diff --git a/content/what-is/how-to-step-up-cloud-infrastructure-testing.md b/content/what-is/how-to-step-up-cloud-infrastructure-testing.md index 9fce0fbbf496..503b11db626f 100644 --- a/content/what-is/how-to-step-up-cloud-infrastructure-testing.md +++ b/content/what-is/how-to-step-up-cloud-infrastructure-testing.md @@ -1,7 +1,6 @@ --- title: How to Step Up Cloud Infrastructure Testing -meta_desc: | - Learn more about modeling infrastructure testing on standard software practices by including other tests, such as unit, integration, policy and security tests. +meta_desc: "Learn how to test cloud infrastructure as code: unit, property, integration, and security tests, where each fits in CI/CD, and the tools to use." type: what-is page_title: "Stepping Up Your Infrastructure Testing: A Quick Introduction" @@ -27,72 +26,195 @@ customer_logos: - webflow - supabase - ro -authors: ["zack-chase"] +authors: ["cam-soper"] --- -Infrastructure testing isn’t new and, over the years, there have been various tools that people have used to perform it. However, what tends to happen is that standard ops testing focuses on acceptance tests. That means the ops team spins up some infrastructure in the cloud and they then test that infrastructure to see if it’s correct. Of course, if it wasn’t spun up correctly, the team needs to destroy and recreate it. That’s not a great approach because, potentially, something that shouldn’t have happened already has, depending on how quickly the team took a look. +**Cloud infrastructure testing is the practice of validating the code that defines your cloud resources the same way you'd validate application code: with unit tests, property tests, integration tests, and security tests running in CI before changes reach production.** When infrastructure lives in source control as [infrastructure as code (IaC)](/what-is/what-is-infrastructure-as-code/), the same engineering disciplines that catch bugs in applications can catch bugs in VPCs, IAM policies, Kubernetes clusters, and DNS records before they cause an outage or a compliance finding. -A better approach is to model infrastructure testing on standard software practices for applications. That means including other tests, such as unit, integration, policy and security tests. All these tests can be part of a test-driven development (TDD) strategy. With TDD, you first write the test cases that specify and validate what the code will do. The test cases are created and tested first. The initial run will fail, of course, because the code isn’t written. Then, you write the new code, which should pass the tests. If it doesn’t, you know early on that you’ve got some bugs that need to be fixed. +Most teams that adopt IaC stop at deploy-and-verify: spin up the resources, click around, hope for the best. That's the infrastructure equivalent of "no automated tests, just QA." It works for small estates and breaks badly at scale. A real testing program for infrastructure mirrors the testing pyramid software teams already use: fast tests run on every commit, slower tests run on PRs, the slowest run pre-deploy. This article walks through the layers, what each one catches, and how to wire them into a CI/CD pipeline. -TDD is one way to shift risk to the left. That means you move testing to as early in the development cycle as possible, instead of testing at the end of the cycle. Fixing problems early, before there’s a lot of code with all its complexities and interdependencies in place, will make your debugging sessions simpler. The end result is you’ll deliver faster. +In this article, we'll cover the key questions about cloud infrastructure testing: -## Unit Tests +* Why test infrastructure as code? +* What are the layers of infrastructure testing? +* What is IaC unit testing? +* What is property testing for infrastructure? +* What is integration testing for infrastructure? +* What is security and policy testing for infrastructure? +* Where do these tests fit in a CI/CD pipeline? +* What tools are used for infrastructure testing? +* How does Pulumi support infrastructure testing? +* Frequently asked questions about infrastructure testing -Unit tests evaluate the behavior of your infrastructure in isolation. External dependencies, such as databases, are replaced by mocks to check your resource configuration and responses. It’s possible to use mocks because responses from cloud providers are well known and tested. You already know how, given some parameters, the provider will respond. +## Why test infrastructure as code? -Unit tests run in memory without any out-of-process calls, which makes them very fast. Use them for fast feedback loops during development. Unit tests really help you solve problems early in the life cycle of your infrastructure. +Three reasons make automated infrastructure tests worth the investment: -A few examples of what you can verify are: +* **Misconfigurations are the dominant source of cloud incidents.** Public buckets, overly broad IAM, open security groups, and missing encryption account for the majority of reported cloud breaches. The cheapest moment to catch any of those is a test that fails before the merge. +* **Manual review doesn't scale.** A typical cloud-native app spans hundreds of resources changing daily. No human reviewer reliably catches a regression like "the new module just opened SSH to the internet" by reading a diff. A test does. +* **Rollback is expensive.** Reverting an infrastructure change can cascade: dropped database, deleted secrets, broken DNS. Catching the issue in CI is dramatically cheaper than rolling forward through an incident. -- Resources are correctly tagged. -- Instances don’t have an SSH connection open to the Internet. -- Web site URLs are valid. +## What are the layers of infrastructure testing? -When you’re planning your tests, think about using a tool that lets you write your tests in a general purpose language such as Python, Go, TypeScript or C#, rather than in a special-purpose DSL. Standard languages all have well-understood tools and frameworks that make it much easier to test your code. +Like application testing, infrastructure testing has layers that trade speed for fidelity. -## Integration Tests +| Layer | What it catches | Provisions real resources? | Typical runtime | +|---|---|---|---| +| **Static analysis / linting** | Syntax errors, obvious misconfigs, drift from project standards | No | Seconds | +| **Unit tests** | Logic in your IaC code (loops, conditionals, components) | No (mocked) | Seconds | +| **Property tests** | Constraints on the planned/deployed resource shape | No (or transient) | Seconds to minutes | +| **Integration / end-to-end tests** | Whether the resulting infrastructure actually works | Yes, ephemeral | Minutes to tens of minutes | +| **Security / policy tests** | Compliance, security posture, organizational rules | No or partial | Seconds | -Integration testing (also known as black-box testing) comes after unit testing and it takes a different approach. Integration tests deploy cloud resources and validate their actual behavior but in an ephemeral environment. An ephemeral environment is a short-lived environment that mimics a production environment. It’s often simpler and only includes the first-level dependencies of the code you’re testing. +A healthy program runs every layer; the cheap ones run on every commit, the expensive ones run on PRs or pre-deploy. -Some of the behaviors you can verify are: +## What is IaC unit testing? -- Your project’s code is syntactically well-formed and runs without errors. -- Your stack’s configuration and secrets work and are interpreted correctly. -- Your project can be successfully deployed to your cloud provider. -- The infrastructure behaves as expected: for example, a health-check endpoint returns a valid HTML document, or a suite of application-level tests succeeds against the public API. +Unit tests verify the logic of your IaC programs without calling a cloud provider. External dependencies are replaced with mocks that return canned responses, so the test runs entirely in memory. Pulumi programs can be unit-tested using the standard test runners for the language you wrote them in (Jest/Vitest for TypeScript, pytest for Python, `go test` for Go, xUnit for C#, JUnit for Java). -Once the integration tests are finished, you can destroy the ephemeral infrastructure. +Examples of what a unit test should assert: -## Property Tests +* Every resource carries the required cost-allocation and ownership tags. +* No security group allows `0.0.0.0/0` ingress on port 22. +* A bucket's public-access-block is enabled in every environment. +* The right number of replicas is created when the stack is configured as `production`. -A type of test you may not be familiar with is a property test. Property tests run resource-level assertions while the infrastructure is being deployed. They are there to test your policies and they rely on you having written your policies as code. +Unit tests run in seconds and produce the tightest feedback loop. The trade-off is that they only test what you wrote; they can't catch a problem in how the cloud provider actually behaves. -In contrast to “black-box” integration testing, policies have access to all input and output values of all cloud resources in the stack. As opposed to unit testing, property tests can evaluate real values returned from the cloud provider instead of the mocked ones. +For Pulumi-specific patterns, see the [unit testing guide](/docs/iac/guides/testing/unit/). -Use property tests to ensure that your infrastructure complies with your company’s standards. A couple examples are: +## What is property testing for infrastructure? -- Checking that you’re using the correct version of the provider's managed Kubernetes service. -- Ensuring a service can make an API call to a policy engine to determine whether a request is authorized or not. -- Ensuring that a resource is provisioned inside a private VPC, rather than the default one. +Property tests run after a `pulumi preview` (or equivalent) produces a plan, against the planned resource graph or the freshly-deployed real resources. Unlike unit tests, they see actual cloud-provider outputs; unlike integration tests, they assert on specific resource properties rather than end-to-end behavior. -## Security Tests +Property tests are well suited to enforcing organizational rules like: -Too often, security tests are left until the last minute, or code that’s considered “finished” gets thrown over the wall to a security team, who’ve been left out of the entire development process. The phrase “courting disaster” comes to mind when considering this approach. Large companies and governments have all suffered well-publicized data breaches that exposed millions of confidential records. +* "The Kubernetes cluster must use the LTS provider version." +* "Every database must have backups enabled with at least a 7-day retention." +* "All compute lives inside a non-default VPC with private subnets." +* "Every service has logging enabled and shipped to the central log account." -Security tests should be as much a part of your workflow as any other type of testing. Just as you start testing your code early with unit tests, so should you start testing early to find security problems. If you have a dedicated security team, involve them right away, so they can help you design effective tests. Make sure those tests are included in your CI/CD pipeline. +In Pulumi, these checks are usually written as [policy as code](/docs/insights/policy/) using Pulumi Policies. The same policy code runs against `pulumi preview` (blocking the merge) and as a deploy-time gate. -Just a few of the things you should do are: +## What is integration testing for infrastructure? -- Strip out all plaintext secrets. -- Make sure all secrets are encrypted. -- Think about adopting services offered by your cloud provider to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. +Integration tests deploy real cloud resources into an ephemeral environment, run end-to-end checks, then tear the environment down. They answer the question your unit and property tests can't: "Does the thing I just deployed actually work?" -As with all the other tests we’ve mentioned, security testing should be done as early in the process as possible. If you have a dedicated security team, make sure to involve them immediately. Don’t write your code and then hand off what you consider to be a production-ready code to them. There are a variety of security tests you can incorporate into your development process. Here are a few: +A typical integration test: -- Vulnerability Scanning: This kind of testing uses automated software to scan a system against known vulnerability signatures. There are vulnerability scanners on the market that you can use. -- Penetration testing or pen tests: This kind of testing simulates an attack from a malicious hacker. This testing involves analysis of a particular system to check for potential vulnerabilities to an external hacking attempt. You might want to check out the Open Web Application Security Project ([OWASP](https://owasp.org/www-project-web-security-testing-guide/)), which is a worldwide non-profit organization focused on improving the security of software. The project has multiple tools to pen-test various software environments and protocols. -- Ethical hacking: Try scheduling “game days,” where people in your company deliberately try to hack its systems. +1. Stand up a short-lived stack in a sandbox account using a unique stack name. +1. Wait for resources to converge. +1. Run assertions against the live system: hit a health-check URL, confirm a Lambda returns the expected response, run a SQL query, send a Kafka message and assert it lands. +1. Capture logs and metrics for failure diagnostics. +1. Tear down the stack. -## Learn More +Integration tests are the slowest and most expensive layer (they consume cloud resources), so run them selectively: on PRs that touch production-relevant code paths, on a nightly schedule, or as a deploy-time smoke gate. -Pulumi lets you take advantage of the well-developed testing frameworks that support your favorite programming language. It also includes many features for helping you ensure that your infrastructure works the way it should, is reliable and is secure. Visit us at pulumi.com or [get started](/docs/get-started/) for free today. +See Pulumi's [integration testing guide](/docs/iac/guides/testing/integration/) for end-to-end test patterns. + +## What is security and policy testing for infrastructure? + +Security testing for IaC has two halves. + +**Static scans of the code** catch known bad configurations before deploy: hardcoded secrets, public buckets, missing encryption, dangerous IAM. Tools like Checkov, Terrascan, Trivy, and Snyk IaC scan Terraform, CloudFormation, and Kubernetes manifests against built-in rulesets; Checkov also has a Pulumi-output mode. Run them in CI on every change. + +**Policy as code** enforces the rules your security team writes for the organization itself. [Pulumi Policies](/docs/insights/policy/) lets you author policies in TypeScript/JavaScript, Python, or OPA's Rego against the actual Pulumi resource model, with three enforcement levels (`advisory`, `mandatory`, `disabled`). (Pulumi Policies apply to Pulumi stacks written in any supported language, including Go, .NET, and Java.) Policies run during `pulumi preview` and `pulumi up`, so a non-compliant change can't get past CI. + +Beyond IaC-specific scans, the wider security testing menu still applies: + +* **Vulnerability scanning** of container images and dependencies (Trivy, Snyk, Anchore). +* **Penetration testing** of the deployed system, scheduled or ad hoc. +* **Game days and chaos exercises** that deliberately break parts of the running system to test detection and response. + +Whatever the mix, the consistent rule is **shift left**: every security check that can run in CI should, and the deploy gate should fail closed. + +## Where do these tests fit in a CI/CD pipeline? + +A common shape for an IaC pipeline: + +1. **On every commit:** lint, static security scan (Checkov / Trivy), unit tests. +1. **On every pull request:** the above, plus `pulumi preview`, plus Pulumi policies in advisory mode. +1. **On merge to main:** `pulumi preview` against staging, deploy to staging, run integration tests against staging. +1. **On promotion to production:** Pulumi policies in mandatory mode, `pulumi up`, smoke tests against production. + +The principle is the same as application CI: fast feedback for changes in progress, slower and broader checks closer to production. + +## What tools are used for infrastructure testing? + +| Category | Representative tools | +|---|---| +| Unit testing | Jest, Vitest, pytest, `go test`, xUnit, JUnit (standard test runners for the language you write IaC in) | +| Static IaC scanning | Checkov, Terrascan, Trivy, Snyk IaC | +| Policy as code | [Pulumi Policies](/docs/insights/policy/), Open Policy Agent (OPA), HashiCorp Sentinel | +| Property and integration testing | Pulumi automation API, Terratest, Kitchen-Terraform | +| Cloud emulation | LocalStack (AWS), Moto (AWS), Azurite (Azure) | +| Image and dependency scanning | Trivy, Snyk, Anchore, Grype | +| Chaos engineering | Gremlin, AWS Fault Injection Simulator, Chaos Mesh | + +The point of a testing toolchain isn't to have the most tools; it's to have one tool covering each layer with a connection into CI, so a regression in any of them fails the build. + +## How does Pulumi support infrastructure testing? + +Pulumi treats infrastructure as software, which means every testing tool that exists for your application code is available for your IaC: + +* **Real programming languages.** Write Pulumi programs in TypeScript, Python, Go, C#, Java, or YAML, and use the same test runners and mocking libraries you already know. +* **Unit testing with mocks.** Pulumi's [test mocks](/docs/iac/guides/testing/unit/) replace cloud provider calls with canned responses, so unit tests run in milliseconds and don't need any cloud credentials. +* **Integration testing through the automation API.** The [automation API](/docs/iac/packages-and-automation/automation-api/) lets you script `pulumi up` and `pulumi destroy` from a test runner, so integration tests can deploy and tear down ephemeral stacks programmatically. +* **Policy as code.** [Pulumi Policies](/docs/insights/policy/) run during preview and update, blocking changes that violate organizational rules. Policy packs are versioned and shipped alongside your code. +* **CI/CD integration.** Pulumi runs in every major CI/CD platform via the [GitHub Actions integration](/docs/iac/guides/continuous-delivery/github-actions/) or any other system that can run a CLI. + +[Get started with Pulumi](/docs/get-started/) to provision and test infrastructure as code in TypeScript, Python, Go, C#, Java, or YAML. + +## Frequently asked questions about infrastructure testing + +### Why test infrastructure at all? + +The same reason you test applications: to catch defects when they're cheap to fix. An IAM misconfiguration found in CI costs a few seconds of compute; the same misconfiguration in production can cost a breach. + +### What's the difference between IaC unit tests and integration tests? + +Unit tests run in memory with mocked cloud responses, so they're fast (seconds) but only test the logic of your IaC code. Integration tests deploy real cloud resources to an ephemeral environment and assert on the running system, so they're slow (minutes) but catch problems that only show up against the real cloud provider. + +### Do I need to write tests for every resource? + +No. Test the resources and modules that carry real risk: anything with security implications (IAM, networking, secrets), anything that's reused across many stacks (shared components), and anything whose failure would cause an outage. Treat one-off resources the same way you treat one-off scripts. + +### What is "property testing" in the IaC sense? + +A check that runs against the planned or deployed resource graph and asserts properties on it (for example: every database has backups enabled, every Kubernetes cluster uses the supported version). In Pulumi, property tests are typically written as [Pulumi policies](/docs/insights/policy/). + +### How do I test infrastructure without spending a lot on cloud resources? + +Run unit tests with mocks for the bulk of your testing. Use cloud emulators (LocalStack, Moto, Azurite) where the provider's behavior is well-modeled. Reserve real-cloud integration tests for the highest-value scenarios, run them in cheap regions, and tear down stacks as soon as the test finishes. + +### Should security tests run before or after deploy? + +Both. Static scanning and policy as code should run before deploy so non-compliant changes never reach production. Dynamic scanning, penetration testing, and chaos exercises run against the deployed system because they can only see runtime behavior. + +### How does test-driven development work for infrastructure? + +The same way as for applications: write a test that describes the resource you want (right tag, right encryption, right policy), watch it fail, write the IaC that makes it pass, then refactor. TDD is particularly effective when building reusable infrastructure components, because the tests document the component's contract. + +### What's the role of policy as code? + +Policy as code is the operating model for organization-wide rules: things like "every database must have backups," "no public S3 buckets," "all compute must be tagged with an owner." Policies live in version control alongside the infrastructure they govern, run automatically on every change, and produce auditable evidence that the rules are enforced. + +### Do compliance frameworks (SOC 2, HIPAA, PCI) accept IaC test results as evidence? + +Yes — SOC 2, HIPAA, and PCI DSS audits routinely accept IaC test output and policy-as-code run logs as evidence that a control is enforced. A Pulumi Policies run, for example, produces a record of a control being checked against a specific change at a specific time, which is more concrete than a written policy with no enforcement mechanism behind it. + +### How do I introduce testing to an existing IaC codebase? + +Start with the cheapest layer that produces the most value: static scanning and policy as code in advisory mode. That gives you a baseline of how compliant the current codebase is without blocking anyone. Promote policies to mandatory mode as you remediate findings. Add unit tests to new components as you write them; add integration tests around the highest-risk modules first. + +## Learn more + +Pulumi lets you take advantage of the testing frameworks, mock libraries, and CI/CD tooling that already work for your application code, and apply them to your infrastructure. Combined with [Pulumi policy as code](/docs/insights/policy/), [the automation API](/docs/iac/packages-and-automation/automation-api/), and a real testing pyramid, that closes the gap between how teams treat application code and how they treat the cloud infrastructure it runs on. [Get started today](/docs/get-started/). + +Related reading: + +* [What is Infrastructure as Code (IaC)?](/what-is/what-is-infrastructure-as-code/) +* [Infrastructure as Code for DevOps](/what-is/infrastructure-as-code-for-devops/) +* [What is DevOps?](/what-is/what-is-devops/) +* [What is Cloud Security?](/what-is/what-is-cloud-security/) +* [What is Configuration Management?](/what-is/what-is-configuration-management/)