Infrastructure Testing: Terratest and InSpec

Infrastructure-as-code has solved configuration drift and manual provisioning errors, but it introduced a new problem: how do you validate that your Terraform modules or CloudFormation templates...

Key Insights

  • Terratest excels at deployment orchestration and integration testing of infrastructure-as-code, while InSpec specializes in compliance validation and runtime state verification—using both together provides comprehensive coverage
  • Infrastructure testing prevents costly production failures by catching misconfigurations early, but requires careful management of cloud costs and test isolation to avoid flaky tests and resource conflicts
  • Implementing infrastructure tests in CI/CD pipelines with proper teardown mechanisms and parallel execution strategies can reduce feedback cycles from days to minutes while maintaining cloud spend under control

Introduction to Infrastructure Testing

Infrastructure-as-code has solved configuration drift and manual provisioning errors, but it introduced a new problem: how do you validate that your Terraform modules or CloudFormation templates actually work as intended? Manual testing through terraform apply and hoping for the best isn’t sustainable.

Infrastructure testing applies software engineering principles to your infrastructure code. Instead of discovering that your security groups are misconfigured in production, you catch it in a pull request. Instead of wondering if your module works across regions, you verify it programmatically.

Two tools dominate this space: Terratest and InSpec. Terratest, written in Go, focuses on deployment testing—spinning up real infrastructure, validating it works, and tearing it down. InSpec, written in Ruby, specializes in compliance and state verification—checking that running infrastructure meets specific security and configuration requirements. They’re complementary, not competitive.

Terratest Fundamentals

Terratest treats infrastructure code like application code. You write Go tests that deploy your infrastructure, validate it behaves correctly, then destroy it. This means testing against real cloud providers, not mocks—if your S3 bucket configuration doesn’t work in AWS, your test fails.

The typical Terratest workflow follows three phases: deploy, validate, destroy. The destroy phase runs even if tests fail, preventing orphaned resources from accumulating in your cloud account.

Here’s a practical example testing a Terraform module that creates an S3 bucket:

package test

import (
    "testing"
    
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/gruntwork-io/terratest/modules/aws"
    "github.com/stretchr/testify/assert"
)

func TestS3BucketCreation(t *testing.T) {
    t.Parallel()
    
    // Expected values
    expectedBucketName := "my-test-bucket-" + randomString(8)
    awsRegion := "us-west-2"
    
    terraformOptions := &terraform.Options{
        TerraformDir: "../examples/s3-bucket",
        Vars: map[string]interface{}{
            "bucket_name": expectedBucketName,
            "region":      awsRegion,
        },
    }
    
    // Ensure cleanup happens
    defer terraform.Destroy(t, terraformOptions)
    
    // Deploy infrastructure
    terraform.InitAndApply(t, terraformOptions)
    
    // Validate bucket exists
    aws.AssertS3BucketExists(t, awsRegion, expectedBucketName)
    
    // Validate bucket versioning is enabled
    versioning := aws.GetS3BucketVersioning(t, awsRegion, expectedBucketName)
    assert.Equal(t, "Enabled", versioning)
    
    // Validate bucket encryption
    encryption := aws.GetS3BucketEncryption(t, awsRegion, expectedBucketName)
    assert.NotNil(t, encryption)
    assert.Equal(t, "AES256", encryption.Rules[0].ApplyServerSideEncryptionByDefault.SSEAlgorithm)
}

This test creates real AWS resources, validates their configuration, and cleans up. The defer statement ensures cleanup happens even if assertions fail. The t.Parallel() call allows multiple tests to run concurrently, reducing total test time.

Terratest shines for integration testing—validating that your infrastructure actually deploys and functions. But it’s less ideal for detailed compliance checks across dozens of configuration parameters. That’s where InSpec enters.

InSpec Deep Dive

InSpec takes a different approach. Rather than focusing on deployment, it validates the state of existing infrastructure against compliance requirements. You describe what your infrastructure should look like, and InSpec verifies it matches.

InSpec uses a resource-based model. Want to check an EC2 instance? Use the aws_ec2_instance resource. Need to verify security group rules? Use aws_security_group. Tests read naturally and focus on compliance.

Here’s an InSpec profile testing EC2 instance compliance:

# controls/ec2_compliance.rb

control 'ec2-instance-compliance' do
  impact 1.0
  title 'EC2 instances must meet security requirements'
  desc 'Verify EC2 instances have proper configuration and tags'
  
  # Get instance by tag
  instances = aws_ec2_instances.where { tags('Environment') == 'production' }
  
  instances.instance_ids.each do |instance_id|
    describe aws_ec2_instance(instance_id) do
      it { should exist }
      it { should be_running }
      its('instance_type') { should be_in ['t3.medium', 't3.large'] }
      its('monitoring_state') { should eq 'enabled' }
      
      # Verify required tags
      its('tags') { should include('Owner') }
      its('tags') { should include('CostCenter') }
      its('tags') { should include('Environment') }
    end
    
    # Check security group configuration
    describe aws_security_group(group_id: aws_ec2_instance(instance_id).security_group_ids.first) do
      # No unrestricted SSH access
      it { should_not allow_in(port: 22, ipv4_range: '0.0.0.0/0') }
      
      # HTTPS should be allowed from specific CIDR
      it { should allow_in(port: 443, ipv4_range: '10.0.0.0/8') }
    end
  end
end

control 'ec2-encryption-compliance' do
  impact 1.0
  title 'EC2 volumes must be encrypted'
  
  aws_ebs_volumes.volume_ids.each do |volume_id|
    describe aws_ebs_volume(volume_id) do
      it { should be_encrypted }
    end
  end
end

InSpec excels at expressing compliance requirements as code. The tests are readable by security teams, not just engineers. You can run these profiles against production infrastructure continuously, catching drift before it becomes a security incident.

Combining Terratest and InSpec

The real power emerges when you combine both tools. Use Terratest to orchestrate deployment and high-level validation, then invoke InSpec for detailed compliance checking. This gives you both integration testing and compliance verification in a single test suite.

package test

import (
    "testing"
    "fmt"
    "os/exec"
    
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/require"
)

func TestInfrastructureCompliance(t *testing.T) {
    t.Parallel()
    
    terraformOptions := &terraform.Options{
        TerraformDir: "../infrastructure/vpc",
        Vars: map[string]interface{}{
            "environment": "test",
            "vpc_cidr":    "10.0.0.0/16",
        },
    }
    
    defer terraform.Destroy(t, terraformOptions)
    
    // Deploy infrastructure
    terraform.InitAndApply(t, terraformOptions)
    
    // Get outputs for InSpec
    vpcID := terraform.Output(t, terraformOptions, "vpc_id")
    instanceID := terraform.Output(t, terraformOptions, "instance_id")
    
    // Run basic Terratest validations
    require.NotEmpty(t, vpcID)
    require.NotEmpty(t, instanceID)
    
    // Run InSpec compliance checks
    inspecCmd := exec.Command("inspec", "exec", 
        "../compliance/aws-profile",
        "--input", fmt.Sprintf("vpc_id=%s", vpcID),
        "--input", fmt.Sprintf("instance_id=%s", instanceID),
        "--reporter", "cli", "json:inspec-results.json")
    
    output, err := inspecCmd.CombinedOutput()
    require.NoError(t, err, "InSpec compliance checks failed:\n%s", string(output))
    
    t.Logf("InSpec validation passed for VPC: %s", vpcID)
}

This pattern gives you the best of both worlds. Terratest handles the deployment lifecycle and ensures your infrastructure actually works. InSpec validates that it meets your organization’s compliance requirements. The test fails if either check doesn’t pass.

Testing Patterns and Best Practices

Infrastructure testing requires different patterns than application testing. Tests interact with real cloud APIs, cost money, and take minutes instead of milliseconds. Here are patterns that work:

Test Isolation: Always use unique names and separate accounts/regions for test resources. Parallel tests that share resources will fail unpredictably.

Retry Logic: Cloud APIs are eventually consistent. Wrap assertions in retry logic:

maxRetries := 10
timeBetweenRetries := 6 * time.Second

aws.WaitForInstanceState(t, awsRegion, instanceID, "running", maxRetries, timeBetweenRetries)

Cost Management: Destroy resources immediately after validation. Use smaller instance types. Run expensive tests only on main branch merges, not every PR.

CI/CD Integration: Here’s a GitHub Actions workflow that runs both tools:

name: Infrastructure Tests

on:
  pull_request:
    paths:
      - 'infrastructure/**'
      - 'test/**'

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup Go
        uses: actions/setup-go@v4
        with:
          go-version: '1.21'
      
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: '1.6.0'
      
      - name: Setup InSpec
        run: |
          curl https://omnitruck.chef.io/install.sh | sudo bash -s -- -P inspec          
      
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-west-2
      
      - name: Run Terratest
        run: |
          cd test
          go test -v -timeout 30m -parallel 5          
        env:
          AWS_DEFAULT_REGION: us-west-2
      
      - name: Upload InSpec Results
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: inspec-results
          path: test/inspec-results.json

This workflow runs tests on every PR touching infrastructure code, provides fast feedback, and uploads compliance results for review.

Common Pitfalls and Solutions

Flaky Tests: The biggest issue with infrastructure testing. Cloud APIs have eventual consistency, rate limits, and transient failures. Solutions:

  • Implement exponential backoff retry logic
  • Use terraform.InitAndApplyAndIdempotent to verify idempotency
  • Set appropriate timeouts (infrastructure tests need 20-30 minute timeouts)

Orphaned Resources: Failed tests that don’t clean up leave expensive resources running. Solutions:

  • Always use defer terraform.Destroy()
  • Implement resource tagging with timestamps
  • Run cleanup jobs that delete resources older than test duration
  • Use separate AWS accounts with budget alerts

Long Test Duration: Full infrastructure tests can take 15-20 minutes. Solutions:

  • Run expensive tests only on main branch
  • Use t.Parallel() aggressively
  • Cache Terraform providers and modules
  • Consider using LocalStack or moto for unit tests, real cloud for integration tests

State Management: Terraform state conflicts cause test failures. Solutions:

  • Use unique backend keys per test: key = "test-${random_id}/terraform.tfstate"
  • Never share state between tests
  • Clean up state files in CI/CD

Conclusion

Infrastructure testing isn’t optional anymore. The cost of production outages from misconfigured infrastructure far exceeds the investment in proper testing. Terratest and InSpec provide complementary capabilities that together create comprehensive infrastructure validation.

Use Terratest when you need to validate deployment workflows, test module interfaces, and verify infrastructure actually provisions correctly. Use InSpec when you need compliance validation, security posture verification, and ongoing drift detection. Use both together when you want confidence that your infrastructure is both functional and compliant.

Start small: write a single Terratest test for your most critical Terraform module. Add InSpec compliance checks for your security requirements. Integrate them into CI/CD. Expand coverage iteratively. Your future self—and your security team—will thank you.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.