Reducing Terraform Drift in Production

October 25, 2025/
Terraform plan showing no changes

Our Terraform runs were showing changes on every apply, even when nothing had actually changed. In production infrastructure, that’s a real risk. When the plan is always noisy, it’s easy to miss a change that actually matters. It also blocks any move toward automation — you can’t safely automate applies when you can’t trust what the plan is telling you.

I picked this up on my own during spare time, initially to fix just one recurring change. But seeing the reduced change set gave me a dopamine hit, and I just kept going until there were none left. Five sources of drift across CloudFront, ECS, Secrets Manager, and Terraform provider bugs — all cleaned up.

Why it matters

Noisy plans increase the risk of accidental changes and make it hard to move towards automation. Cleaning up drift ensures the plan only shows real changes. This gives us a reliable baseline for future improvements like automated releases and introducing guardrails.

The fixes

Each fix targets a specific source of drift, with an emphasis on fixing root causes rather than applying quick workarounds. In cases where AWS provider limitations made some drift unavoidable, ignore_changes was used as a last resort.

Make origin_shield dynamic

  • When origin_shield is disabled on a CloudFront distribution, AWS removes it from the refreshed state entirely, causing drift on every plan
  • Made the block dynamic so it is only included when enabled
  • There was a secondary provider bug when disabling via a dynamic block that needed its own workaround

Fix null_resources

  • timestamp() in triggers caused Terraform to show changes every run since the value always differs
  • Moved script execution to external data sources to remove drift while still running scripts every apply
  • Split dependency installation into a separate null_resource so it only runs once

Fix task definitions

  • Every apply created new ECS task definition revisions even with no real changes. Task definitions are tracked by AWS Config so this had a cost implication
  • Identified mismatches in environment variable ordering and missing optional properties by comparing Terraform state with the AWS API response
  • Standardized ordering and completed missing fields in the template

Webhook secrets

  • An AWS provider issue caused webhook secrets to always show as changed
  • Added ignore_changes for the secret value

Resources not managed by Terraform

  • Some resources existed in AWS but were intentionally not managed by Terraform, which resulted in drift
  • Added the relevant attributes to ignore_changes

What this affords us

Safer infrastructure releases. Planned changes match expected changes. Only changes with discrepancies will require manual review before apply.

Faster developer feedback loops. Terraform changes are easier to understand, approve, and debug during speculative planning.

A good foundation for future automation. A clean baseline means we can now move toward automated infrastructure releases — something that wasn’t safe to do when the plan couldn’t be trusted.

What’s next

With a clean plan and apply baseline, we can safely automate infrastructure releases, introduce policy guardrails that highlight genuine issues, and reduce the time spent on manual maintenance.

Related posts