How to organize Terraform projects

Published on April 10, 2021

Terraform

Refactoring

A Terraform project tends to grow over time, and a bunch of configuration files may overwhelm team members, especially those who are new to the project. When you notice such many files are difficult to maintain, it might be time to refactor the project. But, what should you do first?

According to How Do I Decompose Monolithic Terraform Configurations?, Terraform modules solve the monolith problem, and it says like this:

Gradually you can break that monorepo down into a bunch of smaller repositories that could be managed by different individuals, or different teams.

First, I thought it makes sense, but I think this approach could not be applicable to my team. My team is too small and has only fewer than 10 members, so I easily imagine we cannot maintain many repositories. Ideally, Terraform states should be separated and delegated to each repository, but it's virtually impossible for such a small team.

By the way, my team has mainly used these repositories.

Application repository, which depends on Ruby on Rails
Analyzer repository, which is used by Application
Infrastructure repository, which is the main topic of this article

If we follow the best practice, which is breaking down into smaller repositories, it means "Infrastructure" is supposed to be decoupled into many repositories, but it's impossible for a small team like us to maintain many repositories. Actually, we are even struggling to maintain three repositories.

I think a small team with a single Terraform repository should keep maintaining it as it is unless there are necessities. However, code structure within the repository should be reconsidered. As How Do I Decompose Monolithic Terraform Configurations? says, Terraform modules should be used because it clarifies what resources should be coupled together. Hashicorp has also published another blog post: Structuring HashiCorp Terraform Configuration for Production, and explains the usage of Terraform modules more specifically.

From these articles, I came up with an idea to structure our configurations like this:

$ tree
.
├── modules
│   └── cluster
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── prod
│   ├── analyzer
│   │   └── main.tf # It depends on `cluster`
│   ├── app
│   │   └── main.tf # It depends on `cluster`
│   └── main.tf
└── stg
    ├── analyzer
    │   └── main.tf # It depends on `cluster`
    ├── app
    │   └── main.tf # It depends on `cluster`
    └── main.tf

These files can be managed on a single repository, but prod and stg have their own Terraform state respectively. So, we can test the changes on stg first, and then we can apply it to prod. It apparently looks good, but I soon found a disadvantage of this structure. For example, when I update modules/cluster/main.tf, this change affects all dependants; in our project, prod/app/main.tf, prod/analyzer/main.tf, stg/app/main.tf, and stg/analyzer/main.tf are using this module. In this case, terraform plan shows the difference between the previous commit and the head of the Git branch, and we get to know four declared modules will be affected. Of course, we can test it on our staging environment, stg, but even within stg, we have to evaluate the effect of the change on both app and analyzer.

What if we cannot apply the change to app and analyzer at the same time for some reason? Probably, the combination of main configurations and shareable modules does not work. Ideally, the cluster module should be managed on another repository to let this repository specify different versions. However, again, multiple repositories might not work for us, so I want to find another way.

In the first place, the module cluster may not be able to get shared. When we do terraform apply, we usually want to confirm how code changes affect our infrastructure resources one by one. It means we hope a single line of change should reflect a single update of a resource. Maybe, the above code structure does not satisfy our need.

In this case, the cluster should be two modules: app and analyzer. They are absolutely the same name of the main configurations. Maybe, there are many duplicated codes among them, but we cannot avoid it because of a single repository constraint. If you want to make the code DRY, each module should have its own repository.

Finally, we decided to structure our infrastructure code like this:

$ tree
.
├── modules
│   ├── analyzer
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   └── app
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf
├── prod
│   ├── analyzer
│   │   └── main.tf # It depends on `analyzer`
│   ├── app
│   │   └── main.tf # It depends on `app`
│   └── main.tf
└── stg
    ├── analyzer
    │   └── main.tf # It depends on `analyzer`
    ├── app
    │   └── main.tf # It depends on `app`
    └── main.tf

Conclusion

If your team is small, it's recommended to avoid decoupling the infrastructure code into multiple repositories and consider using modules within your existing repositories. A single module on a single repository is an ideal code structure, but this may fit only large teams. However, keep in mind modules on a single repository are different from such shareable modules.