Should You Maintain A Private Fork Of Open Source Terraform Modules?
By Sudheer S
This is a blog post in IAC with Terraform series.
IAC stands for Infrastructure As Code. Modern IT infrastructure can be orchestrated using programmatic methods. Terraform is(was?) a popular open source software used to orchestrate infrastructure in the cloud and elsewhere too.
Terraform has the concept of modules. With modules, you can code abstract infrastructure. For example, if you are creating a pattern of infrastructure over and over again, you could abstract the pattern into a Terraform module. Let’s take the example of a web application. It consists of:
- Compute instances for backend API behind a load balancer
- Message broker service
- Compute instances for message consumer application
- PostgreSQL relational database service
- Redis for caching
- Object store for files
- …
You can create a Terraform module to orchestrate the above list of resources. You can keep the module in a private repository or make it open source and publish to the Terraform registry. The module would take a few input parameters and orchestrate the resources. The module also outputs information about the resources. The input parameters could be:
- Name of the compute instance group. compute instance type
- Name of the database, its version, size, storage etc
- Redis cluster parameters
- …
The module could output:
- IP address or DNS name of the load balancer
- IP addresses of the compute instances
- Database instance IP address
- …
You could write the modules yourself. Or you can leverage open source Terraform modules published in the registry. If the open source modules fit the bill, you can focus on your business and less about the module maintenance.
The question we are considering in this blog post is: should you fork the open source Terraform modules and keep the forks private?
If the module is released under a permissible open source license, you can definitely do it. But, should you?
There’s a fear among people that such open source modules might be removed from the registry and might disappear from the Internet someday. Therefore, forking the open source module while it is still available might sound like a good idea.
The answer to the question becomes clear, if you apply the same concept to other source software you use. We are talking about IT infrastructure. There’s a good chance that you are using Linux in your IT infrastructure. Try to apply the same logic to Linux. What if Linux disappears from the Internet someday? Should you maintain a proprietary fork of Linux for your organization? Extend the same logic to other open source software such as open source programming languages, frameworks, libraries, etc.
My take is that, it is not a good idea to fork open source software out of fear of it disappearing from the Internet. Especially popular software with community backing. There’s a good chance that someone will have a copy of the software. Or some Internet archive might have the copy.
Another common objection to using open source Terraform modules I have come across is, stability. If the module is not stable enough you can:
- Contribute to the open source project and make it stable
- Pin the [version of module](version of module) in your project if you are worried about unstable releases
If you are still paranoid of losing the source code, maintain your own archive of the software. You can still avoid maintaining your own fork.
Forking an open source project comes with technical debt. While it might sound like a good idea to download a module and perform some minor changes to it and then use it in your project. The open source version will move faster with bug fixes and new features. Your fork becomes harder to maintain. The heavy lifting required to maintain a fork might not be justified. The successful projects use a CI/CD pipeline and automated tests. You may not have access to such a sophisticated software release engineering system for your fork. The amount of scrutiny each change or PR in the open source software will likely be substantially more than what you and your colleagues can possibly provide. Following the security bulletins of upstream projects is an arduous task. The upstream open source project might have discovered security vulnerabilities and applied fixes in later versions. If you do not have a dedicated team and resources to track such vulnerabilities, you create more security risk in your IT infrastructure.
However, there are some good reasons to fork the module:
- Your use case requires some customization that can be done only via a fork
- The upstream project decided not to include your patches
- Due to urgency of your project, a temporary fork is required. This is an extreme situation and tread with caution even with the temporary fork.
The same logic can be applied to Ansible modules, roles and collections as well. You can pin the role or collection version in your project.
Let go of the fear of disappearing software.