Deployment (VM + container)

This page provides guidance on how to deploy Terraform Enterprise using Docker on a VM running on AWS, Azure or GCP using Terraform modules written and managed by HashiCorp (HVD Modules). The modules are designed to be used in conjunction with the Terraform CLI, are available in the HashiCorp Terraform Registry (see tabs below) and are the best practice method for deploying Terraform Enterprise on the respective cloud.

We expect that the reader has already reviewed the architecture page of this HVD and has a good understanding of the Terraform Enterprise architecture. If you have not done so, we recommend you review the architecture page before proceeding with the deployment. We also expect that the reader has a good understanding of Docker or Podman, and of deploying and managing virtual machines on public clouds.

Architectural summary

Terraform Enterprise is deployed on a VM running on AWS, Azure or GCP, irrespective of the deployment topology. If using Active/Active, additional VMs would be scaled to and these should run in different availability zones (AZs) in the same region.
Deploy one or more Terraform Enterprise container(s) onto respective VMs depending on your chosen deployment topology.
Use managed versions of object storage, PostgreSQL database. If using Active/Active, use a managed Redis cache (as offered by your public cloud, specifically not Redis Cluster or Redis Sentinel), to ensure replicas are distributed in different AZs.
Use a layer 4 load balancer to ingress traffic to the Terraform Enterprise instances. This is because:
- Certificates need to be on the compute nodes for Terraform Enterprise to work.
- It is more secure to terminate the TLS connection on Terraform Enterprise rather than outside it and have to re-encrypt traffic from the inside of the load balancer requiring an additional certificate.
- It is more straight forward to manage than a layer 7 load balancer.
By using three AZs (one Terraform Enterprise pod in each AZ), the system has an n-2 failure profile, surviving failure of two AZs. However, if the entire region is unavailable, then there will be an outage of Terraform Enterprise. Currently, the application architecture is single-region.
We do not recommend that Terraform Enterprise be exposed for ingress to the public Internet. Users should be on the company network to be able to access the Terraform Enterprise API/UI. However, we recommend Terraform Enterprise is allowed to access certain addresses on the Internet(opens in new tab) in order to access:
- The HashiCorp container registry - where the Terraform Enterprise container is vended from
- HashiCorp service APIs (all owned and exclusively operated by HashiCorp except Algolia):
  - registry.terraform.io houses the public Terraform module registry which enterprise customers will want to avoid letting users have unfettered access to as it contains community-contributed modules and providers uncertified by HashiCorp. However, this is where official providers are indexed.
  - releases.hashicorp.com is where HashiCorp host Terraform binary releases. We recommend users stay within two minor releases of latest in order to access the latest security updates, and new features.
  - reporting.hashicorp.services is where we aggregate license usage and as such we strongly recommend including this in egress allow lists in order to ensure our partnership with your organization can be right-sized for your needs going forward.
  - Algolia(opens in new tab) - The Terraform Registry uses Algolia to index the current resources in the registry.
- Additional outbound targets for VCS/SAML etc. depending on the use case.
- Public cloud cost estimation APIs as necessary.
This recommended architecture requires Terraform Enterprise v202309-1 or later. Review the main public documentation(opens in new tab) if you have not already.
The HVD Modules all include example code which you can use to deploy Terraform Enterprise on RHEL (Podman) or Ubuntu (Docker) onto new VM instances, which will be configured with the container runtime appropriate for the operating system chosen.

In some regulated environments, outbound access is limited or totally inaccessible from server environments. If you need to run Terraform Enterprise fully air gapped from the Internet, it would be regularly necessary to manually download provider and Terraform binary versions and host them in the Terraform Enterprise registry when these are released, in order to offer them to your users.

In order to allow Terraform Enterprise access to the public registry, but prevent your user base from accessing community content, we recommend using Sentinel or OPA as part of platform development in order to limit which providers are signed-off for use.

Terraform modules for installation

The primary route to the installation of Terraform Enterprise on VMs using a container runtime is through the use of HVD Modules which require the use of a Terraform CLI binary(opens in new tab). These modules are available in the public HashiCorp Terraform Registry and are linked in the tabs below.

The installation code we provide has been used by HashiCorp Professional Services and HashiCorp partners to set up best practice Terraform Enterprise instances. We strongly recommend leveraging partners and/or HashiCorp Professional Services to accelerate the scaling out of your project.

If you will be installing Terraform Enterprise yourself, we recommend that you follow these high-level steps:

Import the provided Terraform Enterprise modules into your VCS repository.
Establish where to store the Terraform state for the deployment. HashiCorp recommends that, for simplicity, you store the state in cloud-based object store service (S3, Blob Storage, GCS etc.) for which HVD Modules all contain a backend.tf file which you can use to configure the remote state storage location as this is more secure than using the local backend. Free access in HCP Terraform for this can also be used.
Select a machine where the Terraform code will be executed. This machine will need to have the Terraform CLI(opens in new tab) available.
Ensure that cloud credentials are instantiated in the shell used on the machine for Terraform execution.

Note

Terraform state contains sensitive information and needs to be protected. We do not recommend that you store the state file for a Terraform Enterprise deployment in VCS or any unprotected location. It is the only state you will need to separately secure, as all other state generated by your organization will be protected by Terraform Enterprise.

Terraform Enterprise license

Before starting the Terraform Enterprise installation, make sure you have a valid Terraform Enterprise license. If you do not, first reach out to your HashiCorp account team to request the license file. When you receive it, save it as terraform.hclic, and protect it as a company asset.

Ensure that the license file contains a single line with no new line character. Run the following command and ensure the output is 0.

$ wc -l terraform.hclic
       0 terraform.hclic

Process overview

The layout of the HVD Module GitHub repositories follows a standard structure exhibiting these features:

The Terraform code is separated into logical .tf files at the top level of the repository, without the need for submodules and without calls to external child modules. This keeps the codebase whole, simple, and easy to understand.
The main repository README.md file contains the primary instructions which should be read through and then followed closely in order to deploy Terraform Enterprise.
Subdirectories in the respective repository include:
- docs - Auxiliary documentation for the module.
- examples - This directory contains more than one example use case, each of which pertain to a root module which, when configured and run, will use the module to deploy Terraform Enterprise. Expect to run at least the initial development deployment from one of these subdirectories. The names of the subdirectories reflect the specific use case for each example e.g. podman-rhel-internal-lb refers to the deployment of the product onto a RHEL operating system using Podman as the container runtime, with an internal load balancer; similarly, new-gke refers to the deployment of the product onto a new instance of Google Kubernetes Engine which the module will install as part of the product deployment.
- templates - Contains HCL templates used by the module as needed.
- tests - Contains deployment configuration used to test the module contents.

To deploy Terraform Enterprise using the provided modules, you will need to:

Select the relevant tab below for your relevant cloud provider, and then follow the link to the respective public Terraform Registry entry.
In the Registry, review the contents, then click on the Source Code link in the page header to point your browser to the GitHub repository.
Read the GitHub repository for the respective Terraform module in its entirety. Not doing so may result in a failed deployment.
Follow the repository README.md file step by step, ensuring you have all prerequisites in place before starting the deployment; these may take some time to arrange in your environment and should be accounted for in project planning.
Ensure you have the TLS certificate and private key for your Terraform Enterprise installation. The DNS SAN in the certificate should include the FQDN you will use in DNS for the service (which will resolve to the NLB). We also expect you will have a standard organizational CA bundle and process for generating these which we recommend using. We do not recommend self-signed certificates, especially not in production environments. Inspect your certificate with this command.
```
$ openssl x509 -noout -text -in cert.pem
```
The README.md will direct you to complete the configuration and deploy Terraform Enterprise using the terraform init, terraform plan and terraform apply commands. When using the example terraform.tfvars.example file, remember to remove angled brackets from the values of key-value pairs in the resulting terraform.tfvars file.
Once the Terraform run completes, ensure to login to the VM and tail the automated application setup logs, watching for errors. Then return to the README.md to complete the task.
As soon as the software is installed, you will have sixty minutes to access the Initial Admin Console Token (IACT) which is generated during the installation process. As part of your preparation, read the HashiCorp public documentation for using the IACT(opens in new tab) (Initial Admin Console Token) to create the first admin user. This is also linked at the end of the main HVD Module README.md as next steps.

Container-specific guidance

More detailed guidance on the deployment of Terraform Enterprise on Docker is provided in this section.

General guidance

Ensure that project planning allows time and resource for scale testing as close to the degree of scale that is eventually expected in production. We recommend working with your earlier adopter teams in order to understand the expected scale and to ensure that the machines are sized appropriately during development and testing.

Ensure that observability tooling is also in place before load testing so that CPU, RAM and I/O constraints can be understood fully in your specific context, particularly in terms of connectivity to external services.

CPU

At high concurrency, HCP Terraform agent workload may pressure network throughput and is sensitive to the over-allocation of CPU. Memory-optimised instances have been evaluated but are unable to provide sufficient CPU, resulting in possible HCP Terraform agent workspace run failures.

RAM

Memory sizing is the most workload-dependent and RAM usage is driven by the Terraform configuration executed. To size conservatively, start with the system defaults and test thoroughly, preferably using representative workloads, and increase limits as necessary.

The default HCP Terraform agent run pipeline configures a container resource request for every agent at 2GB, so if the instance is not appropriately sized for this reservation, physical memory over-allocation will cause run failure, although this can be adjusted using the TFE_CAPACITY_MEMORY environment variable directive(opens in new tab). The conservative approach is to size on the limit so this can be tuned down carefully if cost efficiency is a priority. If it needs to be uniformly increased, this should be tested exhaustively. If only some workspaces require increases on this limit, consider running a agent pool for those workspaces, where dedicated hardware can run agents with higher resource availability as needed.

Determine maximum HCP Terraform agent RAM requirement and production system overhead

The number of concurrent workspace runs is managed by the TFE_CAPACITY_CONCURRENCY variable, and at the default, RAM requirement equates to TFE_CAPACITY_CONCURRENCY * 2GB. In the nominal example above, with a requirement of 30 agents (a workspace run is run by an agent container), the maximum RAM requirement would be thus 60GB for workspace runs.

For right-sizing of the instance, at least ten percent overhead is prudent, making the total RAM requirement 66GB. Note that this calculation does not include OS requirements or for agents required to be run in discrete network segments outside of the instance, which would be sized separately based on specific graph calculation requirements.

Platform-specific guidance

Select the tab below for further guidance and corresponding machine size choice on your cloud service provider.

For cluster calculation below:

One node in each of three AZs.
Nominal Linux operating system RAM requirement of 2048MiB.
Terraform Enterprise requiring 4096MiB.
Cluster calculations in MiB based on the TFE_CAPACITY_MEMORY system default of 2048MiB.
Part GB calculations are rounded down.

Deployment considerations for AWS

The official Terraform Enterprise HVD Module for deployment onto VMs with Docker/Podman is available at this entry in Terraform Registry(opens in new tab).

Disk sizing

For AWS, we recommend using EBS gp3 volumes in your EC2 instance configuration because they provide the best performance and scale with a baseline performance of 3000 IOPS.

Machine sizing

For ideal CPU sizing:

Avoid instances with burstable CPU or network characteristics (i.e. T type).
Choose the latest generation general purpose instance type x86-64 hosts.
CPU/RAM ratio should be 1:4 or higher.
Do not use memory-optimized instances.

Approximate default Active/Active cluster calculation

Set HVD Module machine type to m7i.2xlarge (8 vCPU, 30517MiB).
30517MiB - 2048MiB - 4096MiB = 24373MiB RAM maximum spare for workspace runs (25GB).
Maximum TFE_CAPACITY_CONCURRENCY can be set to 24373MiB / 2048MiB = 11.

Approximate scaled Active/Active cluster calculation

Set HVD Module machine type to m7i.4xlarge (16 vCPU, 61035MiB).
61035MiB - 2048MiB - 4096MiB = 54891MiB RAM maximum spare for workspace runs (57GB).
Maximum TFE_CAPACITY_CONCURRENCY can be set to 54891MiB / 2048MiB = 26.

Database sizing

For AWS we recommend a starting instance of db.r6i.xlarge.

Cache sizing

For AWS we recommend a starting instance of cache.m5.large.

Deployment considerations for Azure

The official Terraform Enterprise HVD Module for deployment onto VMs with Docker/Podman is available at this entry in Terraform Registry(opens in new tab)

Disk sizing

We recommend use of Azure Premium SSD Managed Disks in your Azure instance configuration because they provide the best performance and scale with a baseline performance of 5000 IOPS.

Machine sizing

For ideal CPU sizing:

Avoid instances with burstable CPU or network characteristics (i.e. B type).
Choose the latest generation general purpose instance type x86-64 hosts.
CPU/RAM ratio should be 1:4 or higher
Do not use memory-optimized instances. We evaluated memory-optimized instances with a CPU/RAM ratio of 1:8 and these were found to be CPU-bound under load.

Approximate default Active/Active cluster calculation

Set HVD Module machine type to Standard_D8s_v4 (8 vCPU, 30517MiB).
30517MiB - 2048MiB - 4096MiB = 24373MiB RAM maximum spare for workspace runs (25GB).
Maximum TFE_CAPACITY_CONCURRENCY can be set to 24373MiB / 2048MiB = 11.

Approximate scaled Active/Active cluster calculation

Set HVD Module machine type to Standard_D16s_v4 (16 vCPU, 61035MiB).
61035MiB - 2048MiB - 4096MiB = 54891MiB RAM maximum spare for workspace runs (57GB).
Maximum TFE_CAPACITY_CONCURRENCY can be set to 54891MiB / 2048MiB = 26.

Database sizing

For Azure we recommend a starting instance of GP_Standard_D4ds_v4.

Cache sizing

For Azure we recommend a starting SKU of Premium, a family of P and a capacity of 1.

Deployment considerations for GCP

The official Terraform Enterprise HVD Module for deployment onto VMs with Docker/Podman is available at this entry in Terraform Registry(opens in new tab)

Disk sizing

We recommend use of GCP Balanced Persistent SSD Disks in your GCP instance configuration because they provide the best performance and scale with a baseline performance of 10000 IOPS.

Machine sizing

For ideal CPU sizing:

Avoid instances with burstable CPU or network characteristics (i.e. e2-, f1-, g1- types).
Choose the latest generation general purpose instance type x86-64 hosts.
CPU/RAM ratio should be 1:4 or higher
Do not use memory-optimized instances. We evaluated memory-optimized instances with a CPU/RAM ratio of 1:8 and these were found to be CPU-bound under load.

Approximate default Active/Active cluster calculation

Set HVD Module machine type to n2-standard-8 (8 vCPU, 30517MiB).
30517MiB - 2048MiB - 4096MiB = 24373MiB RAM maximum spare for workspace runs (25GB).
Maximum TFE_CAPACITY_CONCURRENCY can be set to 24373MiB / 2048MiB = 11.

Approximate scaled Active/Active cluster calculation

Set HVD Module machine type to n2-standard-16 (16 vCPU, 61035MiB).
61035MiB - 2048MiB - 4096MiB = 54891MiB RAM maximum spare for workspace runs (57GB).
Maximum TFE_CAPACITY_CONCURRENCY can be set to 54891MiB / 2048MiB = 26.

Database sizing

For GCP we recommend a starting instance of db-custom-4-16384.

Cache sizing

For GCP we recommend a starting configuration of STANDARD_HA.

Personnel and access

Deployment (Managed Kubernetes)