Vault organizational concepts
At the highest level, we have a Vault cluster. There are two deployment options: self-managed, where you install and run Vault manually, and HCP Vault, where we handle the deployment and management of Vault on your behalf, including seamless upgrades.
In both self-managed and HCP Vault, a cluster begins with a primary namespace: "root" for self-managed Vault, and "admin" for HCP Vault. It is important to note that in HCP Vault, the root namespace is reserved for HCP control plane management and is not accessible to end users, even though both self-managed and HCP Vault use the same underlying Vault software.
Once in the admin namespace, an admin can create multiple namespaces just like a folder structure, i.e, namespaces can be nested (as shown in the diagram above). Each namespace can store:
- Secret Engines: Provide an interface for storing, retrieving, and controlling access to sensitive data, with each type of secret engine handling specific kinds of secrets or workflows, such as key/value storage, database credential issuance, and more.
- Auth Methods: The method that users, machines, or applications can use to authenticate to a Vault cluster.
- Policies: The permissions assigned to authenticated users, machines, or applications specifying what secrets they can access.
We suggest, as depicted in the diagram below, that you configure the auth method for administration within the root namespace and reserve the root namespace for operators with super admin access. This is to secure cluster-level configurations, as some configurations can only be made within the root namespace.
The admin namespace is the organization-level namespace where regular administrators and all other users log in to Vault. This is where the administrators define the policies necessary for granting these human users access to different sub-namespaces. The rationale behind this recommendation is to consolidate all human access points into a single, centralized location within the cluster. Typically, organizations utilize an identity management platform like Active Directory or Okta for managing human user access.
Configuring Vault using Terraform
We recommend using Terraform to automate the configuration of Vault, for example, setting up namespaces, secret mounts, etc. On a small scale, manually configuring Vault may be manageable. However, as the scale and complexity of your use case increases, you will need some form of automation to help keep up with the increasing management overhead.
For example, you will most likely need to map many OIDC provider groups to Vault external groups across multiple environments (dev, test, staging, prod, etc.). This can quickly become a significant operational overhead without an automated process.
The Vault Terraform provider allows Terraform to read from, write to, and configure HashiCorp Vault. See here for more details on the Vault provider.
The Vault Terraform Provider enables you to take an infrastructure as code (IaC) approach to your Vault configurations. This provides many benefits including increased productivity, promoting repeatable processes, and reducing human errors. While other infrastructure automation tools such as Ansible can be used to automate Vault configuration, we recommend the Terraform provider because it features testing and support from HashiCorp as well as regular updates to support new features in Vault. Terraform also has a robust state management system which tracks the current configuration on your Vault clusters. For more information on adopting Terraform, please take a look at the HashiCorp Validated Design for Terraform.
We recommend that the Vault admin team set up a “Vault-Admin-Config” repository in a version control system (VCS) such as Git, where Vault configuration is stored as Terraform files. Any change to the configuration is applied to the repo and the changes are then made to Vault.
This allows configuration changes to be migrated between the various application lifecycle environments.
Set up Vault for consumption for various business units
When Vault is configured as recommended in the “Vault Organizational Concepts” section, all users of various business units will authenticate via a single top level auth method. Their membership to groups will dictate which namespace and which secret engine paths within the namespace they will have access to.
Environments
Separate from your production Vault environment, we recommend deploying an additional cluster as a testing environment. A testing environment should be able to withstand longer periods of downtime without business impacts, allowing for the safe examination and troubleshooting of new configurations, policies, and updates without putting your business at risk. The configuration of the testing environment should mirror the configuration of the production environment as closely as possible in order to ensure that any changes being tested will produce the same behavior in both environments.
You may also wish to provision additional Vault clusters for additional environments separate from production. For example, you may wish to have a separate Vault cluster for a development or staging environment. This decision should be based on your organization's own availability requirements and security policies.
For example, one organization may have CI/CD pipelines for a development environment which are still critical to producing and deploying new software, and therefore may need to be associated with the production Vault cluster to guarantee high-availability and security. Another organization may have extremely strict security requirements for secrets associated with their production environment, and therefore may decide to run a separate Vault cluster for their development environment to prevent any co-mingling of production and development secrets. These considerations can help you decide how many distinct Vault environments are right for your organization.
Important workflows
- App team onboards a new app to Vault.
- App team adds a new secret.
- App team edits/rotates a secret.
- Application consumes a secret.
Onboarding workflow
The Vault onboarding experience should be as simple and as pain-free as possible in order to drive adoption to the service. Like most other services, Vault has limitations that can lead to service disruptions when exceeded. Therefore, we do not recommend a full self-service workflow and guard rails should be put in place to prevent consumers from exceeding Vault’s limits.
Developing an onboarding process often depends on organizational preferences and technical tools. In general, it is recommended that the onboarding process be automated as much as possible to minimize human errors and to ensure scalability as adoption of Vault matures.
We recommend that you codify Vault configurations using an IaC tool such as Terraform. Below is an example of an onboarding workflow built around Terraform Enterprise/Cloud (TFE/C).
There are two personas in this workflow: Vault administrators and application owners. Vault administrators are part of the platform team, they are responsible for:
- Vault cluster level configuration such as setting up audit logs, mounting auth methods and secret engines etc.
- Creating common Terraform modules for application owners to consume.
- Approving pull requests in the Vault consumer repositories.
Vault administrators manage cluster level configurations using Terraform through a Vault admin configuration Git repository. The Vault admin configuration repository maps to multiple TFE/C workspaces using the VCS workflow - one workspace per environment (i.e, dev, staging, prod). When a pull request is merged into the repository, TFE/C will kick off a workflow to apply the configuration changes. For more information on this workflow, please refer to the Terraform Operating Guide for Adoption.
In addition to cluster level configuration, the Vault administrators are also responsible for creating common Terraform modules to be consumed by application owners. Each module should align to a pre-approved Vault use case and provisions the resources necessary for consumers to access and work with Vault. For example, a module for consuming Vault static secrets would provision the appropriate Vault policies, identity groups, auth method roles, and the required paths within the KV secret engine. These modules serve as an interface for Vault, abstracting its configuration details from application owners. They should offer sensible defaults and expose only the necessary input parameters.
Application owners request access to Vault by using common Terraform modules and making a pull request to a Vault consumer repository. Each application team requesting access to Vault should have its own consumer repository. The creation of the consumer repository could be initiated by an onboarding request form through ServiceNow, Jira, or simply a Google Form. The consumer repository is where Vault configuration is managed long-term for the application. Similar to the Vault admin configuration repository, each consumer repository maps to multiple environment specific workspaces in TFE/C. Pull requests submitted to the consumer repository are reviewed and approved by the Vault administration team. Once the pull request is approved and merged, TFE/C will kick off a workflow to apply the changes to Vault. Application owners should then use the Vault UI, CLI, or API to manage secrets to ensure that secrets are not committed into the repository.