Introduction
HashiCorp Validated Designs (HVDs) give customers and partners recommendations to deliver a resilient, secure, and highly performant deployment of HashiCorp solutions on various platforms. This operations guide contains HashiCorp's recommendations for deploying Consul Enterprise so your organization can leverage Consul to discover services in any environment with identity-based networking. Following this guide will improve your organization's operational capabilities with Consul Enterprise and lay a solid foundation for future Consul use cases as you progress through your usage maturity.
Note
The current version of this guide contains recommendations for operating Consul Enterprise on AWS EC2 with client workload deployments on AWS EKS. Future versions will include deployment options for other cloud providers, including Azure, GCP, and on-premises deployments. They will also include information for running clients in additional Kubernetes platform varieties.Objectives
This Consul operating guide aims to help you deploy Consul Enterprise. Consul lets you register services in a centralized registry to discover, monitor, and track their health status. It also enables secure network connectivity between services across multiple cloud environments, runtimes, and platforms.
Deploying Consul will help your organization provide the following solutions:
- Identity-based networking provides a dynamic way of referencing services by their identity. This lets you quickly respond to scaling events without updating IP addresses for new instances inside your firewalls and load balancers.
- A multi-platform service discovery solution provides a better way for applications to discover each other across different deployment platforms.
By leveraging these solutions, your organization can:
- Reduce the time required to scale infrastructure by providing dynamic capabilities to discover services and get the health status to communicate securely with dynamically scaled infrastructure.
- Reduce the time required to release new features to consumers Consul allows development to test new features and route traffic to old and new features based on traffic resolving and splitting rules.
Note
The recommendations in this solution guide are derived from different production Consul deployments. Before implementing these recommendations, carefully evaluate them and determine whether they suit your specific environment.Prerequisites
When using Consul Enterprise (Self-managed / Self-Hosted version of HCP Consul), this guide assumes that you have a basic understanding of Consul and have reviewed or implemented the Consul Enterprise Solution Design Guide.
Please note: In this document, Consul Enterprise refers to both the “Self-Managed” and the “HashiCorp-managed (HCP Consul)” versions of Consul.
Architecture assumptions
Many of the recommendations and best practices outlined in this document are platform-agnostic and include information that could apply to any platform. When we need to highlight platform-specific details, this document focuses on specifics for running Consul Enterprise server agents deployed on Amazon EC2 instances with client workloads running in either Amazon EC2 instances or Amazon’s managed Elastic Kubernetes Service within a single Amazon Region.
The following is a sample high-level architecture diagram that shows the Consul Enterprise deployment this Operating Guide recommends. The Consul Enterprise server agents are deployed by the Terraform module from the Solution Design Guide. Client workloads running in Amazon EKS are segregated into Consul Enterprise Admin Partitions corresponding to each Kubernetes cluster.
Maturity model
Organizations that successfully enable a cloud operating model follow a typical blueprint called a maturity model and rely on centralized cloud platform teams.
Cloud adoption journeys typically follow an established pattern that flows through three stages:
Adopt: Many organizations start their cloud journey when individual teams independently deploy services and applications to the cloud. This leads to multiple workflows tailored to a particular team’s needs. There is no platform, limited knowledge share, and minimal cloud operating strategy.
Standardize: As cloud usage increases, organizations often create platform teams to standardize how developers interact with the cloud. The platform team is responsible for creating central services around provisioning, security, networking, and application deployment. This process accelerates developer productivity by removing the manual tasks associated with deploying cloud resources. At the same time, it reduces risk by providing a centralized way to apply corporate governance and security policies to all cloud-based resources.
Scale: Once established in a single cloud environment, platform teams can extend these workflows to other cloud vendors and across an organization’s private estate, creating a consistent platform and system across all development and deployment areas. The team implements enterprise-scale solutions to facilitate self-service cloud workflows across dozens or hundreds of teams.
The first step in adopting a cloud operating model at the networking layer is to find a solution that drives the discovery, registration, and connection of your services, applications, and environments. Organizations can then use this foundation to facilitate proper identity-based zero-trust networking policies and advanced networking systems, such as a service mesh.
HashiCorp Consul enables platform teams to overcome these networking challenges by providing an identity-based networking solution across their entire cloud estate.
- Adopt: Consul helps you discover, register, and connect all your services in your environment. This registry provides a “map” of what services are running, where they are, and their current health. Next, these connections can be secured based on service identities (not IP addresses), a key component in a zero-trust architecture. This enables proper authorization and access to only required services instead of full network access.
- Standardize: As network scale expands, network complexity grows exponentially. Consul can provide a service mesh as a central networking control plane to cope with this complexity. The Consul service mesh contains all necessary networking services bundled in one interface, including service discovery, secured service-to-service communication, and traffic management. This enables a networking platform that developers can use to leverage the secure network layer without needing to deploy manually or understand all of the underlying technology.
- Scale: Once your organization uses Consul to secure service-to-service communication across your cloud, the platform teams can extend your network to other clouds or on-prem infrastructure. This multi-cloud strategy improves resiliency and enables you to expand to new regions. In addition, platform teams can reduce the operational complexity of existing networking infrastructure through automation. Instead of a manual, ticket-based process, Consul automates network operations using Terraform to execute changes based on predefined tasks so product teams can independently deploy applications, and platform teams can rely on Consul to handle downstream automation requirements.
This operating guide focuses on the Adoption phase and will guide you through deploying and configuring Consul Enterprise for the following use cases:
- Multi-platform service registration and health checking
- Service catalog discovery
Language and definitions
This documentation intentionally uses technology-agnostic terminology. However, some terms do not translate perfectly between the cloud providers. The following are the definitions of terms this document uses.
Cloud Provider Terms | Definition |
---|---|
Region | A physical location around the world with multiple clusters of data centers. |
Availability zone (AZ) | One or more discrete data centers within a region. Each AZ has redundant power, networking, and connectivity. |
Public subnet | A network accessible by application users. |
Private subnet | A network used by applications, but inaccessible by application users. |
Secrets Manager (SM) | System that can store secrets for bootstrapping. |
Virtual private cloud (VPC) | Software defined cloud networking |
Consul Term | Definition |
---|---|
Consul Enterprise | Self-managed / Self-Hosted version of HCP Consul |
Datacenter | A Consul datacenter is the smallest unit of Consul infrastructure that can perform basic Consul operations. This may coincide with cloud regions' boundaries. |
Partition or Admin Partition | A logical boundary within a single Consul datacenter that delineates unique network boundaries or teams. |
Peering or Cluster Peering | A Consul datacenter or admin partition that has an established relationship with another datacenter or admin partition |
Sample project plan
The following table itemizes this Adoption Operating Guide’s artifacts and its expected timeline to help project managers create their project plans. You can exclude some of these activities. Reach out to your HashiCorp account team for additional guidance. They can help you customize this plan to suit your specific and unique requirements.
HashiCorp validated design (HVD) adoption stage deliverables
Deliverable | Brief Description | ~Time to Complete |
---|---|---|
People & process | ||
Define user data, roles, and authentication | Define how users interact with the self-service platform and configure authentication methods, roles, and policies to be applied. Create policies as code and document team organization | 1-2 weeks |
Identify self-service workflow | Identify all personas and teams that will be part of the platform's consumers and producers. Document the workflow and specify inputs and outputs required to integrate with all development processes | 2-3 weeks |
Initial configuration | ||
Configure Consul as code | Automate the initial configuration of your security policies, admin partitions and namespaces, and consumer workflows. | 2 weeks |
Configure DNS forwarding | Setup DNS forwarding for local agents and remote systems to allow for consumer discovery workflows. | 1-2 weeks |
Configure platform monitoring | Configure exporting of telemetry and logging data and integrate with your monitoring platform. | 2 weeks |
Setup backup, restore, and upgrades | Familiarize the team responsible for operating Consul Enterprise with the different operational procedures required to ensure the ability to adopt new features and provide resilience in the event of unexpected issues. Test and validate the procedures. | 2 weeks |
Security operational procedures | Validate operational rotation procedures and ensure they align with your organizational security requirements. | 1-2 weeks |
Multiplatform service registration and health checking | ||
Build health checking workflow | Determine appropriate health checks for your applications to ensure you only direct discovery traffic to healthy instances. | 2-6 weeks |
Build service registration workflow | Targeting the first pilot consumers, build a service registration workflow that includes your health check definition | 2-6 weeks |
Service catalog discovery | ||
Build Consumer Workflow | Identify all of the consumers for the initially targeted applications and ensure you can perform a smooth migration from the current way they are addressing services to leveraging the Consul Service Catalog. | 2-6 weeks |