Appendix 1: Admin partitions
Consul admin partitions give organizations the option to define tenancy boundaries for services using Consul. This can help organizations managing services across teams and business units. Teams can benefit from managing and customizing their own Consul environment, without impacting other teams, or other Consul environments.
Some organizations want to allow organizational or business units to deploy their own installations of Consul Enterprise on their own Kubernetes clusters. Centrally managing multiple Consul installations can be an operational challenge for organizations. Instead of giving teams their own clusters, the organization can consolidate these installations onto a shared multi-tenant server cluster. This cluster serves as the control plane for Consul clients in the tenant clusters, and ensures separation between tenants of the system. This deployment model provides teams the autonomy to configure Consul and application networking as they require. This increases the team's flexibility to manage application deployments, and eliminates the operational overhead associated with managing individual server clusters.
Appendix 2: Kubernetes catalog sync
To use catalog sync, you must enable it in the Helm chart. Catalog sync allows you to sync services between Consul and Kubernetes. The sync can be unidirectional in either direction or bidirectional. Refer to the operating guide to learn more about the configuration options.
Services are synced from Kubernetes and non-Kubernetes to Consul service registry to make them discoverable, like any other service within the Consul datacenter. This service sync allows Kubernetes services to use Kubernetes' native service discovery capabilities to discover and connect to external services registered in Consul, and for external services to use Consul service discovery to discover and connect to Kubernetes services. Read more in the network connectivity section to learn more about related Kubernetes configuration. Services synced from Consul to Kubernetes are discoverable with the built-in Kubernetes DNS once a Consul stub domain is deployed. When bidirectional catalog sync is enabled, it behaves like the two unidirectional setups.
Appendix 3: Gossip protocol
Consul uses a gossip protocol to manage membership and broadcast messages to the cluster. The protocol, membership management, and message broadcasting is provided through the Serf library. The gossip protocol used by Serf is based on a modified version of the SWIM (Scalable Weakly-consistent Infection-style Process Group Membership) protocol.
Consul uses a LAN gossip pool and a WAN gossip pool to perform different functions. The pools are able to perform their functions by leveraging an embedded Serf library. The library is abstracted and masked by Consul to simplify the user experience, but developers may find it useful to understand how the library is leveraged.
Appendix 4: Sample Terraform and helm configuration
Consul server
Terraform configuration
# Nodes tainted for running consul servers
consul_server = {
name = "consul_server"
instance_types = var.consul_server_node_type
min_size = 1
max_size = 5
desired_size = var.consul_server_node_count
// Nodes only for consul server agents; excluding other agents
taints = {
dedicated = {
key = "consul_agent_type"
value = "server"
effect = "NO_SCHEDULE"
}
}
labels = {
consul_agent_type = "server"
}
}
Resource requirement (helm)
# Configure your Consul servers in this section.
server:
# Specify three servers that wait until all are healthy to bootstrap the Consul cluster.
replicas: 3
# Specify the resources that servers request for placement. These values will serve a large environment.
resources:
requests:
memory: '32Gi'
cpu: '4'
disk: '50Gi'
limits:
memory: '32Gi'
cpu: '4'
disk: '50Gi'
Consul client
Terraform configuration
# Nodes to deploy services and consul client
consul_workload = {
name = "consul_workload"
instance_types = var.consul_workload_node_type
min_size = 0
max_size = 200
desired_size = var.consul_workload_node_count
labels = {
app = "client - Data Plane"
}
}
Resource requirement (helm)
# Configure Consul clients in this section
client:
# Specify the resources that clients request for deployment.
resources:
requests:
memory: '8Gi'
cpu: '2'
disk: '15Gi'
limits:
memory: '8Gi'
cpu: '2'
disk: '15Gi'
Helm configuration - load balancer
# ELB or Classic Load Balancer
---
global:
name: consul
datacenter: dc1
ui:
enabled: true
service:
type: LoadBalancer
# NLB: Network Load Balancer
---
global:
name: consul
datacenter: dc1
ui:
enabled: true
service:
type: LoadBalancer
ingress:
enabled: true
ingressClassName: alb
hosts:
- host: consul-ui.test.consul.domain
annotations: |
'alb.ingress.kubernetes.io/certificate-arn': 'arn:aws:acm:us-east-2:01234xxxxxxx:certificate/f36b75c3-xxxx-40ca-xxx-3a2fad7f419d'
'alb.ingress.kubernetes.io/listen-ports': '[{"HTTPS": 443}]'
'alb.ingress.kubernetes.io/backend-protocol': 'HTTPS'
'alb.ingress.kubernetes.io/healthcheck-path': '/v1/status/leader'
'alb.ingress.kubernetes.io/group.name': 'envname-consul-server'
---
global:
name: consul
datacenter: dc1
ui:
enabled: true
service:
type: LoadBalancer
annotations: |
'service.beta.kubernetes.io/aws-load-balancer-type': "external"
'service.beta.kubernetes.io/aws-load-balancer-nlb-target-type': "instance"
ELB_IRSA - Terraform configuration
resource "kubernetes_service_account" "lb_controller" {
metadata {
name = "aws-load-balancer-controller"
namespace = "kube-system"
labels = {
"app.kubernetes.io/component" = "controller"
"app.kubernetes.io/name" = "aws-load-balancer-controller"
}
annotations = {
"eks.amazonaws.com/role-arn" = module.lb_irsa.iam_role_arn
}
}
depends_on = [
module.eks
]
}
AWS load balancer controller - TF
resource "helm_release" "lb_controller" {
name = "aws-load-balancer-controller"
repository = "https://aws.github.io/eks-charts"
chart = "aws-load-balancer-controller"
namespace = "kube-system"
set {
name = "clusterName"
value = module.eks.cluster_name
}
set {
name = "serviceAccount.create"
value = false
}
set {
name = "serviceAccount.name"
value = kubernetes_service_account.lb_controller.metadata[0].name
}
}
EBS CSI driver
Terraform configuration for installing an EBS CSI driver.
data "aws_iam_policy" "ebs_csi_policy" {
arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
}
module "irsa-ebs-csi" {
source =
"terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "5.27.0"
create_role = true
role_name =
"AmazonEKSTFEBSCSIRole-${module.eks.cluster_name}"
provider_url = module.eks.oidc_provider
role_policy_arns = [data.aws_iam_policy.ebs_csi_policy.arn]
oidc_fully_qualified_subjects =
["system:serviceaccount:kube-system:ebs-csi-controller-sa"]
}
resource "aws_eks_addon" "ebs-csi" {
cluster_name = module.eks.cluster_name
addon_name = "aws-ebs-csi-driver"
addon_version = "v1.24.0-eksbuild.1"
service_account_role_arn = module.irsa-ebs-csi.iam_role_arn
tags = {
"eks_addon" = "ebs-csi"
"terraform" = "true"
}
preserve = false
}
Consul server helm
global:
name: consul
image: "hashicorp/consul-enterprise:1.16.X-ent"
datacenter: default
adminPartitions:
enabled: true
name: "default"
acls:
manageSystemACLs: true
enableConsulNamespaces: true
enterpriseLicense:
secretName: consul-enterprise-license
secretKey: license
enableLicenseAutoload: true
peering:
enabled: true
tls:
enabled: true
server:
replicas: 3
bootstrapExpect: 3
exposeService:
enabled: true
type: LoadBalancer
extraConfig: |
{
"log_level": "TRACE"
}
syncCatalog:
enabled: true
k8sAllowNamespaces: ["*"]
consulNamespaces:
mirroringK8S: true
connectInject:
enabled: true
transparentProxy:
defaultEnabled: false
consulNamespaces:
mirroringK8S: true
k8sAllowNamespaces: ['*']
k8sDenyNamespaces: []
apiGateway:
managedGatewayClass:
serviceType: LoadBalancer
meshGateway:
enabled: true
replicas: 3
service:
enabled: true
type: LoadBalancer
ui:
enabled: true
service:
enabled: true
type: LoadBalancer
Non-default partition K8s cluster - helm
global:
name: consul
datacenter: default
enabled: false
image: "hashicorp/consul-enterprise:1.16.X-ent"
imageK8S: hashicorp/consul-k8s-control-plane:1.0.2
imageConsulDataplane: "hashicorp/consul-dataplane:1.0.0"
enableConsulNamespaces: true
adminPartitions:
enabled: true
name: "prod-partition-1"
peering:
enabled: true
tls:
enabled: true
caCert:
secretName: consul-ca-cert
secretKey: tls.crt
caKey:
secretName: consul-ca-key
secretKey: tls.key
acls:
manageSystemACLs: true
bootstrapToken:
secretName: consul-bootstrap-acl-token
secretKey: token
externalServers:
enabled: true
hosts: [ "a7d8d3f12cdfb4783af0357050e95416-secondryTest.us-east-1.elb.amazonaws.com" ] # External-IP (or DNS name) of the Expose Servers
tlsServerName: server.default.consul # <server>.<datacenter>.<dns>
k8sAuthMethodHost: "CE8490A70630FBF25B9DsecondryTest.gr7.us-east-1.eks.amazonaws.com" # DNS name of EKS API of client1 - prod-parition-1
httpsPort: 8501
grpcPort: 8502
useSystemRoots: false
server:
enabled: false
connectInject:
transparentProxy:
defaultEnabled: true
enabled: true
default: true
apiGateway:
managedGatewayClass:
serviceType: LoadBalancer
meshGateway:
enabled: true
replicas: 3
service:
enabled: true
type: LoadBalancer
Agent telemetry
The Consul agent collects various runtime metrics about the performance of different libraries and subsystems. These metrics are aggregated on a ten second (10s) interval and are retained for one minute. An interval is the period of time between instances of data being collected and aggregated.
When telemetry is being streamed to an external metrics store, the interval is defined to be that store's flush interval.
External Store | Interval (Seconds) |
---|---|
dogstatsd | 10s |
Prometheus | 60s |
statsd | 10s |
Consul emits metrics under two major categories Consul health and server health:
Consul health:
- Transaction timing
- Leadership changes
- Autopilot
- Garbage collection
Server health:
- File descriptors
- CPU usage
- Network activity
- Disk activity
- Memory usage
Consul telemetry collector intention
Create a service-intentions configuration entry that allows all traffic to consul-telemetry-collector:
# consul-telemetry-collector.yaml
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceIntentions
metadata:
name: consul-telemetry-collector
spec:
destination:
name: consul-telemetry-collector
sources:
- action: allow
name: '*'
Create configuration entry:
kubectl apply --namespace consul --filename consul-telemetry-collector.yaml
Consul server sizing - EC2 to Consul cluster size
Provider | Size | Instance Type | CPU | Memory | Disk Capacity | Disk IO |
---|---|---|---|---|---|---|
AWS | Small | m5.large | 2 | 8 | min: 100 GB (gp3) | min: 3000 IOPS |
AWS | Medium | m5.xlarge | 4 | 16 | min: 100 GB (gp3) | min: 3000 IOPS |
AWS | Large | m5.2xlarge | 8 | 32 | min: 200 GB (gp3) | min: 7500 IOPS |
AWS | Extra Lage | m5.4xlarge | 16 | 64 | min: 200 GB (gp3) | min: 7500 IOPS |
Agents
You can run the Consul binary to start Consul agents, which are daemons that implement Consul control plane functionality. You can start agents as Servers or clients.
terraform.auto.tfvars - Terraform deployment
friendly_name_prefix = "consul"
common_tags = {
deployment = "consul"
site = "westeros"
}
route53_failover_record = {
record_name = "consul"
}
secretsmanager_secrets = {
license = {
name = "consul-license"
data = ""
}
ca_certificate_bundle = {
name = "consul-ca-bundle"
path = "./consul-agent-ca.pem"
}
cert_pem_secret = {
name = "consul-public"
path = "./consul-server-public.pem"
}
cert_pem_private_key_secret = {
name = "consul-private"
path = "./consul-server-private.pem"
}
consul_initial_management_token = {
generate = true
}
consul_agent_token = {
generate = true
}
consul_gossip_key = {
generate = true
}
consul_snapshot_token = {
generate = true
}
consul_ingress_gw_token = {
generate = true
}
consul_terminating_gw_token = {
generate = true
}
consul_mesh_gw_token = {
generate = true
}
}
snapshot_interval = "5 min"
s3_buckets = {
snapshot = {
bucket_name = "consul-westeros-snapshots"
force_destroy = true
}
}
route53_zone_name = "test.aws.sbx.hashicorpdemo.com"
ssh_public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCX/57xBO3ZBhWFnHXcO0+DOKyrajTWyvOlxFHUQ/PlH9iNqog4XIWkYlG/3f0ctl61IR0InrH2PRYYctlR4HIeMWO2cbJcC8PovWlaB9nHU3rb16JWsx47C48R6iurTxyvYHkeYPbjYlicqztwMbvdh55jw+vOTZCM85Ni+burz1dxSTYh164rsB2WzRL+G/c74D5L6ufOnY6k9VTlf9VGpZ6Zh72xmm9IKyHwO6t518Ht5QZdQtBPKEjbGMByLSHPBsw1ceq1P+r315YfH7rYR11DnDNDpkrf87RB5nC9TiukMlz53MtW6vdPzThB/XlupqWDjwdlQGmU9BnGMu+jz0eWtUIQkaTANQXxtQAgv/YvuAq2QuRsd/lRLwR49fRbUXy3VRThYVu25oZsvPgknsY4ZarTYh1d65C2qrVVvoEYdnx4w+rBQWWludOhvcwfz5edpvxIoUh9ksdWog1kMlr8fFUCQepCPUF8ObM69sXjJv9sdM3GpGiGtUinda8="
iam_resources = {
ssm_enable = true
cloud_auto_join_enabled = true
log_forwarding_enabled = true
role_name = "consul-role"
policy_name = "consul-policy"
ssm_enable = true
}
rules = {
consul = {
server = {
rpc = {
enabled = true
self = true
target_sg = "agent"
}
serf_lan_tcp = {
enabled = true
self = true
target_sg = "agent"
}
serf_lan_udp = {
enabled = true
self = true
target_sg = "agent"
}
dns_tcp = {
enabled = true
self = true
bidrection = true
}
dns_udp = {
enabled = true
self = true
bidrection = true
}
https_api = {
enabled = true
self = true
bidrection = false
}
grpc = {
enabled = true
self = true
bidrection = false
}
grpc_tls = {
enabled = true
self = true
bidrection = false
}
}
agent = {
rpc = {
enabled = true
self = true
}
serf_lan_tcp = {
enabled = true
self = true
}
serf_lan_udp = {
enabled = true
self = true
}
dns_tcp = {
enabled = true
self = true
bidirectional = true
}
dns_udp = {
enabled = true
self = true
bidirectional = true
}
serf_lan_udp = {
enabled = true
self = true
}
mesh_gateway = {
enabled = true
self = true
}
ingress_gateway = {
enabled = true
self = true
}
}
}
}