Looking at Consul – Part 2

In this second part of the series we will get a bit more practical with Consul, install it and define some basic checks.

How to install Consul

Consul can be downloaded as a compiled binary from the Consul download page, right now there are binaries available for OSX (64 bits), Linux (32 and 64 bits) and Windows (32 bits).
If you want to compile from the source code it is available on GitHub, Consul is written in Go and has good documentation, all you need to do is run scripts/build.sh and it will download the dependencies and create a binary for your platform.

We will install Consul on Ubuntu so we can use the precompiled Linux binary.

Let’s check that Consul has been properly installed

We need to do some extra steps to make sure consul runs as a service, first we need to create a consul user and group

Now we can create the directories consul will need for regular operations, its configuration directory /etc/consul and the data directory, which we will create at /var/lib/consul as it follows the Filesystem Hierarchy Standard.

Finally we need an init script, if you use RedHat there is a one available in this gist and for Ubuntu there is an upstart script available in the Consul source code.

You can also use Configuration Management to install Consul, I personally use Kyle Anderson puppet module availabe in the Puppet Forge, there are Chef cookbooks here and here and also a playbook available for Ansible here.

Configuring Consul

Consul will read all the files available in the /etc/consul directory in JSON format, there is one file that is necessary for it to run and that is config.json, here is below an example of a Consul Server config.json file:

The configuration is very readable but the parameters that we have to have in mind are server, bootstrap_expect, datacenter and encrypt.

server is what defines if this node is a Consul Server or a Consul Agent (client), defining "server": true we are declaring ourselves as a Consul Server node.
bootstrap_expect tells Consul what is the minimum quorum we need in order to bootstrap the Consul cluster and trigger a leader election, in this configuration example we are running Consul in production so we expect 3 servers to have a valid quorum.
datacenter defines the difference between local and remote clusters, all machines in the local cluster will have the same datacenter name.
encrypt is the key that we will use to encrypt communication across the whole cluster, you can easily generate keys with the provided command line consul keygen:

Bootstraping the cluster

Starting in version 0.4.0 (the current one), bootstraping a cluster should be as easy as starting up the minimum quorum servers (defined by bootstrap_expect) and asking them to join one another using consul join, you can add as many servers as you want as the operation is serialized and idempotent.

As soon as you have the minimum number of nodes expected the cluster will bootstrap itself, you can double check that it has been successful by either using consul monitor or consul info and looking at the number of serf_lan members.

Now that we have our cluster bootstrapped we can start configuring Consul to get the most out of it.

Defining Health Checks

Consul has the concept of Health Checks and Services which is very powerful as it provides a decentralised way to monitor your platform.

Health Checks are your typical old school Nagios/Icinga/Zabbix checks, Consul is 100% compatible with the exit codes of checks in Nagios so you can use the bast amount of checks available at the Nagios Exchange.

Here is a basic check for checking our load average

Consul will check every 30 seconds for our load average and report back if it triggers a warning or a critical, you can use ttl instead of interval if you want to feed the information to the check through Consul API rather than letting Consul run the check itself.

Defining Services

The main function of consul as a Service Discovery mechanism, you need to define the services that are running on your server to expose them.

Services can have Health Checks attached to their definition, right now only one health check can be defined per service, although that is a limitation of the json configuration, services with more than one health check can be added through the API.

If the service passes its health check it will be exposed as active and available.

Getting information from DNS

One of the best things is being able to use a well known solid and heavily tested protocol as one of the key parts of your product, this is what Consul does by exposing DNS.

Consul exposes a DNS server on port 8600 for the zone .consul, if we want to integrate this with our current DNS infrastructure there is a post by Gareth Rushgrove in how to use Dnsmasq to integrate it.

Consul exposes through DNS both nodes and services, nodes can be found either using nodename.datacenter.consul or nodename.consul as the DC part is optional.

As for services we can also query them in different ways, the standard way of querying them in consul is using the format servicename.service.datacenter.consul

This will give you the IPs available for that service in consul in Round Robin format, but you can also use RFC-2782 style DNS queries to recover more information like the service port value.

In the next post I will be covering how to use and abuse the k/v storage, what kind of things we can do to connect it with provisioning and configuration management and using consul as a load balancer with HAproxy.

Looking at Consul – Part 1

Introduction

Service Discovery has been very prominent as of lately, it is a key part of any self-healing and autoscaling system.
Companies have been creating these systems in house as there was no clear open source offering, some companies like Netflix and Airbnb have published their own publicly to try to help others in embracing good practices.
Hashicorp created Consul as a natural extension to one of their previous projects called Serf, Serf is a local cluster membership orchestration mechanism, detecting failures and executing actions based on that.
Consul takes this concept several steps further, let’s see how.

What is Consul

Consul is a service discovery system, it provides a DNS system to be able to query services in a k/v storage mechanism which is strongly consistent as proven by the rigorous Jepsen tests it has witheld.
Consul introduces two concepts that are very well segmented, the concept of Service definitions and a Health Check, Health Checks can be included in Service definitions to make sure they work fine but at the same time can be running standalone to ensure that the health of the system itself is not compromised.

Consul Architecture

Consul is based on the same Gossip protocol that Serf is based (as it uses Serf itself for Gossip), Gossip will transmit at random intervals the health check of the system to the rest of the cluster; if a system fails to report in it will be checked indirectly by a random number of nodes, if it fails to do so it will be marked as “suspicious” and taken off the cluster in a reasonable time if the node itself does not challenge this suspicion. Gossip is a fairly good balance between decentralisation and network traffic generated to guarantee quorum in the system.
Consul can use Gossip in a LAN system (also known as a datacentre in the config files) and a WAN system, as Consul can connect and talk to other Consul clusters in other datacentres, all the clusters share information through the k/v system that is queryable from every single node, ensuring consistency with very short “eventuality”.

Consul Servers

Consul needs a quorum of servers in order to run reliably, for production purposes a minimum of three servers is recommended although this number can be take up to five or more nodes for high consistency.
When starting a cluster one of the server nodes will be the bootstrap node, this node will generate the initial k/v definitions and assume the temporary leadership of the cluster until the minimum number of quorum servers is reached. At that point a leader election will be forced and a cluster leader will be chosen amongst all the servers, having the right number of servers will help the cluster vote for new leaders and ensure consistency, this can be trickier than it looks like in some environments, specially datacentre partitioned ones like AWS, here’s a table I use for configuring mine in AWS.

Region Zones Servers per zone Total Servers Minimum Quorum Consistency
us-east-1 3 1 3 2 Good
us-east-1 3 2 6 4 High
us-west-1 2 1.5 3 2 Average
us-west-1 2 2.5 5 2 Good

Consul Agents

Consul Agents are any other node that runs Consul and joins the Cluster but are not responsible of any of Consul internal services, they will run health checks for the node and declare services available.
Consul Agents will also have a full copy of the k/v database and share the strong consistency of the system, this way we can always query through DNS or through the k/v itself the services available in the cluster.

In the next blog post I will talk about how to install Consul, start a basic cluster and add service and health check definitions.

Continue reading Part 2