The backend technology behind the websites managed by Patton Web Concepts

Making Puppet Behave Like Ansible

For many reasons, I've recently started a migration away from Puppet to Ansible (why could be another blog post by itself).

But regardless of the platform, the purpose of a configuration management system remains the same: apply a set of instructions to a host to configure it to a desired state. How these two platforms accomplish this differs in some significant ways.

I've used Puppet for several years and gotten used to how it assigns modules to hosts using a hierarchical design. When I first started looking at Ansible though, I didn't find this same kind of automatic handling of configurations based on hierarchy. I learned that to get something similar, it was necessary to define the hierarchy logic manually using different playbooks.

What follows is some background on each platform along with a description of an Ansible configuration that mimics how Puppet behaves.

How Puppet Works

A theoretical example

A core tenet of Puppet is the concept of a hierarchy. A host is a member of one or more progressively larger groups that are self contained, similar to a nesting doll.

For example, an application server host called "server3" can be considered to be in each of the following groups in order from least specific to most specific.

  • server3
  • Application servers
  • Web servers
  • RHEL 8 servers
  • Red Hat servers
  • All servers

Each of these groups may contain other groups. So "Red Hat" could have both "RHEL 8" and "RHEL 9" for a different version of the OS.

If you draw out their relationships, you would get an inverted tree data structure where the root is "All servers" and leaves are individual hosts. Below is an example of this.

If you've studied computer science you'll know that trees are a basic data structure due to their efficiency. The real CS nerds will recall that parsing a path in a balanced tree is O(log n), so it's super efficient.

Each group in the hierarchy is considered unique and has its own YML configuration file. The diagram has 9 servers organized into 11 groups, so there are 20 different YML files across 6 different levels.

As a result, "server3" will have 6 different YML files applied to it where each level's YML file contains configurations unique to that level. The common.yml for "All Servers" contains configurations for all systems. The redhat.yml for "Red Hat" contains configs for all Red Hat systems while the ubuntu.yml for "Ubuntu" contains configs unitque to the Ubuntu OS, etc. This increasingly specific ordering of configurations ends with a host specific configuration <host>.yml.

The advantage to this design is that if you add "server3" to the "Application Servers" group, it will automatically get the configurations for itself, the application servers, web servers, RHEL 8 servers, Red Hat servers, and all servers just because of its leaf location in the tree. It makes configuring hosts a trivial single step.

I tend to think of this parsing of the tree as going "bottom up" from a leaf to the root and the path as a series of individually nested group.

My Configuration

For my environment, I've defined a simple hierarchy based on all hosts (common), Red Hat 8, Red Hat 9, various roles, and individual hosts.

This is the hiera.yml file that defines the groupings. Note that the individual hosts are defined at the top of the file, so a tree diagram would be "right side up" with the root at the bottom.

This way of thinking about organizing systems is intuitive, so over the course of several years this is how I built out our Puppet configuration management system.

How Ansible Works

Ansible Host Grouping

When I first started looking at Ansible, this enforced hierarchy wasn't anywhere to be found. Ansible stores its hosts in a single file, aptly named hosts. When I started learning Ansible, I just used individual hosts since I was just testing things.

Eventually I started playing with grouping hosts and applying playbooks to groups. The example below defines a group called "dns" with two hosts, dell56 and dell57. Ansible also supports ranges in host names as seen in the "homework" group. It has ten hosts vm-hw00 through vm-hw09.

Going a step further, it's also possible to nest the groups using the :children syntax.

I work at a college so we have computer labs. My hierarchy for lab systems looks as follows.

The "everyone" group includes the group "desktops" which includes the group "lab120" which is comprised of the hosts lab120a through lab120v.

I can now configure an Ansible playbook to use the [desktops] host group and the list of hosts will be determined recursively from the smaller groups. And again, drawing out the relationships produces a tree structure.

So logically Puppet and Ansible do their host groupings pretty much the same way. So what's different with Ansible?

The difference has to do with the recursive nature of how host configurations are applied. Puppet does this automatically while Ansible does not. If you add a host to Puppet, it will automatically apply configurations for each group that the host is a member of using those YML files.

Push vs Pull

In Puppet, a client node contacts the server and presents its identity. The server then parses the tree, identifying all the roles that pertain to our host. Because Puppet knows the host, it can start at the bottom of the tree with the configuration for the host. From there, it walks up the tree gathering roles until it reaches the root. The server then sends back a list of roles to be executed by the client.

In this way, Puppet clients are pulling their configurations from a server based on their position in the tree.

Ansible, however, is a push mechanism. It expands all of the hosts in the playbook's host definition and it then sends the playbook to all of the hosts in parallel.

The Need for Multiple Playbooks

Let's say we wanted to install a certain set of RPMs on every system. We'd use the [everyone] host group. Pretty simple.

But if we wanted a set of RPMS on desktops and servers that were distinctly different, we'd have to use the [desktops] in one playbook and [servers] in another.

And if we wanted yet a third distinct set of RPMS on the desktops for lab120, we'd have to use the [lab120] group.

So we would have to run a different playbook for as many groups as the host is a member of. This is analogous to the way puppet walks a path from leaf to root gathering YML files as it goes.

The difference is that we're doing this manually where Puppet does it automatically. If we want to automate this, we're going to have to put some logic into our playbooks to do it.

Make It Work Like Puppet

What I really want is to add a host to a single entry to my hosts file and then run one command to have Ansible apply the roles that pertain to it. I want to mimic starting at the leaf level of the tree and walking up to get my configurations.

Because I have to start at the root, I'd have to start with the [everyone] host group so that the traverse would be guaranteed to find our host somewhere down the tree. How would we find our path to walk down to the leaf?

Here's an example.

Let's say I want to add a new lab system lab120w to my computer lab using my Ansible groups above. I add it to the [lab120] host group.

Let's also assume we want to apply roles A, B, and C to everyone, D and E to all desktops, F to lab 120 systems, and G to lab120w.

To do that starting at the root, we need logic that looks like this pseudo code snippet:

With that logic, you can run a single ansible-playbook command after adding a host by calling a playbook that applies to [everyone].

$ ansible-playbook all-hosts.yml

The ansible-playbook command has the ability to limit what hosts are included at run time with the -l option. So for our new lab host, I could run:

$ ansible-playbook all-hosts.yml -l lab120w

Logic to Ansible YML

The screenshot below is from the all-hosts.yml file we just referenced. In it is an example of how to reference a host's group membership and apply the appropriate role(s).

Here we are working with web servers. If our host is in the [web_servers] group, and it's RHEL 9, then include the web_servers.yml playbook. This playbook's hosts will also reference the [web_servers] group and apply the corresponding roles.

In turn, the web_servers.yml may have another import_playbook statement in it that references something more specific than [web_servers], such as an individual host. By doing includes recursively this way, it's possible to define only the relevant playbooks for your host.

And voila. We've made Ansible act like Puppet.

Now we can port over all of our Puppet roles to Ansible hosts while keeping the same configuration logic for the actual configuration bits. It's now just a "simple" matter of translating Ruby to YML files.

Patton Web Concepts

 

Find Me

 

Boston, MA

erik "at" pattonwebconcepts.com

@erikpatton

About Me

 

I'm Erik I build and maintain websites so other people don't have to.

My expertise lies in building computing infrastructures for websites that are reliable, fast, and secure. I work primarily with Linux systems in cloud and on-premise environments.

I also do web design and development with a preference for the Astro javascript framework. I've also managed several websites using WordPress.

If you need a new website, an integration to your existing site, or managed hosting, please get in touch.