Kennis Navigating complex Puppet setups

Navigating complex Puppet setups

Benny Cornelissen is an Infrastructure Consultant and Solution Developer at CRI Service, technology addict, Mac user, guitar player, cyclist, mechanical keyboard user and Belgian beer aficionado. About a year ago, we asked him to help us design and build a new infrastructure for our company. Benny wrote a very interesting blogpost about this project and has been so kind to share that blogpost with us. Thanks Benny!


Navigating complex Puppet setups

The downside to 'Puppetizing'  everything is that you usually end up with huge amounts of code. In this blogpost I will explain how to not get lost in complex Puppet setups.


How to get a complex Puppet setup


About a year ago one of my clients, a software development company called Avisi, asked me to help them design and build a new infrastructure for their company. They had a few requirements about security, scalability, performance, flexibility, etcetera. But there were a few notable requirements:

  • no more 'sudo/root' permissions for developers
  • eliminating manual changes
  • up-to-date documentation (or better: self-documenting)
  • self-service for developers


From the start, it was clear that automation was going to be a key part of our infrastructure. As we had prior experience with Puppet, we decided to 'Puppet all the things'. That was easier said than done, though.


We had to deal with several different development teams that had very different requirements, wanted support for different Linux distributions, different versions of Java, Oracle, PHP, MySQL, etcetera. Then there were all the infrastructure services (DNS, LDAP, backup, syslog, monitoring), development services (SCM, build, test, deploy, repository, QA tooling), collaboration services (issue tracking, wiki) some websites and the usual 'one-offs', that all needed to be 'Puppetized'.


Mission: successful?

One year down the road we seem to have accomplished everything we set out to do. We have eliminated root permissions for developers and manual changes, deploying new servers takes just minutes and includes fully automated configuration of monitoring and backup. But most important, the developers are actively using Puppet to deploy their applications on the development infrastructure.


This has resulted in a fairly large codebase. Some numbers:

  • total lines of code: about 380k
  • Puppet modules developed in-house: 91
  • Upstream Puppet modules used: 49
  • Downtime related to config errors: none in the past 6 months


The problem

While the above may seem pretty successful, our fairly complex Puppet setup did introduce a few new challenges:

  • some tasks that took just a few seconds before, now take up to a few hours because of code reviews.
  • in the past, everyone with rudimentary sysadmin skills could do perform basic sysadmin tasks. Now, basic Puppet knowledge gets you nowhere.
  • When you first start working with the Puppet codebase, it can be quite intimidating.
  • When you want to change something, it is sometimes difficult to find out exactly where to find the code that needs to be changed.
  • Puppet doesn't guide you in how to handle includes/inheritance.


Solving the problem

There is no single solution to this problem, but there are a few guidelines that could help you steer clear from the common pitfalls of complex Puppet setups.


Separate data from code

Separating data from code has a few advantages. First, it allows for easy re-use of code. Second, it forces you to think ahead while writing code, make your modules highly configurable, and decide on sane defaults. Third, it allows you to expose the 'data-part', or node-classification separately, so the actual configuration of your nodes doesn't necessarily require any programming skills.


When separating data from code, you obviously need a place to store your data. The most obvious choice currently is Hiera, which is built into Puppet. Hiera is a key/value lookup tool that uses a configurable hierarchy and supports multiple backends.


Other options are ENCs (External Node Classifiers) like the Puppet Enterprise Dashboard or The Foreman.

Depending on the node classifier you choose, you can configure nodes using YAML, JSON or web interfaces.


Please continue reading at Benny's blog for more guidelines that could help you steer clear from the common pitfalls of complex Puppet setups!