Getting Capistrano destination hosts from Puppet

26 11 2011

If you’re using Capistrano to deploy code to web servers and Puppet to manage those servers, the DRY principle suggests that it may be a bad idea to hardcode your list of web servers in your Capistrano recipe — instead, you may want to dynamically fetch the list of web servers from Puppet prior to each deploy. Here’s one way of doing this.

Puppetmaster Configuration

First, you’ll need to turn on storeconfigs on your Puppetmaster. This allows Puppet to store all of its node information in a database (which enables us to access it for other purposes). Note that these instructions assume you have a MySQL database for your storeconfigs.

Next, create a script on your Puppetmaster called /usr/local/bin/list_nodes_by_class.rb. Fill in the database username, password, host and schema name used to access your storeconfigs in lines 4-7. Make sure the script is executable.

#!/usr/bin/ruby
require 'mysql'

MY_USER="yourdbuser"
MY_PASS="yourdbpass"
MY_HOST="yourdbserver"
MY_DB="yourdbname"

QUERY="select h.name from hosts h join resources r on h.id = r.host_id join resource_tags rt on r.id = rt.resource_id join puppet_tags pt on rt.puppet_tag_id = pt.id where pt.name = 'class' and r.restype = 'Class' and r.title = '#{ARGV[0].gsub("'","")}' order by h.name;"

my = Mysql.new(MY_HOST, MY_USER, MY_PASS, MY_DB)
res = my.query(QUERY)
res.each_hash { |row| puts row['name'] }

Capistrano Recipe Modification

Note that I’ll assume here that your Capistrano recipe uses the “web” role to determine where to deploy code to, and that the Puppet class you use to designate web servers is “role_webserver” — you may need to change these.

First, add the following task and helper function to your recipe:


task :set_roles_from_puppet, :roles => :puppetmaster do
 get_nodes_by_puppet_class('role_webserver').each {|s| role :web, s}
end

def get_nodes_by_puppet_class(classname)
 hosts = []
 run "/usr/local/bin/list_nodes_by_class.rb #{classname}", :pty => true do |ch, stream, out|
 out.split("\r\n").each { |host| hosts << host }
 end
 hosts
end

Next, go to wherever you’re defining your roles. Add a role for your Puppetmaster:


role :puppetmaster,  "puppetmaster.mynetwork.com"

Finally, delete your existing definition of the “web” role and replace it with the following:

set_roles_from_puppet

Now, when you do a Capistrano deploy, the destination servers should be dynamically retrieved from the Puppet database.





Puppet and the Ghetto DNS

25 11 2011

Suppose you have a small network of Linux servers powering your web site. You’re probably going to want a way of accessing the servers by hostname from one another — for example, so your web servers can find your database servers without resorting to hardcoding IP addresses in your Apache configuration. What’s the best way to do this?

DNS

Well, if you’re already hosting your own DNS servers, you can consider a split-horizon configuration, which is supported by major DNS daemons like BIND. But if you’re using your ISP’s DNS servers, for example, setting up your own DNS for this purpose seems like overkill (and a pain). Personally, I like to run as few services on my network as possible, just as a matter of principle.

Another option is to just add A records for each of your servers to your site’s public DNS zone. But adding private (RFC1918) IP addresses to the public DNS means that anyone who can operate dig can find a complete list of your servers and their private IP addresses. Some would argue that I’m advocating for security by obscurity, but I just can’t see the upside of unnecessarily exposing information about your internal network to the public. Also, while it’ll work, exposing RFC1918 IPs via the public DNS just seems icky.

Hosts file

The “lightweight” approach to the problem of hostname to IP address resolution is to just add entries to the /etc/hosts file on each server. Of course, this is completely unmaintainable if you have more than 2 or 3 servers.

Hosts file + Puppet = Ghetto DNS

But if you’re using Puppet to automate your server infrastructure, it turns out that exported resources offer a nice solution to this problem. You can create and deploy a simple Puppet module to each machine that exports a Host entry for itself, and then assembles a /etc/hosts file based on the Host entries exported by all the machines Puppet knows about:


class hosts {
 host { "localhost.localdomain":
 ip => "127.0.0.1",
 host_aliases => [ "localhost" ],
 ensure => present,
 }
 @@host { "$fqdn":
 ip => $ipaddress_eth1,
 host_aliases => [ $hostname ],
 ensure => present,
 }

Host <<| |>>
}

This “Ghetto DNS” setup can be just right if manually maintaining hosts files seems impractical, but running your own DNS seems like overkill.