cfn-init
and cfn-signal
commands provided by the aws-cfn-bootstrap module, utilities authored by Amazon Web Services. Amazon’s recommended install method seems to be calling easy_install
against an unversioned tarball artifact:
1
|
|
Here easy_install
downloads the artifact, unpacks it, reads its dependencies, connects to the PyPi package index, retrieves information about where to get those dependencies, and so on. This all works well enough, until one of the many different package sources for one of the module’s dependencies begins to behave erratically. On more than one occassion this process has taken so long to return an error from misbehaving artifact source that all stack deployments subsequently fail due to timeouts.
Having been bitten by this more than once, I determined that vendorizing the aws-cfn-bootstrap code, along with its dependencies, would probably be the best way to make my builds more reliable.
Initially I experimented with virtualenv, but ultimately found it difficult to use for manufacturing a truly portable artifact for this purpose. Additional literature review indicated that repackaging aws-cfn-bootstrap and its dependencies as a Python wheels might be just what I needed.
On a default Amazon AMI, I installed pip via the prescribed installation method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
With pip and wheel installed, building wheels for each module can be done in one simple command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
The aws-cfn-bootstrap-wheelhouse directory we specified has been created and now contains a .whl file for the aws-cfn-bootstrap module and its dependencies
1 2 3 4 5 6 |
|
Creating a tarball of this directory yields an artifact I can place in an S3 bucket for my infrastructure, along side my own copy of get-pip.py. I have versioned these artifaces with a date stamp in their file names, and because there’s nothing proprietary about the artifacts, I have marked them as world-readable. After updating our bootstrap code in the appropriate SparkleFormation registry, the relevant bootstrap script reads as follows:
1 2 3 4 5 |
|
This process is relatively simple and can be distiled into a CI/CD pipeline job, but, as I have been unable to find tagged versions of the module, it might only be appropriate to build new artifacts on a manual trigger.
]]>This article assumes some familiarity with CloudFormation concepts such as stack parameters, resources, mappings and outputs. See the AWS Advent CloudFormation Primer for an introduction.
Although CloudFormation templates are billed as reusable, many users will attest that as these monolithic JSON documents grow larger, they become “all encompassing JSON file[s] of darkness,” and actually reusing code between templates becomes a frustrating copypasta exercise.
From another perspective these JSON documents are actually just hashes, and with a minimal DSL we can build these hashes programmatically. SparkleFormation provides a Ruby DSL for merging and compiling hashes into CFN templates, and helpers which invoke CloudFormation’s intrinsic functions (e.g. Ref, Attr, Join, Map).
SparkleFormation’s DSL implementation is intentionally loose, imposing little of its own opinion on how your template should be constructed. Provided you are already familiar with CloudFormation template concepts and some minimal ammount of Ruby, the rest is merging hashes.
Just as with CloudFormation, the template is the high-level object. In SparkleFormation we instantiate a new template like so:
1
|
|
But an empty template isn’t going to help us much, so let’s step into it and at least insert the required
AWSTemplateFormatVersion
specification:
1 2 3 |
|
In the above case we use the _set
helper method because we are setting a top-level key with a string value.
When we are working with hashes we can use a block syntax, as shown here adding a parameter to the top-level
Parameters
hash that CloudFormation expects:
1 2 3 4 5 6 7 8 9 |
|
SparkleFormation provides primatives to help you build templates out of reusable code, namely:
Here’s a component we’ll name environment
which defines our allowed environment parameter values:
1 2 3 4 5 6 7 8 |
|
Resources, parameters and other CloudFormation configuration written into a SparkleFormation component are statically
inserted into any templates using the load
method. Now all our stack templates can reuse the same component so
updating the list of environments across our entire infrastructure becomes a snap. Once a template has loaded a
component, it can then step into the configuration provided by the component to make modifications.
In this template example we load the environment
component (above) and override the allowed values for the environment
parameter the component provides:
1 2 3 4 5 |
|
Where as components are loaded once at the instantiation of a SparkleFormation template, dynamics are inserted one or more times throughout a template. They iteratively generate unique resources based on the name and optional configuration they are passed when inserted.
In this example we insert a launch_config
dynamic and pass it a config object containing a run list:
1 2 3 4 5 6 |
|
The launch_config
dynamic (not pictured) can then use intrisic functions like Fn::Join
to insert data passed in the config deep inside a launch
configuration, as in this case where we want our template to tell Chef what our run list should be.
Similar to dynamics, a registry entry can be inserted at any point in a SparkleFormation template or dynamic. e.g. a registry entry can be used to share the same metadata between both AWS::AutoScaling::LaunchConfiguration and AWS::EC2::Instance resources.
This JSON template from a previous AWS Advent article provisions a single EC2 instance into an existing VPC subnet and security group:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
Not terrible, but the JSON is a little hard on the eyes. Here’s the same thing in Ruby, using SparkleFormation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
|
Without taking advantage of any of SparkleFormation’s special capabilities, this translation is already a few lines shorter and easier to read as well. That’s a good start, but we can do better.
The template format version specification and parameters required for this template are common to any stack where EC2 compute resources may be used, whether they be single EC2 instances or Auto Scaling Groups, so lets take advantage of some SparkleFormation features to make them reusable.
Here we have a base
component that inserts the common parameters into templates which load it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
Now that the template version and common parameters have moved into the new base
component, we can
make use of them by loading that component as we instantiate our new template, specifying that the
template will override any pieces of the component where the two intersect.
Let’s update the SparkleFormation template to make use of the new base
component:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
Because the base
component includes the parameters we need, the template no longer explicitly
describes them.
Since SparkleFormation is Ruby, we can get a little fancy. Let’s say we want to build 3 subnets into an existing VPC. If we know the VPC’s /16 subnet we can provide it as an environment variable (export VPC_SUBNET="10.1.0.0/16"
), and then call that variable in a template that generates additional subnets:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
Of course we could place the subnet and route table association resources into a dynamic, so that we could just call the dynamic with some config:
1 2 3 |
|
SparkleFormation by itself does not implement any means of sending its output to the CloudFormation
API. In this simple case, a SparkleFormation template named ec2_example.rb
is output to JSON
which you can use with CloudFormation as usual:
1 2 3 4 5 6 |
|
The knife-cloudformation plugin for Chef’s knife
command adds sub-commands for creating, updating,
inspecting and destroying CloudFormation stacks described by SparkleFormation code or plain JSON
templates. Using knife-cloudformation does not require Chef to be part of your toolchain, it simply
leverages knife as an execution platform.
Advent readers may recall a previous article on strategies for reusable CloudFormation templates which advocates a “layer cake” approach to deploying infrastructure using CloudFormation stacks:
The overall approach is that your templates should have sufficient parameters and outputs to be re-usable across environments like dev, stage, qa, or prod and that each layer’s template builds on the next.
Of course this is all well and good, until we find ourselves, once again, copying and pasting. This time its stack outputs instead of JSON, but again, we can do better.
The recent 0.2.0 release of knife-cloudformation adds a new --apply-stack
parameter
which makes operating “layer cake” infrastructure much easier.
When passed one or more instances of --apply-stack STACKNAME
, knife-cloudformation will cache the outputs of the named stack
and use the values of those outputs as the default values for parameters of the same name in the stack you are creating.
For example, a stack “coolapp-elb” which provisions an ELB and an associated security group has been configured with the following outputs:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The values from the ElbName and ElbSecurityGroup would be of use to us in attaching an app server auto scaling group to this ELB, and we could use those values automatically by setting parameter names in the app server template which match the ELB stack’s output names:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Once our coolapp_asg
template uses parameter names that match the output names from the coolapp-elb
stack, we can deploy the app server layer “on top” of the ELB layer using --apply-stack
:
1
|
|
Similarly, if we use a SparkleFormation template to build our VPC, we can set a number of VPC outputs that will be useful when building stacks inside the VPC:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
This ‘apply stack’ approach is just the latest way in which the SparkleFormation tool chain can help you keep your sanity when building infrastructure with CloudFormation.
I hope this brief tour of SparkleFormation’s capabilities has piqued your interest. For some AWS users, the combination of SparkleFormation and knife-cloudformation helps to address a real pain point in the infrastructure-as-code tool chain, easing the development and operation of layered infrastructure.
Here’s some additional material to help you get started:
I have a lot of warm feelings for Sensu, a flexible, scalable open source monitoring framework. At Needle our team has used Chef to build a Sensu instance for each of our environments, allowing us to test our automated monitoring configuration before promoting it to production, just like any other code we deploy.
Speaking of deploying code, isn’t it obnoxious to see alerts from your monitoring system when you know that your CM tool or deploy method is running? We think so too, so I set about writing a Chef handler to take care of this annoyance.
Among Sensu’s virtues is its RESTful API which provides access to the data Sensu servers collect, such as clients & events.
The API also exposes an interface to stashes. Stashes are arbitrary JSON documents, so any JSON formatted data can be stored under the /stashes
API endpoint.
Sensu handlers are expected to check the stashes under the /stashes/silence
path when processing events, and silence events whose client has a matching stash at /stashes/silence/$CLIENT
or whose client and check match a stash at /stashes/silence/$CLIENT/$CHECK
.
Chef’s handler functionality can be used to trigger certain behaviors in response to specific situations during a chef-client run. At this time there are three different handler types implemented by Chef::Handler
:
Combined, Sensu’s stash API endpoint and Chef’s exception and report handlers provide an excellent means for Chef to silence Sensu alerts during the time it is running on a node.
We achieved our goal by implementing Chef::Handler::Sensu::Silence
, which runs as a start handler, and Chef::Handler::Sensu::Unsilence
, which runs as both an exception and a report handler. All of this is bundled up in our chef-sensu-handler
cookbook.
The cookbook installs and configures the handler using the node['chef_client']['sensu_api_url']
attribute. Once configured, the handler will attempt to create a stash under /stashes/silence/$CLIENT
when the Chef run starts, and delete that stash when the Chef run fails or succeeds.
We also wanted to guard against conditions where Chef could fail catastrophically and its exception handlers might not run. To counter that possibility, the handler writes a timestamp and owner name into the stash it creates when silencing the client:
1
|
|
We then authored a Sensu plugin, check-silenced.rb
, which compares the timestamp in existing silence stashes against a configurable timeout (in seconds). Once configured as part of our Sensu monitoring system, this plugin acts as a safety net which prevents clients from being silenced too long.
The LWRP should work with any modern version of Chef. When you use include_recipe
to access the LWRP in your own recipes, the default recipe for this cookbook will install
the required ‘hipchat’ gem.
room
- the name of the room you would like to speak into (requied).token
- authentication token for your HipChat account (required).nickname
- the nickname to be used when speaking the message (required).message
- the message to speak. If a message is not specified, the name of the hipchat_msg
resource is used.notify
- toggles whether or not users in the room should be notified by this message (defaults to true).color
- sets the color of the message in HipChat. Supported colors include: yellow, red, green, purple, or random (defaults to yellow).failure_ok
- toggles whether or not to catch the exception if an error is encountered connecting to HipChat (defaults to true).1 2 3 4 5 6 7 8 9 |
|
You can find this cookbook on github or on the Opscode community site.
]]>aws
cookbook with providers for other AWS resources, I figured it would be worthwhile to create a provider for manipulating these tags and contribute it back upstream.
The result of this Saturday project is the resource_tag
LWRP. Source available here, Opscode ticket here.
add
- Add tags to a resource.update
- Add or modify existing tags on a resource – this is the default action.remove
- Remove tags from a resource, but only if the specified values match the existing ones.force_remove
- Remove tags from a resource, regardless of their values.aws_secret_access_key
, aws_access_key
- passed to Opscode::AWS:Ec2
to authenticate, required.tags
- a hash of key value pairs to be used as resource tags, (e.g. { "Name" => "foo", "Environment" => node.chef_environment }
,) required.resource_id
- resources whose tags will be modified. The value may be a single ID as a string or multiple IDs in an array. If no resource_id
is specified the name attribute will be used.resource_tag
can be used to manipulate the tags assigned to one or more AWS resources, i.e. ec2 instances, ebs volumes or ebs volume snapshots.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
When setting tags on the node’s own EC2 instance, I recommend wrapping resource_tag
resources in a conditional like if node.has_key?('ec2')
so that your recipe will still run on Chef nodes outside of EC2 as well.
Since the code I’ve been using to send messages from Chef recipes to Campfire is virtually identical between a number of our cookbooks, I decided to turn that code into a LWRP that anyone can use in their own recipes. The cookbook for this LWRP is available on github.
tinder
gem (installed by the campfire::default
recipe)subdomain
- the subdomain for your Campfire instance (required)room
- the name of the room you would like to speak into (requied)token
- authentication token for your Campfire account (required)message
- the message to speak. If a message is not specified, the name of the campfire_msg
resource is used.paste
- toggles whether or not to send the message as a monospaced “paste” (defaults to false)play_before
- play the specified sound before speaking the messageplay_after
- play the specified sound after speaking the messagefailure_ok
- toggles whether or not to catch the exception if an error is encountered connecting to Campfire (defaults to true)A list of emoji and sounds available in Campfire can be found here: http://www.emoji-cheat-sheet.com/
1 2 3 4 5 6 7 8 9 |
|
deploy
and deploy_revision
resources provide a useful mechanism for deploying applications as part of a chef-client or chef-solo run, without depending on an external system (e.g. Capistrano.) Many Chef users learning to use these resources for the first time will find that they also need to install an SSH deploy key and an SSH wrapper script for Git before they can make effective use of these deploy resources, and that the Chef wiki doesn’t provide much documentation around this issue.
Enter deploy_wrapper
: a Chef definition which handles the installation of an SSH deploy key and SSH wrapper script to be used by a deploy
or deploy_revision
resource.
Before deploy_wrapper
, a recipe to configure the required resources to make an automated deploy
or deploy_revision
possible might look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
Not counting the source to template files for these resources, thats almost 30 lines of code just to set the stage for a deployment. It didn’t take long for me to grow tired of reusing this rather verbose pattern across a growing number of recipes.
Here’s how I accomplish the same thing with the deploy_wrapper
definition:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Much better, right? Well, a lot shorter anyway. Now let’s talk about what the deploy_wrapper
parameters used in the above example are doing.
The ssh_key_dir
and ssh_wrapper_dir
parameters specify directories which will be created by Chef. In the case of ssh_wrapper_dir
, the git SSH wrapper script will automatically be created in this directory following the pattern “APPNAME_deploy_wrapper.sh”, using the value of the name parameter (in this case, myapp
) in place of “APPNAME”.
Similarly, an SSH key file containing the data passed to the ssh_key_data
parameter will be created in the directory specified as the value for the ssh_key_dir
parameter. The key file will be named following the pattern “APPNAME_deploy_key”, using the value of the name parameter (myapp
) in place of “APPNAME”.
The sloppy
parameter is the only optional one. Because the default configuration of most most ssh installations is to require manual verification when accepting a remote host’s key for the first time, the sloppy
parameter allows one to toggle key checking (StrictHostKeyChecking
) on or off.
When the value for sloppy
is true
, the wrapper script will accept any host key without prompting. The default value for sloppy
is false
, meaning that additional Chef resources, or … *gasp* … manual intervention, will be required in order to set up a known_hosts
file before deployments can run successfully.
Swedish firm Pingdom offers a flexible, affordable service for monitoring the availability and response time of web sites, applications and other services. At Needle we provision an instance of our chat server for each partner we work with, and as a result I’ve found myself creating a Pingdom service check to monitor each of these instances. As you might imagine, this is a rather repetitive task, and the configuration is basically the same for each service check – a process ripe for automation!
Thankfully Pingdom provides a REST API for interacting with the service programatically, which has made it possible for me to write a Chef LWRP for creating and modifying Pingdom service checks. Source available here: http://github.com/cwjohnston/chef-pingdom
Requires Chef 0.7.10 or higher for Lightweight Resource and Provider support. Chef 0.10+ is recommended as this cookbook has not been tested with earlier versions.
A valid username, password and API key for your Pingdom account is required.
This cookbook provides an empty default recipe which installs the required json
gem (verison <=1.6.1). Chef already requires this gem, so it’s really just included in the interests of completeness.
This cookbook provides the Opscode::Pingdom::Check
library module which is required by all the check providers.
This cookbook provides a single resource (pingdom_check
) and corresponding provider for managing Pingdom service checks.
pingdom_check
resources support the actions add
and delete
, add
being the default. Each pingdom_check
resource requires the following resource attributes:
host
- indicates the hostname (or IP address) which the service check will targetapi_key
- a valid API key for your Pingdom accountusername
- your Pingdom usernamepassword
- your Pingdom passwordpingdom_check
resources may also specifiy values for the optional type
and check_params
attributes.
The type
attribute will accept one of the following service check types. If no value is specified, the check type will default to http
.
The optional check_params
attribute is expected to be a hash containing key/value pairs which match the type-specific parameters defined by the Pingdom API. If no attributes are provided for check_params
, the default values for type-specific defaults will be used.
In order to utilize this cookbook, put the following at the top of the recipe where Pingdom resources are used:
1
|
|
The following resource would configure a HTTP service check for the host foo.example.com
:
1 2 3 4 5 6 |
|
The resulting HTTP service check would be created using all the Pingdom defaults for HTTP service checks.
The following resource would configure an HTTP service check for the host bar.example.com
utilizing some of the parameters specific to the HTTP service check type:
1 2 3 4 5 6 7 8 9 10 11 |
|
At this time I consider the LWRP to be incomplete. The two major gaps are as follows:
check_params
does not actually update the service check’s configuration. I have done most of the initial work to implement this (available in the check-updating
branch on github), but there are still bugs.update
action for service checks which modifies existing checks to match the values from check_params
enable
and disable
actions for service checkspingdom_contact
resource)TrueClass
attribute values to "true"
stringscheck_params
valuescontactids
in check_params
Recently I have been experimenting with the logging-as-a-service platform at Loggly. It seems pretty promising, and there’s a free tier for those who are indexing less than 200MB per day.
Since I am using Chef to manage my systems, I decided I would take a crack at writing a LWRP that would allow me to manage devices and inputs on my Loggly account through Chef. This makes it possible for new nodes to register themselves as Loggly devices when they are provisioned, without requiring me to make a trip to the Loggly control panel. The resulting cookbook is available here: http://github.com/cwjohnston/chef-loggly
json
ruby gemnode['loggly']['username']
- Your Loggly username.node['loggly']['password']
- Your Loggly password.In the future these attributes should be made optional so that usernames and passwords can be specified as parameters for resource attributes.
default
- simply installs the json
gem. Chef requires this gem as well, so it should already be available.rsyslog
- creates a loggly input for receiving syslog messages, registers the node as a device on that input and configures rsyslog to forward syslog messages there.loggly_input
- manage a log inputdomain
- The subdomain for your loggly accountdescription
- An optional descriptor for the inputtype
- The kind of input to create. May be one of the following:
http
syslogudp
syslogtcp
syslog_tls
syslogtcp_strip
syslogudp_strip
create
- create the named input (default)delete
- delete the named input1 2 3 4 5 6 |
|
loggly_device
- manage a device which sends logs to an inputThe name of a loggly_device
resource should be the IP address for the device. Loggly doesn’t do DNS lookups, it just wants the device’s IP.
username
- Your Loggly username. if no value is provided for this attribute, the value of node['loggly']['username']
will be used.password
- Your Loggly password. if no value is provided for this attribute, the value of node['loggly']['password']
will be used.domain
- The subdomain for your loggly accountinput
- the name of the input this device should be added toadd
- add the device to the named input (default)delete
- remove the device from the named input1 2 3 4 5 |
|