<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<title type="text">Kartar.Net</title>
<generator uri="https://github.com/mojombo/jekyll">Jekyll</generator>
<link rel="self" type="application/atom+xml" href="http://kartar.net/feed.xml" />
<link rel="alternate" type="text/html" href="http://kartar.net/" />
<updated>2015-08-17T16:18:06-04:00</updated>
<id>http://kartar.net/</id>
<author>
  <name>James Turnbull</name>
  <uri>http://kartar.net/</uri>
  <email>james@lovedthanlost.net</email>
</author>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Data]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---data/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---data</id>
  <published>2015-08-12T00:00:00-04:00</published>
  <updated>2015-08-12T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;Over the course of the series I’ve talked about &lt;a href=&quot;/2015/08/monitoring-survey-2015--effectiveness&quot;&gt;monitoring effectiveness&lt;/a&gt;, &lt;a href=&quot;/2015/08/monitoring-survey-2015---environments&quot;&gt;monitoring environments&lt;/a&gt;,
&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;metrics&lt;/a&gt;, &lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;the tools people use to monitor&lt;/a&gt; and &lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;the demographics&lt;/a&gt; of the survey.&lt;/p&gt;

&lt;p&gt;In this last post I am providing the &lt;a href=&quot;/data/Monitoring_Survey_2015.csv&quot;&gt;anonymized source data&lt;/a&gt; that I based my analysis on.
It’s in CSV form and comes directly from Survey Monkey. The only data I
have removed is the IP address of the respondents to make it anonymous.&lt;/p&gt;

&lt;p&gt;For those interested I used &lt;a href=&quot;http://www.r-project.org/&quot;&gt;R&lt;/a&gt; and &lt;a href=&quot;http://www.rstudio.com/&quot;&gt;R
Studio&lt;/a&gt; to produce the analysis and
&lt;a href=&quot;http://ggplot2.org/&quot;&gt;ggplot2&lt;/a&gt; to produce the graphs.&lt;/p&gt;

&lt;p&gt;P.S. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---data/&quot;&gt;Monitoring Survey 2015 - Data&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 12, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Effectiveness]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---effectiveness/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---effectiveness</id>
  <published>2015-08-11T00:00:00-04:00</published>
  <updated>2015-08-11T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;In the last posts I talked about &lt;a href=&quot;/2015/08/monitoring-survey-2015---environments&quot;&gt;monitoring environments&lt;/a&gt;,
&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;metrics&lt;/a&gt;, &lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;the tools people used in monitoring&lt;/a&gt; and &lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;the demographics&lt;/a&gt; of the survey.&lt;/p&gt;

&lt;p&gt;In this post I am going to look at the questions around the
effectiveness of monitoring, how people handle alerting and the use of
configuration management software.&lt;/p&gt;

&lt;p&gt;As I’ve mentioned in previous posts, the survey got 1,116 responses of
which 884 were complete and my analysis only includes complete
responses.&lt;/p&gt;

&lt;p&gt;This post will cover the questions:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;12. When do you most commonly add monitoring checks or graphs to your environment?
13. Do you ever have unanswered alerts in your monitoring environment?
14. How often does something go wrong that IS NOT detected by your monitoring?
15. Do you use a configuration management tool like Chef, Puppet, Salt or Ansible to manage your monitoring infrastructure?
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;when-do-you-add-monitoring-checks-or-graphs-to-your-environment&quot;&gt;When do you add monitoring checks or graphs to your environment?&lt;/h2&gt;

&lt;p&gt;Question 12 attempts to identify when in the product and infrastructure lifecycle you add monitoring checks to your environment. This is designed to tease out whether your monitoring is proactive or reactive.&lt;/p&gt;

&lt;p&gt;The question had the following choices:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When something goes wrong and we want to monitor for that problem in future.&lt;/li&gt;
  &lt;li&gt;When we build new infrastructure or deploy new applications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve provided a graph showing the distribution of answers.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/whencheck.png&quot; alt=&quot;When checks are added&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that most people, 62.7% of them, add checks when infrastructure or applications are deployed, leaving 37% performing reactive checks.  That’s largely unchanged from &lt;a href=&quot;http://kartar.net/2014/12/monitoring-survey---environments/&quot;&gt;last year’s response&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We’ve also mapped it by organization size.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/whencheckorg.png&quot; alt=&quot;When checks are added by size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that very small and very large organizations are slightly more reactive.&lt;/p&gt;

&lt;h2 id=&quot;do-you-ever-have-unanswered-alerts-in-your-monitoring-environment&quot;&gt;Do you ever have unanswered alerts in your monitoring environment?&lt;/h2&gt;

&lt;p&gt;In Question 13 we’re interested in the measurement of alerting hygiene
and how people respond to alerts. I was interested in seeing how many
people had outstanding alerts and how many actioned them immediately.&lt;/p&gt;

&lt;p&gt;Each respondent had the option to answer the question with:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;No - we action them all immediately&lt;/li&gt;
  &lt;li&gt;Yes - we usually have a few&lt;/li&gt;
  &lt;li&gt;Yes - we usually have some&lt;/li&gt;
  &lt;li&gt;Yes - we usually have a lot&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve provided a graph showing the distribution of answers.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/alert.png&quot; alt=&quot;Alert Behavior&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that the largest group of respondents, 401 or 45%, have at least a few unanswered alerts. This is identical to last year’s results for this category. The next largest group at 196 or 22% of respondents
actions all alerts immediately. A further 19% have some unanswered alerts and 13% have a lot of unanswered alerts.&lt;/p&gt;

&lt;p&gt;I also broke down alert behavior by organization size.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/alertorg.png&quot; alt=&quot;Alert Behavior by Org Size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This year the patterns in this breakdown again felt very familiar. Like last year there is a decrease in alerts being actioned immediately as the organization grows and an increase in volume of alerts that are not actioned.&lt;/p&gt;

&lt;p&gt;I was also planning to add a question about alert fatigue in this year’s survey but was unable to frame one that provided viable data.&lt;/p&gt;

&lt;h2 id=&quot;how-often-does-something-go-wrong-that-is-not-detected-by-your-monitoring&quot;&gt;How often does something go wrong that IS NOT detected by your monitoring?&lt;/h2&gt;

&lt;p&gt;Question 14 asked about outages and failures in environments that are
NOT detected via monitoring. The respondents had the option of
answering:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Frequently&lt;/li&gt;
  &lt;li&gt;Occasionally&lt;/li&gt;
  &lt;li&gt;Never&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve graphed the responses here:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/monfail.png&quot; alt=&quot;Monitoring Failure&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that 81% of respondents had something occasionally go wrong
that wasn’t detected by monitoring. 11% stated that failures frequently occurred that were not detected by monitoring. 8% stated that there were never undetected
failures in their environments. This is very close to last year’s results.&lt;/p&gt;

&lt;p&gt;I further analyzed the response by organization size.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/monfailorg.png&quot; alt=&quot;Monitoring Failures by Org Size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Again we see some familiar patterns with more frequent unmonitored
failures in larger organizations.&lt;/p&gt;

&lt;h2 id=&quot;do-you-use-a-configuration-management-tool&quot;&gt;Do you use a configuration management tool&lt;/h2&gt;

&lt;p&gt;The last question, Question 15, asked respondents if they used
Configuration Management to manage their monitoring environment.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/cm.png&quot; alt=&quot;Use of Configuration Management&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This year 71.7% of respondents did use Configuration Management to manage
their monitoring, which is in line with last year’s results.&lt;/p&gt;

&lt;p&gt;0.3% or 3 respondents did not know what configuration management was.&lt;/p&gt;

&lt;p&gt;I also analyzed the responses by organization size.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/cmorg.png&quot; alt=&quot;Use of Configuration Management by Org Size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Again this year we see less use of configuration management in larger organizations.&lt;/p&gt;

&lt;p&gt;P.S. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Monitoring Survey 2015 - Effectiveness&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 11, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Metrics]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---metrics/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---metrics</id>
  <published>2015-08-10T00:00:00-04:00</published>
  <updated>2015-08-10T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;style&gt;
#otable {
    font-family: &quot;Trebuchet MS&quot;, Arial, Helvetica, sans-serif;
    width: 100%;
    border-collapse: collapse;
}

#otable td, #otable th {
    font-size: 1em;
    border: 1px solid #98bf21;
    padding: 3px 7px 2px 7px;
}

#otable th {
    font-size: 1.1em;
    text-align: left;
    padding-top: 5px;
    padding-bottom: 4px;
    background-color: #A7C942;
    color: #ffffff;
}

#otable tr.alt td {
    color: #000000;
    background-color: #EAF2D3;
}
&lt;/style&gt;

&lt;p&gt;In the last posts I talked about &lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;the tools people used in monitoring&lt;/a&gt;, &lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;the demographics&lt;/a&gt;, and what &lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;environments people monitor&lt;/a&gt;.
In this post I am going to look at the questions around collecting
metrics and what those metrics are used for by respondents.&lt;/p&gt;

&lt;p&gt;As I’ve mentioned in previous posts, the survey got 1,116 responses of which 884 were complete.&lt;/p&gt;

&lt;p&gt;This post will cover the questions:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;7. Do you collect metrics on your infrastructure and applications?
8. What tools do you use to collect metrics?
9. What tools do you use to store your metrics?
10. What tools do you use to visualize your metrics?
11. If you collect metrics, what do you use the metrics you track for?
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;collecting-metrics&quot;&gt;Collecting Metrics&lt;/h2&gt;

&lt;p&gt;Question 7 asked if the respondents collected metrics. It was a Yes/No question.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/metcoll.png&quot; alt=&quot;Metrics Collection&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that the overwhelming majority, 88% in fact, of respondents collect metrics (slightly down from 90% &lt;a href=&quot;http://kartar.net/2014/12/monitoring-survey---metrics/&quot;&gt;last year&lt;/a&gt;). That continues to be a pretty conclusive indication that metrics matter.&lt;/p&gt;

&lt;p&gt;I also broke the responses down by organization size. I was curious to see what size organizations collected the least metrics.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/metorg.png&quot; alt=&quot;Metrics Collection by Organization Size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that there a pretty even distribution of people that do not collect metrics across organization size.&lt;/p&gt;

&lt;h2 id=&quot;metric-collection-tools&quot;&gt;Metric collection tools&lt;/h2&gt;

&lt;p&gt;I also asked respondents to tell me about the tools they used to collect metrics. There was a choice of potential tools and an Other option. The choice of tools included:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;collectd&lt;/li&gt;
  &lt;li&gt;Cube&lt;/li&gt;
  &lt;li&gt;DataDog&lt;/li&gt;
  &lt;li&gt;Ganglia&lt;/li&gt;
  &lt;li&gt;Librato&lt;/li&gt;
  &lt;li&gt;Munin&lt;/li&gt;
  &lt;li&gt;New Relic&lt;/li&gt;
  &lt;li&gt;OpenTSDB&lt;/li&gt;
  &lt;li&gt;StatsD&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/metcoltool.png&quot; alt=&quot;Metrics Collection - Tools&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see that both collectd and StatsD are heavily used with New Relic coming in third, in keeping with the data revealed in &lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---tools/&quot;&gt;the tool analysis results&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The results of the Other question was also interesting. I’ve only included tools that occurred more than once to keep the list manageable.&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metrics collection tools - Other&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;In-house&lt;/td&gt;
      &lt;td&gt;77&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Diamond&lt;/td&gt;
      &lt;td&gt;26&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Sensu&lt;/td&gt;
      &lt;td&gt;23&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Zabbix&lt;/td&gt;
      &lt;td&gt;19&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;ELK&lt;/td&gt;
      &lt;td&gt;17&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Cacti&lt;/td&gt;
      &lt;td&gt;16&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Nagios&lt;/td&gt;
      &lt;td&gt;13&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Check_MK&lt;/td&gt;
      &lt;td&gt;13&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Centreon&lt;/td&gt;
      &lt;td&gt;11&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;pnp4nagios&lt;/td&gt;
      &lt;td&gt;9&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Splunk&lt;/td&gt;
      &lt;td&gt;9&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SolarWinds&lt;/td&gt;
      &lt;td&gt;8&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;AppDynamics&lt;/td&gt;
      &lt;td&gt;7&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Prometheus&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Icinga2&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;NetCrunch&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Shinken&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Zenoss&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;jmxtrans&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;DropWizard&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Observium&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dataloop&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;OpenNMS&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Riemann&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Coda’s Metrics&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Cloudwatch&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;OMD&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dynatrace&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Smokeping&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Graphite&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Stackdriver&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Xymon&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;CopperEgg&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Ganglia&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;LogicMonitor&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SignalFX&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The high number respondents building their own metrics collection tools (77 reported having in-house tooling) is interesting. It potentially suggests that there is still a segment of the market that isn’t happy with the available tooling out there.&lt;/p&gt;

&lt;p&gt;Also interesting was the support for &lt;a href=&quot;https://github.com/python-diamond/Diamond&quot;&gt;Diamond&lt;/a&gt;, a Python-based metrics collection tools originally written by the &lt;a href=&quot;https://www.brightcove.com/en/&quot;&gt;Brightcove&lt;/a&gt; team and now maintained as a separate open source project.&lt;/p&gt;

&lt;h2 id=&quot;metric-storage-tools&quot;&gt;Metric storage tools&lt;/h2&gt;

&lt;p&gt;We also asked respondents to name the tools they used to store metrics. The options for the question included:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;DataDog&lt;/li&gt;
  &lt;li&gt;Graphite&lt;/li&gt;
  &lt;li&gt;Hosted Graphite&lt;/li&gt;
  &lt;li&gt;InfluxDB&lt;/li&gt;
  &lt;li&gt;Librato&lt;/li&gt;
  &lt;li&gt;OpenTSDB&lt;/li&gt;
  &lt;li&gt;RRDtool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There was also an Other option we’ll report below.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/metstotool.png&quot; alt=&quot;Metrics Storage - Tools&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The clear winner here is &lt;a href=&quot;http://graphite.wikidot.com/&quot;&gt;Graphite&lt;/a&gt;. As one of the longer standing tools in the metrics space it’s not overly surprising it is so well represented. Also present in large numbers is &lt;a href=&quot;http://oss.oetiker.ch/rrdtool/&quot;&gt;RRDTool&lt;/a&gt;, an even older tool in the metric’s space. The newer generation of tools is represented by &lt;a href=&quot;https://influxdb.com/&quot;&gt;InfluxDB&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These are the responses to the Other option. I’ve only included tools that occurred more than once to keep the list manageable.&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metrics storage tools - Other&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;ELK&lt;/td&gt;
      &lt;td&gt;28&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;In-house&lt;/td&gt;
      &lt;td&gt;27&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Splunk&lt;/td&gt;
      &lt;td&gt;14&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Zabbix&lt;/td&gt;
      &lt;td&gt;14&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;New Relic&lt;/td&gt;
      &lt;td&gt;9&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;MySQL&lt;/td&gt;
      &lt;td&gt;8&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Prometheus&lt;/td&gt;
      &lt;td&gt;8&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Cacti&lt;/td&gt;
      &lt;td&gt;8&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SignalFX&lt;/td&gt;
      &lt;td&gt;7&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;AppDynamics&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;NetCrunch&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dataloop&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SolarWinds&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Stackdriver&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Zenoss&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Cassandra&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;CopperEgg&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;MSSQL&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Ganglia&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;postgreSQL&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Circonus&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;LogicMonitor&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Check_MK&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;pnp4nagios&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SPM&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;OpenNMS&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;kairosdb&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Xymon&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Redis&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Interesting to note here is the people using the ELK stack and in-house tools to store their metric data. I’ve been seeing a lot of tools and services converting data and metrics into Logstash’s JSON format and using Logstash as a filtering router and Elasticsearch as storage.&lt;/p&gt;

&lt;h2 id=&quot;metric-visualization-tools&quot;&gt;Metric visualization tools&lt;/h2&gt;

&lt;p&gt;Our last question focussed on metrics visualization tools.&lt;/p&gt;

&lt;p&gt;Respondents had a choice of the following tools:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;D3&lt;/li&gt;
  &lt;li&gt;Grafana&lt;/li&gt;
  &lt;li&gt;Graphene&lt;/li&gt;
  &lt;li&gt;Graphite&lt;/li&gt;
  &lt;li&gt;Highcharts&lt;/li&gt;
  &lt;li&gt;Rickshaw&lt;/li&gt;
  &lt;li&gt;Tessera&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Respondents could also select an Other option and specify other tools.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/metvistool.png&quot; alt=&quot;Metrics Visualization - Tools&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Here &lt;a href=&quot;http://grafana.org/&quot;&gt;Grafana&lt;/a&gt; is a clear favorite. Likely given its ability to sit on top of Graphite, InfluxDB and OpenTSDB. The next largest tool was Graphite itself and then, with a long drop-off, the &lt;a href=&quot;http://d3js.org/&quot;&gt;D3 Javascript framework&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These are the responses to the Other option. I’ve only included tools that occurred more than once to keep the list manageable.&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metrics Visualization tools - Other&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;In-house&lt;/td&gt;
      &lt;td&gt;54&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;ELK&lt;/td&gt;
      &lt;td&gt;35&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;pnp4nagios&lt;/td&gt;
      &lt;td&gt;27&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;DataDog&lt;/td&gt;
      &lt;td&gt;24&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Cacti&lt;/td&gt;
      &lt;td&gt;22&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Zabbix&lt;/td&gt;
      &lt;td&gt;17&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Splunk&lt;/td&gt;
      &lt;td&gt;13&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Munin&lt;/td&gt;
      &lt;td&gt;13&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;New Relic&lt;/td&gt;
      &lt;td&gt;10&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Ganglia&lt;/td&gt;
      &lt;td&gt;8&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Observium&lt;/td&gt;
      &lt;td&gt;7&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Librato&lt;/td&gt;
      &lt;td&gt;7&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;NetCrunch&lt;/td&gt;
      &lt;td&gt;7&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Centreon&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;AppDynamics&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SolarWinds&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dataloop&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RRDTool&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dashing&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;OpenNMS&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SignalFX&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Stackdriver&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Promdash&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Check_MK&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;MRTG&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;pnp&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Nagios&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Circonus&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Graphite&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Tableau&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;CopperEgg&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Xymon&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Metrilyx&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Riemann&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Zenoss&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;LogicMonitor&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;SPM&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Nagiosgraph&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;OpenTSDB&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;StatusWolf&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Visage&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Again present are a lot of in-house tools and the ELK stack in the form of Kibana. Given the presence of lots of Nagios users it’s also not a surprise to see &lt;a href=&quot;https://docs.pnp4nagios.org/start&quot;&gt;pnp4nagios&lt;/a&gt; represented.&lt;/p&gt;

&lt;h2 id=&quot;the-purpose-of-metrics-collection&quot;&gt;The purpose of metrics collection&lt;/h2&gt;

&lt;p&gt;I also asked respondents why they collected metrics. As with last year I was curious whether respondents were collecting data for performance analysis or as a fault detection tool. There’s a strong movement in more modern monitoring methodologies to consider metrics a fault detection tool in their own right. I was interested to see if this thinking had grown from last year.&lt;/p&gt;

&lt;p&gt;Respondents were able to select one or more choice from the list of:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Performance analysis and trending&lt;/li&gt;
  &lt;li&gt;Fault and Anomaly detection&lt;/li&gt;
  &lt;li&gt;Capacity Planning&lt;/li&gt;
  &lt;li&gt;A/B Testing&lt;/li&gt;
  &lt;li&gt;We don’t do anything with collected metrics&lt;/li&gt;
  &lt;li&gt;Other&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If respondents selected “No”, that they did not collect
metrics, the previous question logic skipped them to the next
question.&lt;/p&gt;

&lt;p&gt;I’ve produced a summary table of respondents and their selections.&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metrics Purpose&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Performance analysis and trending&lt;/td&gt;
      &lt;td&gt;63%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Fault and Anomaly detection&lt;/td&gt;
      &lt;td&gt;53%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Capacity Planning&lt;/td&gt;
      &lt;td&gt;45%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;A/B Testing&lt;/td&gt;
      &lt;td&gt;11%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;We don’t do anything with collected metrics&lt;/td&gt;
      &lt;td&gt;3%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;We have see that 63% of respondents specified performance analysis and trending as a reason for collecting metrics. Below that 53% of respondents specified
that they used metrics for Fault and anomaly detection. This is &lt;a href=&quot;http://kartar.net/2014/12/monitoring-survey---metrics/&quot;&gt;10% lower than last year’s survey&lt;/a&gt;. The next largest group, 45%, used metrics for capacity planning.&lt;/p&gt;

&lt;p&gt;A very small group, 11%, used metrics for A/B testing.&lt;/p&gt;

&lt;p&gt;I also summarized the Other responses as a table&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metrics Purpose - Other&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Reporting&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Dashboards&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Alerting&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Business KPIs&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Slow call traces&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Marketing&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Retrospectives&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Power management&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Fault diagnosis&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Incident response&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Billing&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;P.S. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Monitoring Survey 2015 - Metrics&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 10, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Environments]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---environments/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---environments</id>
  <published>2015-08-07T00:00:00-04:00</published>
  <updated>2015-08-07T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;In the last posts I’ve talked about the &lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;the tools people used in monitoring&lt;/a&gt; and &lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;the demographics&lt;/a&gt; of the survey.&lt;/p&gt;

&lt;p&gt;In this post I am going to look at the question around what parts of
people’s environments are monitored. As I’ve mentioned in previous posts, the survey got 1,116 responses of which 884 were complete.&lt;/p&gt;

&lt;p&gt;This post will cover the question:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;6. What parts of your environment do you monitor? Please select all the apply.
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;what-parts-of-your-environment-do-you-monitor&quot;&gt;What parts of your environment do you monitor?&lt;/h2&gt;

&lt;style&gt;
#otable {
    font-family: &quot;Trebuchet MS&quot;, Arial, Helvetica, sans-serif;
    width: 100%;
    border-collapse: collapse;
}

#otable td, #otable th {
    font-size: 1em;
    border: 1px solid #98bf21;
    padding: 3px 7px 2px 7px;
}

#otable th {
    font-size: 1.1em;
    text-align: left;
    padding-top: 5px;
    padding-bottom: 4px;
    background-color: #A7C942;
    color: #ffffff;
}

#otable tr.alt td {
    color: #000000;
    background-color: #EAF2D3;
}
&lt;/style&gt;

&lt;p&gt;With Question 6 I am most interested in understanding what types of
infrastructure are monitored. Especially in areas beyond traditional host-based monitoring. As with last year I asked about network and application monitoring. I also introduced a new category called &lt;code&gt;Cloud Infrastructure&lt;/code&gt; in response to feedback on this question.&lt;/p&gt;

&lt;p&gt;Overall, I divided monitoring types into:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Server Infrastructure&lt;/li&gt;
  &lt;li&gt;Cloud Infrastructure&lt;/li&gt;
  &lt;li&gt;Network Infrastructure&lt;/li&gt;
  &lt;li&gt;Application logic&lt;/li&gt;
  &lt;li&gt;Business logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve compiled the results into a summary table.&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Environments Monitored&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Server Infrastructure&lt;/td&gt;
      &lt;td&gt;81%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Cloud Infrastructure&lt;/td&gt;
      &lt;td&gt;49%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Network Infrastructure&lt;/td&gt;
      &lt;td&gt;57%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Application logic&lt;/td&gt;
      &lt;td&gt;59%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Business logic&lt;/td&gt;
      &lt;td&gt;29%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;81% of respondents perform Server Infrastructure
monitoring. 49% monitor Cloud Infrastructure. That half the respondents have cloud infrastructure to monitor is likely a consequence of selection bias in the respondent pool.&lt;/p&gt;

&lt;p&gt;A smaller 57% of respondents monitor Network
Infrastructure. This fits with the results from last year, where I had expected more network monitoring. I posited that this may be related to the silo’ing of network management in many organizations into a Network-specific team or be a selection bias.&lt;/p&gt;

&lt;p&gt;A slightly smaller group than &lt;a href=&quot;http://kartar.net/2014/12/monitoring-survey---environments/&quot;&gt;last year&lt;/a&gt; perform Application and Business logic monitoring with 59% and 29% respectively.&lt;/p&gt;

&lt;p&gt;This year I also added an “Other” category to cover other environments or elements that people might monitor.&lt;/p&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Environments Monitored - Other&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Application&lt;/td&gt;
      &lt;td&gt;12&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Database&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Plant and Physical site&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Workstations&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Backups&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;External services&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;I’m assuming &lt;code&gt;Application&lt;/code&gt; here and &lt;code&gt;Application logic&lt;/code&gt; above are related. A smaller group also considered database monitoring as a separate category.&lt;/p&gt;

&lt;p&gt;In the next post I’ll be looking at metrics and their use in monitoring.&lt;/p&gt;

&lt;p&gt;P.S. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---environments/&quot;&gt;Monitoring Survey 2015 - Environments&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 07, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Tools]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---tools/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---tools</id>
  <published>2015-08-06T00:00:00-04:00</published>
  <updated>2015-08-06T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;In this series I am looking at the results of my recent &lt;a href=&quot;/2015/08/monitoring-survey---background/&quot;&gt;monitoring survey&lt;/a&gt; and specifically the monitoring tools being used by respondents. As I’ve mentioned in previous posts, the survey got 1,116 responses of which 884 were complete.&lt;/p&gt;

&lt;p&gt;This post will cover the question:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;5. What tools do you use for monitoring? (Choose all that apply)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Every respondent was required to answer question five. Last year I asked about primary tools and forced respondents to select a single “primary” tool. Feedback indicated that this artificially constrained respondents and many people struggled to select a single tool. This year I allowed respondents to select all tools that they used for monitoring.&lt;/p&gt;

&lt;h2 id=&quot;monitoring-tools&quot;&gt;Monitoring Tools&lt;/h2&gt;

&lt;p&gt;This graph shows the monitoring tools selected. The clear winner this year is again &lt;a href=&quot;https://www.nagios.org/&quot;&gt;Nagios&lt;/a&gt;. This suggests that advances in monitoring approaches are still potentially embryonic and evolutionary rather than revolutionary.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/ptools.png&quot; alt=&quot;Monitoring Tools&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Interestingly however the next two most popular choices are AWS Cloudwatch and New Relic. This data suggests a few interesting potential trends:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We’re starting to see more SAAS-based monitoring.&lt;/li&gt;
  &lt;li&gt;People are potentially using New Relic and the like as an Application Performance Management or APM tool in conjunction with other tools.&lt;/li&gt;
  &lt;li&gt;With CloudWatch it is also possible that companies that use AWS hosting are using this to supplement and feed into existing monitoring.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also interesting is that the number of &lt;a href=&quot;https://sensuapp.org/&quot;&gt;Sensu&lt;/a&gt; users has doubled from last year. That’s fairly rapid growth but is only half the usage of Nagios.&lt;/p&gt;

&lt;p&gt;There’s also a large number of home-grown tools, 230 people responded that they have a home-grown tool. That’s six times the number who indicated they had a home-grown tool last year. As a result I’ll be adding a question or questions to attempt to unpack that in next year’s survey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note: You can find last year’s results in &lt;a href=&quot;http://kartar.net/2014/11/monitoring-survey---tools/&quot;&gt;this blog post&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;other-tools&quot;&gt;Other Tools&lt;/h2&gt;

&lt;p&gt;This is the breakdown of the &lt;code&gt;Other&lt;/code&gt; category. 363 respondents specified other tools not specifically listed in Question 5. This table shows the summary listing of all other tools specified.&lt;/p&gt;

&lt;style&gt;
#otable {
    font-family: &quot;Trebuchet MS&quot;, Arial, Helvetica, sans-serif;
    width: 100%;
    border-collapse: collapse;
}

#otable td, #otable th {
    font-size: 1em;
    border: 1px solid #98bf21;
    padding: 3px 7px 2px 7px;
}

#otable th {
    font-size: 1.1em;
    text-align: left;
    padding-top: 5px;
    padding-bottom: 4px;
    background-color: #A7C942;
    color: #ffffff;
}

#otable tr.alt td {
    color: #000000;
    background-color: #EAF2D3;
}
&lt;/style&gt;

&lt;table id=&quot;otable&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Other tools&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;Count&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;SolarWinds&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;23&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Pingdom&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;21&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Check_MK&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;19&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;ELK&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;19&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Shinken&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;18&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Munin&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;16&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Splunk&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;13&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Graphite&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;13&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;AppDynamics&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;12&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Cacti&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;PRTG&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;HP&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Consul&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Monit&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;LogicMonitor&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Monit&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;OpenNMS&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;NetCrunch&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;8&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Prometheus&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;8&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Dynatrace&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;8&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Observium&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;7&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;OMD&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;7&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Nimsoft&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Collectd&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Ganglia&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Circonus&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Dataloop&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Op5&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Scout&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Sentry&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;SignalFX&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Grafana&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Rackspace Cloud Monitoring&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Stackdriver&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Hyperic&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Statsd&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;NodePing&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;CopperEgg&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Monitis&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;What’s Up&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;PagerDuty&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Smokeping&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Netcool&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;MongoDB Manager Service&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;OpenTSDB&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;naemon&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;SPM&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Bosun&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Flapjack&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Kibana&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;BMC&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;CA&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;StackDriver&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;New Relic&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Loggly&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Tivoli&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Azure AppInsights&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;PCP/Vector&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Uptime robot&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Sysdig Cloud&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Graphite-beacon&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Opsware&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Alerta&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;graphite-pager&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;ScienceLogic EM7&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Netuitive&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;ServerSpec&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Seyren&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Sitescope&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;NMSaaS&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Intermapper&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;SNMP&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Opsmatic&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Logwatch&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Monasco&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Big Brother&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Wavefront&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Boundary&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;locust&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Server density&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Elasticsearch&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;torrus&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;LibreNMS&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Metrics.net&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Fluentd&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;ITRS Geneos&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Argo&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;uptrends&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Livewatch&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;vRealize Operations&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;MonYog&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Jennifer&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Icinga2&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;mon&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Pulseway&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;diamond&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Moogsoft&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;CloudMonix&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;www.cronalarm.com&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;VividCortex&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Runscope&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Rancid&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Catchpoint&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Truk&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Kubernetes&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Gomez/Compuware&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Nagios&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;COTS&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Graylog&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Tensor&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Elastic Watcher&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;ruxit&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;SumoLogic&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Neustar&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Traverse&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Squash&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;The Dude&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;RightScale&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Geckoboard&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Pandora&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;VeeamOne&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;StatusWolf&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Keymetrics.io for NodeJS&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Logsene&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;WebNMS&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Cloudhealth&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;A lot of the use cases here appear to be more domain-specific monitoring: network specific tools like SolarWinds and Pingdom or log management tools like the ELK stack and Splunk.&lt;/p&gt;

&lt;p&gt;Newcomer &lt;a href=&quot;http://prometheus.io/&quot;&gt;Prometheus&lt;/a&gt; also appeared with 8 respondents stating they used it.&lt;/p&gt;

&lt;p&gt;Finally it was also interesting to see 9 respondents report using &lt;a href=&quot;https://www.consul.io/&quot;&gt;Consul&lt;/a&gt; for monitoring. There was a strong negative reaction to &lt;a href=&quot;https://vividcortex.com/blog/2015/05/22/consul-for-cluster-health-monitoring/&quot;&gt;a recent post&lt;/a&gt; suggesting that approach.&lt;/p&gt;

&lt;p&gt;In the next post I’ll look at what environments people monitor.&lt;/p&gt;

&lt;p&gt;P.S. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---tools/&quot;&gt;Monitoring Survey 2015 - Tools&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 06, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Demographics]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---demographics/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---demographics</id>
  <published>2015-08-05T00:00:00-04:00</published>
  <updated>2015-08-05T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;In an earlier post I talked about the 2015 edition of the &lt;a href=&quot;/2015/7/monitoring-survey-2015-background/&quot;&gt;monitoring survey&lt;/a&gt; and the background
to
it. In this post, the first of several posts analyzing the results, I am
going to look at the demographics of the responses.&lt;/p&gt;

&lt;p&gt;The survey got 1,116 responses of which 884 were complete.&lt;/p&gt;

&lt;p&gt;This post will cover the questions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Which of the following best describes your IT job role?&lt;/li&gt;
  &lt;li&gt;How big is your organization?&lt;/li&gt;
  &lt;li&gt;Are you responsible for IT monitoring in your organization&lt;/li&gt;
  &lt;li&gt;If you are not responsible for monitoring, who is?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everyone was required to answer questions 1 to 3. If they answered
“No” to question 3 then they were prompted with question 4. If they
answered that they didn’t do any monitoring they were presented with the
end of the survey. Otherwise they moved onto the next question on the
form.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note - you can find last’s years answers to these questions &lt;a href=&quot;http://kartar.net/2014/11/monitoring-survey---demographics/&quot;&gt;in this blog post&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;job-roles&quot;&gt;Job Roles&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/roles.png&quot; alt=&quot;Roles&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Operations, SysAdmins and SRE staff represented 40% of the respondents. This compares to 49% of last’s year respondents.  The next largest group being DevOps at 28% of respondents which compares with 33% last year. A slightly higher percentage, 12%, of respondents reported themselves as developers. This compares to 9.2% last year. This year 15% of respondents classed themselves as management of some kind. An increase of 11% from last year.&lt;/p&gt;

&lt;p&gt;As with last year’s results, the bias towards Operations roles is likely related to the communities where the survey was distributed. But it also may be related
to Operations being the traditional owners of monitoring.&lt;/p&gt;

&lt;h2 id=&quot;organization-size&quot;&gt;Organization Size&lt;/h2&gt;

&lt;p&gt;I also asked respondents about the size of their organization.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/compsize.png&quot; alt=&quot;Company Size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The results are reasonably well distributed across organizations of
various sizes. The largest group, 31%, are small organizations of 1 to
50 employees. Closely behind this, at 21%, are slightly larger
organizations of 50 to 250 employees. In the third place are
organizations of larger than 1000 employees at 18%. This is very similar to &lt;a href=&quot;http://kartar.net/2014/11/monitoring-survey---demographics/&quot;&gt;last year’s demographic results&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;roles-by-organization-size&quot;&gt;Roles by Organization Size&lt;/h2&gt;

&lt;p&gt;I also created an overlay of roles distributed by organization size.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/respsize.png&quot; alt=&quot;Roles by Organization Size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The graph reveals results similar to last year with the same slightly higher
distribution of developers responding from smaller organizations and the
more visible presence of architects and security folks in larger
enterprises. We also see the influx of management respondents in the two largest categories of organization.&lt;/p&gt;

&lt;h2 id=&quot;monitoring-responsibility&quot;&gt;Monitoring Responsibility&lt;/h2&gt;

&lt;p&gt;I also asked respondents if they were responsible for monitoring or if
the task belonged to someone else.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/resp.png&quot; alt=&quot;Responsibility for Monitoring&quot; /&gt;&lt;/p&gt;

&lt;p&gt;81% of respondents, were responsible for
monitoring. A further 17% of respondents were not responsible for
monitoring (slightly up from 15% last year). A small group, 1.6% of respondents, indicated that their organization did not do monitoring at all. This is slightly down from last year’s result of 2.5%.&lt;/p&gt;

&lt;p&gt;In the case where respondents were not responsible for monitoring I
asked them to indicate which groups were responsible. The respondents
could specify all the groups that were involved in monitoring. I’ve
rolled up the multiple responses into a summary graph.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/othresp.png&quot; alt=&quot;Other Responsibility for Monitoring&quot; /&gt;&lt;/p&gt;

&lt;p&gt;These results again reflect the distribution of roles established by
respondent’s who did manage monitoring. Strangely, last year’s category of Monitoring Team did not reappear this year.&lt;/p&gt;

&lt;p&gt;I’ve also broken out those people who don’t monitor. Firstly, I’ve looked at the breakdown of roles across organization who do not monitor at all.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/rolesdont.png&quot; alt=&quot;Roles who don&#39;t monitor by size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I’ve also broken out the count of people by organization size who don’t monitor.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/7/sizedont.png&quot; alt=&quot;Count of respondents who don&#39;t monitor by size&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Obviously it’s a very small sample size (18 respondents) but the largest group of people who don’t monitor are in smaller organizations.&lt;/p&gt;

&lt;p&gt;In the next post I’ll be looking at the tools identified in the survey.&lt;/p&gt;

&lt;p&gt;P.S. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Monitoring Survey 2015 - Demographics&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 05, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015 - Background]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/08/monitoring-survey-2015---background/"/>
  <id>http://kartar.net/2015/08/monitoring-survey-2015---background</id>
  <published>2015-08-04T00:00:00-04:00</published>
  <updated>2015-08-04T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;As many of you are aware I recently ran a small Monitoring survey. I ran
&lt;a href=&quot;http://kartar.net/2014/11/monitoring-survey---background/&quot;&gt;a similar survey last year&lt;/a&gt; and decided to see if the results had changed. Assuming interest continues I’ll run it again next year too.&lt;/p&gt;

&lt;p&gt;Again, the intent of the survey was to understand the
state of maturity across some key areas of monitoring. I was
specifically interested in what sort of monitoring people were doing,
some idea of why they were doing that monitoring, and what tools they
were using to do that monitoring. I am also writing &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;a book about monitoring&lt;/a&gt; and wanted to get some insights that could help shape the book.&lt;/p&gt;

&lt;p&gt;The survey greatly benefited from community feedback and was tweaked in
response to that and the data I received last year.&lt;/p&gt;

&lt;p&gt;This year the survey was 15 questions across 5 pages. The questions
(which included some skip logic) are reproduced here:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Which of the following best describes your IT job role?&lt;/li&gt;
  &lt;li&gt;How big is your organization?&lt;/li&gt;
  &lt;li&gt;Are you responsible for IT monitoring in your organization&lt;/li&gt;
  &lt;li&gt;If you are not responsible for monitoring, who is?&lt;/li&gt;
  &lt;li&gt;What tools do you use for monitoring? (Choose all that apply)&lt;/li&gt;
  &lt;li&gt;What parts of your environment do you monitor? Please select all the
apply.&lt;/li&gt;
  &lt;li&gt;Do you collect metrics on your infrastructure and applications?&lt;/li&gt;
  &lt;li&gt;What tools do you use to collect metrics? (Choose all that apply)&lt;/li&gt;
  &lt;li&gt;What tools do you use to store your metrics?&lt;/li&gt;
  &lt;li&gt;What tools do you use to visualize your metrics?&lt;/li&gt;
  &lt;li&gt;If you collect metrics, what do you use the metrics you track for?
(Select all that apply)&lt;/li&gt;
  &lt;li&gt;When do you most commonly add monitoring checks or graphs to your
environment?&lt;/li&gt;
  &lt;li&gt;Do you ever have unanswered alerts in your monitoring environment?&lt;/li&gt;
  &lt;li&gt;How often does something go wrong that IS NOT detected by your
monitoring?&lt;/li&gt;
  &lt;li&gt;Do you use a configuration management tool like Chef, Puppet, Salt
or Ansible to manage your monitoring infrastructure?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The survey was launched 6/15/2015 and ran until 7/20/2015. It was
advertised on this blog, Twitter, and a number of monitoring, DevOps,
SysAdmin and tools events, publications and mailing lists. As a result
there’s likely some bias in the responses towards more open source,
DevOps, Operations and startup-centric communities.&lt;/p&gt;

&lt;p&gt;In total there were 1,116 response (slightly more than last year’s
1,016), of which 884 were complete (866 last year). In my analysis I’ve considered complete and some partial responses where appropriate.&lt;/p&gt;

&lt;p&gt;I’ll be again analyzing each section of the survey in a series of posts,
starting with the demographics of the respondents. Once I’ve posted my
analysis I’ll be making the source data available to anyone who wants to
use it.&lt;/p&gt;

&lt;p&gt;The posts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---background/&quot;&gt;Background&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---demographics/&quot;&gt;Demographics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---tools/&quot;&gt;Tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---environments/&quot;&gt;Environments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---metrics/&quot;&gt;Metrics&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---effectiveness/&quot;&gt;Effectiveness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/2015/08/monitoring-survey-2015---data/&quot;&gt;Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/08/monitoring-survey-2015---background/&quot;&gt;Monitoring Survey 2015 - Background&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on August 04, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[The Art of Monitoring sample chapter]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/06/aom-sample-chapter/"/>
  <id>http://kartar.net/2015/06/aom-sample-chapter</id>
  <published>2015-06-23T00:00:00-04:00</published>
  <updated>2015-06-23T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;&lt;strong&gt;TL;DR - &lt;a href=&quot;http://artofmonitoring.com/TheArtOfMonitoring_sample.pdf&quot;&gt;The Art of Monitoring has a sample chapter&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I’m writing a new book on monitoring rather illustriously called &lt;a href=&quot;http://artofmonitoring.com/&quot;&gt;The
Art of Monitoring&lt;/a&gt;. I’ve just released a
sample chapter from the book. The chapter focuses on installing,
learning and using Riemann for monitoring.&lt;/p&gt;

&lt;p&gt;The book is progressing well and I hope to have it out at the end of the
year. If you’re interested in receiving updates and getting notified
when the book is released you can sign up below.&lt;/p&gt;

&lt;!-- Begin MailChimp Signup Form --&gt;
&lt;link href=&quot;//cdn-images.mailchimp.com/embedcode/slim-081711.css&quot; rel=&quot;stylesheet&quot; type=&quot;text/css&quot; /&gt;

&lt;style type=&quot;text/css&quot;&gt;
  #mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; }
  /* Add your own MailChimp form style overrides in your site stylesheet or in this style block.
     We recommend moving this block and the preceding CSS link to the HEAD of your HTML file. */
&lt;/style&gt;

&lt;div id=&quot;mc_embed_signup&quot;&gt;
&lt;form action=&quot;//artofmonitoring.us6.list-manage.com/subscribe/post?u=f3aa656fdcded6d1354d6f4f0&amp;amp;id=a0633aafc9&quot; method=&quot;post&quot; id=&quot;mc-embedded-subscribe-form&quot; name=&quot;mc-embedded-subscribe-form&quot; class=&quot;validate&quot; target=&quot;_blank&quot; novalidate=&quot;&quot;&gt;
    &lt;div id=&quot;mc_embed_signup_scroll&quot;&gt;
  &lt;label for=&quot;mce-EMAIL&quot;&gt;Subscribe to the mailing list&lt;/label&gt;
  &lt;input type=&quot;email&quot; value=&quot;&quot; name=&quot;EMAIL&quot; class=&quot;email&quot; id=&quot;mce-EMAIL&quot; placeholder=&quot;email address&quot; required=&quot;&quot; /&gt;
    &lt;!-- real people should not fill this in and expect good things - do not remove this or risk form bot signups--&gt;
    &lt;div style=&quot;position: absolute; left: -5000px;&quot;&gt;&lt;input type=&quot;text&quot; name=&quot;b_f3aa656fdcded6d1354d6f4f0_a0633aafc9&quot; tabindex=&quot;-1&quot; value=&quot;&quot; /&gt;&lt;/div&gt;
    &lt;div class=&quot;clear&quot;&gt;&lt;input type=&quot;submit&quot; value=&quot;Subscribe&quot; name=&quot;subscribe&quot; id=&quot;mc-embedded-subscribe&quot; class=&quot;button&quot; /&gt;&lt;/div&gt;
    &lt;/div&gt;
&lt;/form&gt;
&lt;/div&gt;

&lt;!--End mc_embed_signup--&gt;


  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/06/aom-sample-chapter/&quot;&gt;The Art of Monitoring sample chapter&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on June 23, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Monitoring Survey 2015]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/06/monitoring-survey-2015/"/>
  <id>http://kartar.net/2015/06/monitoring-survey-2015</id>
  <published>2015-06-16T00:00:00-04:00</published>
  <updated>2015-06-16T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;&lt;strong&gt;TL;DR - &lt;a href=&quot;https://www.surveymonkey.com/s/monitoringsurvey2015&quot;&gt;Please take the 2015 Monitoring Survey&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Last year I ran &lt;a href=&quot;http://kartar.net/2014/10/monitoring-survey/&quot;&gt;a monitoring survey&lt;/a&gt;, whose data I also reviewed as &lt;a href=&quot;http://kartar.net/2014/11/monitoring-survey---background/&quot;&gt;a series of posts on this blog&lt;/a&gt; and presented in several talks. I was interested in running the survey because I think we’re seeing the beginnings of a significant change in the maturity of the monitoring landscape.&lt;/p&gt;

&lt;p&gt;I’ve decided to make the survey a yearly event and am coinciding the launch of this year’s survey with &lt;a href=&quot;http://monitorama.com/&quot;&gt;Monitorama&lt;/a&gt; in Portland.&lt;/p&gt;

&lt;p&gt;The survey takes about 5 minutes to fill out and the results will again
be presented on this blog, in some conference talks and made
available as &lt;a href=&quot;http://creativecommons.org/licenses/by-nc/4.0/&quot;&gt;Creative Commons&lt;/a&gt; licensed data. The survey is totally anonymous and the data won’t be used for any commercial purposes.&lt;/p&gt;

&lt;p&gt;You can find the survey at &lt;a href=&quot;https://www.surveymonkey.com/s/monitoringsurvey2015&quot;&gt;https://www.surveymonkey.com/s/monitoringsurvey2015&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Running the survey last year resulted in numerous suggestions on methodology and
approach. I want to thank everyone who responded to the survey and who
provided feedback that contributed to this year’s survey especially: Paul Nasrat, Lindsay Holmwood, and John Allspaw.&lt;/p&gt;

&lt;p&gt;Thanks in advance!&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/06/monitoring-survey-2015/&quot;&gt;Monitoring Survey 2015&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on June 16, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Looking up events in the Riemann index]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/06/looking-up-events-in-the-riemann-index/"/>
  <id>http://kartar.net/2015/06/looking-up-events-in-the-riemann-index</id>
  <published>2015-06-15T00:00:00-04:00</published>
  <updated>2015-06-15T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;http://artofmonitoring.com&quot;&gt;Forthcoming book - The Art of Monitoring&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the classic problems of monitoring alerts is that they are often
very cryptic. Coupled with the challenge of alert fatigue&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; this makes working out
what to do next when you receive an alert quite tricky. Additionally, alerts often happen when we’re not at the top of our game: a 4am
on a Sunday morning alert is not likely to foster an exemplary response.&lt;/p&gt;

&lt;p&gt;The quintessential example of cryptic/unhelpful alerts are Nagios disk space alerts.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-text&quot; data-lang=&quot;text&quot;&gt;PROBLEM Host: server.example.com
Service: Disk Space

State is now: WARNING for 0d 0h 2m 4s (was: WARNING) after 3/3 checks

Notification sent at: Thu Aug 7th 03:36:42 UTC 2015 (notification number
1)

Additional info:
DISK WARNING - free space: /data 678912 MB (9% inode=99%)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;What does this alert mean? We can see that filesystem &lt;code&gt;/data&lt;/code&gt; has 678912 Mb of disk space left or 9%. Should we worry? How fast it is filling up? Is this likely to happen RSN or sometimes in the future? What’s on that filesystem? Do I care if it fills up? I already have five questions from a single alert and I haven’t even started to diagnose WHY things might be wrong. Meh I am going back to sleep.&lt;/p&gt;

&lt;p&gt;Thankfully, in the middle of last year the estimable &lt;a href=&quot;https://twitter.com/ryan_frantz&quot;&gt;Ryan
Frantz&lt;/a&gt; released &lt;a href=&quot;https://codeascraft.com/2014/06/06/introducing-nagios-herald/&quot;&gt;Nagios
Herald&lt;/a&gt;.
Nagios Herald is a decorator for Nagios alerts. It allows you to add
context or further information to alerts generated by Nagios.&lt;/p&gt;

&lt;p&gt;For example, here is a decorated Nagios disk alert.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/6/nagios-herald.png&quot; alt=&quot;Decorated Nagios disk alert&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Much more useful. Nice big stack bar. Helpful graph. Output from the &lt;code&gt;df&lt;/code&gt; command. With this information I’m feeling a lot more comfortable about fixing the issue. (You can find a bunch of other &lt;a href=&quot;https://github.com/etsy/nagios-herald/blob/master/docs/example_alerts.md&quot;&gt;example alerts here too&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;So helpful to all using Nagios. Not so helpful to others. (Although I think there is support for user-supplied attributes in &lt;a href=&quot;https://sensuapp.org/&quot;&gt;Sensu&lt;/a&gt; and &lt;a href=&quot;https://github.com/sensu/uchiwa&quot;&gt;uchiwa&lt;/a&gt; and probably some other tools but nothing quite so well integrated and helpful (yet).)&lt;/p&gt;

&lt;p&gt;So in the spirit of &lt;a href=&quot;http://kartar.net/tags/#riemann&quot;&gt;recent Riemann posts&lt;/a&gt; I thought about what I could do quickly and simply to provide some context for alerts, specifically email alerts. Riemann does have one useful store of information: the index. Every event you index is stored in there until its TTL expires and the expiration reaper runs. So if you’re collecting useful events then some of those might help to color your alerts with helpful context.&lt;/p&gt;

&lt;p&gt;In my environment Riemann receives events from &lt;a href=&quot;http://collectd.org&quot;&gt;collectd&lt;/a&gt; and does most of its alerting based on the values of collectd metrics. One of those plugins, &lt;code&gt;df&lt;/code&gt;, emits metrics that measure the size of your filesystems. It emits a metric like so:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;df-root/percent_bytes-used&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;90.334929260253906&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;collectd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1433706333&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;20.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ds_index&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ds_name&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;value&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ds_type&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;gauge&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:type_instance&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;used&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:type&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;percent_bytes&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:plugin_instance&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;root&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:plugin&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can use this event, through the &lt;code&gt;:service&lt;/code&gt; field, for example &lt;code&gt;:service df-root/percent_bytes-use&lt;/code&gt;, to identify when specific filesystem have exceeded a threshold.&lt;/p&gt;

&lt;p&gt;We can create a configuration like so to do this:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;

  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;and &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;service&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;^df-(.\*)/percent_bytes-used&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;&amp;gt;= &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;90.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;james&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;example.com&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This uses the &lt;code&gt;where&lt;/code&gt; filter stream to select all &lt;code&gt;df&lt;/code&gt;-generated metric matching &lt;code&gt;df-(.\*)/percent_bytes-used&lt;/code&gt;. This should find the percent bytes used for every filesystem we’re monitoring, for example for the &lt;code&gt;/&lt;/code&gt; filesytem the metric would be: &lt;code&gt;df-root/percent_bytes-used&lt;/code&gt;. Our &lt;code&gt;where&lt;/code&gt; filter all matches on the &lt;code&gt;metric&lt;/code&gt; when the percentage if greater than or equal to 90%. If it matches it sends an email using the &lt;code&gt;email&lt;/code&gt; function to &lt;code&gt;james@example.com&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It’s inside our email alerting that we’re going to add the additional context. Inside our &lt;code&gt;email&lt;/code&gt; variable we’re going to &lt;a href=&quot;http://kartar.net/2015/03/custom-emails-with-riemann/&quot;&gt;redefine how Riemann creates the emails it sends&lt;/a&gt;. We do this by adding the &lt;code&gt;:body&lt;/code&gt; option to the &lt;code&gt;mailer&lt;/code&gt; plugin. We’ve defined that plugin inside our &lt;code&gt;email&lt;/code&gt; variable.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mailer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:from&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;reimann@example.com&amp;quot;&lt;/span&gt;
                    &lt;span class=&quot;ss&quot;&gt;:body&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;format-body&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
                    &lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;:body&lt;/code&gt; option takes a function and the &lt;code&gt;events&lt;/code&gt; argument. The &lt;code&gt;events&lt;/code&gt; argument contains one or more events in a sequence that our function, here &lt;code&gt;format-body&lt;/code&gt;, will then parse and format.&lt;/p&gt;

&lt;p&gt;Our new &lt;code&gt;format-body&lt;/code&gt; function will look pretty similar to &lt;a href=&quot;https://github.com/aphyr/riemann/blob/master/src/riemann/common.clj#L271&quot;&gt;the default Riemann email formatting&lt;/a&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;format-body&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Format the email body&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clojure.string/join&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n\n\n&amp;quot;&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;
          &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;str&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;Time: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;riemann.common/time-at&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n&amp;quot;&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;Host: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n&amp;quot;&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;Service: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n&amp;quot;&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;Metric: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ratio?&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;double &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n&amp;quot;&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;\n&amp;quot;&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;Additional context for host: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n\n&amp;quot;&lt;/span&gt;
              &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;print-context&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;search&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
              &lt;span class=&quot;s&quot;&gt;&amp;quot;\n\n&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
          &lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We take the &lt;code&gt;events&lt;/code&gt; argument and loop through the sequence of events inside it to produce a notification. Where the function starts to differ is when we begin to populate our additional insights. The insight is generated by looking up events in the Riemann index. To do this we use a third function called &lt;code&gt;print-context&lt;/code&gt;. The &lt;code&gt;print-context&lt;/code&gt; function takes a host, here the host of the current event from the &lt;code&gt;:host&lt;/code&gt; field, and uses the &lt;code&gt;search&lt;/code&gt; function to return all of the other events from that host from the index.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;search&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Search events in the index&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;-&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;= &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
       &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;riemann.index/search&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:index&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.config/core&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;search&lt;/code&gt; function uses the &lt;a href=&quot;http://riemann.io/api/riemann.index.html#var-search&quot;&gt;&lt;code&gt;riemann.index/search&lt;/code&gt;&lt;/a&gt; function to query the index. It constructs a query using the &lt;code&gt;host&lt;/code&gt; argument. It then uses that query to retrieve all matching events from that host from the index. Where the location of the index is the &lt;a href=&quot;http://riemann.io/api/riemann.config.html#var-core&quot;&gt;currently running core&lt;/a&gt;. Any matching events in the index will be returned as a sequence of standard Riemann events.&lt;/p&gt;

&lt;p&gt;We then pass this sequence to the &lt;code&gt;print-context&lt;/code&gt; function as an argument. The &lt;code&gt;print-context&lt;/code&gt; function iterates through the sequence and prints out a list of services and associated metrics.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;print-context&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Print the event content&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clojure.string/join&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;\n&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;str&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;&amp;quot;Service: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot; with metric: &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;round&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The contextual example is a little silly because you probably don’t want all of these services and their metrics but you could easily select something more elegant. (In the example code we’re also included a &lt;code&gt;lookup&lt;/code&gt; function which uses the other index parsing function: &lt;a href=&quot;http://riemann.io/api/riemann.index.html#var-lookup&quot;&gt;&lt;code&gt;riemann.index/lookup&lt;/code&gt;&lt;/a&gt;. The &lt;code&gt;lookup&lt;/code&gt; function uses a host/service pair to look up specific events inside the index.)&lt;/p&gt;

&lt;p&gt;We also run our events through the &lt;code&gt;round&lt;/code&gt; function which uses &lt;code&gt;cl-format&lt;/code&gt; from &lt;code&gt;clojure.pprint&lt;/code&gt; to round any numbers to 2 decimal places.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;round&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Round numbers to 2 decimal places&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clojure.pprint/cl-format&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;~,2f&amp;quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Phew! That’s a lot of background. So what actually happens when this alert triggers? In this case you will generate an email much like:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-text&quot; data-lang=&quot;text&quot;&gt;Time: Sun Jun 14 15:22:19 UTC 2015
Host: app2-api
Service: df-root/percent_bytes-used
Metric: 90.33

Additional context for host: app2-api

Service: cpu-0/cpu-system with metric: 0.40
Service: processes-rsyslogd/ps_disk_octets/read with metric: 0.00
Service: processes-collectd/ps_cputime/syst with metric: 3002.70
Service: cpu-0/cpu-wait with metric: 0.00
Service: interface-lo/if_errors/rx with metric: 0.00
Service: swap/swap_io-out with metric: 0.00
Service: interface-docker0/if_errors/rx with metric: 0.00
Service: elasticsearch-productiona/counter-indices.refresh.total with metric: 0.59
Service: interface-eth0/if_octets/tx with metric: 10192.04
Service: processes-collectd/ps_disk_ops/read with metric: 81.07
Service: processes-collectd/ps_data with metric: 621551616.00
Service: processes-rsyslogd/ps_pagefaults/minflt with metric: 0.00
Service: processes/ps_state-paging with metric: 0.00
Service: processes-rsyslogd/ps_count/processes with metric: 1.00
Service: interface-eth0/if_packets/rx with metric: 117.10
Service: interface-lo/if_packets/tx with metric: 0.00
Service: load/load/shortterm with metric: 0.14
. . .&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You could easily modify this to only select specific, relevant, events. You could also use any of Riemann’s stream functions or Clojure’s functions to manipulate those events.&lt;/p&gt;

&lt;p&gt;You could also extend this example beyond the index to retrieve external information. For example to retrieve further information from the host, construct a graph, or link to an existing Graphite graph or data source. This could even be further extended to take some action on the host itself in addition to the notification. The possibilities are broad and exciting!&lt;/p&gt;

&lt;p&gt;P.S. You can find a fully-functioning Riemann configuration for this example &lt;a href=&quot;https://gist.github.com/jamtur01/32f678c0b61e45751da2&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;div class=&quot;footnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;Becoming desensitized to alerts because you get so many. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/06/looking-up-events-in-the-riemann-index/&quot;&gt;Looking up events in the Riemann index&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on June 15, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Connecting Riemann and Zookeeper]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/04/connecting-riemann-and-zookeeper/"/>
  <id>http://kartar.net/2015/04/connecting-riemann-and-zookeeper</id>
  <published>2015-04-21T00:00:00-04:00</published>
  <updated>2015-04-21T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;One of my pet hates is having to maintain configuration inside
monitoring tools. Not only large pieces like host definitions but
smaller pieces like service and component definitions. Using a
configuration management tool makes this much easier but it still
generally requires some convergence to update your monitoring
configuration when a host is added or removed or a service changes.&lt;/p&gt;

&lt;p&gt;An example might be HAProxy. I have a HAProxy running with
multiple back-end nodes. I want to know about issues if the node count
drops below a threshold, potentially if it drops at all. With
auto-scaling or just adding and subtracting nodes I need to keep this
count up-to-date in my monitoring system to ensure I am correctly
alerted when something goes wrong and to avoid false positives. I could
do that with configuration management and converge the configuration
when I deploy, using Puppet’s exported resources for example. But in a
dynamic and fast-moving environment I’d really prefer not to wait for
any convergence.&lt;/p&gt;

&lt;p&gt;(Note: This is a somewhat artificial and very pets v. cattle example. I
don’t overly care if individual nodes die because they are disposable
and easily replaced. I could apply the same logic to any host or service
threshold that I wanted to query.)&lt;/p&gt;

&lt;p&gt;Instead I want &lt;a href=&quot;http://riemann.io/&quot;&gt;my monitoring system&lt;/a&gt; to be able to
lookup my threshold in some source of truth about the state of my
infrastructure. That source of truth could be something like &lt;a href=&quot;https://zookeeper.apache.org/&quot;&gt;Apache
Zookeeper&lt;/a&gt;,
&lt;a href=&quot;https://www.consul.io/&quot;&gt;Consul&lt;/a&gt;, or a configuration management store
like &lt;a href=&quot;http://docs.puppetlabs.com/puppetdb/&quot;&gt;PuppetDB&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In this post I’m going to combine Zookeeper and my Riemann monitoring
stack. Let’s start with some code to connect to Zookeeper. It makes use
of the &lt;a href=&quot;https://github.com/liebke/zookeeper-clj&quot;&gt;Zookeeper-clj&lt;/a&gt; Clojure
client.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;cemerick.pomegranate&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:only&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add-dependencies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add-dependencies&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:coordinates&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;zookeeper-clj&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;0.9.1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
                  &lt;span class=&quot;ss&quot;&gt;:repositories&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;merge &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;cemerick.pomegranate.aether/maven-central&lt;/span&gt;
                                       &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;clojars&amp;quot;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;http://clojars.org/repo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;ns &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;zookeep&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Zookeeper functions&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:use&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;clojure.tools.logging&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:require&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;zookeeper&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:as&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;zk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;zookeeper.data&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:as&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;client&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zk/connect&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;127.0.0.1:2181&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;get_data&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Gets data from Zookeeper&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;-&amp;gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:data&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zk/data&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;data/to-string&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;read-string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The first part of our code loads the &lt;code&gt;zookeeper-clj&lt;/code&gt; client. We then
define a namespace called &lt;code&gt;zookeep&lt;/code&gt; and require the client (as &lt;code&gt;zk&lt;/code&gt;) and
the Zookeeper client’s data function as &lt;code&gt;data&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We’ve defined a var called &lt;code&gt;client&lt;/code&gt; that is a connection to a local
Zookeeper server. We could easily specify a remote server instead.&lt;/p&gt;

&lt;p&gt;We’ve created a very simple function named &lt;code&gt;get_data&lt;/code&gt; that
retrieves the contents of a specific Zookeeper node specified by the
&lt;code&gt;node&lt;/code&gt; argument.&lt;/p&gt;

&lt;p&gt;Let’s now create a &lt;code&gt;riemann.config&lt;/code&gt; file to make use of our Zookeeper
functions.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;include&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;/etc/riemann/include/zookeeper.clj&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;0.0.0.0&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tcp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;udp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ws-server&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mailer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:from&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;reimann@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;and &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;&amp;lt; &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;metric&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zookeep/get_data&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;/app1/haproxy/nodes&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;haproxy-backend.web-backend/gauge-active_servers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tagged&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;app1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;throttle&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;james@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;In our configuration we’ve included our Zookeeper functions using the
&lt;code&gt;include&lt;/code&gt; function and bound Riemann to all the interfaces on our host.
We’ve also configured the &lt;code&gt;email&lt;/code&gt; plug-in to allow us to send emails
from Riemann.&lt;/p&gt;

&lt;p&gt;Next we’ve defined some streams including a &lt;code&gt;where&lt;/code&gt; filter on an event
generated from &lt;a href=&quot;http://www.collectd.org&quot;&gt;collectd&lt;/a&gt; called
&lt;code&gt;haproxy-backend.web-backend/gauge-active_servers&lt;/code&gt;. This is the active
back-end server count from the &lt;a href=&quot;http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9&quot;&gt;HAProxy
stats&lt;/a&gt;
output.&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;where&lt;/code&gt; filter matches this service, if it is tagged with &lt;code&gt;app1&lt;/code&gt;, and if
the value of the metric field is less than the value derived from the
&lt;code&gt;(zookeep/get_data &quot;/app1/haproxy/nodes&quot;)&lt;/code&gt; function.  This function,
&lt;code&gt;zookeep/get_data&lt;/code&gt;, takes the node name &lt;code&gt;/app1/haproxy/nodes&lt;/code&gt; and looks it up
in Zookeeper.&lt;/p&gt;

&lt;p&gt;Inside Zookeeper we’ve created this node and populated it with the count
of HAProxy back-end nodes running for this specific application. That
population of the node or its update would normally take place during
deployment.&lt;/p&gt;

&lt;p&gt;Now when the metric arrives into Riemann, the lookup is triggered and
Riemann compares the value of the metric field with the value from the
Zookeeper node. If the metric value is less than the node value then
Riemann sends an email out containing the specific event. Now our
monitoring system doesn’t need any changes when our HAProxy
configuration changes. We hence eliminate the need to wait for our
deployment changes to converge in our monitoring environment. Which
means less risk of missing an alert or a false positive alert being
generated.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/04/connecting-riemann-and-zookeeper/&quot;&gt;Connecting Riemann and Zookeeper&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on April 21, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Just Enough Clojure for Riemann]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/04/just-enough-clojure-for-riemann/"/>
  <id>http://kartar.net/2015/04/just-enough-clojure-for-riemann</id>
  <published>2015-04-12T00:00:00-04:00</published>
  <updated>2015-04-12T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt; TL;DR - This is not a comprehensive guide to &lt;a href=&quot;http://clojure.org/&quot;&gt;Clojure&lt;/a&gt;, but it is enough to get you started with &lt;a href=&quot;http://kartar.net/2014/12/an-introduction-to-riemann/&quot;&gt;Riemann&lt;/a&gt;. This is also an excerpt from my forthcoming book - &lt;b&gt;&lt;a href=&quot;http://www.artofmonitoring.com&quot;&gt;The Art of Monitoring&lt;/a&gt;&lt;/b&gt;. It&#39;ll also be available in the Riemann documentation at some point too.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://riemann.io/&quot;&gt;Riemann&lt;/a&gt; is configured using a Clojure-based configuration file. This means your configuration file is actually processed as a Clojure program. So to process events and send alerts and metrics you&#39;ll be writing Clojure. Don&#39;t panic! You don&#39;t need to become a fully fledged Clojure developer to use Riemann. I can teach you what you need to know in order to use Riemann. Additionally, Riemann comes with a lot of helpers and shortcuts that make it easier to write Clojure to do what we need to process our events.&lt;/p&gt;
&lt;p&gt;Let&#39;s learn a bit more about Clojure and help you get started with Riemann. Clojure is a dynamic programming language that targets the Java Virtual Machine. It&#39;s a dialect of Lisp and is largely a &lt;a href=&quot;http://clojure.org/functional_programming&quot;&gt;functional programming language&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Functional_programming&quot;&gt;Functional programming&lt;/a&gt; is a programming style that focuses on the evaluation of mathematical functions and steers away from changing state and mutable data. It&#39;s highly declarative, meaning you build programs from expressions that describe &quot;what&quot; a program should accomplish rather than &quot;how&quot; it accomplishes it.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Note&lt;/span&gt; &lt;span&gt;Languages that describe more of the &quot;how&quot; are called imperative languages.&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Examples of declarative programming languages include SQL, CSS, regular expressions and configuration management languages like Puppet and Chef. Let&#39;s take a simple example.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sql&quot; data-lang=&quot;sql&quot;&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_id&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;users&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;user_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;quot;Alice&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;In this SQL query we&#39;re asking for the &lt;code&gt;user_id&lt;/code&gt; for &lt;code&gt;user_name&lt;/code&gt; of &lt;code&gt;Alice&lt;/code&gt; from the &lt;code&gt;users&lt;/code&gt; table. The statement is asking a declarative &quot;what&quot; question. We don&#39;t really care about the &quot;how&quot;, the database engine takes care of those details.&lt;/p&gt;
&lt;p&gt;In addition to their declarative nature, functional programming languages try to eliminate all side effects from changing state. In a functional language when you call a function its output value depends only on the inputs to the function. So if you repeatedly call function &lt;code&gt;f&lt;/code&gt; with the same value for argument &lt;code&gt;x&lt;/code&gt;, &lt;code&gt;f(x)&lt;/code&gt;, it will produce the same result every time. This makes functional programs very easy to understand, test and predict. Functional programming languages call functions that operate like this &quot;pure&quot; functions.&lt;/p&gt;
&lt;p&gt;The best way to get started with Clojure is to understand the basics of its syntax and types. Let&#39;s get a crash course now.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Warning&lt;/span&gt; &lt;span&gt;This is going to be a very high level and not very nuanced introduction to Clojure. It&#39;s designed to give you the knowledge and recognition of various syntax and expressions to allow you to work with Riemann. It is not an article that will teach you how to develop in Clojure.&lt;/span&gt;
&lt;/div&gt;
&lt;section id=&quot;a-brief-introduction-to-clojure&quot; class=&quot;level3&quot;&gt;
&lt;h3&gt;A brief introduction to Clojure&lt;/h3&gt;
&lt;p&gt;Let&#39;s step through the Clojure basic syntax and types. We&#39;ll also show you a tool called REPL that can help you test and build your Clojure snippets. REPL (short for read–eval–print loop) is an interactive programming shell that takes single expressions, evaluates them and returns the results. It&#39;s a great way to get to know Clojure.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Note&lt;/span&gt; &lt;span&gt;If you&#39;re from the Ruby world then REPL is just like &lt;code&gt;irb&lt;/code&gt;. Or in Python when you launch the &lt;code&gt;python&lt;/code&gt; binary interactively.&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;We can install REPL via a tool called &lt;a href=&quot;http://leiningen.org/&quot;&gt;Leiningen&lt;/a&gt;. Leiningen is an automation tool for Clojure that helps you automate the build and management of Clojure projects.&lt;/p&gt;
&lt;/section&gt;
&lt;section id=&quot;installing-leiningen&quot; class=&quot;level3&quot;&gt;
&lt;h3&gt;Installing Leiningen&lt;/h3&gt;
&lt;p&gt;In order to install Leiningen we&#39;ll need to have Java installed on the host. The prerequisite Java packages on Ubuntu and Red Hat for Reimann will also be sufficient for Leiningen too.&lt;/p&gt;
&lt;p&gt;We&#39;re going to download a Leiningen binary called &lt;code&gt;lein&lt;/code&gt; to install it. Let&#39;s download that into a &lt;code&gt;bin&lt;/code&gt; directory under our home directory.&lt;/p&gt;


&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;mkdir -p ~/bin
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd&lt;/span&gt; ~/bin
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;curl -o lein https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;chmod a+x lein
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;PATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PATH&lt;/span&gt;:&lt;span class=&quot;nv&quot;&gt;$HOME&lt;/span&gt;/bin&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve created a new directory called &lt;code&gt;~/bin&lt;/code&gt; and changed into it. We&#39;ve then used the &lt;code&gt;curl&lt;/code&gt; command to download the &lt;code&gt;lein&lt;/code&gt; binary and the &lt;code&gt;chmod&lt;/code&gt; command to make it executable. Lastly, we&#39;ve added our &lt;code&gt;~/bin&lt;/code&gt; directory to our path so that we can find the &lt;code&gt;lein&lt;/code&gt; binary.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Tip&lt;/span&gt; &lt;span&gt;The addition of the &lt;code&gt;~/bin&lt;/code&gt; directory assumes you&#39;re in a Bash shell. It&#39;s also temporary to your current shell. You&#39;d need to add the path to your &lt;code&gt;.bashrc&lt;/code&gt; or the similar setup for your shell.&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Next we need to run &lt;code&gt;lein&lt;/code&gt; to auto-install its supporting libraries.&lt;/p&gt;


&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;lein
. . .&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This will download Leiningen&#39;s supporting Jar file.&lt;/p&gt;
&lt;p&gt;Finally, we can run REPL using the &lt;code&gt;lein repl&lt;/code&gt; sub-command.&lt;/p&gt;


&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-sh&quot; data-lang=&quot;sh&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;lein repl
. . .
&lt;span class=&quot;nv&quot;&gt;user&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This will download Clojure itself (in the form of its Jar file) and launch our interactive Clojure shell.&lt;/p&gt;
&lt;/section&gt;
&lt;section id=&quot;clojure-syntax-and-types&quot; class=&quot;level3&quot;&gt;
&lt;h3&gt;Clojure syntax and types&lt;/h3&gt;
&lt;p&gt;Let&#39;s use this interactive shell to look at some of the syntax and functions we&#39;ve just learnt about. Let&#39;s start by opening our shell.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Now let&#39;s try a simple expression.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;nil&lt;/code&gt; expression is the simplest value in Clojure. It represents literally nothing.&lt;/p&gt;
&lt;p&gt;We can also specify an integer value.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Or a string.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;hello Ms Event&amp;quot;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;hello Ms Event&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Or Boolean values.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;true&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;true&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;false&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;false&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;section id=&quot;clojure-functions&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Clojure functions&lt;/h4&gt;
&lt;p&gt;Whilst interesting these values aren&#39;t very exciting on their own. To do some more interesting things we can use Clojure functions. A function is structured like this:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;argument&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Tip&lt;/span&gt; &lt;span&gt;If you&#39;re used to the Ruby or Python world a function is broadly the equivalent of a method.&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Let&#39;s look at a function in action by doing something with some values: adding two integers together.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In this case we&#39;ve used the &lt;code&gt;+&lt;/code&gt; function and added &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;1&lt;/code&gt; together to get &lt;code&gt;2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But there&#39;s something about this structure that might look familiar to you if you&#39;ve used other programming languages. Our function looks just like &lt;a href=&quot;https://en.wikipedia.org/wiki/List_(abstract_data_type)&quot;&gt;a list&lt;/a&gt;. This is because it is! Our expression might add two numbers together but it’s also a list of three items in a valid list data structure.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Note&lt;/span&gt; &lt;span&gt;Technically it&#39;s an &lt;a href=&quot;http://en.wikipedia.org/wiki/S-expression&quot;&gt;s-expression&lt;/a&gt;.&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This is a feature of Clojure called &lt;a href=&quot;http://en.wikipedia.org/wiki/Homoiconicity&quot;&gt;homoiconicity&lt;/a&gt;, sometimes described as: &quot;code is data, data is code&quot;. This concept is inherited from Clojure&#39;s parent language: &lt;a href=&quot;http://en.wikipedia.org/wiki/Lisp_(programming_language)&quot;&gt;Lisp&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Homoiconicity means that the program&#39;s structure is similar to its syntax. In this case Clojure programs are written in the form of lists. Hence you can gain insight into the program&#39;s internal workings by reading its code. This also makes &lt;a href=&quot;http://en.wikipedia.org/wiki/Metaprogramming&quot;&gt;metaprogramming&lt;/a&gt; really easy because Clojure&#39;s source code is a data structure and the language can treat it like one.&lt;/p&gt;
&lt;p&gt;Now let&#39;s look more closely at the &lt;code&gt;+&lt;/code&gt; function. Each function is a symbol. A symbol is a bare string of characters, like &lt;code&gt;+&lt;/code&gt; or &lt;code&gt;inc&lt;/code&gt;. Symbols have short names and full names. The short name is used to refer to it locally, for example &lt;code&gt;+&lt;/code&gt;. The full name, or perhaps more accurately the fully qualified name, gives you a way to refer to the symbol unambiguously from anywhere. The fully qualified name of the &lt;code&gt;+&lt;/code&gt; symbol is &lt;code&gt;clojure.core/+&lt;/code&gt;. The &lt;code&gt;clojure.core&lt;/code&gt; being the fundamental library of the Clojure language. We can refer to &lt;code&gt;+&lt;/code&gt; in it&#39;s fully qualified form here:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clojure.core/+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Symbols refer to other things; generally they point to values. Think about them as a name or identifier that points to a concept: &lt;code&gt;+&lt;/code&gt; is the name, &quot;adding&quot; is the concept. When Clojure encounters a symbol it evaluates it by looking up its meaning. If it can&#39;t find a meaning it&#39;ll generate an error message, for example:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;bob&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;CompilerException&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;java.lang.RuntimeException&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Unable&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;resolve &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;symbol&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;bob&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;this&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;context&lt;/span&gt;, &lt;span class=&quot;nv&quot;&gt;compiling&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;NO_SOURCE_PATH&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:1:1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Clojure also has a syntax for stopping that evaluation. This is called quoting and it is achieved by prefixing the expression with a quotation mark: &lt;code&gt;&#39;&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This returns the symbol itself without evaluating it. This is important because often we want to do things, review things, or test things without evaluating.&lt;/p&gt;
&lt;p&gt;For example, if we need to determine what type of thing something is in Clojure we can use the &lt;code&gt;type&lt;/code&gt; function and quote the function like so:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;+&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;clojure.lang.Symbol&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we can see that &lt;code&gt;+&lt;/code&gt; is a Clojure language symbol.&lt;/p&gt;
&lt;/section&gt;
&lt;section id=&quot;lists&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Lists&lt;/h4&gt;
&lt;p&gt;Clojure also has a variety of data structures. Especially useful to us will be collections. Collections are groups of values, for example a list or a map.&lt;/p&gt;
&lt;p&gt;Let&#39;s start by looking at lists. Lists are core to all Lisp-based languages (Lisp means &quot;LISt Processing&quot;). As we discovered above Clojure programs are essentially lists. So we&#39;re going to see a lot of them!&lt;/p&gt;
&lt;p&gt;Lists have zero or more elements and are wrapped in parentheses.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve created a list containing the elements &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt; and &lt;code&gt;c&lt;/code&gt;. We&#39;ve quoted it because we don&#39;t want it evaluated. If we didn&#39;t quote it then evaluation would fail because none of the elements, &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt;, etc are defined. Let&#39;s see that now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;CompilerException&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;java.lang.RuntimeException&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Unable&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;resolve &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;symbol&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;this&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;context&lt;/span&gt;, &lt;span class=&quot;nv&quot;&gt;compiling&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;NO_SOURCE_PATH&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:1:1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can do a few neat things with lists, for example add an element using the &lt;code&gt;conj&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;conj &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You can see we&#39;ve added a new element, &lt;code&gt;d&lt;/code&gt;, to the front of the list. Why the front? Because a list is really a &lt;a href=&quot;https://en.wikipedia.org/wiki/Linked_list&quot;&gt;linked list&lt;/a&gt; and focusses on providing immediate access to the first value in the list. Lists are most useful for small collections of elements and when you need to read elements in a linear fashion.&lt;/p&gt;
&lt;p&gt;We can also return values from a list using a variety of functions.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;first &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;second &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;nth &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve pulled out the first element, second element, and using the &lt;code&gt;nth&lt;/code&gt; function, the third element.&lt;/p&gt;
&lt;p&gt;This last, &lt;code&gt;nth&lt;/code&gt;, function shows us a multi-argument function. The first argument is the list, &lt;code&gt;&#39;(a b c)&lt;/code&gt;, and the second argument is the index value of the element we want to return, here &lt;code&gt;2&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Tip&lt;/span&gt; &lt;span&gt;Like most programming languages Clojure starts counting from &lt;code&gt;0&lt;/code&gt;.&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;We can also create a list with the &lt;code&gt;list&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;/section&gt;
&lt;section id=&quot;vectors&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Vectors&lt;/h4&gt;
&lt;p&gt;Another collection available to us is the vector. Vectors are like lists but they are optimized for random access to the elements by index. Vectors are created by adding zero or more elements inside square brackets.&lt;/p&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Tip&lt;/span&gt; &lt;span&gt;Most of the time, given the choice between a list and a vector, you should use a vector for data access. It&#39;s generally faster.&lt;/span&gt;
&lt;/div&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Like lists, we can again use &lt;code&gt;conj&lt;/code&gt; to add to a vector.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;conj &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You&#39;ll note the &lt;code&gt;d&lt;/code&gt; element is added at the end because a vector isn&#39;t focussed on sequential access like a list.&lt;/p&gt;
&lt;p&gt;There are some other useful functions we can use on lists and vectors, for example to get the last element in a list or vector.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;last &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;d&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Or count the elements.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;count &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Because vectors are designed to look up elements by index, we can also use them directly as functions, for example:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve retrieved the value, &lt;code&gt;2&lt;/code&gt;, at index &lt;code&gt;1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;We can create or convert an existing structure, like a list, into a vector with the &lt;code&gt;vector&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;vector &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;vector &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;/section&gt;
&lt;section id=&quot;sets&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Sets&lt;/h4&gt;
&lt;p&gt;There&#39;s a final collection related to lists and vectors called a set. Sets are unordered collections of values, prefixed with &lt;code&gt;#&lt;/code&gt; and wrapped in curly braces, &lt;code&gt;{ }&lt;/code&gt;. They are most useful for collections of values where you want to check a value or values is present.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You&#39;ll notice the set was returned in a different order. This is because sets are focussed on presence lookups so order doesn&#39;t matter quite so much.&lt;/p&gt;
&lt;p&gt;Like lists and vectors we can use the &lt;code&gt;conj&lt;/code&gt; function to add an element to a set.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;conj &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Sets can never contain an element more than once, so adding an element which is already present does nothing. You can remove elements with the &lt;code&gt;disj&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;disj &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The most common operation with a set is to check for the presence of a specific value, for this we use the &lt;code&gt;contains?&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;contains? &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;true&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;contains? &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;false&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Like a vector, you can also use the set itself as a function. This returns the value if it is present or &lt;code&gt;nil&lt;/code&gt; if it is not.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;#39;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;You can make a set out of any other collection with the &lt;code&gt;set&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;set &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve made a set out of a vector.&lt;/p&gt;
&lt;/section&gt;
&lt;section id=&quot;maps&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Maps&lt;/h4&gt;
&lt;p&gt;The last data structure we&#39;re going to look at is the map. Maps are key/value pairs enclosed in braces. You can think about them as being equivalent to a hash.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve defined a map with two key/value pairs: &lt;code&gt;:a 1&lt;/code&gt; and &lt;code&gt;:b 2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You&#39;ll note each key is prefixed with a &lt;code&gt;:&lt;/code&gt;. This denotes another type of Clojure syntax: the keyword. A keyword is much like a symbol but instead of referencing another value it is merely a name or label. It&#39;s highly useful in data structures like maps to do lookups, you look up the keyword and return the value.&lt;/p&gt;
&lt;p&gt;We can use the &lt;code&gt;get&lt;/code&gt; function to retrieve a value.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;get &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve specified the keyword &lt;code&gt;:a&lt;/code&gt; and asked Clojure if it is inside our map. It&#39;s returned the value in the key/value pair, &lt;code&gt;1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If the key doesn&#39;t exist in the map then Clojure returns &lt;code&gt;nil&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;get &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;get&lt;/code&gt; function can also take a default value to return instead of &lt;code&gt;nil&lt;/code&gt;, if the key doesn’t exist in that map.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;get &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:c&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:novalue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;ss&quot;&gt;:novalue&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can also use the map itself as a function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can also use keywords as functions to look themselves up in a map.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To add a key/value pair to a map we use the &lt;code&gt;assoc&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;assoc &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:c&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:c&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If a key isn&#39;t present then &lt;code&gt;assoc&lt;/code&gt; adds it. If the key is present then &lt;code&gt;assoc&lt;/code&gt; replaces the value.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;assoc &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;To remove a key we use the &lt;code&gt;dissoc&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dissoc &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Note&lt;/span&gt; &lt;span&gt;If you&#39;ve come from the Ruby or Python world the terms list, set, vector and map might be a little new. But the syntax probably looks familiar. You can think about lists, vectors and sets as being very similar to arrays and maps being hashes.&lt;/span&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;section id=&quot;strings&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Strings&lt;/h4&gt;
&lt;p&gt;We can also work with strings. Clojure lets you turn pretty much any value into a string using the &lt;code&gt;str&lt;/code&gt; function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;holiday&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;holiday&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;str&lt;/code&gt; function turns anything specified into a string. We can also use it concatenate strings.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;str &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;james needs &amp;quot;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot; holidays&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;james needs 2 holidays&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;/section&gt;
&lt;section id=&quot;creating-our-own-functions&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Creating our own functions&lt;/h4&gt;
&lt;p&gt;Up until now we&#39;ve run functions as stand-alone expressions, for example here&#39;s the &lt;code&gt;inc&lt;/code&gt; function which increments arguments passed to it:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;inc &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This isn&#39;t overly practical, except to demonstrate how a function works. If we want do more with Clojure we need to be able to define our own functions. To do this Clojure provides a function called &lt;code&gt;fn&lt;/code&gt;. Let us construct our first function.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;So what&#39;s going on here? We&#39;ve used the &lt;code&gt;fn&lt;/code&gt; function to create a new function. The &lt;code&gt;fn&lt;/code&gt; function takes a vector as an argument. This vector contains any arguments being passed to our function. Then we specify the actual action our function is going to perform. In our case we&#39;re mimicking the behavior of the &lt;code&gt;inc&lt;/code&gt; function. The function will take the value of &lt;code&gt;a&lt;/code&gt; and add &lt;code&gt;1&lt;/code&gt; to it.&lt;/p&gt;
&lt;p&gt;If we run this code now nothing will happen because &lt;code&gt;a&lt;/code&gt; is currently unbound as we haven&#39;t defined a value for it. Let&#39;s run our function now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve evaluated our function and passed in an argument of &lt;code&gt;2&lt;/code&gt;. This is assigned to our &lt;code&gt;a&lt;/code&gt; symbol and passed to the function. The function adds &lt;code&gt;a&lt;/code&gt;, now set to &lt;code&gt;2&lt;/code&gt;, and &lt;code&gt;1&lt;/code&gt; and returns the resulting value: &lt;code&gt;3&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There&#39;s also a shorthand for writing functions that we&#39;ll see occasionally in Riemann configurations.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;This shorthand function is the equivalent of &lt;code&gt;(fn [x] (+ x 1))&lt;/code&gt; and we can call it to see the result.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;+ &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;/section&gt;
&lt;section id=&quot;creating-variables&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Creating variables&lt;/h4&gt;
&lt;p&gt;But we&#39;re still a step from a named function and we&#39;re missing an important piece, how do we define our own variables to hold values? Clojure has a function called &lt;code&gt;def&lt;/code&gt; that allows us to do this.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;smoker&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;joker&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;&amp;#39;user/smoker&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;def&lt;/code&gt; function does two things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It creates a new type of object called a var. Vars, like symbols, are references to other values. You can see our new var &lt;code&gt;#&#39;user/smoker&lt;/code&gt; returned as output of the &lt;code&gt;def&lt;/code&gt; function.&lt;/li&gt;
&lt;li&gt;It binds a symbol to that var, here the symbol &lt;code&gt;smoker&lt;/code&gt; is bound to a var with a value of the string &lt;code&gt;&quot;joker&quot;&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When we evaluate a symbol pointing to a var it is replaced by the var&#39;s value. But because &lt;code&gt;def&lt;/code&gt; also creates a symbol we can refer to our var like that too.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;user/smoker&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;joker&amp;quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;smoker&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;joker&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Where did this &lt;code&gt;user/&lt;/code&gt; come from? It&#39;s a Clojure namespace. Namespaces are a way Clojure organizes code and program structure. In this case the REPL creates a namespace called &lt;code&gt;user/&lt;/code&gt; by default. Remember we learnt earlier that a symbol has a short name, for example &lt;code&gt;smoker&lt;/code&gt; that can be used locally to refer to it, and a full name. That full name, here &lt;code&gt;user/smoker&lt;/code&gt;, would be used to refer to this symbol from another namespace.&lt;/p&gt;
&lt;p&gt;We&#39;ll talk more about namespaces and use them to organize our Riemann configuration in the &lt;a href=&quot;http://riemann.io/howto.html#organizing-with-namespaces&quot;&gt;HOWTO&lt;/a&gt;. If you&#39;d like to read more about them then there is an excellent explanation at &lt;a href=&quot;http://www.braveclojure.com/organization/&quot; class=&quot;uri&quot;&gt;http://www.braveclojure.com/organization/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We can also use the &lt;code&gt;type&lt;/code&gt; function to see the type of value the symbol references.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;smoker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;java.lang.String&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we can see that the value &lt;code&gt;smoker&lt;/code&gt; resolves to is a string.&lt;/p&gt;
&lt;/section&gt;
&lt;section id=&quot;creating-named-functions&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Creating named functions&lt;/h4&gt;
&lt;p&gt;Now with the combination of &lt;code&gt;def&lt;/code&gt; and &lt;code&gt;fn&lt;/code&gt; we can create our own named functions.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;grow&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;* &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;&amp;#39;user/grow&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Firstly, we&#39;ve defined a var (and symbol) called &lt;code&gt;grow&lt;/code&gt;. Inside that we&#39;ve defined a function. Our function takes a single argument, &lt;code&gt;number&lt;/code&gt;, and passes that number to the &lt;code&gt;*&lt;/code&gt; function, the mathematical multiplication operator in Clojure, and multiplies it by &lt;code&gt;2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let&#39;s call our function now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;grow&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Here we&#39;ve called the &lt;code&gt;grow&lt;/code&gt; function and passed it a value of &lt;code&gt;10&lt;/code&gt;. The &lt;code&gt;grow&lt;/code&gt; function multiplies that value and returns the result: &lt;code&gt;20&lt;/code&gt;. Pretty awesome eh?&lt;/p&gt;
&lt;p&gt;But the syntax is a little cumbersome. Thankfully Clojure offers a shortcut to creating a var and binding it to a function called &lt;code&gt;defn&lt;/code&gt;. Let&#39;s rewrite our function using this form.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;grow&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;* &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;&amp;#39;user/grow&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;That&#39;s a little neater and easier to read. Now how about we add a second argument? Let&#39;s make both the number to be multiplied and the multiplier arguments.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;grow&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;multiple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;* &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;multiple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;&amp;#39;user/grow&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Let&#39;s call our &lt;code&gt;grow&lt;/code&gt; function again.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;grow&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;ArityException&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Wrong&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;passed&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;user/grow&lt;/span&gt;  &lt;span class=&quot;nv&quot;&gt;clojure.lang.AFn.throwArity&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;AFn.java&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:429&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Ooops not enough arguments. Let&#39;s add the second argument.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;grow&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;40&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can also add a doc string to our function to help us articulate what it does.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;defn &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;grow&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&amp;quot;Multiplies numbers - can specify the number and multplier&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;multiple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;* &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;multiple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;We can access a function&#39;s doc string using the &lt;code&gt;doc&lt;/code&gt; function.&lt;/p&gt;


&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;nv&quot;&gt;user=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;doc &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;grow&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;-------------------------&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;user/grow&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;multiple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;Multiplies&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;numbers&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;can&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;specify&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;and &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;multplier&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;doc&lt;/code&gt; function tells us the full name of the function, the arguments it accepts, and returns the docstring.&lt;/p&gt;
&lt;p&gt;That&#39;s the end of our crash course.&lt;/p&gt;
&lt;/section&gt;
&lt;section id=&quot;learning-more-clojure&quot; class=&quot;level4&quot;&gt;
&lt;h4&gt;Learning more Clojure&lt;/h4&gt;
&lt;p&gt;I recommend trying to get an understanding of the basics of Clojure to get the most out of Riemann. If you&#39;d like to start to learn a bit about Clojure then Kyle Kingsbury&#39;s excellent &lt;a href=&quot;https://aphyr.com/posts/301-clojure-from-the-ground-up-welcome&quot;&gt;Clojure from the ground up&lt;/a&gt; series is an great place to start. This section is very much an abbreviated crash-course of sections of that tutorial and I can&#39;t thank Kyle enough for writing it. A reading of this tutorial will add signicantly to the knowledge we&#39;ve shared here. I recommend at least a solid reading of the first three posts in the series:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;a href=&quot;https://aphyr.com/posts/301-clojure-from-the-ground-up-welcome&quot;&gt;Welcome&lt;/a&gt; post.&lt;/li&gt;
&lt;li&gt;The post on &lt;a href=&quot;https://aphyr.com/posts/302-clojure-from-the-ground-up-basic-types&quot;&gt;Basic types&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The post on &lt;a href=&quot;https://aphyr.com/posts/303-clojure-from-the-ground-up-functions&quot;&gt;Functions&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;admonition&quot;&gt;
&lt;span class=&quot;admonition-title&quot;&gt;Tip&lt;/span&gt; &lt;span&gt;Another resource if you&#39;re interested in learning a bit more about the basics of Clojure is &lt;a href=&quot;http://learn-clojure.com/&quot; class=&quot;uri&quot;&gt;http://learn-clojure.com/&lt;/a&gt;.&lt;/span&gt;
&lt;/div&gt;
&lt;/section&gt;&lt;/section&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/04/just-enough-clojure-for-riemann/&quot;&gt;Just Enough Clojure for Riemann&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on April 12, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Custom emails with Riemann]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/03/custom-emails-with-riemann/"/>
  <id>http://kartar.net/2015/03/custom-emails-with-riemann</id>
  <published>2015-03-27T00:00:00-04:00</published>
  <updated>2015-03-27T00:00:00-04:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;I’ve recently started &lt;a href=&quot;http://kartar.net/2015/01/riemann-streams/&quot;&gt;alerting on expired events from Riemann&lt;/a&gt; via email. The default email alert looks something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/3/am210.png&quot; alt=&quot;Riemann email alerts&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It contains some useful information but it is pretty basic: the subject is the
name of the alerted service and the body contains a basic printout of the
event’s fields.&lt;/p&gt;

&lt;p&gt;I decided I’d like to build some alternative emails and so I went digging into
the &lt;a href=&quot;http://riemann.io/api/riemann.email.html#var-mailer&quot;&gt;mailer plug-in&lt;/a&gt; code
to find out how.&lt;/p&gt;

&lt;p&gt;You would normally configure the mailer plug-in something like this:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mailer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:from&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;reimann@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This defines a new function called &lt;code&gt;email&lt;/code&gt; that passes events to the &lt;code&gt;mailer&lt;/code&gt;
plug-in. We’ve configured a single option for the plug-in: &lt;code&gt;:from&lt;/code&gt; which controls the source address for emails, here &lt;code&gt;riemann@example.com&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If we want to update the subject or body of the email we can pass in the
&lt;code&gt;:subject&lt;/code&gt; and &lt;code&gt;:body&lt;/code&gt; options. These options take a collection of events and
return a formatted string, for example the default subject is set by a function like:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mailer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:from&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;reimann@example.com&amp;quot;&lt;/span&gt;
                    &lt;span class=&quot;ss&quot;&gt;:subject&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
                     &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clojure.string/join&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;, &amp;quot;&lt;/span&gt;
                       &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;map &lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))}))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;:subject&lt;/code&gt; option has a function with an argument of &lt;code&gt;events&lt;/code&gt;, which is the collection of incoming events. The &lt;code&gt;map&lt;/code&gt; function then extracts the value of the &lt;code&gt;:service&lt;/code&gt; field in each event, if there is more than one event then joins the services in a comma separated list, and writes that as a string in the subject line of our email. Hence &lt;code&gt;riemanna riemann server tcp 0.0.0.0:5555...&lt;/code&gt; as the subject in our example email above.&lt;/p&gt;

&lt;p&gt;If instead I wanted to build a custom email subject, let’s say to notify me when specific host was down I could add the &lt;code&gt;:subject&lt;/code&gt; option to my &lt;code&gt;mailer&lt;/code&gt; function:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host_email&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mailer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:from&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;reimann@example.com&amp;quot;&lt;/span&gt;
                        &lt;span class=&quot;ss&quot;&gt;:subject&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;apply str &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Host &amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get-in&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;first &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot; is down&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))}))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we’ve passed the &lt;code&gt;:subject&lt;/code&gt; options our &lt;code&gt;events&lt;/code&gt; collection. We’ve then specified a string, “Host … is down”. We’ve replaced the &lt;code&gt;...&lt;/code&gt; in the string with the hostname of the event. We’ve taken the hostname by getting the contents of the &lt;code&gt;:host&lt;/code&gt; field from the first event in our collection.&lt;/p&gt;

&lt;p&gt;We can then trigger these alerts with something like:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clj&quot; data-lang=&quot;clj&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expired&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;by&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;host_email&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;james@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we’re filtering on all &lt;code&gt;expired&lt;/code&gt; events, splitting the streams by the &lt;code&gt;:host&lt;/code&gt; field using the &lt;code&gt;by&lt;/code&gt; function. This creates a new stream for event by host. We then call the &lt;code&gt;host_email&lt;/code&gt; function to send the email.&lt;/p&gt;

&lt;p&gt;The resulting email would look like:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/3/am211.png&quot; alt=&quot;Riemann new email alerts&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We could do similar things to modify the body of the email using the &lt;code&gt;:body&lt;/code&gt; option.&lt;/p&gt;

&lt;p&gt;P.S. I am slowly teaching myself Clojure. I’ve thus far found the &lt;a href=&quot;http://www.tryclj.com/&quot;&gt;Try Clojure site&lt;/a&gt;, the &lt;a href=&quot;http://learn-clojure.com/&quot;&gt;Learn Clojure&lt;/a&gt; and &lt;a href=&quot;https://aphyr.com/tags/Clojure-from-the-ground-up&quot;&gt;Clojure from the ground up&lt;/a&gt; to be most useful for this.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/03/custom-emails-with-riemann/&quot;&gt;Custom emails with Riemann&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on March 27, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Treat GitHub Wiki like a repository]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/02/treat-github-wiki-like-a-repository/"/>
  <id>http://kartar.net/2015/02/treat-github-wiki-like-a-repository</id>
  <published>2015-02-27T00:00:00-05:00</published>
  <updated>2015-02-27T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;I recently needed to export all the articles from a
&lt;a href=&quot;https://www.github.com&quot;&gt;GitHub&lt;/a&gt; wiki. I had thought I’d need to scrape
it but I discovered that each GitHub wiki is in fact a Git repo.&lt;/p&gt;

&lt;p&gt;If you need a copy of the content you can just clone it via Git.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;git clone git@github.com:username/repo_name.wiki.git&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That’s neat and I hope it’s useful to someone else.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/02/treat-github-wiki-like-a-repository/&quot;&gt;Treat GitHub Wiki like a repository&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on February 27, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[The Art of Monitoring]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/02/the-art-of-monitoring/"/>
  <id>http://kartar.net/2015/02/the-art-of-monitoring</id>
  <published>2015-02-02T00:00:00-05:00</published>
  <updated>2015-02-02T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;TL;DR - I am writing a &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;book&lt;/a&gt; about
monitoring and you can sign up for updates
&lt;a href=&quot;http://logstashbook.us6.list-manage.com/subscribe/post?u=f3aa656fdcded6d1354d6f4f0&amp;amp;id=a0633aafc9&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let’s begin with an origin story. Once upon a time(-series) there was a
sysadmin. She managed infrastructure that lived in a data center. Every
time a new host was added to that environment she installed some
software and setup some checks. Every now and again one of those servers
would break and a check would trigger. An alert would be sent and she
would wake up and run &lt;code&gt;rm -fr /var/log/*.log&lt;/code&gt; to fix it.&lt;/p&gt;

&lt;p&gt;For many years this approach worked just fine. Oh there were some
dramas: sometimes things would go wrong for which there wasn’t a check,
or there just wasn’t time to action some alerts, or some applications
and services on top of those hosts weren’t monitored. But things were
mostly fine.&lt;/p&gt;

&lt;p&gt;Then things started to change in the IT industry. Virtualization was
introduced and a lot more hosts appeared. Many of those hosts were run
by people who weren’t sysadmins or were even outsourced to
third-parties. Then some of the hosts in her data center were moved into
the Cloud or replaced with Software-as-a-Service applications.&lt;/p&gt;

&lt;p&gt;Most importantly, applications and services that were previously merely
seen as technology now became critical to selling to customers and
providing high quality customer service. Suddenly IT wasn’t a cost
centre but rather something the company’s revenue relied on.&lt;/p&gt;

&lt;p&gt;As a result aspects of monitoring began to break down. It became hard to
keep track of hosts (there were a lot more of them!), applications and
infrastructure became more complex, and expectations around availability
and quality became more aggressive. It became harder and harder to check
for all the possible things that could go wrong using the current
system. More and more alerts piled up. More hosts and services meant
more demand on monitoring systems, most of which were only able to
vertically scale. Faults and outages became harder to find and slower to
detect under these loads.&lt;/p&gt;

&lt;p&gt;Additionally, the organization began demanding more and more data to both
demonstrate the quality of the service they were delivering to customers
and to justify the increasing spend on IT services. Many of these
demands were made for data that existing monitoring simply wasn’t
measuring or couldn’t generate. The monitoring system became a tangled
mess.&lt;/p&gt;

&lt;p&gt;This is monitoring right now for many people in the industry. But it
doesn’t have to be like that. You can build a better solution that
addresses the change in the way IT works and that scales for the future.&lt;/p&gt;

&lt;p&gt;Welcome to &lt;a href=&quot;http://artofmonitoring.com&quot;&gt;The Art of Monitoring&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is a hands-on book that teaches you how to build a modern, scalable
monitoring environment using up-to-date tools and techniques.&lt;/p&gt;

&lt;p&gt;We include lessons for both sysadmins and developers. We’ll show
developers how they can better enable monitoring and metrics and we’ll
show sysadmins how to take advantage of that data to do better fault
detection and get insights into performance.&lt;/p&gt;

&lt;p&gt;We try to address the change in IT environments with virtualization,
containerization and the Cloud. We help you provide a monitoring
environment that helps you and your customers manage IT better.&lt;/p&gt;

&lt;p&gt;The book will contain.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Chapter 1: An Introduction to Monitoring&lt;/li&gt;
  &lt;li&gt;Chapter 2: Building a metrics-centric monitoring environment.&lt;/li&gt;
  &lt;li&gt;Chapter 3: Metrics, metrics and measurement&lt;/li&gt;
  &lt;li&gt;Chapter 4: Building a service-centric and dynamic fault detection system&lt;/li&gt;
  &lt;li&gt;Chapter 5: Alerting&lt;/li&gt;
  &lt;li&gt;Chapter 6: Trending&lt;/li&gt;
  &lt;li&gt;Chapter 8: Visualization&lt;/li&gt;
  &lt;li&gt;Chapter 9: Anomaly Detection for fun and profit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Likely to change…)&lt;/p&gt;

&lt;p&gt;In the book we look at a variety of open source tools, including:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://riemann.io&quot;&gt;Riemann&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/graphite-project&quot;&gt;Graphite&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://sensuapp.org/&quot;&gt;Sensu&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://flapjack.io/&quot;&gt;Flapjack&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.logstash.net&quot;&gt;Logstash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The book will be published late in 2015.&lt;/p&gt;

&lt;p&gt;You can find more information on the book and its status
&lt;a href=&quot;http://artofmonitoring.com&quot;&gt;here&lt;/a&gt; and you can sign up for updates
&lt;a href=&quot;http://logstashbook.us6.list-manage.com/subscribe/post?u=f3aa656fdcded6d1354d6f4f0&amp;amp;id=a0633aafc9&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;


  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/02/the-art-of-monitoring/&quot;&gt;The Art of Monitoring&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on February 02, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Riemann Sample Configurations]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/02/riemann-sample-configurations/"/>
  <id>http://kartar.net/2015/02/riemann-sample-configurations</id>
  <published>2015-02-01T00:00:00-05:00</published>
  <updated>2015-02-01T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;One of the challenges of getting to know Riemann is that its
configuration is in Clojure. Your Riemann configuration is actually a
Clojure program that executes when Riemann is running. For some folks
this is a very new language and sometimes a new approach.&lt;/p&gt;

&lt;p&gt;To help with this process I’m keen on collecting a bunch of sample
Riemann configurations from people who have already “been there and done
this”. There are a few already online - &lt;a href=&quot;https://github.com/guardian/riemann-config&quot;&gt;The
Guardian&lt;/a&gt; have theirs up for
example - but I’d love to have more.&lt;/p&gt;

&lt;p&gt;I’ve created &lt;a href=&quot;https://github.com/jamtur01/riemann.config&quot;&gt;a repository to hold
them&lt;/a&gt; and it’d be great if
folks would create a pull request and add theirs. I’d also be happy to
manually add configurations via gist, pastie,
&lt;a href=&quot;&amp;#109;&amp;#097;&amp;#105;&amp;#108;&amp;#116;&amp;#111;:&amp;#106;&amp;#097;&amp;#109;&amp;#101;&amp;#115;&amp;#064;&amp;#108;&amp;#111;&amp;#118;&amp;#101;&amp;#100;&amp;#116;&amp;#104;&amp;#097;&amp;#110;&amp;#108;&amp;#111;&amp;#115;&amp;#116;&amp;#046;&amp;#110;&amp;#101;&amp;#116;&quot;&gt;email&lt;/a&gt; or any other way you’d like to
get them to me.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/02/riemann-sample-configurations/&quot;&gt;Riemann Sample Configurations&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on February 01, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Using Riemann for Metrics]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/01/using-riemann-for-metrics/"/>
  <id>http://kartar.net/2015/01/using-riemann-for-metrics</id>
  <published>2015-01-19T00:00:00-05:00</published>
  <updated>2015-01-19T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;In &lt;a href=&quot;http://kartar.net/2014/12/an-introduction-to-riemann/&quot;&gt;my first post I introduced you to
Riemann&lt;/a&gt; and &lt;a href=&quot;http://kartar.net/2015/01/riemann-streams/&quot;&gt;my second post discussed Riemann for fault detection&lt;/a&gt;. In those posts we’ve discovered that Riemann aggregates events from distributed hosts and services. One of the cool outcomes of this aggregation is the ability to generate metrics from the events. We can then use a tool like &lt;a href=&quot;https://graphite.readthedocs.org/en/latest/index.html&quot;&gt;Graphite&lt;/a&gt; to store the metric data and render graphs from it. In this post you’ll see how to:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Install Graphite.&lt;/li&gt;
  &lt;li&gt;Generate metrics.&lt;/li&gt;
  &lt;li&gt;Integrate Riemann with Graphite.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;installing-graphite&quot;&gt;Installing Graphite&lt;/h2&gt;

&lt;p&gt;The first step we’re going to take is to install &lt;a href=&quot;https://graphite.readthedocs.org/en/latest/index.html&quot;&gt;Graphite&lt;/a&gt;. Graphite is an engine that stores time-series data and then renders graphs from that data.&lt;/p&gt;

&lt;p&gt;On an Ubuntu 14.04 or later host Graphite is available from APT packages. It’s made up of three components:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A web interface.&lt;/li&gt;
  &lt;li&gt;A storage engine called Carbon.&lt;/li&gt;
  &lt;li&gt;A database library called Whisper.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Carbon also relies on a database backend. The default database is Sqlite3 but you can specify Postgresql or MySQL/MariaDB if you wish (and I recommend one of these for a production environment - they are both far more robust than the default). We’re going to stick with the default right now as we’re just testing.&lt;/p&gt;

&lt;h3 id=&quot;installing-packages&quot;&gt;Installing Packages&lt;/h3&gt;

&lt;p&gt;Let’s install the packages we need.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo apt-get update
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo apt-get -y install graphite-web graphite-carbon apache2 libapache2-mod-wsgi&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We’ve first updated our APT package cache and then we’ve installed the &lt;code&gt;graphite-web&lt;/code&gt; and &lt;code&gt;graphite-carbon&lt;/code&gt; packages. The &lt;code&gt;graphite-web&lt;/code&gt; package contains Graphite’s web interface and the &lt;code&gt;graphite-carbon&lt;/code&gt; package contains the Carbon storage engine. We’ve also installed Apache to run the Graphite web interface.&lt;/p&gt;

&lt;p&gt;You’ll be prompted during installation as to whether your graph database should be removed if you uninstall Graphite. Answer “No” to ensure your graph data is preserved.&lt;/p&gt;

&lt;h3 id=&quot;configuring-graphite&quot;&gt;Configuring Graphite&lt;/h3&gt;

&lt;p&gt;Next we need to configure Graphite. First we edit the &lt;code&gt;/etc/graphite/local_settings.py&lt;/code&gt; configuration file.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;vi /etc/graphite/local_settings.py&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We need to change two items in this file. The first, &lt;code&gt;SECRET_KEY&lt;/code&gt;, is used to salt hashes for Graphite’s authentication and the second, &lt;code&gt;TIME_ZONE&lt;/code&gt;, controls the time zone. The latter is important if you want your metrics to have the right time and date.&lt;/p&gt;

&lt;p&gt;We want to uncomment &lt;code&gt;SECRET_KEY&lt;/code&gt; and set it to a long random string. Let’s generate a string now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;cat /dev/urandom &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; tr -dc &lt;span class=&quot;s1&quot;&gt;&amp;#39;a-zA-Z0-9&amp;#39;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; fold -w &lt;span class=&quot;m&quot;&gt;256&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; head -1
SyN1cmnVFCOvHhKJ4Jxrfc5osJx5HNmOc60LVEFahYM0dusIYmCRndd2mFEfHi6WAf9Sv8xBksmsmdQSh6PcoBKhA0MeX6DMNszKZEyGTBpx3kU5AArbcAtoeyTHz6ROk25DSKmjw7MlbmVVuM5Nbf5ewCIl6OVN3iXDhPLX0wvkE7nKJHKDcqelIOR0EyXDoa25Z88W374TXVNSucpxlyLDXWhHP6XShXCza4EQKCu6GePvFLHl1pjpYrb4sv7J&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now let’s add this random string to &lt;code&gt;SECRET_KEY&lt;/code&gt; and uncomment and update our &lt;code&gt;TIME_ZONE&lt;/code&gt; setting inside &lt;code&gt;/etc/graphite/local_settings.py&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;SECRET_KEY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;SyN1cmnVFCOvHhKJ4Jxrfc5osJx5HNmOc60LVEFahYM0dusIYmCRndd2mFEfHi6WAf9Sv8xBksmsmdQSh6PcoBKhA0MeX6DMNszKZEyGTBpx3kU5AArbcAtoeyTHz6ROk25DSKmjw7MlbmVVuM5Nbf5ewCIl6OVN3iXDhPLX0wvkE7nKJHKDcqelIOR0EyXDoa25Z88W374TXVNSucpxlyLDXWhHP6XShXCza4EQKCu6GePvFLHl1pjpYrb4sv7J&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;TIME_ZONE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;America/New_York&amp;#39;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Later in the same file you’ll find a hash of database settings.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;DATABASES&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;s1&quot;&gt;&amp;#39;default&amp;#39;&lt;/span&gt;: &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;s1&quot;&gt;&amp;#39;NAME&amp;#39;&lt;/span&gt;: &lt;span class=&quot;s1&quot;&gt;&amp;#39;/var/lib/graphite/graphite.db&amp;#39;&lt;/span&gt;,
    &lt;span class=&quot;s1&quot;&gt;&amp;#39;ENGINE&amp;#39;&lt;/span&gt;: &lt;span class=&quot;s1&quot;&gt;&amp;#39;django.db.backends.sqlite3&amp;#39;&lt;/span&gt;,
    &lt;span class=&quot;s1&quot;&gt;&amp;#39;USER&amp;#39;&lt;/span&gt;: &lt;span class=&quot;s1&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;,
    &lt;span class=&quot;s1&quot;&gt;&amp;#39;PASSWORD&amp;#39;&lt;/span&gt;: &lt;span class=&quot;s1&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;,
    &lt;span class=&quot;s1&quot;&gt;&amp;#39;HOST&amp;#39;&lt;/span&gt;: &lt;span class=&quot;s1&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;,
    &lt;span class=&quot;s1&quot;&gt;&amp;#39;PORT&amp;#39;&lt;/span&gt;: &lt;span class=&quot;s1&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For the default Sqlite3 database you won’t need to change this but it’d be here that you’d update if you wanted to use Postgresql or MySQL. In the default configuration you’ll find your data stored in &lt;code&gt;/var/lib/graphite/graphite.db&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;prepping-our-database&quot;&gt;Prepping our database&lt;/h3&gt;

&lt;p&gt;Next we find to prep our initial database using the &lt;code&gt;syncdb&lt;/code&gt; option of the &lt;code&gt;graphite-manage&lt;/code&gt; command. This populates our database with the required initial tables and structure.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo graphite-manage syncdb
Creating tables ...
Creating table account_profile
Creating table account_variable
Creating table account_view
Creating table account_window
Creating table account_mygraph
Creating table dashboard_dashboard_owners
Creating table dashboard_dashboard
Creating table events_event
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_groups
Creating table auth_user_user_permissions
Creating table auth_user
Creating table django_session
Creating table django_admin_log
Creating table django_content_type
Creating table tagging_tag
Creating table tagging_taggeditem

You just installed Django&lt;span class=&quot;s1&quot;&gt;&amp;#39;s auth system, which means you don&amp;#39;&lt;/span&gt;t have any superusers defined.
Would you like to create one now? &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;yes/no&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;: yes
Username &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;leave blank to use &lt;span class=&quot;s1&quot;&gt;&amp;#39;root&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;:
Email address: james@example.com
Password:
Password &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;again&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;:
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
Installed &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; object&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;s&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; from &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; fixture&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;s&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We also define a super-user to use with our database. I specify the default &lt;code&gt;root&lt;/code&gt;, an email address and then a secure password.&lt;/p&gt;

&lt;h3 id=&quot;configuring-carbon&quot;&gt;Configuring Carbon&lt;/h3&gt;

&lt;p&gt;Next I want to tweak Carbon’s density of metric retention, essentially how long metrics should be stored and how detailed those metrics should be. This is configured in the &lt;code&gt;/etc/carbon/storage-schemas.conf&lt;/code&gt; file. Let’s look at this file now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;# Schema definitions for Whisper files. Entries are scanned in order,&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# and first match wins. This file is scanned for changes every 60 seconds.&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#  [name]&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#  pattern = regex&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#  retentions = timePerPoint:timeToStore, timePerPoint:timeToStore, ...&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Carbon&amp;#39;s internal metrics. This entry should match what is specified in&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# CARBON_METRIC_PREFIX and CARBON_METRIC_INTERVAL settings&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;carbon&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;pattern&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; ^carbon&lt;span class=&quot;se&quot;&gt;\.&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;retentions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 60:90d

&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default_1min_for_1day&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;pattern&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; .*
&lt;span class=&quot;nv&quot;&gt;retentions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 60s:1d&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each schema entry matches specific metrics by name and specifies one or more retention periods. The first entry, &lt;code&gt;[carbon]&lt;/code&gt;, manages Carbon’s own metrics. A regular expression &lt;code&gt;pattern&lt;/code&gt; is matched to find these, here any metric starting with &lt;code&gt;carbon&lt;/code&gt;. The retentions are then set with the &lt;code&gt;retentions&lt;/code&gt; entry. You can specify one or more retentions in the form of:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;sample_time:retention_period&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For the Carbon metrics a data point is created every 60 seconds and kept for 90 days: &lt;code&gt;60:90d&lt;/code&gt;. This means each data point represents 60 seconds and we want to keep enough data points for 90 days of data.&lt;/p&gt;

&lt;p&gt;All other metrics use the &lt;code&gt;default_1min_for_1day&lt;/code&gt; schema, the &lt;code&gt;pattern&lt;/code&gt; matches &lt;code&gt;.*&lt;/code&gt; or all events. In this schema, Graphite creates data points every 60 seconds and keeps enough data to represent 1 day. That’s a pretty low resolution by most standards and Riemann processes events much more quickly. So we’re going to create a new schema and comment out the old one.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;c&quot;&gt;#[default_1min_for_1day]&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#pattern = .*&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;#retentions = 60s:1d&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;default&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;pattern&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; .*
&lt;span class=&quot;nv&quot;&gt;retentions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 10s:1h, 1m:7d, 15m:30d, 1h:2y&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This new schema includes multiple retentions. Multiple retentions allow graceful downsampling of historical data, saving you disk and performance. Our first retention, &lt;code&gt;10s:1h&lt;/code&gt; creates data points every 10 seconds and keeps enough data for 1 hour and then our next retention, &lt;code&gt;1m:7d&lt;/code&gt;, retains 1 minute data points for 7 days and so on.&lt;/p&gt;

&lt;p&gt;To do the downsample from &lt;code&gt;10s:1h&lt;/code&gt; to &lt;code&gt;1m:7d&lt;/code&gt; Graphite gathers all of the data from the past minute (this should be six data points, one generated every 10 seconds). It then averages the data points to aggregate them and retains this new data point for 7 days. By default, each retention averages the total as it downsamples so you can determine metrics totals by reversing the average.&lt;/p&gt;

&lt;p&gt;You can also configure Graphite to use alternate methods to aggregate the data points including &lt;code&gt;min&lt;/code&gt;, &lt;code&gt;max&lt;/code&gt;, &lt;code&gt;sum&lt;/code&gt; and &lt;code&gt;last&lt;/code&gt;. This is done by configuring a &lt;code&gt;/etc/carbon/storage-aggregation.conf&lt;/code&gt; file. There’s a sample file in &lt;code&gt;/usr/share/doc/graphite-carbon/examples/storage-aggregation.conf.example&lt;/code&gt;. We’re not going to do that right now but there’s an annoyingly frequent log message that appears in your Carbon logs, &lt;code&gt;/var/log/carbon/console.log&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;/etc/carbon/storage-aggregation.conf not found, ignoring.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Creating an empty &lt;code&gt;/etc/carbon/storage-aggregation.conf&lt;/code&gt; file stops the message so let’s do that now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;touch /etc/carbon/storage-aggregation.conf&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can see a lot more about how Carbon is configured &lt;a href=&quot;http://graphite.readthedocs.org/en/latest/config-carbon.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;run-carbon-at-startup&quot;&gt;Run Carbon at startup&lt;/h3&gt;

&lt;p&gt;Now let’s configure Carbon to run by default by editing the &lt;code&gt;/etc/default/graphite-carbon&lt;/code&gt; file.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo vi /etc/default/graphite-carbon&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Change the value of &lt;code&gt;CARBON_CACHE_ENABLED=false&lt;/code&gt; to &lt;code&gt;CARBON_CACHE_ENABLED=true&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;installing-graphites-web-interface&quot;&gt;Installing Graphite’s web interface&lt;/h3&gt;

&lt;p&gt;As our last setup step we’re going to install Graphite’s web interface. To do this we’re going to install it as Apache’s default website. First, disable the existing default site.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo a2dissite 000-default&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now copy in Graphite’s Apache configuration.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo cp /usr/share/graphite-web/apache2-graphite.conf /etc/apache2/sites-available&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And enable it.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo a2ensite apache2-graphite&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And we’re done.&lt;/p&gt;

&lt;h3 id=&quot;starting-carbon-and-graphite&quot;&gt;Starting Carbon and Graphite&lt;/h3&gt;

&lt;p&gt;Finally, let’s start or reload the required services.&lt;/p&gt;

&lt;p&gt;First Carbon.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo service carbon-cache start&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And then Apache.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo service apache2 reload.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can then view the Graphite web interface in your browser.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/1/graphite_web.png&quot; alt=&quot;Graphite Dashboard&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;configuring-riemann-for-graphite&quot;&gt;Configuring Riemann for Graphite&lt;/h2&gt;

&lt;p&gt;Riemann uses a Clojure-based configuration file to specify how events are processed and handled. On an Ubuntu host we can find that file at &lt;code&gt;/etc/riemann/riemann.config&lt;/code&gt;. We’re going to add a Graphite output to the configuration we used in the last posts on Riemann. Let’s look at an updated configuration now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;c1&quot;&gt;; -*- mode: clojure; -*-&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; vim: filetype=clojure&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;logging/init&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:file&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;/var/log/riemann/riemann.log&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;; Listen on the local interface over TCP (5555), UDP (5555), and websockets&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; (5556)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;0.0.0.0&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tcp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;udp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ws-server&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;; Expire old events from the index every 5 seconds.&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;periodically-expire&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:keep-keys&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]})&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;graph&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;graphite&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;localhost&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; Inbound events will be passed to these streams:&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

      &lt;span class=&quot;c1&quot;&gt;; graph all&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can see we’ve added a function called &lt;code&gt;graph&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;graph&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;graphite&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;localhost&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This defines a connection to our local Graphite server, here &lt;code&gt;localhost&lt;/code&gt;. You could also specify the name of a remote Graphite server and you can use either TCP or UDP to send events.&lt;/p&gt;

&lt;p&gt;Inside your &lt;code&gt;streams&lt;/code&gt; block we can then use the &lt;code&gt;graph&lt;/code&gt; function to send events through to Graphite. In our current configuration we’re graphing everything. This means every event sent to Riemann will get passed to Graphite and turned into a graph.&lt;/p&gt;

&lt;p&gt;Alternatively, if you don’t want to send everything to Graphite we can be more selective, for example we could only select metrics from specific services.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;service&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;heartbeat&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we’re only sending events from the &lt;code&gt;heartbeat&lt;/code&gt; service through to Graphite.&lt;/p&gt;

&lt;p&gt;Now let’s send some metrics through to Graphite.&lt;/p&gt;

&lt;h3 id=&quot;sending-metrics-to-riemann-and-graphite&quot;&gt;Sending metrics to Riemann and Graphite&lt;/h3&gt;

&lt;p&gt;For our metrics we’re going to choose some Nginx metrics. We’ve got a host running Nginx and are going to use the &lt;code&gt;riemann-nginx-status&lt;/code&gt; command provided by the &lt;code&gt;riemann-tools&lt;/code&gt; gem to send the metrics.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo gem install riemann-tools&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;riemann-nginx-status&lt;/code&gt; command assumes the presence of an Nginx status page at &lt;code&gt;http://localhost:8080/nginx_status&lt;/code&gt;. You can configure a page like that in your Nginx configuration. You can also override the default location with the &lt;code&gt;--uri&lt;/code&gt; option.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;location /nginx_status &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  stub_status on&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  access_log   off&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  allow 127.0.0.1&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  deny all&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Nginx status stub provides connection and status metrics. You can also control which metrics get sent to Riemann and specify any required thresholds. Let’s run &lt;code&gt;riemann-nginx-status&lt;/code&gt; now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;riemann-nginx-status --host riemann.example.com&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We’re sending our metrics from our Nginx host to &lt;code&gt;riemann.example.com&lt;/code&gt; and we should start to see events like these hit Riemann shortly:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;:host artemisia.example.com, :service nginx health, :state ok, :description Nginx status connection ok, :metric nil, :tags nil, :time 1421514112, :ttl 10.0&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;:host artemisia.example.com, :service nginx active, :state ok, :description nil, :metric 3, :tags nil, :time 1421514112, :ttl 10.0&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we have a health check and the active connections metric. We should also now see if these events passed through to Graphite. Let’s see the resulting graphs in the Graphite web console.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/1/graphite_web2.png&quot; alt=&quot;Graphite Dashboard 2&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We can see several metrics in our graph but not our health event. This is because Riemann only forwards events that have metrics. As the health event has a &lt;code&gt;metric&lt;/code&gt; value of &lt;code&gt;nil&lt;/code&gt; it’s not forwarded along to Graphite.&lt;/p&gt;

&lt;p&gt;Pretty simple eh? Instant graph gratification.&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;We’ve seen how to install Graphite and connect it to Riemann. We’ve also seen how easy it is to turn our metrics into useful graphs. Building on this we could easily add categorization, filtering and manipulation (you remember all those cool things Riemann can do to events and their contents). A good starting point is &lt;a href=&quot;https://github.com/guardian/riemann-config&quot;&gt;The Guardian’s Riemann configuration&lt;/a&gt;. There’s lots of useful examples and ideas here. Enjoy!&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/01/using-riemann-for-metrics/&quot;&gt;Using Riemann for Metrics&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on January 19, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[A Monitoring Maturity Model]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/01/a-monitoring-maturity-model/"/>
  <id>http://kartar.net/2015/01/a-monitoring-maturity-model</id>
  <published>2015-01-13T00:00:00-05:00</published>
  <updated>2015-01-13T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;I’ve been thinking a lot about &lt;a href=&quot;http://kartar.net/2014/11/monitoring-survey---background/&quot;&gt;monitoring maturity&lt;/a&gt;. Based on some research I did last year and a number of conversations with people in the industry I’ve documented a simple monitoring maturity model. I present it largely because some folks might be interested rather than as any sweeping revelation.&lt;/p&gt;

&lt;p&gt;The three level maturity model reflects the various stages of monitoring evolution I’ve seen organizations experience. The three stages are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Manual&lt;/li&gt;
  &lt;li&gt;Reactive&lt;/li&gt;
  &lt;li&gt;Proactive&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Onto the details of the stages.&lt;/p&gt;

&lt;h3 id=&quot;manual-or-none&quot;&gt;Manual or None&lt;/h3&gt;

&lt;p&gt;Monitoring is largely done manually or not at all. If monitoring is performed you will commonly see checklists, or simple scripts and other non-automated processes. Much of the monitoring is &lt;a href=&quot;https://en.wikipedia.org/wiki/Cargo_cult&quot;&gt;cargo cult&lt;/a&gt; behaviour where the components that are monitored are those that have broken in the past. Faults in these components are remediated by repeatedly following rote steps that have also “worked in the past”.&lt;/p&gt;

&lt;p&gt;The focus is entirely on minimizing downtime and managing assets. Monitoring provides little or no value in measuring quality or service and provides little or no data that helps IT justify budgets, costs or new projects.&lt;/p&gt;

&lt;p&gt;This is typical in small organizations with limited IT staffing, where there are no dedicated IT staff or where the IT function is run or managed by non-IT staff, such as a Finance team.&lt;/p&gt;

&lt;h3 id=&quot;reactive&quot;&gt;Reactive&lt;/h3&gt;

&lt;p&gt;Monitoring is mostly automatic with some remnants of manual or unmonitored components. Tooling of varying sophistication has been deployed to perform the monitoring. You will commonly see tools like Nagios with stock checks of basic concerns like disk, CPU and memory. Some performance data may be collected. Most alerting will be simple and via email or messaging services. There may be one or more centralized consoles displaying monitoring status.&lt;/p&gt;

&lt;p&gt;There is a broad focus on measuring availability and managing IT assets. There may be some movement towards using monitoring data to measure customer experience. Monitoring provides some data that measures quality or service and provides some data that helps IT justify budgets, costs or new projects. Most of this data needs to be manipulating or transformed before it can be used though. A small number of operationally-focussed dashboards exist.&lt;/p&gt;

&lt;p&gt;This is typical in small to medium enterprises and common in divisional IT organizations inside larger enterprises. Typically here monitoring is built and deployed by an operations team. You’ll often find large backlogs of alerts and stale check configuration and architecture. Updates to monitoring systems tend to be reactive in response to incidents and outages. New monitoring checks are usually the last step in application or infrastructure deployments.&lt;/p&gt;

&lt;h3 id=&quot;proactive&quot;&gt;Proactive&lt;/h3&gt;

&lt;p&gt;Monitoring is considered core to managing infrastructure and the business. Monitoring is automatic and often driven by configuration management tooling. You’ll see tools like Nagios, Sensu, and Graphite with widespread use of metrics and graphing. Checks will tend to be more application-centric, with many applications being instrumented as part of development. Checks will also focus on measuring application performance and business outcomes rather than stock concerns like disk and CPU. Performance data will be collected and frequently used for analysis and fault resolution. Alerting will be annotated with context and likely include escalations and automatic responses.&lt;/p&gt;

&lt;p&gt;There is a focus on measuring quality of service and customer experience. Monitoring provides data that measures quality or service and provides data that helps IT justify budgets, costs or new projects. Much of this data is provided directly to business units, application teams and other interests parties via dashboards and reports.&lt;/p&gt;

&lt;p&gt;This is typical in web-centric organizations and many mature startups. Monitoring will still largely be managed by an operations team but responsibility for ensuring new applications and services are monitoring may be devolved to application developers. Products will not be considered feature complete or ready for deployment without monitoring and instrumentation.&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;I don’t believe or claim this model is perfect (or overly scientific). It’s also largely designed so I can quantify some work I am conducting. The evolution of monitoring in organizations varies dramatically, or as William Gibson said: “The future is not evenly distributed.” The stages I’ve identified are broad. Organizations may be at varying points of a broad spectrum inside those stages.&lt;/p&gt;

&lt;p&gt;Additionally, what makes measuring this maturity difficult is that I don’t think all organizations experience this evolution linearly or holistically. This can be the consequence of having employees with varying levels of skill and experience over different periods. Or it can that different segments, business units or divisions of an organizations can have quite different levels of maturity. Or both.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/01/a-monitoring-maturity-model/&quot;&gt;A Monitoring Maturity Model&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on January 13, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[Using Riemann for Fault Detection]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2015/01/riemann-streams/"/>
  <id>http://kartar.net/2015/01/riemann-streams</id>
  <published>2015-01-05T00:00:00-05:00</published>
  <updated>2015-01-05T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;p&gt;In &lt;a href=&quot;http://kartar.net/2014/12/an-introduction-to-riemann/&quot;&gt;the last post I introduced you to
Riemann&lt;/a&gt;. I
mentioned streams in that post and how they are at the heart of
Riemann’s power. However I only provided a vague teaser of streams and
left you having to go fish for yourself.&lt;/p&gt;

&lt;p&gt;In this post I’m going to build on our example Riemann configuration. I’ll show you how to do simple service management with streams and introduce you to Riemann’s state table: the index. We’ll see:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;How the index works.&lt;/li&gt;
  &lt;li&gt;How we can alert on services and hosts using events.&lt;/li&gt;
  &lt;li&gt;How we can send those alerts via email and PagerDuty.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;configuring-streams&quot;&gt;Configuring Streams&lt;/h2&gt;

&lt;p&gt;Streams are specified in Riemann’s Clojure-based configuration file. On our example Ubuntu host we can find that file at &lt;code&gt;/etc/riemann/riemann.config&lt;/code&gt;. We edited that configuration in the &lt;a href=&quot;http://kartar.net/2014/12/an-introduction-to-riemann/&quot;&gt;last post&lt;/a&gt; to bind Riemann to all interfaces and to add some more logging. Let’s look at it again now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;logging/init&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:file&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;/var/log/riemann/riemann.log&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;; Listen on all interfaces over TCP (5555), UDP (5555), and websockets&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; (5556)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;0.0.0.0&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tcp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;udp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ws-server&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;; Expire old events from the index every 5 seconds.&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;periodically-expire&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;; Inbound events will be passed to these streams:&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

      &lt;span class=&quot;c1&quot;&gt;; Log expired events.&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expired&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;info&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;expired&amp;quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;In our configuration we can see a section called &lt;code&gt;(streams&lt;/code&gt;. Inside this section is where we configure Riemann’s streams. The first entry in this section specifies a default time to live for events. More on this shortly. The second entry tells Riemann to index all events.&lt;/p&gt;

&lt;h2 id=&quot;the-riemann-index&quot;&gt;The Riemann Index&lt;/h2&gt;

&lt;p&gt;The index is a table of the current state of all services being tracked by Riemann. In the last post, when we introduced events, we discovered that each Riemann event is a struct that can contain one of a number of (optional) fields including: host, service, state, a time and description, a metric value or a time to live. Each event you tell Riemann to index is added and mapped by its host and service fields. The index retains the most recent event for each host and service. You can think about the index as Riemann’s worldview. The Riemann dashboard, which we also saw in the last post, uses the index as its source of truth.&lt;/p&gt;

&lt;p&gt;Each indexed event has a Time To Live or TTL. The TTL can be set by the event’s &lt;code&gt;ttl&lt;/code&gt; field or as a default. In our configuration we’ve set the default TTL to 60 seconds with the &lt;a href=&quot;http://riemann.io/api/riemann.streams.html#var-default&quot;&gt;&lt;code&gt;default&lt;/code&gt;&lt;/a&gt; variable. This is the period for any event which doesn’t already have a TTL.&lt;/p&gt;

&lt;p&gt;After an event’s TTL expires it is dropped from the index and fed back into the stream with a &lt;code&gt;state&lt;/code&gt; of &lt;code&gt;expired&lt;/code&gt;. This seems pretty innocuous right? Nope! This is where the change in monitoring methodology that Riemann facilitates starts to become clear (and exciting).&lt;/p&gt;

&lt;h2 id=&quot;detecting-down-services&quot;&gt;Detecting down services&lt;/h2&gt;

&lt;p&gt;In &lt;a href=&quot;http://kartar.net/2014/12/an-introduction-to-riemann/&quot;&gt;the last post&lt;/a&gt; I talked a bit about pull/polling models versus push models for monitoring. In the monitoring “pull model” we actively poll services, for example using an active check like a Nagios plugin. If any of those services failed to respond or returned a malformed response our monitoring system would alert us to that. This active monitoring generally results in a centralized, monolithic and vertically scaled solution. That’s not an ideal architecture.&lt;/p&gt;

&lt;p&gt;In an event-driven push model we don’t do any active monitoring. Our services generate events. Those events are pushed to Riemann. Each event has a TTL and the last event received is stored in the index. When the TTL expires Riemann will expire the event and feed it back into the stream. In that stream I can then monitor for events with a &lt;code&gt;status&lt;/code&gt; of &lt;code&gt;expired&lt;/code&gt; and alert on those. A much simpler, more scalable and IMHO more elegant solution.&lt;/p&gt;

&lt;p&gt;So let’s see how this might work for a service. In the last post we looked at some of the &lt;a href=&quot;https://github.com/aphyr/riemann-tools&quot;&gt;Riemann tools&lt;/a&gt; for service checking. Let’s use the &lt;code&gt;riemann-varnish&lt;/code&gt; tool again for our testing.&lt;/p&gt;

&lt;p&gt;On our Varnish host we need to install &lt;code&gt;riemann-tools&lt;/code&gt; via RubyGems.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo gem install riemann-tools&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can then use &lt;code&gt;riemann-varnish&lt;/code&gt; to send our events.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;riemann-varnish --host riemann.example.com&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;riemann-varnish&lt;/code&gt; command wraps the &lt;code&gt;varnishstat&lt;/code&gt; command and converts Varnish statistics into Riemann events, for example the client connections accepted metric generates an event like so:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_conn&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Client&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;connections&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;accepted&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;13795.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419404501&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can see that the event has a &lt;code&gt;host&lt;/code&gt; and a &lt;code&gt;service&lt;/code&gt;, the combination of which Riemann will use to track state in the index. The event also has a &lt;code&gt;state&lt;/code&gt; field of &lt;code&gt;ok&lt;/code&gt; plus other useful information like the actual client connections accepted metric.&lt;/p&gt;

&lt;p&gt;We’re going to use this data, plus the TTL, to do basic service monitoring with Riemann. Let’s update our configuration to&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mailer&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:from&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;reimann@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; Inbound events will be passed to these streams:&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;changed-state&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:init&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;ok&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;james@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The first thing we’ve added is a function called &lt;code&gt;email&lt;/code&gt; that configures the emailing of events. Under the covers Riemann uses &lt;a href=&quot;https://github.com/drewr/postal&quot;&gt;Postal&lt;/a&gt; to send email for you. This basic configuration uses local sendmail to send emails. The &lt;code&gt;From&lt;/code&gt; email will be &lt;code&gt;riemann@example.com&lt;/code&gt;. You could also configure &lt;a href=&quot;http://riemann.io/api/riemann.email.html&quot;&gt;sending via SMTP&lt;/a&gt;. To send emails you’ll need to ensure you have local mail configured on your host. To do this I usually install the &lt;code&gt;mailtools&lt;/code&gt; package.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo apt-get -y install mailtools&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If you don’t install a suitable local mail server then you’ll receive a somewhat cryptic error in your Riemann log along the lines of:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;riemann.email$mailer$make_stream threw java.lang.NullPointerException
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Next we’ve used a helper shortcut called &lt;code&gt;changed-state&lt;/code&gt; to monitor for events whose state has changed. The &lt;code&gt;init&lt;/code&gt; variable specifies the base assumption of an event’s state, here &lt;code&gt;ok&lt;/code&gt;. This is because Riemann doesn’t know about the previous state of events when it starts. This tells Riemann to assume previous events are all okay. Now the &lt;code&gt;changed-state&lt;/code&gt; shortcut will match any events whose state is not &lt;code&gt;ok&lt;/code&gt; and pass them to the &lt;code&gt;email&lt;/code&gt; function we defined earlier.&lt;/p&gt;

&lt;p&gt;Let’s see this in action. First, we need to restart or HUP Riemann. Next, whilst I’ve been explaining this, the &lt;code&gt;riemann-varnish&lt;/code&gt; tool has been sending events to Riemann. Those events are from my Varnish host, &lt;code&gt;varnish.example.com&lt;/code&gt;, and an event is generated by each Varnish metric. Each event has a state of &lt;code&gt;ok&lt;/code&gt; and a TTL of 10 seconds.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_conn&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Client&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;connections&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;accepted&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;13795.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419404501&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If Varnish fails or I stop the &lt;code&gt;riemann-varnish&lt;/code&gt; tool then the events flow will cease. When the TTL has expired, 10 seconds later, this should trigger an event with a state of &lt;code&gt;expired&lt;/code&gt; and email notifications telling us that the Varnish services have changed state.&lt;/p&gt;

&lt;p&gt;If we check our Riemann log file we should see the following event.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1420058947163&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;/1000&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;expired&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;7184.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_conn&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As well as additional events for each Varnish metric that has also expired. If we check our inbox we should also see email notifications for each service that has stopped reporting.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/1/riemann_email.png&quot; alt=&quot;Riemann Email Notification&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If the service starts working again you’ll receive another set of notifications that things are back to normal.&lt;/p&gt;

&lt;h2 id=&quot;preventing-spikes-and-flapping&quot;&gt;Preventing spikes and flapping&lt;/h2&gt;

&lt;p&gt;Like most monitoring systems we also have to be conscious of the potential for state spikes and flapping. Riemann provides a useful variable to help us here called &lt;a href=&quot;http://riemann.io/api/riemann.streams.html#var-stable&quot;&gt;stable&lt;/a&gt;. This variable allows us to specify a time period and event field, like the &lt;code&gt;state&lt;/code&gt; (or usefully the &lt;code&gt;metric&lt;/code&gt; for certain types of monitoring), and it monitors for spikey or flapping behavior. Let’s add &lt;code&gt;stable&lt;/code&gt; to our example.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;; Inbound events will be passed to these streams:&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
      &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;changed-state&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:init&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;ok&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;stable&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt;
          &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;email&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;james@example.com&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we’ve specified the &lt;code&gt;stable&lt;/code&gt; variable with a time period of 60 seconds watching the &lt;code&gt;state&lt;/code&gt; of events. This will mean that Riemann will only pass on events where the state remains the same for at least 60 seconds. Hopefully avoiding service flapping. (Also potentially interesting here is the ability to &lt;a href=&quot;http://riemann.io/howto.html#roll-up-and-throttle-events&quot;&gt;rollup and throttle event streams&lt;/a&gt;.)&lt;/p&gt;

&lt;h2 id=&quot;sending-events-to-pagerduty&quot;&gt;Sending events to PagerDuty&lt;/h2&gt;

&lt;p&gt;We aren’t limited to email either for alerting. Riemann comes with some additional options, most notably &lt;a href=&quot;http://www.pagerduty.com/&quot;&gt;PagerDuty&lt;/a&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;def &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;pd&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;pagerduty&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;pagerduty-service-key&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; Inbound events will be passed to these streams:&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;changed-state&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:init&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;ok&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;stable&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;ok&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:resolve&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;where&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;expired&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:trigger&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we’ve defined a function called &lt;code&gt;pd&lt;/code&gt; that creates a connection to PagerDuty. We’ve specified a service key we previously defined in PagerDuty. We’ve updated our state monitoring to trigger in two cases:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When an event has a state of &lt;code&gt;expired&lt;/code&gt; we send an alert trigger to PagerDuty.&lt;/li&gt;
  &lt;li&gt;When an event has a state of &lt;code&gt;ok&lt;/code&gt; we send a resolution signal to PagerDuty.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures we can both trigger and resolve issues created from Riemann.&lt;/p&gt;

&lt;p&gt;Let’s trigger some PagerDuty alerts. First, we need to restart or HUP Riemann to update our configuration. Next, we can generate some alerts by stopping our &lt;code&gt;riemann-varnish&lt;/code&gt; tool again. The &lt;code&gt;expired&lt;/code&gt; events should trigger some PagerDuty alerts like these.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2015/1/riemann_pd.png&quot; alt=&quot;Riemann PagerDuty Notification&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;Pretty cool stuff eh? Well this post just scratches the surface of things you can do with Riemann streams. There are a bunch of other ideas and examples in the &lt;a href=&quot;http://riemann.io/howto.html&quot;&gt;Riemann HOWTO&lt;/a&gt; section that you can explore. Also look out for my next post on Riemann where I’ll be looking at streams again, this time with a focus on metrics and Graphite.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2015/01/riemann-streams/&quot;&gt;Using Riemann for Fault Detection&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on January 05, 2015.&lt;/p&gt;</content>
</entry>


<entry>
  <title type="html"><![CDATA[An Introduction to Riemann]]></title>
  <link rel="alternate" type="text/html" href="http://kartar.net/2014/12/an-introduction-to-riemann/"/>
  <id>http://kartar.net/2014/12/an-introduction-to-riemann</id>
  <published>2014-12-26T00:00:00-05:00</published>
  <updated>2014-12-26T00:00:00-05:00</updated>
  <author>
    <name>James Turnbull</name>
    <uri>http://kartar.net</uri>
    <email>james@lovedthanlost.net</email>
  </author>
  <content type="html">&lt;blockquote&gt;
  &lt;p&gt;If only I had the theorems! Then I should find the proofs easily enough - Bernard Riemann&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For the last year I’ve been using nights and weekends to look to a
variety of monitoring and logging tools. &lt;a href=&quot;http://artofmonitoring.com/&quot;&gt;For
reasons&lt;/a&gt;. I’ve spent a lot of hours playing
with Nagios again (some years ago &lt;a href=&quot;http://tinyurl.com/pronagios&quot;&gt;I wrote a book about
it&lt;/a&gt;) as well as looking at tools like
&lt;a href=&quot;http://sensuapp.org/&quot;&gt;Sensu&lt;/a&gt; and
&lt;a href=&quot;http://kartar.net/2014/09/a-whole-lot-of-heka/&quot;&gt;Heka&lt;/a&gt;. One of the tools
I am reviewing and am quite excited about is
&lt;a href=&quot;http://riemann.io/&quot;&gt;Riemann&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Riemann is a monitoring tool that aggregates events from hosts
and applications and can feed them into a stream processing language to
be manipulated, summarized or actioned. The idea behind Riemann is to
make monitoring and measuring events an easy default. Riemann also
provides alerting and notifications, the ability to send events onto
other services and storage and a variety of other integrations. Overall,
Riemann is fast and highly configurable.  Most importantly however it is
an event-centric push model.&lt;/p&gt;

&lt;p&gt;So why does this matter? Most monitoring systems I’ve been examining are
pull or polling-based systems like Nagios where your monitoring system
queries the components being monitored. A classic (perhaps even
traditional) check might be an ICMP-based ping of a server. This type of
polling is focused on measuring uptime and availability. There’s nothing
fundamentally wrong with wanting to know that assets are available and
running. Except if that’s the only question you ask. Then it reinforces
the view of IT as a cost center.&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; Everything in the IT organization
tends to be focused around minimizing downtime rather than maximizing
value.&lt;/p&gt;

&lt;p&gt;Push based models in comparison are generally about measurement. You
still get availability measurement but as a side effect of measuring
components and services. The push model also introduces some changes in
the way monitoring is architected. Monitoring is no longer a monolithic
central function and we don’t need to vertically scale that monolith as
hosts are added. Instead pushes are decentralized and the focus is on
measuring your applications, your business and your user experience.
This changes the focus inside your IT organization towards measuring
value, throughput and performance. All levers that are about profit
rather than cost.&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So with this in mind, let’s take a look at installing Riemann,
configuring it and doing some basic service and event monitoring.&lt;/p&gt;

&lt;h2 id=&quot;introducing-riemann&quot;&gt;Introducing Riemann&lt;/h2&gt;

&lt;p&gt;Riemann is &lt;a href=&quot;https://github.com/aphyr/riemann&quot;&gt;open source&lt;/a&gt; and licensed
with the &lt;a href=&quot;https://github.com/aphyr/riemann/blob/master/LICENSE&quot;&gt;Eclipse Public
license&lt;/a&gt;. It is
primarily authored by &lt;a href=&quot;https://aphyr.com/&quot;&gt;Kyle Kingsbury&lt;/a&gt; aka
Aphyr.&lt;sup id=&quot;fnref:3&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; Riemann is written in Clojure and runs on top of a
&lt;a href=&quot;https://en.wikipedia.org/wiki/Java_virtual_machine&quot;&gt;JVM&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;installing-riemann&quot;&gt;Installing Riemann&lt;/h2&gt;

&lt;p&gt;We’re going to install Riemann onto an Ubuntu 14.04 host. We’re going to
use &lt;a href=&quot;https://aphyr.com/riemann/riemann_0.2.6_all.deb&quot;&gt;the Riemann project’s DEB
packages&lt;/a&gt;.  Also
available are RPM packages and tarballs. I am going to do a manual
install so you can see the steps involved but you could also install
Riemann via
&lt;a href=&quot;https://registry.hub.docker.com/search?q=riemann&amp;amp;searchfield=&quot;&gt;Docker&lt;/a&gt;,
&lt;a href=&quot;https://forge.puppetlabs.com/garethr/riemann&quot;&gt;Puppet&lt;/a&gt;, &lt;a href=&quot;https://github.com/garethr/riemann-vagrant&quot;&gt;Vagrant&lt;/a&gt;, or &lt;a href=&quot;https://github.com/hudl/riemann-cookbook&quot;&gt;Chef&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;First, we’ll need Java and Ruby installed. The Java to run Riemann
itself and Ruby for some supporting libraries, a client and the Riemann
dashboard. For Java we’re going to use the default OpenJDK available on
Ubuntu. For Ruby we’re going to install the &lt;code&gt;ruby-dev&lt;/code&gt; package which
will drag in Ruby and all the required dependencies we need. We also
need the &lt;code&gt;build-essential&lt;/code&gt; package to allow us to compile some of the
Ruby dependencies.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo apt-get -y install default-jre ruby-dev build-essential&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then let’s check Java is installed correctly.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;java -version
java version &lt;span class=&quot;s2&quot;&gt;&amp;quot;1.7.0_65&amp;quot;&lt;/span&gt;
OpenJDK Runtime Environment &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;IcedTea 2.5.3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;7u71-2.5.3-0ubuntu0.14.04.1&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
OpenJDK 64-Bit Server VM &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;build 24.65-b04, mixed mode&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now let’s grab the DEB package of the current release.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;wget https://aphyr.com/riemann/riemann_0.2.8_all.deb&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And then install it via the &lt;code&gt;dpkg&lt;/code&gt; command.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo dpkg -i riemann_0.2.8_all.deb&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Riemann DEB package installs the &lt;code&gt;riemann&lt;/code&gt; binary and supporting
files, service management and a default configuration file.&lt;/p&gt;

&lt;p&gt;Lastly, let’s install some supporting tools, the Riemann client and
dashboard.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo gem install --no-ri --no-rdoc riemann-client riemann-tools riemann-dash&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id=&quot;running-riemann&quot;&gt;Running Riemann&lt;/h2&gt;

&lt;p&gt;We can run Riemann interactively via the command line or as a daemon. If
we’re running it as a daemon we can use the Ubuntu service management
commands:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo service riemann start
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo service riemann stop
. . .&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let’s start though with running it interactively using the &lt;code&gt;riemann&lt;/code&gt;
binary. To do this we need to specify a configuration file. Conveniently
the installation process has added one at &lt;code&gt;/etc/riemann/riemann.config&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;sudo riemann /etc/riemann/riemann.config
loading bin
INFO &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;2014-12-21 18:13:21,841&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; main - riemann.bin - PID 18754
INFO &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;2014-12-21 18:13:22,056&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; clojure-agent-send-off-pool-2 - riemann.transport.websockets - Websockets server 127.0.0.1 &lt;span class=&quot;m&quot;&gt;5556&lt;/span&gt; online
INFO &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;2014-12-21 18:13:22,091&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; clojure-agent-send-off-pool-4 - riemann.transport.tcp - TCP server 127.0.0.1 &lt;span class=&quot;m&quot;&gt;5555&lt;/span&gt; online
INFO &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;2014-12-21 18:13:22,099&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; clojure-agent-send-off-pool-3 - riemann.transport.udp - UDP server 127.0.0.1 &lt;span class=&quot;m&quot;&gt;5555&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;16384&lt;/span&gt; online
INFO &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;2014-12-21 18:13:22,102&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt; main - riemann.core - Hyperspace core online&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can see that Riemann has been started and a couple of services have
been started: a Websockets server on port 5556 and TCP and UDP servers
on port 5555.  By default Riemann binds to &lt;code&gt;localhost&lt;/code&gt; only.&lt;/p&gt;

&lt;p&gt;The default configuration on Ubuntu logs to
&lt;code&gt;/var/log/riemann/riemann.log&lt;/code&gt; and you can also follow the daemon’s
activity there.&lt;/p&gt;

&lt;h2 id=&quot;configuring-riemann&quot;&gt;Configuring Riemann&lt;/h2&gt;

&lt;p&gt;Riemann is configured using a Clojure configuration file, by default on
Ubuntu it is available at &lt;code&gt;/etc/riemann/riemann.config&lt;/code&gt;. Let’s take a
quick look at the default file.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;c1&quot;&gt;; -*- mode: clojure; -*-&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; vim: filetype=clojure&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;logging/init&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:file&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;/var/log/riemann/riemann.log&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;; Listen on the local interface over TCP (5555), UDP (5555), and websockets&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; (5556)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;127.0.0.1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tcp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;udp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ws-server&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;; Expire old events from the index every 5 seconds.&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;periodically-expire&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;index &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;; Inbound events will be passed to these streams:&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;; Index all events immediately.&lt;/span&gt;
    &lt;span class=&quot;nv&quot;&gt;index&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;; Log expired events.&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;expired&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;info&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;expired&amp;quot;&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can see the file is broken into a few stanzas. The first stanza sets
up Riemann’s logging to a file: &lt;code&gt;/var/log/riemann/riemann.log&lt;/code&gt;. The
second stanza controls Riemann’s interfaces: binding TCP, UDP and
Websockets interfaces to &lt;code&gt;localhost&lt;/code&gt; by default. Let’s make a quick
change here to bind these interfaces to all available networks.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;let &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;0.0.0.0&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;tcp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;udp-server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ws-server&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We’ve updated the &lt;code&gt;host&lt;/code&gt; value from &lt;code&gt;127.0.0.1&lt;/code&gt; to &lt;code&gt;0.0.0.0&lt;/code&gt;. This means
if one of your interfaces is on the Internet then your Riemann server is
now on the Internet. If you’re worried about security you can also
&lt;a href=&quot;http://riemann.io/howto.html#securing-traffic-using-tls&quot;&gt;configure Riemann with
TLS&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The remaining sections configure indexing and streams. Streams are a big
part of why Riemann is very cool. Streams are functions you can pass
events to for aggregation, modification, or escalation. Streams can also
have child-streams that they can pass events to, allowing filtering or
partitioning of the event stream. Using streams is amazingly powerful
and you can find &lt;a href=&quot;http://riemann.io/howto.html#working-with-streams&quot;&gt;sample configurations and a wide variety of howtos on
the Riemann site&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let’s make a small change to our &lt;code&gt;streams&lt;/code&gt; stanza to output events to
&lt;code&gt;STDOUT&lt;/code&gt; and our log file. Add the following at the bottom of the file
after all of the other stanzas.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;c1&quot;&gt;;print events to the log&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;streams&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;prn&lt;/span&gt;

  &lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;info&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;prn&lt;/code&gt; prints all events to &lt;code&gt;STDOUT&lt;/code&gt; and the &lt;code&gt;#(info %)&lt;/code&gt; sends events
to the log file. Now restart Riemann to enable our new configuration.&lt;/p&gt;

&lt;h2 id=&quot;sending-data-to-riemann&quot;&gt;Sending data to Riemann&lt;/h2&gt;

&lt;p&gt;Riemann has a variety of ways you can send data to it including a set of
tools and a variety of client native language bindings. You can find a
full list of the clients &lt;a href=&quot;http://riemann.io/clients.html&quot;&gt;here&lt;/a&gt; and
we’ll see how to use a client below. The collection of tools are written
in Ruby and available via the &lt;code&gt;riemann-tools&lt;/code&gt; gem we installed above.
Each tool ships as a separate binary and you can see a list of the
available tools
&lt;a href=&quot;https://github.com/aphyr/riemann-tools/tree/master/bin&quot;&gt;here&lt;/a&gt;. They
include basic health checks, web services like Apache and Nginx, Cloud
services likes AWS and a variety of others. The code is clear and you
could easily extend or adapt these to provide a variety of other
monitoring capabilities.&lt;/p&gt;

&lt;p&gt;The easiest of these tools to test is &lt;code&gt;riemann-health&lt;/code&gt;. It sends CPU,
Memory and load statistics to Riemann. Open up a new session and launch
it now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;riemann-health&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can either run it locally on the same host you’re running Riemann on
or you can point it at a Riemann server using the &lt;code&gt;--host&lt;/code&gt; flag.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;riemann-health --host myriemann.example.com&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Remember the default Riemann is only bound to &lt;code&gt;localhost&lt;/code&gt; but we updated
our configuration to bind to all interfaces.&lt;/p&gt;

&lt;p&gt;Now let’s look at our incoming data. Let’s start with looking at the
Riemann log file.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;tail&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;-f&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;/var/log/riemann/riemann.log&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;INFO&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2014-12-23&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:23:47&lt;/span&gt;,&lt;span class=&quot;mi&quot;&gt;050&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pool-1-thread-16&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.config&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.codec.Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;riemann.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;disk&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;/&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;used&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.11&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419373427&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;INFO&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2014-12-23&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:23:47&lt;/span&gt;,&lt;span class=&quot;mi&quot;&gt;055&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pool-1-thread-18&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.config&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.codec.Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;riemann.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;load&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;-minute&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;load &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;average/core&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.11&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.11&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419373427&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;. . &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we can see a couple of events, one for disk space and another for
load. Each Riemann event is a struct. Each event can contain one of a
number of optional fields including: host, service, state, a time and
description, a metric value or a TTL. They can also contain custom
fields.&lt;/p&gt;

&lt;p&gt;Let’s examine one of the disk events &lt;code&gt;riemann-health&lt;/code&gt; has sent:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;riemann.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;disk&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;/&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;used&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.11&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419373427&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can see the event has a host, service, and state. If we peek over at
&lt;a href=&quot;https://github.com/aphyr/riemann-tools/blob/master/bin/riemann-health#L64&quot;&gt;the code that produced the
event&lt;/a&gt;
we can how it is generated and
&lt;a href=&quot;https://github.com/aphyr/riemann-tools/blob/master/lib/riemann/tools.rb#L55&quot;&gt;sent&lt;/a&gt;.
As event APIs go it’s very lightweight but still hugely extensible.&lt;/p&gt;

&lt;p&gt;Let’s try another tool, &lt;code&gt;riemann-varnish&lt;/code&gt;, which reports Varnish
metrics. On one of my hosts with Varnish installed I run.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;riemann-varnish --host riemann.example.com&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And on the Riemann host I see in &lt;code&gt;/var/log/riemann/riemann.log&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;nv&quot;&gt;INFO&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2014-12-24&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;02&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:01:41&lt;/span&gt;,&lt;span class=&quot;mi&quot;&gt;660&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pool-1-thread-19&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.config&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.codec.Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_conn&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Client&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;connections&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;accepted&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;13795.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419404501&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;INFO&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2014-12-24&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;02&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:01:41&lt;/span&gt;,&lt;span class=&quot;mi&quot;&gt;706&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pool-1-thread-21&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.config&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.codec.Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_drop&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Connection&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;dropped&lt;/span&gt;, &lt;span class=&quot;nv&quot;&gt;no&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;sess/wrk&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419404501&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;INFO&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2014-12-24&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;02&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:01:41&lt;/span&gt;,&lt;span class=&quot;mi&quot;&gt;751&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pool-1-thread-22&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.config&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;- &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;riemann.codec.Event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_req&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Client&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;requests&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;received&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;15452.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419404501&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And to drill down to a specific event.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;varnish&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;client_conn&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ok&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Client&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;connections&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;accepted&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;13795.0&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419404501&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here we can see the Varnish client connections accepted metric. If we
look at the &lt;code&gt;riemann-varnish&lt;/code&gt;
&lt;a href=&quot;https://github.com/aphyr/riemann-tools/blob/master/bin/riemann-varnish&quot;&gt;code&lt;/a&gt;
we can see a shell-out to &lt;code&gt;varnishstat&lt;/code&gt; that captures our metrics and
sends them to Riemann. Pretty easy to replicate for a variety of
services.&lt;/p&gt;

&lt;p&gt;If you think the shell-out and parse is a little clumsy then we can also
write our own tool or use the Riemann client directly. Let’s embed
Riemann into a Sinatra application.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-ruby&quot; data-lang=&quot;ruby&quot;&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;rubygems&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;sinatra&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;riemann/client&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;socket&amp;#39;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;configure&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;set&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:bind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;0.0.0.0&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;get&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;/&amp;#39;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;send_event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metric&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;s1&quot;&gt;&amp;#39;&amp;lt;h1&amp;gt;This does something awesome&amp;lt;/h1&amp;gt;&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;send_event&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Riemann&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;localhost&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5555&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Socket&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gethostname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;service&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;something awesome&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;What an awesome number: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;metric&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Our Sinatra app is very basic. It responds on &lt;code&gt;/&lt;/code&gt; with the HTML:
&lt;code&gt;&amp;lt;h1&amp;gt;This does something awesome&amp;lt;/h1&amp;gt;&lt;/code&gt;. As part of that connection it
also sends an event to Riemann using the Riemann client we installed
earlier.&lt;/p&gt;

&lt;p&gt;To do this we’ve required the &lt;code&gt;riemann/client&lt;/code&gt; and inside the
&lt;code&gt;send_event&lt;/code&gt; method we’ve connected to the Riemann host on &lt;code&gt;localhost&lt;/code&gt;.
This method then accepts a metric, which is a random number created by
the &lt;code&gt;rand&lt;/code&gt; method, from the &lt;code&gt;get&lt;/code&gt; block and sends that metric with an
event.&lt;/p&gt;

&lt;p&gt;If we run this app (you might need to &lt;code&gt;gem install sinatra&lt;/code&gt; to install
Sinatra first).&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;ruby riemann_sinatra.rb&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And then look at our Riemann logs we’ll see an event much like this:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-clojure&quot; data-lang=&quot;clojure&quot;&gt;&lt;span class=&quot;ss&quot;&gt;:host&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;riemann.example.com&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:service&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;something&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;awesome&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:state&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:description&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;What&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;an&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;awesome&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.9984397664300542&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:metric&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.9984397664300542&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:tags&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:time&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1419449388&lt;/span&gt;, &lt;span class=&quot;ss&quot;&gt;:ttl&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;nil&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id=&quot;displaying-riemann-events&quot;&gt;Displaying Riemann events&lt;/h2&gt;

&lt;p&gt;Obviously reading events from the log output isn’t overly practical or
useful. To allow you to work with your events Riemann comes with a
dashboard. It’s a Sinatra application and we already installed it via
the &lt;code&gt;riemann-dash&lt;/code&gt; gem.&lt;/p&gt;

&lt;p&gt;Let’s start it now.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;riemann-dash&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can then view it on port &lt;code&gt;4567&lt;/code&gt; on the &lt;code&gt;localhost&lt;/code&gt;. You can also
change the dashboard’s configuration by &lt;a href=&quot;https://github.com/aphyr/riemann-dash/blob/master/example/config.rb&quot;&gt;creating a &lt;code&gt;config.rb&lt;/code&gt;
file&lt;/a&gt;
in the directory from which you’ve launch the dashboard. This provides
control over where and how the dashboard binds and some other
configuration options.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2014/12/riemann_dash.png&quot; alt=&quot;Riemann Dashboard&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The dashboard is a little janky in places but can produce some excellent
dashboards. The dashboard is made up of view panels that are
configurable. You can select or add a view using the boxes and plus
symbol in the top left of the dashboard.&lt;/p&gt;

&lt;p&gt;We just want to see the events coming into our dashboard though. So
let’s edit our current view to show those events. First, Ctrl-Click (or
Meta-Click on OSX) on the big Riemann title in the centre top of the
dashboard to select this view. This will highlight it gray (The Escape
key de-selects the view). Now type “e” to edit the view.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2014/12/riemann_dash2.png&quot; alt=&quot;Edit Riemann Dashboard&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Change the view from &lt;code&gt;Title&lt;/code&gt; to &lt;code&gt;Grid&lt;/code&gt; and then put &lt;code&gt;true&lt;/code&gt; into the
query box.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2014/12/riemann_dash3.png&quot; alt=&quot;Edit Riemann Dashboard&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This will change this view into a grid, which shows a table of events,
and select all events, the &lt;code&gt;true&lt;/code&gt; in the query box. This is the simplest
query you can create but you can do much more. To get started you can
find some sample queries
&lt;a href=&quot;https://github.com/aphyr/riemann/blob/master/test/riemann/query_test.clj&quot;&gt;here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Now you should see some of the events you’re generating displayed in a
per-host grid.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2014/12/riemann_dash4.png&quot; alt=&quot;Edit Riemann Dashboard&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If you’re not taken with the Riemann dashboard there is a &lt;a href=&quot;https://github.com/exoscale/riemann-grid&quot;&gt;Grid layout&lt;/a&gt; alternative or for
graphing you could &lt;a href=&quot;http://riemann.io/howto.html#forward-to-graphite&quot;&gt;direct all your
metrics&lt;/a&gt; to
&lt;a href=&quot;http://graphite.wikidot.com/&quot;&gt;Graphite&lt;/a&gt; which has a very fully-featured
dashboard.&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;We’ve barely scratched the surface of Riemann’s capabilities with this
introduction. From here we could configure a variety of streams,
matching events by service or host, and convert our events into
summaries, metrics and collections.&lt;sup id=&quot;fnref:4&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; We can take alerting actions
(email, PagerDuty) based on everything from failed services (replace
Nagios anyone?), to metric thresholds, or even Holt-Winters anomaly
detection. We can also send data onto longer-term storage or into other
tools like Graphite. The &lt;a href=&quot;http://riemann.io/howto.html&quot;&gt;Riemann HOWTO&lt;/a&gt;
has a number of examples and ideas to help you build your Riemann
environment further. I really recommend taking a look at Riemann if
you’re interested in where modern monitoring is headed.&lt;/p&gt;

&lt;div class=&quot;footnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;It also tends to reward conservatism and fear of change. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;This is a highly simplistic analysis of the potential for change in IT monitoring behaviour. Your mileage may vary. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot;&gt;
      &lt;p&gt;Kingsbury also published &lt;a href=&quot;https://aphyr.com/tags/Jepsen&quot;&gt;an excellent series on the CAP properties of a variety of distributed systems&lt;/a&gt;. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot;&gt;
      &lt;p&gt;Of couse there’s even a &lt;a href=&quot;http://kartar.net/2012/05/sending-events-from-puppet-to-riemann/&quot;&gt;Puppet Riemann report processor&lt;/a&gt;. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

  &lt;p&gt;&lt;a href=&quot;http://kartar.net/2014/12/an-introduction-to-riemann/&quot;&gt;An Introduction to Riemann&lt;/a&gt; was originally published by James Turnbull at &lt;a href=&quot;http://kartar.net&quot;&gt;Kartar.Net&lt;/a&gt; on December 26, 2014.&lt;/p&gt;</content>
</entry>

</feed>
