New collectd codec (Logstash 1.4.1+) configuration

With the advent of Logstash 1.4.1, I wanted to make sure everyone knows about the new collectd codec.

In Logstash 1.3.x, we introduced the collectd input plugin.  It was awesome!  We could process metrics in Logstash, store them in Elasticsearch and view them with Kibana.  The only downside was that you could only get around 3100 events per second through the plugin.  With Logstash 1.4.0 we introduced a newly revamped UDP input plugin which was multi-threaded and had a queue.  I refactored the collectd input plugin to be a codec (with some help from my co-workers and the community) to take advantage of this huge performance increase.  Now with only 3 threads on my dual-core Macbook Air I can get over 45,000 events per second through the collectd codec!

So, I wanted to provide some quick examples you could use to change your plugin configuration to use the codec instead.

The old way:

input {
  collectd {}
}

The new way:

input {
  udp {
    port => 25826         # Must be specified. 25826 is the default for collectd
    buffer_size => 1452   # Should be specified. 1452 is the default for recent versions of collectd
    codec => collectd { } # This will invoke the default options for the codec
    type => "collectd"
  }
}

This new configuration will use 2 threads and a queue size of 2000 by default for the UDP input plugin. With this you should easily be able to break 30,000 events per second!

I have provided a gist with some other configuration examples. For more information, please check out the Logstash documentation for the collectd codec.

Happy Logstashing!

18 thoughts on “New collectd codec (Logstash 1.4.1+) configuration

  1. Paul says:

    When I try to use collectd as input (up or old way) and GELF as output, it seems that the message is not filled, so GELF complains about shot_message being empty (:message=>”Trouble sending GELF event”, :gelf_event=>{“short_message”=>nil, “full_message”=>”%{message}”). Would you have any hint for me.
    When i use stdout with codec => rubydebug i see what also would go into elasticsearch, whereas without rubydebug is see %message.

    Thanks

  2. Aaron says:

    This would be because events created by the collectd codec do not have a message field. If you wanted to send something in the message field, you’d have to use mutate to create it.

  3. Athreya says:

    Hi Aaron,

    When i run the logstash with the below config in the logstash config,

    input {
    udp {
    port => 25826 # Must be specified. 25826 is the default for collectd
    buffer_size => 1452 # Should be specified. 1452 is the default for recent versions of collectd
    codec => collectd { } # This will invoke the default options for the codec
    type => “collectd”
    }
    }

    output {
    stdout { }
    }

    I am getting the following :
    2014-12-31T09:40:25.995+0000 %{message}

    How do I get the contents of the %{message} ?

    Please provide your inputs on mutating the field and give me some example.

    • Aaron says:

      You shouldn’t see %{message} as that implies a completely empty event. I’d be willing to bet that something on the collectd side either isn’t sending properly or is misconfigured somehow. You can verify this by using collectd-tg (collectd traffic generator) to send dummy messages to this host/port in Logstash (or choose another).

      • Athreya says:

        Hi Aaron,

        We have installed collectd and logstash on the same Centos 6 box and both are being used in the client mode.

        Find below the collectd config in client mode:

        Hostname “WhizSMS-Deployment-Server-Dev”
        BaseDir “/var/lib/collectd”
        TypesDB “/usr/share/collectd/types.db”
        Interval 1

        ReadThreads 5

        LoadPlugin logfile

        LogLevel info
        File “/var/log/collectd-new.log”
        Timestamp true
        PrintSeverity false

        LoadPlugin cpu
        LoadPlugin interface
        LoadPlugin network

        Interface “eth0”
        IgnoreSelected false

        # client setup:

        Interface “eth0”

        I am using tcpdump to get the packets transmitted to the port using the command:
        tcpdump -Av ‘udp port 25826’

        I got the following output:
        WhizSMS-DeploymentServer-Dev.54107 > localhost.25826: UDP, length 1299
        E../..@.@…..
        ……[d…N….”WhizSMS-Deployment-Server-Dev……)….Uz. ……@…….cpu…..0…..cpu…. wait…………………)….[ ….interrupt…………..=……)….]U….softirq…………..%^…..)…._….
        steal…………………)…..c….interface…. eth0…..if_octets……………………………..)….Jm….if_packets…………..fb……)^M……)….sl….if_errors…………………………)….B4….cpu…..0…..cpu…. user…………..[……)….R…. nice…………………)….YC….system…………..FX…..)….`,… idle…………..”……)….g;… wait…………………)….m…..interrupt…………..=……)….o…..softirq…………..%^…..)….q….
        steal…………………)….h…..interface…. eth0…..if_octets……………………………..)….j…..if_packets…………..fb……)^M……)….l…..if_errors…………………………)..-.k…..cpu…..0…..cpu…. user…………..[……)..-.~…. nice…………………)..-…….system…………..FX…..)..-…… idle…………..# …..)..-…… wait…………………)..-..Q….interrupt…………..=……)..-…….softirq…………..%^…..)..-……
        steal…………………)..-..5….interface…. eth0…..if_octets…………………………

        Now, I tried to get these logs through logstash client
        using the following config of logstash:
        input {
        udp {
        port => 25826 # Must be specified. 25826 is the default for collectd
        buffer_size => 1452 # Should be specified. 1452 is the default for recent versions of collectd
        codec => collectd {}
        type => “collectd”
        }
        }
        output {
        stdout { }
        }

        But we are not getting any output on stdout when i run the logstash command below on the standard output:
        logstash -f -l

        Currently we are running a POC to decide whether the ELK stack will work for us in production. So far we absolutely love the ELK stack,
        but need it to work with collectd to capture other parameters.

        Need your help urgently to sort the issue we are facing.

      • Tim Ecklund says:

        Hi Athreya, I was having the same problem. It turns out I needed to add a codec to my output, like this:

        output { stdout { codec => json } }

        Once I did that the %{message} was replaced with the correct output.

  4. Aaron says:

    Since it’s clear that data is flowing, have you checked the possibility that Logstash is running into selinux issues? tcpdump usually requires root access to the device, but Logstash not running as root may be hitting some security issue.

  5. Marc says:

    Hi Aaron

    Just a side question, as I’m considering shoving my collectd metrics into Logstash/Elasticsearch: how are handled counter-typed metrics? Does the codec perform some roll-up to generate rates from counter increments, or are the raw counter values passed on to the next step of the Logstash pipeline?

    Best regards,

    m.

  6. melanie says:

    Hi,

    How do you manage the tidy up of elasticsearch ?
    I have a collectd index by day. I want to have 5 min graphs, monthly graphs and yearly graphs but I don’t want to have all the 6 mounth old data at the max resolution.

    • Aaron says:

      If you’re looking to average your results at different intervals, there are no quick ways to do so yet. Elastic is working on a plugin to be able to do these things on the fly, but it’s a ways out. In the meantime, you may be able to do this with some clever collectd and Logstash work, putting aggregated results into different indices. I will caution you about having too many indices open in an Elasticsearch cluster. There is a limit to how many you can keep open, per node, at any given time without causing issues.

      • melanie says:

        ok. thanks a lot !
        I think I will keep collectd with graphite and grafana for the moment for metrics, and ELK for logs.

  7. Nacho Pérez says:

    Hi,
    Is possible to use logstash-forwarder to send metrics of network to logstash centralized server logs, using collectd codec?.

    Or

    As I can send network metrics from multiple servers to logstash server?

    Thanks
    Nacho

    • Aaron says:

      Unfortunately, no. Logstash-Forwarder is for file-tailing only. It cannot do anything else. Collectd itself can send to a centralized Logstash instance, however, without the need for any intermediary. It can also send securely, if that’s desired.

  8. Peter Pham says:

    It seems to me the collectd codec does not have the capability of reassembly of fragmented segments. Instead, it fails when it detects that it does not receive the full segment of collectd data. Anyone has this experience before?

Leave a reply to Aaron Cancel reply