Using elasticsearch mappings appropriately to map as type IP, int, float, etc.

Update 2015-08-31: My most recent template/mapping can be found here.
Update 2012-11-05: My most recent template/mapping can be found here.

I am updating previous templates in blogs accordingly, just FYI.

Logstash allows you to tag certain fields as types within elasticsearch. This is useful for performing statistical analysis on numbers, such as the byte fields or the duration of a transaction in mili or microseconds. In grok, this would be as simple as adding :int, or :float to the end of an expression, e.g. %{POSINT:bytes:int}. This makes the correct mapping output when the event is sent to elasticsearch. However, since we’re trying to avoid using grok and are sending values as pre-formatted JSON, this sometimes results in values not being properly tagged.

Jordan instructed me to not encapsulate values within double-quotes if the value is a number. In doing so, the value is auto-sent as type long (for long integer). However, elasticsearch allows us to store IP addresses as type IP. This is crucial to using the range-based queries across IP blocks/subnets, e.g. clientip:[172.16.0.0 TO 172.23.255.255].

In the past, I tried putting in :ip, just like with :int or :float. I thought it was working, because I was able to do a range query. But then it became clear that it was limited to a single dotted-quad difference, such as 192.168.0.1 TO 192.168.0.255. It would not work with a larger subnet. The way to discover if this is correctly configured or not is to pull the _mapping from your index:

curl -XGET 'http://localhost:9200/logstash-2012.10.12/_mapping?pretty=true'
(truncated)
           "clientip" : {
              "type" : "string"
            },

In this case, type “string” is not desired. We want to see type: “ip”. It turns out my mapping was misconfigured. The correct way to do this is as follows (see the mappings section in particular):

curl -XPUT http://localhost:9200/_template/logstash_per_index -d '{
    "template" : "logstash*",
    "settings" : {
        "number_of_shards" : 4,
        "index.cache.field.type" : "soft",
        "index.refresh_interval" : "5s",
        "index.store.compress.stored" : true,
        "index.query.default_field" : "@message",
        "index.routing.allocation.total_shards_per_node" : 2
    },
    "mappings" : {
        "_default_" : {
           "_all" : {"enabled" : false},
           "properties" : {
              "@fields" : {
                   "type" : "object",
                   "dynamic": true,
                   "path": "full",
                   "properties" : {
                       "clientip" : { "type": "ip"}
                   }
              },
              "@message": { "type": "string", "index": "analyzed" },
              "@source": { "type": "string", "index": "not_analyzed" },
              "@source_host": { "type": "string", "index": "not_analyzed" },
              "@source_path": { "type": "string", "index": "not_analyzed" },
              "@tags": { "type": "string", "index": "not_analyzed" },
              "@timestamp": { "type": "date", "index": "not_analyzed" },
               "@type": { "type": "string", "index": "not_analyzed" }    
           }   
        }
   }
}
'

After applying this template, now I have type “ip” showing up:

curl -XGET 'http://localhost:9200/logstash-2012.10.12/_mapping?pretty=true'
(truncated)
           "clientip" : {
              "type" : "ip"
            },

The same logic is applicable to all other fields in the object @fields (logstash’s default object for everything not prepended with an @ sign). Try it out! Enjoy! Keep in mind that this will not change existing data, but will work on new indexes created after replacing your template.

3 thoughts on “Using elasticsearch mappings appropriately to map as type IP, int, float, etc.

  1. Ulli says:

    Hello Aaron,

    Thank you very much. This helped me to get my type mapping working at least with the put api. Did you do this tried this as well as a permanent configuration by using a template file? I’m just despairing of getting elasticsearch to use my template file …

    Thanks in advance
    Ulli

    • Aaron says:

      Hi Ulli,

      Yes, you can create a template file. In reality, though, my file is a shell script that wipes out the former template and applies a new one with the same name. I find it’s easier to do that with something large and complex like this JSON.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s