I figured it was time to share my current template again, as much has changed since Logstash 1.2. Among the changes include:
- doc_values everywhere applicable
- Defaults for all numeric types, using doc_values
- Proper mapping for the raw sub-field
- Leaving the message field analyzed, and with no raw sub-field
- Added ip, latitude, and longitude fields to the geoip mapping, using doc_values
If you couldn’t tell, I’m crazy about doc_values. Using doc_values (where permitted) prevents your elasticsearch java heap size from growing out of control when performing large aggregations—for example, a months worth of data with Kibana—with very little upfront cost in additional storage.
This is mostly generic, but it does have a few things which are specific to my use case (like the Nginx entry). Feel free to adapt to your needs.
{ "template" : "logstash-*", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "_default_" : { "_all" : {"enabled" : true, "omit_norms" : true}, "dynamic_templates" : [ { "message_field" : { "match" : "message", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true } } }, { "string_fields" : { "match" : "*", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fields" : { "raw" : {"type": "string", "index" : "not_analyzed", "doc_values" : true, "ignore_above" : 256} } } } }, { "float_fields" : { "match" : "*", "match_mapping_type" : "float", "mapping" : { "type" : "float", "doc_values" : true } } }, { "double_fields" : { "match" : "*", "match_mapping_type" : "double", "mapping" : { "type" : "double", "doc_values" : true } } }, { "byte_fields" : { "match" : "*", "match_mapping_type" : "byte", "mapping" : { "type" : "byte", "doc_values" : true } } }, { "short_fields" : { "match" : "*", "match_mapping_type" : "short", "mapping" : { "type" : "short", "doc_values" : true } } }, { "integer_fields" : { "match" : "*", "match_mapping_type" : "integer", "mapping" : { "type" : "integer", "doc_values" : true } } }, { "long_fields" : { "match" : "*", "match_mapping_type" : "long", "mapping" : { "type" : "long", "doc_values" : true } } }, { "date_fields" : { "match" : "*", "match_mapping_type" : "date", "mapping" : { "type" : "date", "doc_values" : true } } } ], "properties" : { "@timestamp": { "type": "date", "doc_values" : true }, "@version": { "type": "string", "index": "not_analyzed", "doc_values" : true }, "clientip": { "type": "ip", "doc_values" : true }, "geoip" : { "type" : "object", "dynamic": true, "properties" : { "ip": { "type": "ip", "doc_values" : true }, "location" : { "type" : "geo_point", "doc_values" : true }, "latitude" : { "type" : "float", "doc_values" : true }, "longitude" : { "type" : "float", "doc_values" : true } } } } }, "nginx_json" : { "properties" : { "duration" : { "type" : "float", "doc_values" : true }, "status" : { "type" : "short", "doc_values" : true } } } } }
You can also find this in a GitHub gist.
Feel free to add any suggestions, or adaptations you may have used in the comments below!
Would be cool to have a “logstash config builder” type site where you can tick boxes to say what you want it to do or what use cases to cover and it outputs a working logstash config. This might already exist, I haven’t even looked.