Implementing Splunk 7(Third Edition)
上QQ阅读APP看书,第一时间看更新

Indexed field case 2 - splitting words

In some log formats, multiple pieces of information may be encoded into a single word without whitespace or punctuation to separate the useful pieces of information. For instance, consider a log message such as this:

4/2/12 6:35:50.000 PM kernel: abc5s2: 0xc014 (UNDEFINED). 

Let's pretend that 5s2 (a made-up string of characters for an example) is an important piece of information that we need to be able to search for efficiently. The query *5s2 would find the events but would be a very inefficient search (in essence, a full table scan). By defining an indexed field, you can very efficiently search for this instance of the string 5s2, because essentially, we create a new word in the metadata of this event.

Defining an indexed field only makes sense if you know the format of the logs before indexing, if you believe the field will actually make the query more efficient, (see previous section), and if you will be searching for the field value. If you will only be reporting on the values of this field, an extracted field will be sufficient, except in the most extreme performance cases.