HomeWHICHWhich Type Of Splunk Query Searches Through Unstructured Log Records

Which Type Of Splunk Query Searches Through Unstructured Log Records

You have created an architecture that will make it very difficult to code an efficent search. In addition, including a wildcard into the csv means you need to strip the wildcard out before line 8 adds the concatenated %, or your results will be in error. Also, I’m not sure how you got that to work at all with a hyphen in the middle of the field names; I would expect Splunk to have replaced that with an underscore.

Here are some suggestions.

First, limit the initial selection by sourcetype. This should reduce the number of records involved. Something like this should speed the query up a bit…

index=”application_error” [| inputlookup my_lookup_table.csv | rename Field_SourceType as sourcetype, Field_Substring as killname | format | rex field=search mode=sed “s/killname=//g” ]

If your sourcetypes have multiple extensions on them, then you will want to concatenate a “*” on them above, after the rename and before the format.

Next, kill all unneeded fields, thereby limiting the number of bytes involved. We’re saving the field _time just in case you want a timechart or something, and sourcetype to try something later.

Come back and strip those out if you don’t need them; a byte’s a byte.

Refer to more articles:  Which Statement Best Completes The Table

| rename _time as Time | eval Raw = lower(_raw) | fields – _* | fields Time Raw sourcetype

Run that into your current search and see what happens to your run time.

One other potential architecture for this next section is to do a join on sourcetype. This could be much more efficient as long as the lookup table has less than a few thousand lines and as long as the sourcetype does NOT have multiple potential extensions on it.

Adapt the method as needed. For example, you could create a synthetic match field of a certain length, and then do a full test of sourcetype for each of the remaining potential matches.

| join max=0 sourcetype [| inputlookup my_lookup_table.csv | rename Field_SourceType as sourcetype, Field_Substring as teststring | table sourcetype teststring | eval teststring=lower(teststring) ] | rename COMMENT as “change asterisks to % and add % at front and end of teststring, tehn test and stats it up” | rex mode=sed field=teststring “s/[*]/%/g s/^(.)/%1/g s/(.)$/1%/g” | where like(Raw,teststring) | stats count by sourcetype, teststring

By the way, sorry we missed your earlier question.

RELATED ARTICLES

Most Popular

Recent Comments