There are countless blogs, articles, and Splunk ‘answers’ regarding the optimization of Splunk queries (and here’s another one on SPL optimization).
In this article, I’d like to share a few consistent tips that I’ve learned to improve the performance of queries. The following tips are listed in the order that they are used within the search.
1. Minimize the number of trips to the indexers
Namely – avoid subsearches via the use of ‘join’ and ‘append.’ While the ‘join’ and ‘append’ commands are widely used and familiar to most of us, they are not necessarily the most efficient commands. Why is this the case? A few problems include:
- Both commands make use of a subsearch (the stuff between the square brackets). With every use of these commands, the number of times that you need to access the indexers increases (and increases all of the communication and overhead that may be involved).
- Subsearches have limitations. By default, they have a timeout of 60s and a limitation of 50000 events (see subsearch_maxtime and subsearch_maxout in limits.conf). This leads to a truncation of results, which leads to incorrect answers. This can go unnoticed, pay attention to the error messages that are returned with the use of these commands.
What’s the solution? The above problems can be mitigated by combining your subsearch with your primary search and accomplishing the ‘join’ with the use of a stats command. An example of this is shown below.
Using join (before)
index=_internal sourcetype=splunkd component=Metrics
| stats count as metric_count by host
| join host type=left
[search index=_audit sourcetype=audittrail
| stats count as audit_count by host]
| table host metric_count audit_count
Using stats (after)
(index=_internal sourcetype=splunkd component=Metrics) OR
| stats count(eval(sourcetype=”splunkd”)) as metric_count count(eval(sourcetype=”audittrail”)) as audit_count by host
The technique used above can also be used to address the use of the ‘append’ command as well. In general, I exclusively make use of the stats command to avoid the use of the following commands: dedup, table, join, append.
2. SPL optimization: minimize the amount of data coming back from the indexers
Another item that is also mentioned in many articles is the goal to filter your data early in order to help lower the number of events returned. While this cuts down on the number of events (vertical), there can also be substantial benefits to limiting the number of fields that are retrieved (horizontal).
By utilizing the ‘fields’ streaming command early within your spl, you not only lower the sheer amount of data that is being pulled from the indexers, but also the amount that has to be transferred to the search head, and processed by the search head.
Where possible, I’ve made it a habit to use the fields command right after the first pipe of my spl.
|fields <field list>
|fields – _raw
A sample job that I created showed the following improvements when simply limiting the number of fields early within the query:
|# of Fields||Disk Usage||Events||Time Spent|
|Query without use of fields||155||8458240||498478||166s|
|Query with use of fields||18||5681152||498478||103s|
3. SPL optimization: perform calculations on the smallest amount of data
Try and keep calculations using commands such as eval, lookups, for each until after you have a succinct data set that’s been culled by the above steps. Combine commands where possible. For example, I combine all eval statements into one, comma-delimited, eval statement.
| eval var1=”value1”
| eval var2=”value2”
| eval var3=”value3”
eval var1=”value1”, var2=”value2”, var3=”value3”
4. SPL optimization: use non-streaming commands as late in the query as possible
Use non-streaming, transforming commands until last. These are the commands that are really getting the answers you’re looking for such as stats, chart, timechart.
In summary, the basic structure of my queries follow a similar format such as below:
|Base query||Base query|
|Minimize data||| fields <list of fields>|
|Combine/Summarize data||| use of stats for join/append/summarizations|
|Execute calculations||| eval, lookup, etc|
|Format the data||| stats, chart, timechart, etc.|
As we all know, every query and requirement is different, and the thoughts above aren’t strict rules, but rather guidelines I’ve found helpful. Below are a couple of links that should help you along the way.
SP6 is a Splunk consulting firm focused on Splunk professional services including Splunk deployment, ongoing Splunk administration, and Splunk development. SP6 has a separate division that also offers Splunk recruitment and the placement of Splunk professionals into direct-hire (FTE) roles for those companies that may require assistance with acquiring their own full-time staff, given the challenge that currently exists in the market today.