Magnifying glass on computer keyboard

How to Make the Most of Splunk Lookups

Over the past few years, I have worked with various customers for different Splunk use cases and one thing I have noticed is that most customers are not taking full advantage of Splunk lookups.

Splunk Lookups are a powerful way to enrich your data and enhance your search experience. Lookups enable you to add context and do more creative correlations with your machine data.

Today, I am going to cover a few lookup use cases and some tips around lookups (plus a little bit about KV Store) to help you make the most of this Splunk feature. External Lookups, Time-based Lookups, and Geospatial Lookups will be covered in a future article.

I will start with some lookup terminology and then cover a few use cases followed by some tips for using lookups effectively.

Splunk Lookup Terminology

Lookup Definition – The lookup definition provides the name of the lookup and path to its file.

Lookup Tables – Lookup tables are CSV files used to add details/fields to a Splunk event based on matching a field between a CSV file and a Splunk event.

External Lookup – Also referred to as a Scripted Lookup, this type of lookup uses Python code or an executable to populate a Splunk event with additional details from the external world.

Static Lookup – Static lookups are CSV files created and uploaded manually by the user. As their name suggests, the CSV does not change over time

Dynamic Lookup – Dynamic Lookups refer to a CSV file that is periodically updated through a saved search or script(s).

Splunk Lookup Use Cases

Splunk documents cover lookups in detail. Lookup, in general, is a very broad concept. I am going to cover some use cases here that provide the most value from the Splunk lookup feature. Let’s start with a relatively simple use case first.

Use Case 1: You have events indexed in Splunk but these events don’t contain all the information you want to record and you want to add some context to the field values. This use case leads us to a CSV file-based Static Lookup.

Let’s say you have events with an HTTP Response Code but you want to add a description for each of the error codes to make it understandable for the user. HTTP Response Code 200 corresponds to “Success” and HTTP Code 404 corresponds to “Page Not Found.” This is the simplest example as most users probably know the meaning of most HTTP Response Codes.

However, understanding field values can be tricky when you are working with proprietary technologies and the various codes associated with them. For any coded, numeric, or otherwise not user-friendly field where information can easily be mapped for users with a lookup, this is the solution for you. You can follow this Splunk docs link to configure the CSV-based Static Lookup file either through the web manager or through a configuration file.

Use Case 2: You want to create a lookup, however, the data you want to put in the lookup is dynamic in nature and comes as a Splunk event. This use case leads us to the saved search-based Dynamic Lookup.

The best approach to handle this use case is to create and schedule a savedsearch that puts the result in table format. Then use the outputlookup command to periodically populate your lookup with the result of the savedsearch.

For example, you may be collecting data from multiple storage arrays and want to maintain an up-to-date table of each storage array and its disk. The saved search for this use case looks like this:

sourcetype = storage:array | dedup disk_name, disk_id, array_id, array_name | table array_name,array_id,disk_id,disk_name | outputlookup array_disk_mapping

Suppose that later we want to display a timechart of disk latency by disk name on a dashboard panel. However, we first want to apply a filter using the array name, which is not available in the raw event for disk latency. You can pull the array name by running a lookup command against the array_disk_mapping lookup we created previously. Your search would look like this:

sourcetype = storage:disk | lookup array_disk_mapping disk_name output array_name | search array_name = “Array 1” | timechart max(latency) by disk_name
Use Case 3: You want to show data from 2 different sources in the same table by using the “join” command in your search and integrating both data sources with a common field. However, due to the high number of events, your search is at risk of taking too long to execute because of the performance trade-off introduced by the “join” command.

You can avoid this situation by replacing “join” with a combination of a saved search-based Dynamic Lookup and the lookup command in your search query. This solution goes back to the one mentioned in Use Case 2 above. By keeping “join” based searches running in the background on an interval as saved searches, and using the lookup command in dashboard panel searches to get matching field values, you can ensure that user experience and dashboard performance are not compromised.

Use Case 4: You have a lookup in place and frequently need to use it in your searches. This is the perfect use case for an Automatic Lookup.

In this case, the best approach for utilizing a lookup is to use Splunk’s Automatic Lookup feature. Rather than fetching a field using the lookup command in your searches, you can define an Automatic Lookup on a source, sourcetype, and/or host.  The lookup fields will automatically be available to you at the search time.

7 Tips for Using Splunk Lookups

Now that we have covered lookup use cases and the associated terminologies, let’s look at some tips around Splunk lookup usage which will save you time and effort as you use them more heavily. For many Splunkers, using lookup can be relatively easy and painless, but these tips help address some of the common “gotchas.”

1. Use a lookup after a transforming command to make your search execute more quickly.
Where you run a lookup in the search pipeline matters. By running a lookup before the transforming command (e.g. “stats”) you are asking the lookup to execute against every single raw event.

By running a lookup AFTER the transforming command, the lookup only has to execute against the (likely) much reduced transformed version of the data.

For example, use this search query:

sourcetype = web_errors | stats count by code | lookup error_description code output Description

Instead of this search query:

sourcetype = web_errors | lookup error_description code output Description | stats count by code

2. Don’t forget to use append=t with the outputlookup command when you want to add to your lookup file over time.
The outputlookup command, by default, completely rewrites the CSV file and will cause you to lose all previous content from the lookup if you repopulate the CSV file using the outputlookup command.

To prevent this from happening, you can append the result of your query to the existing lookup by using the “append=t” parameter with the outputlookup command. Be careful when using the append feature  – you need to be sure the final result of your query does not include duplicate records before dumping the result to lookup.

Be very careful with this approach. Over time, you may have a large number of stale entries in your lookup which are no longer needed. It’s difficult to identify and remove these entries manually.

3. Use the lookup definition name instead of the lookup CSV file name in search queries.
You should use the lookup definition name from transforms.conf in your search queries instead of using the CSV file name directly. This ensures that future changes to the lookup’s CSV file name will not break your outputlookup command.

If your transforms.conf looks like this,

[name_id_mapping]
filename = name_id.csv

Use “outputlookup name_id_mapping” rather than “outputlookup name_id.csv” in your search query.

4. Keep an eye on aggregate lookup file size in your environment.
If you think your lookup file is going to be larger than 100 MB at any future point in time, you should turn to KV Store, described at the end of this post. If KV Store is not an option, consider any of the following routes to avoid breaking the searches and dashboards that use lookup tables.

  • Try to increase the bundle size of the knowledge object bundle that the search head distributes to the indexer. You can look at the Splunk docs to see how to increase the maxBundleSize in the distsearch.conf file.
  • You may need to blacklist your large lookup files from the knowledge object bundle that goes to the indexers. You can look at the Splunk docs to see how to set the replicationBlacklist parameter in the distsearch.conf file.
  • Use the “local=t” parameter with your lookup command to tell Splunk to look for a lookup file stored locally and not on indexers.

5. Understand lookup processing order with other knowledge objects
Lookups are processed after calculated fields but before event types so make sure you are not doing lookups based on event types and/or tags.  To learn more about the processing order for knowledge objects at search time, refer to the Splunk docs.

6. Understand lookup processing order with multiple Automatic Lookups in place
Splunk processes lookups in ASCII order. If you want Splunk to follow the lookup in a particular sequence, name your automatic lookup definition accordingly (ex. 1_lookup1, 2_lookup2, 3_lookup3) to follow the ASCII order.

7. Preserve Dynamic Lookups in case of an empty search result
If you have a saved search based Dynamic Lookup in place and you want to keep the existing lookup as is when the latest execution of your saved search gives no result (perhaps because data ingestion stopped or the formatting of the log has changed) add the parameter “override_if_empty=false” to your outputlookup command. This approach ensures that dashboard panels dependent on the lookup do not break due to lack of data.

Please note that the override_if_empty parameter is available only with Splunk version 7.1.0 or above and it is not backward compatible.

New World…Better World

By this point, you should have a fair idea of some of the advantages and drawbacks of lookups. The biggest limitation of CSV file-based lookups is that they are not designed to handle a large number of records. However, having millions of records to store in a lookup is not unusual in today’s data-centric world. At this point in time, we need to look at alternatives to mitigate the limitations of CSV file-based traditional lookups. Yes, you guessed it right… KV Store, a mongoDB database integrated natively with Splunk.

When to Consider Using a KV Store

KV Store surpasses traditional CSV file-based lookup benefits when 1) you are going to put millions of field-value pairs in your CSV and the CSV file size is no longer in the KBs and/or 2) you frequently need to update only a few of the records in your lookup at a time.

Remember that CSV file-based (dynamic) lookups overwrite the existing CSV table content rather than updating or modifying it. Also, large data sets put an extra load on Splunk machines to transfer lookup files to search peers. (The default limit is 2 GB for a knowledge object bundle.)

In a typical 3-tier Splunk architecture, the Splunk search head sends CSV-based lookups to Splunk indexers over the network whereas KV Stores live only on the search head. If you have multiple lookups changing more often, converting them to KV Store will provide better performance.

Summary

In this article, we discussed the common ways Splunk lookups can be leveraged and some of the considerations required to use them effectively. Splunk lookups remain a relatively simple but effective tool for enriching your data and enhancing your search experience. Not only can they simplify investigations by automatically enriching data, but lookups can also enable you to do deeper and more meaningful context-based analysis. The next time you find yourself struggling to enhance or analyze a data set, consider lookups another tool in your Splunk toolbox.

About SP6

SP6 is a Splunk consulting firm focused on Splunk professional services including Splunk deployment, ongoing Splunk administration, and Splunk development. SP6 has a separate division that also offers Splunk recruitment and the placement of Splunk professionals into direct-hire (FTE) roles for those companies that may require assistance with acquiring their own full-time staff, given the challenge that currently exists in the market today.