3 Splunk Best Practice Lessons We Learned the Hard Way

Head transferring light bulbs to another head

There are times we’ve been bitten by simple omissions we made laying the foundation for a Splunk environment. When you’re winding down your evening, for example, and the call comes in that Splunk is down.

You race to troubleshoot the problem, only to discover it could have been prevented during your initial Splunk deployment. With that in mind, here are 3 Splunk best practices that will save you time (and face), and let you enjoy your evenings in peace.

Splunk Best Practice #1: Use Volumes to Manage Your Indexes

We’ll assume you’re already familiar with the maxTotalDataSizeMB setting in the indexes.conf file – it’s used to set the maximum size per index (default value: 500 GB). 

While maxTotalDataSizeMB is your first line of defense to avoid reaching the minimum free disk space before indexing halts, volumes will protect you from a miscalculation made when creating a new index. Even if you’ve diligently sized your indexes to account for the right growth, retention, and space available, another admin who creates an index in your absence may not be as prudent.

Once you enter volumes, you can bind indexes together and make sure that they do not surpass the limits set. Volumes are configured via indexes.conf and they require a very simple stanza:

[volume:CustomerIndexes] path = /san/splunk
maxVolumeDataSizeMB = 120000

In the example above, the stanza tells Splunk we want to define a volume called “CustomerIndexes.” 

In addition, we want it to use the path “/san/splunk” to store the associated indexes. Finally, we want to limit the total size of all of the indexes assigned to this volume to 120,000 MB. 

No doubt your mind has already conceived the next step, which is where we assign indexes to our “CustomIndexes” volume. This is also done in indexes.conf by prefixing your index’s cold and warm (home) path with the name of the volume:

[AppIndex] homePath = volume:CustomerIndexes/AppIndex/db
coldPath = volume:CustomerIndexes/AppIndex/colddb
thawedPath = $SPLUNK_DB/AppIndex/thaweddb
[RouterIndex] homePath = volume:CustomerIndexes/RouterIndex/db
coldPath = volume:CustomerIndexes/RouterIndex/colddb
thawedPath = $SPLUNK_DB/RouterIndex/thaweddb

*PRO TIP – use $_index_name to reference the name of your index definition

[RouterIndex] homePath = volume:CustomerIndexes/$_index_name/db
coldPath = volume:CustomerIndexes/$_index_name/colddb
thawedPath = $SPLUNK_DB/$_index_name/thaweddb

Why this approach? Though the next admin might be unaware, once you set it, it’s difficult to miss when creating a new index via indexes.conf or the web interface. By using volumes, the volume’s “maxVolumeDataSizeMB” setting overrides the indexes “maxTotalDataSizeMB” setting. 

If left to their own devices, the AppIndex and RouterIndex would grow to their default maximum size of 500,000 MB each, taking up a total of 1 TB of storage. With volumes, we no longer have to worry about this. As a bonus, there is nothing stopping you from using separate volumes for cold and warm/hot buckets, in case you have different tiers of storage available.

Splunk Best Practice #2: Use Apps and Add-Ons Wherever Possible

As cliché as this phrase is nowadays, in the Splunk world, it pays to be the admin that says, “there’s an app for that”. 

Yes, you can download apps from Splunkbase to extend Splunk’s functionality. Furthermore, you’ve probably already used the deployment server to manage inputs and technical add-ons (TAs) on universal forwarders. But what about using apps to manage an environment? It’s not only possible. It’s recommended.

Let’s assume we have all of our Splunk nodes configured to use our deployment server. If not, then you can use this handy CLI command to do that on each instance:

splunk set deploy-poll <IP_address/hostname>:
splunk restart

After you’ve done that, continue by identifying stanzas that will be common across groups of nodes (search heads, indexers, forwarders, etc.) or all nodes. For instance, there are two useful stanzas in outputs.conf used to make sure every node is aware of the indexers it needs to forward data to outputs.conf:

[tcpout] defaultGroup=Indexers
[tcpout:Indexers] server=IndexerA:9997, IndexerB:9996

Deployment server directory structure

Next, create the following directory structure on your deployment server to accommodate our new app’s config files. You may then place your version of outputs.conf file with the stanzas above in the “local” subdirectory. In this example, we’re naming our app “all_outputs”.

Splunk directory structure

That was the hard part! Go ahead and repeat this exercise and create a new app on the deployment server for every config file that a group of nodes has in common. Here are a few ideas:

  • All search heads usually share the same search peers, and this can be accomplished via an app that provides distsearch.conf
  • Indexers will need to have the same version of props.conf and transforms.conf to consistently parse the data they ingest
  • Forwarders can use an app configuring the allowRemoteLogin setting via server.conf, allowing them to be managed remotely

In order to tie everything together, log on to your deployment server’s GUI and go to Settings > Forwarder Management. Create server classes for the different groups of nodes in your Splunk environment. Assign the appropriate apps and hosts to each server class.

Splunk Forwarder settings

Here comes the fun part. Next time someone calls you up asking how to stand up a new heavy forwarder (or any other instance type), you can answer, “There’s an app for that.”

Splunk Best Practice #3: Keep an Eye on Free Disk Space

We know from experience Splunk frequently checks the free space available on any partition that contains indexes. It also looks for enough free space where the search dispatch directory is mounted, before executing a search (usually wherever Splunk is installed). 

By default, the threshold is set at 5,000 MB and configurable by the “minFreeSpace” on server.conf. When it’s reached, expect a call from your users informing you Splunk has stopped indexing, or that searches are not working.

It’s important to keep a close eye on this when your instance is running on a partition with less than 20 GB of free space. This is because Splunk will use several GB for its own processes. It’s difficult to pinpoint with certainty how an environment will grow, as several directories grow according to the daily use of Splunk not governed by the limits set on indexes or volumes. 

The top places to look for growth in an environment

  • Dispatch directory ($SPLUNK_HOME/var/run/splunk/dispatch)
  • KV store directory ($SPLUNK_DB/kvstore)
  • Configuration bundle directory ($SPLUNK_HOME/var/run/splunk/cluster/remote-bundle)
  • Knowledge bundle directory ($SPLUNK_HOME/var/run/searchpeers)

There’s a way to avoid any surprises: use your monitoring tool of choice to alert for low disk space. A favorite fix of ours: implementing NMON across our cluster. It provides all types of useful metrics when troubleshooting and monitoring your environment. And NMON conveniently has a predefined low disk space alert you can adjust to your environment.


In summary

We hope this overview of 3 Splunk best practice lessons helps make your life easier. If you have questions about this article or about Splunk, we’re always happy to help. Contact us at any time.