You are on page 1of 228

Splunk 5.0.

2
Distributed Deployment Manual
Generated: 5/08/2013 9:26 pm

Copyright 2013 Splunk, Inc. All Rights Reserved

Table of Contents
Overview...............................................................................................................1 Distributed Splunk overview......................................................................1 How data moves through Splunk: the data pipeline...................................5 Scale your deployment: Splunk components.............................................7 Components and roles............................................................................10 Estimate hardware requirements.....................................................................14 Hardware capacity planning for a distributed Splunk deployment..........14 Distribute indexing and searching...........................................................15 How Splunk looks through your data......................................................16 Reference hardware ................................................................................18 Accommodate concurrent users and searches.......................................20 How Splunk apps affect resource requirements.....................................24 Summary of performance recommendations..........................................24 Forward data......................................................................................................26 About forwarding and receiving ...............................................................26 Types of forwarders................................................................................27 Forwarder deployment topologies...........................................................31 Configure forwarding........................................................................................35 Set up forwarding and receiving ..............................................................35 Enable a receiver....................................................................................37 Configure forwarders with outputs.conf ...................................................42 Protect against loss of in-flight data........................................................50 Use the forwarder to create deployment topologies......................................58 Consolidate data from multiple machines...............................................58 Set up load balancing ..............................................................................59 Route and filter data................................................................................64 Forward data to third-party systems ........................................................79 Deploy the universal forwarder........................................................................84 Introducing the universal forwarder.........................................................84 Universal forwarder deployment overview..............................................86 Deploy a Windows universal forwarder via the installer GUI..................91 Deploy a Windows universal forwarder via the command line................98 Remotely deploy a Windows universal forwarder with a static configuration ...........................................................................................110
i

Table of Contents
Deploy the universal forwarder Deploy a *nix universal forwarder manually..........................................111 Remotely deploy a *nix universal forwarder with a static configuration ...........................................................................................117 Make a universal forwarder part of a system image.............................124 Migrate a Windows light forwarder........................................................126 Migrate a *nix light forwarder................................................................128 Supported CLI commands....................................................................129 Deploy heavy and light forwarders................................................................131 Deploy a heavy or light forwarder.........................................................131 Heavy and light forwarder capabilities..................................................136 Search across multiple indexers....................................................................139 About distributed search.......................................................................139 Install a dedicated search head............................................................144 Configure distributed search.................................................................145 Mount the knowledge bundle.................................................................151 Configure search head pooling.............................................................160 How authorization works in distributed searches..................................168 Use distributed search..........................................................................171 Troubleshoot distributed search............................................................172 Monitor your deployment................................................................................176 About Splunk Deployment Monitor App................................................176 Deploy configuration updates and apps across your environment...........177 About deployment server......................................................................177 Plan a deployment................................................................................180 Configure deployment clients................................................................182 Define server classes............................................................................187 Deploy apps and configurations............................................................196 Extended example: deploy configurations to several forwarders..........199 Example: add an input to forwarders....................................................207 Example: deploy an app ........................................................................208 Deploy in multi-tenant environments.....................................................209

ii

Table of Contents
Upgrade your deployment..............................................................................212 Upgrade your distributed environment...................................................212 Upgrade the Windows universal forwarder...........................................218 Upgrade the universal forwarder for *nix systems................................221

iii

Overview
Distributed Splunk overview
This manual describes how to distribute various components of Splunk functionality across multiple machines. By distributing Splunk, you can scale its functionality to handle the data needs for enterprises of any size and complexity. In single-machine deployments, one instance of Splunk handles the entire end-to-end process, from data input through indexing to search. A single-machine deployment can be useful for testing and evaluation purposes and might serve the needs of department-sized environments. For larger environments, however, where data originates on many machines and where many users need to search the data, you'll want to distribute functionality across multiple Splunk instances. This manual describes how to deploy and use Splunk in such a distributed environment.

How Splunk scales


Splunk performs three key functions as it moves data through the data pipeline. First, Splunk consumes data from files, the network, or elsewhere. Then it indexes the data. (Actually, it first parses and then indexes the data, but for purposes of this discussion, we consider parsing to be part of the indexing process.) Finally, it runs interactive or scheduled searches on the indexed data. You can split this functionality across multiple specialized instances of Splunk, ranging in number from just a few to thousands, depending on the quantity of data you're dealing with and other variables in your environment. You might, for example, create a deployment with many Splunk instances that only consume data, several other instances that index the data, and one or more instances that handle search requests. These specialized instances of Splunk are known collectively as components. There are several types of components. For a typical mid-size deployment, for example, you can deploy lightweight versions of Splunk, called forwarders, on the machines where the data originates. The forwarders consume data locally and then forward the data across the network to another Splunk component, called the indexer. The indexer does the heavy lifting; it indexes the data and runs searches. It should reside on a machine by itself. The forwarders, on the other hand, can easily co-exist on the machines generating the data, because the data-consuming function has minimal impact on machine performance. This diagram shows
1

several forwarders sending data to a single indexer:

As you scale up, you can add more forwarders and indexers. For a larger deployment, you might have hundreds of forwarders sending data to a number of indexers. You can use load balancing on the forwarders, so that they distribute their data across some or all of the indexers. Not only does load balancing help with scaling, but it also provides a fail-over capability if one of the indexers goes down. The forwarders automatically switch to sending their data to any indexers that remain alive. In this diagram, each forwarder load-balances its data across two indexers:

To coordinate and consolidate search activities across multiple indexers, you can also separate out the functions of indexing and searching. In this type of deployment, called distributed search, each indexer just indexes data and performs searches across its own indexes. A Splunk instance dedicated to search management, called the search head, coordinates searches across the set of indexers, consolidating the results and presenting them to the user:

For the largest environments, you can deploy a pool of several search heads sharing a single configuration set. With search head pooling, you can coordinate simultaneous searches across a large number of indexers:

These diagrams illustrate a few basic deployment topologies. You can actually combine the Splunk functions of data input, indexing, and search in a great variety of ways. For example, you can set up the forwarders so that they route data to multiple indexers, based on specified criteria. You can also configure forwarders to process data locally before sending the data on to an indexer for storage. In another scenario, you can deploy a single Splunk instance that serves as both search head and indexer, searching across not only its own indexes but the indexes on other Splunk indexers as well. You can mix-and-match Splunk components as needed. The possible scenarios are nearly limitless. This manual describes how to scale a deployment to fit your exact needs, whether you're managing data for a single department or for a global enterprise... or for anything in between.

Use index clusters


Starting with version 5.0, you can group Splunk indexers into clusters. Clusters are groups of Splunk indexers configured to replicate each others' data, so that
3

the system keeps multiple copies of all data. This process is known as index replication. By maintaining multiple, identical copies of Splunk data, clusters prevent data loss while promoting data availability for searching. Splunk clusters feature automatic failover from one indexer to the next. This means that, if one or more indexers fail, incoming data continues to get indexed and indexed data continues to be searchable. Besides enhancing data availability, clusters have other key features that you should consider when you're scaling a deployment. For example, they include a capability to coordinate configuration updates easily across all indexers in the cluster. They also include a built-in distributed search capability. For more information on clusters, see "About clusters and index replication" in the Managing Indexers and Clusters manual.

Manage your Splunk deployment


Splunk provides a few key tools to help manage a distributed deployment: Deployment server. This Splunk component provides a way to centrally manage configurations and content updates across your entire deployment. See "About deployment server" for details. Deployment monitor. This app can help you manage and troubleshoot your deployment. It tracks the status of your forwarders and indexers and provides early warning if problems develop. Read the Deploy and Use Splunk Deployment Monitor App manual for details.

What comes next


The rest of this Overview section covers: How data moves through Splunk: the data pipeline Scale your deployment: Splunk components Components and roles It starts by describing the data pipeline, from the point that the data enters Splunk to when it becomes available for users to search on. Next, the overview describes how Splunk functionality can be split into modular components. It then correlates the available Splunk components with their roles in facilitating the data pipeline. The remaining sections of this manual describe the Splunk components in detail, explaining how to use them to create a distributed Splunk deployment.
4

For information on capacity planning based on the scale of your deployment, read "Hardware capacity planning for your Splunk deployment" in the Installation manual.

How data moves through Splunk: the data pipeline


Data in Splunk transitions through several phases, as it moves along the data pipeline from its origin in sources such as logfiles and network feeds to its transformation into searchable events that encapsulate valuable knowledge. The data pipeline includes these segments: Input Parsing Indexing Search You can assign each of these segments to a different Splunk instance, as described here. This diagram outlines the data pipeline:

Splunk instances participate in one or more segments of the data pipeline, as described in "Scale your deployment".

Note: The diagram represents a simplified view of the indexing architecture. It provides a functional view of the architecture and does not fully describe Splunk internals. In particular, the parsing pipeline actually consists of three pipelines: parsing, merging, and typing, which together handle the parsing function. The distinction can matter during troubleshooting, but does not ordinarily affect how you configure or deploy Splunk.

Input
In the input segment, Splunk consumes data. It acquires the raw data stream from its source, breaks it into 64K blocks, and annotates each block with some metadata keys. The keys apply to the entire input source overall. They include the host, source, and source type of the data. The keys can also include values that are used internally by Splunk, such as the character encoding of the data stream, and values that control later processing of the data, such as the index into which the events should be stored. During this phase, Splunk does not look at the contents of the data stream, so the keys apply to the entire source, not to individual events. In fact, at this point, Splunk has no notion of individual events at all, only of a stream of data with certain global properties.

Parsing
During the parsing segment, Splunk examines, analyzes, and transforms the data. This is also known as event processing. It is during this phase that Splunk breaks the data stream into individual events.The parsing phase has many sub-phases: Breaking the stream of data into individual lines. Identifying, parsing, and setting timestamps. Annotating individual events with metadata copied from the source-wide keys. Transforming event data and metadata according to Splunk regex transform rules.

Indexing
During indexing, Splunk takes the parsed events and writes them to the index on disk. It writes both compressed raw data and the corresponding index files. For brevity, parsing and indexing are often referred together as the indexing
6

process. At a high level, that's fine. But when you need to look more closely at the actual processing of data, it can be important to consider the two segments individually. A detailed diagram that depicts the indexing pipelines and explains how indexing works can be found in "How Indexing Works" in the Community Wiki.

Search
Splunk's search function manages all aspects of how the user sees and uses the indexed data, including interactive and scheduled searches, reports and charts, dashboards, and alerts. As part of its search function, Splunk stores user-created knowledge objects, such as saved searches, event types, views, and field extractions. For more information on the various steps in the pipeline, see "How indexing works" in the Managing Indexers and Clusters manual.

Scale your deployment: Splunk components


To accommodate your deployment topology and performance requirements, you can allocate the different Splunk roles, such as data input and indexing, to separate Splunk instances. For example, you can have instances that just gather data inputs, which they then forward to another, central instance for indexing. Or you can distribute indexing across several instances that coordinate with a separate instance that processes all search requests. To facilitate the distribution of roles, Splunk can be configured into a range of separate component types, each mapping to one or more of the roles. You create most components by enabling or disabling specific functions of the full Splunk instance. These are the Splunk component types available for use in a distributed environment: Indexer Forwarder Search head Deployment server All components are variations of the full Splunk instance, with certain features either enabled or disabled, except for the universal forwarder, which is its own executable.
7

Indexers
The indexer is the Splunk component that creates and manages indexes. The primary functions of an indexer are: Indexing incoming data. Searching the indexed data. In single-machine deployments consisting of just one Splunk instance, the indexer also handles the data input and search management functions. For larger-scale needs, indexing is split out from the data input function and sometimes from the search management function as well. In these larger, distributed deployments, the Splunk indexer might reside on its own machine and handle only indexing (usually along with parsing), along with searching of its indexed data. In those cases, other Splunk components take over the non-indexing/searching roles. Forwarders consume the data, indexers index and search the data, and search heads coordinate searches across the set of indexers. For information on indexers, see the Managing Indexers and Clusters manual, starting with the topic "About indexes and indexers".

Forwarders
One role that's typically split off from the indexer is the data input function. For instance, you might have a group of Windows and Linux machines generating data that needs to go to a central Splunk indexer for consolidation. Usually the best way to do this is to install a lightweight instance of Splunk, known as a forwarder, on each of the data-generating machines. These forwarders manage the data input and send the resulting data streams across the network to a Splunk indexer, which resides on its own machine. There are two types of forwarders: Universal forwarders. These have a very light footprint and forward only unparsed data. Heavy forwarders. These have a larger footprint but can parse, and even index, data before forwarding it. Note: There is also a third type of forwarder, the light forwarder. The light forwarder is essentially obsolete, having being replaced in release 4.2 by the universal forwarder, which provides similar functionality in a smaller footprint.
8

For information on forwarders, start with the topic "About forwarding and receiving".

Search heads
In situations where you have a large amount of indexed data and numerous users concurrently searching on it, it can make sense to distribute the indexing load across several indexers, while offloading the search query function to a separate machine. In this type of scenario, known as distributed search, one or more Splunk components called search heads distribute search requests across multiple indexers. For information on search heads, see "About distributed search".

Deployment server
To update a distributed deployment, you can use Splunk's deployment server. The deployment server lets you push out configurations and content to sets of Splunk instances (referred to, in this context, as deployment clients), grouped according to any useful criteria, such as OS, machine type, application area, location, and so on. The deployment clients are usually forwarders or indexers. For example, once you've made and tested an updated configuration on a local Linux forwarder, you can push the changes to all the Linux forwarders in your deployment. The deployment server can cohabit a Splunk instance with another Splunk component, either a search head or an indexer, if your deployment is small (less than around 30 deployment clients). It should run on its own Splunk instance in larger deployments. For more information, see this tech note on the Community Wiki. For detailed information on the deployment server, see "About deployment server".

Deployment monitor
Although it's actually an app, not a Splunk component, the deployment monitor has an important role to play in distributed environments. Distributed deployments can scale to forwarders numbering into the thousands, sending data to many indexers, which feed multiple search heads. To view and troubleshoot these distributed deployments, you can use the deployment monitor, which provides numerous views into the state of your forwarders and
9

indexers. For detailed information on the deployment monitor, read the Deploy and Use Splunk Deployment Monitor App manual.

Where to go next
While the fundamental issues of indexing and event processing remain the same no matter what the size or nature of your distributed deployment, it is important to take into account deployment needs when planning your indexing strategy. To do that effectively, you must also understand how components map to Splunk roles. For information on hardware requirements for scaling your deployment, see "Hardware capacity planning for your Splunk deployment".

Components and roles


Each segment of the data pipeline directly corresponds to a role that one or more Splunk components can perform. For instance, data input is a Splunk role. Either an indexer or a forwarder can perform the data input role. For more information on the data pipeline, look here.

How components support the data pipeline


This table correlates the pipeline segments and Splunk roles with the components that can perform them: Data pipeline segment Data input Splunk role Splunk components that can perform this role indexer universal forwarder heavy forwarder
indexer

Data input

Parsing Indexing Search


n/a

Parsing Indexing Search

heavy forwarder
indexer indexer

search head deployment server


10

Managing distributed updates

deployment monitor app (not actually a component, but rather a key feature for managing distributed environments) As you can see, some roles can be filled by diffferent components depending on the situation. For instance, data input can be handled by an indexer in single-machine deployments, or by a forwarder in larger deployments.
n/a Troubleshooting deployments

For more information on components, look here.

Components in action
These are some of the common ways in which Splunk functionality is distributed and managed. Forward data to an indexer In this deployment scenario, forwarders handle data input, collecting data and send it on to a Splunk indexer. Forwarders come in two flavors: Universal forwarders. These maintain a small footprint on their host machine. They perform minimal processing on the incoming data streams before forwarding them on to an indexer, also known as the receiver. Heavy forwarders. These retain much of the functionality of a full Splunk instance. They can parse data before forwarding it to the receiving indexer. (See "How data moves through Splunk" for the distinction between parsing and indexing.) Both types of forwarders tag data with metadata such as host, source, and source type, before forwarding it on to the indexer. Note: There is also a third type of forwarder, the light forwarder. The light forwarder is essentially obsolete, having being replaced in release 4.2 by the universal forwarder, which provides similar functionality in a smaller footprint. Forwarders allow you to use resources efficiently while processing large quantities or disparate types of data. They also enable a number of interesting deployment topologies, by offering capabilities for load balancing, data filtering, and routing.

11

For an extended discussion of forwarders, including configuration and detailed use cases, see "About forwarding and receiving". Search across multiple indexers In distributed search, Splunk instances send search requests to other Splunk instances and merge the results back to the user. This is useful for a number of purposes, including horizontal scaling, access control, and managing geo-dispersed data. The Splunk instance that manages search requests is called the search head. The instances that maintain the indexes and perform the actual searching are indexers, called search peers in this context. For an extended discussion of distributed search, including configuration and detailed use cases, see "About distributed search". Manage distributed updates When dealing with distributed deployments consisting potentially of many forwarders, indexers, and search heads, the Splunk deployment server simplifies the process of configuring and updating Splunk components, mainly forwarders and indexers. Using the deployment server, you can group the components (referred to as deployment clients in this context) into server classes, making it possible to push updates based on common characteristics. A server class is a set of Splunk instances that share configurations. Server classes are typically grouped by OS, machine type, application area, location, or other useful criteria. A single deployment client can belong to multiple server classes, so a Linux universal forwarder residing in the UK, for example, might belong to a Linux server class and a UK server class, and receive configuration settings appropriate to each. For an extended discussion of deployment management, see "About deployment server". View and troubleshoot your deployment Use the deployment monitor app to view the status of your Splunk components and troubleshoot them. The deployment monitor is functionally part of the search role, so it resides with either the indexer or a search head. It looks at internal events generated by Splunk forwarders and indexers.

12

The home page for this app is a dashboard that provides charts with basic stats on index throughput and forwarder connections over time. It also includes warnings for unusual conditions, such as forwarders that appear to be missing from the system or indexers that aren't currently indexing any data. The charts, warnings, and other information on this page provide an easy way to monitor potentially serious conditions. The page itself provides guidance on what each type of warning means. The deployment monitor also provides pages that consolidate data for all forwarders and indexers in your deployment. You can drilldown further, to obtain detailed information on any forwarder or indexer. For more information on the deployment monitor, read the Deploy and Use Splunk Deployment Monitor App manual.

For more information


In summary, these are the fundamental components and features of a Splunk distributed environment: Indexers. See "About indexes and indexers" in the Managing Indexers and Clusters manual. Forwarders. See "About forwarding and receiving" in this manual. Search heads. See "About distributed search" in this manual. Deployment server. See "About deployment server" in this manual. Deployment monitor. Read the Deploy and Use Splunk Deployment Monitor App manual. For guidance on where to configure various Splunk settings, see "Configuration parameters and the data pipeline" in the Admin manual. That topic lists key configuration settings and the data pipeline segments they act upon. If you know which components in your Splunk topology handle which segments of the data pipeline, you can use that topic to determine where to configure the various settings. For example, if you use a search head to handle the search segment, you'll need to configure any search-related settings on the search head and not on your indexers.

13

Estimate hardware requirements


Hardware capacity planning for a distributed Splunk deployment
Overview
If you have larger indexing or searching requirements, run Splunk apps or solutions that generate or execute a lot of saved searches, or regularly employ I/O-intensive searches, then you should scale your Splunk deployment to address the increased resource overhead that those operations incur. For an overview of what a distributed Splunk deployment is, review "Distributed Splunk overview" in this manual. In many cases, this involves using distributed search to run searches in parallel across multiple indexers at once. You can gather data from machines using Splunk forwarders and, optionally, configure those servers to send data to multiple indexers at once to reduce search time. For information on the individual elements of a Splunk deployment, read "Components of a Splunk deployment" in the Installation Manual.

Estimate hardware requirements


While determining the hardware requirements for your distributed Splunk deployment, there are a number of things you must consider. You must understand how various Splunk activities affect the resource overhead required to perform them. Each of the following activities has a direct impact on the overall performance of Splunk: The amount of data you index. The number of concurrent Splunk users. The number of saved searches you run. The types of search you employ. The number of apps or solutions you implement. When you run apps, whether or not those apps execute a large number of saved searches. When you add more indexers to a Splunk deployment, you increase the amount of available indexing capacity by reducing the indexing overhead per server.
14

Consequently, reduced indexing overhead also means reduced search time. But that is only half the story. While Splunk scales across multiple indexers, the amount of indexing throughput becomes less important as either the number of concurrent users or saved searches increases. Additionally, depending on the kinds of searches you employ against your data, the resource needs for searching can become as important as the resource needs for indexing. For additional information on estimating your hardware requirements, read the following topics, all in this manual: "Distribute indexing and searching" - for details on how to begin structuring your distributed environment. "How Splunk looks through your data" - to learn what the different search types are, and how they impact performance on an indexer. "Reference hardware" - to learn about the reference servers and the indexing and searching performance they are capable of. "Accommodate concurrent users and searches" - for various scenarios on addressing search performance. "How Splunk apps affect resource requirements" - for additional information on how Splunk apps consume computing resources.

Considerations for clusters


There are some additional hardware issues to consider if you're implementing Splunk clusters. See "System requirements and other deployment considerations" in the Managing Indexers and Clusters manual.

Distribute indexing and searching


This topic discusses the concepts and hardware requirements for distributing the indexing and searching components of your Splunk deployment.

Concepts of distributed indexing and searching


You scale your Splunk deployment by dedicating searching and indexing across multiple servers. Indexers bring in, store, and search the data. Search heads manage search requests and present results. Since indexers require much more disk I/O throughput than search heads do, you give your environment more indexing capacity by reducing the overhead
15

required for searching. The key points to remember are: The more indexers you add to the deployment, the faster data is consumed and prepared for searches. The more search heads you add to the deployment, the faster you are able to find the data you indexed.

Considerations for search performance vs. indexing performance


While the two points shown above are best practice for improving indexing speed, there are some important caveats to note as well, particularly when it comes to search speed. As your indexers consume data, they store it in buckets - individual elements of an index. As more data comes in, the number of buckets increases. An increased number of buckets - particularly those which hold smaller amounts of data - can impact search speed because of the throughput required to navigate through those buckets for the data that you're searching. Additionally, as the number of buckets increases, Splunk must manage the buckets on an individual system. Splunk does this by "rolling" buckets - thus making room for new incoming data. This procedure takes up I/O cycles as well cycles that could be used to fetch events for search requests. The key points to understand are: You can't necessarily improve search performance simply by adding search heads to your distributed deployment. A mix of search heads and indexers is vital. The number and types of search also impact indexer performance. Some search types tax an indexer's CPU, others apply pressure to the disk subsystem. More detail about how to plan for simultaneous searches is found in "Accommodate concurrent users and searches" in this manual.

How Splunk looks through your data


As you scale your deployment up, the importance of understanding the different types of search and how they impact Splunk's performance increases. This
16

knowledge helps you determine how many indexers and search heads you should add to your distributed deployment.

Search types: the details


There are four basic types of search that you can invoke against data stored in a Splunk index. Each of these search types impacts the Splunk indexer in a different way. The search types are: Dense. A dense search is a search that returns a large percentage (10% or more) of matching results for a given set of data in a given period of time. A reference server should be able to fetch up to 50,000 matching events per second for a dense search. Dense searches usually tax a server's CPU first, because of the overhead required to decompress the raw data stored in a Splunk index. Sparse. Sparse searches return smaller numbers of results for a given set of data in a given period of time (anywhere from .01 to 1%) than dense searches do. A reference indexer should be able to fetch up to 5,000 matching events per second when executing a sparse search. Super-sparse. A super-sparse search is a "needle in the haystack" search that retrieves only a very small number of results across the same set of data within the same time period as the other searches. A super-sparse search is very I/O intensive because the indexer must look through all of the buckets of an index to find the desired results. This can take up to two seconds per searched bucket. If you have a large amount of data stored on your indexer, there are a lot of buckets, and a super-sparse search can take a very long time to complete. Rare. Rare searches are like super-sparse searches in that they match just a handful of results across a number of index buckets. The major difference with rare searches is that bloom filters - data structures that test whether or not an element is a member of a set - significantly reduce the number of buckets that need to be searched by eliminating those buckets which do not contain events that match the search request. This allows a rare search to complete anywhere from 20 to 100 times faster than a super-sparse search, for the same amount of data searched.

Summary
The following table summarizes the different search types. Note that for dense and sparse searches, Splunk measures performance based on number of matching events, while with super-sparse and rare searches, performance is
17

measured based on total indexed volume. Search type


Dense

Description
Dense searches return a large percentage of results for a given set of data in a given period of time.

Ref. indexer throughput


Up to 50,000 matching events per second

Performance impact
Generally CPU-bound

Sparse

Sparse searches return a smaller amount Up to 5,000 of results for a given set of data in a matching events given period of time than dense searches per second do. Super-sparse searches return a very small number of results from each index

Generally CPU-bound

Super-sparse

bucket which match the search. Up to 2 seconds Depending on how large the set of per index bucket data is, these types of search can take a long period of time.
Rare searches are similar to super-sparse searches, but are assisted by bloom filters which help

Primarily I/O bound

Rare

eliminate index buckets that do From 10 to 50 Primarily I/O not match the search request. index buckets per bound second Rare searches return results anywhere from 20 to 100 times faster than a super-sparse search does.

Reference hardware
At higher data indexing rates and/or user counts, you must take into account the differing needs of indexers and search heads. Dedicated search heads do not need extremely fast disk throughput, nor do they need much local storage. They do, however, require far more CPU resources than indexers do. The reference indexer for a distributed Splunk deployment is somewhat different from a reference indexer in a single-server deployment. Of particular note is the disk subsystem: a reference indexer in a distributed deployment has significantly larger disk throughput requirements over a single-server reference indexer. The main reason for this is that an indexer in a distributed deployment indexes more data and handles more search requests than one in a single-server deployment.

18

Following are the recommendations for both search heads and indexers in a distributed deployment:

Dedicated search head


Intel 64-bit chip architecture 4 CPUs, 4 cores per CPU, 2.5-3 Ghz per core 8 GB RAM 2 x 300 GB, 10,000 RPM SAS hard disks, configured in RAID 0 Standard 1Gb Ethernet NIC, optional 2nd NIC for a management network Standard 64-bit Linux or Windows distribution Note: Since search heads require raw computing power and are likely to become CPU-bound, it is better to add additional CPU cores to a search head if faster performance per server is desired. The guideline of 1 core per active Splunk user still applies. Additionally: Don't forget to account for scheduled searches in your CPU allowance. A typical search request requires 1 CPU core. Depending on the type of search you use, you might need to add CPU cores to account for the increased load those search types cause.

Indexer
Indexers in a distributed deployment configuration have higher disk I/O bandwidth requirements than indexers in a non-distributed environment. This is because indexers must both write new data and service the remote requests of search heads. Intel 64-bit chip architecture 2 CPUs, 4 cores per CPU, 2.5-3Ghz per core 8 GB RAM Disk subsystem capable of 1200 average input/output operations per second (IOPS) (for example: 12 x 300GB, 15,000 RPM SAS hard disks, configured in RAID 1+0 Standard 1Gb Ethernet NIC, optional 2nd NIC for a management network Standard 64-bit Linux or Windows distribution At higher daily volumes, local disk will likely not provide cost-effective storage for the time frames where speedy search is desired. In these cases, we suggest deploying fast attached storage or networked storage, such as storage area networks (SAN) over fiber. While there are too many types of storage to recommend, consider these guidelines when planning your storage
19

infrastructure: Indexers do many bulk reads. Indexers do many disk seeks. Therefore: More disks (specifically, more spindles) are better for indexing performance. Total throughput of the entire system is important, however: The ratio of disks to disk controllers in a particular system should be higher, similar to how you configure a database server.

Ratio of indexers to search heads


Technically, there is no practical Splunk limitation on the number of search heads an indexer can support, or the number of indexers a search head can search against. However, systems limitations suggest a ratio of approximately 8 to 1 in most use cases. That is a rough guideline - if you have many searchers compared to your total data volume, then more search heads will increase search efficiency. In some cases, the best use of a separate search head is to populate summary indexes. This search head then acts like an indexer to the primary search head that users log into.

Accommodate concurrent users and searches


This topic discusses how to accommodate many simultaneous searches, search types, and concurrent users with your distributed Splunk deployment.

Overview
The biggest performance impacts on a Splunk deployment are the number of concurrent users, the number of concurrent searches, and the types of searches you employ during the course of the deployment's operation. Each of these elements affects the deployment in different ways.

How concurrent users and searches affect performance


When a user submits a search request, the search request may take up to one CPU core on each indexer to process while it is running. Any additional searches launched by that user also account for one CPU core.
20

The type of search the user invokes also impacts resource usage. The additional amount of CPU or disk usage varies depending on the search type. For additional information about Splunk search types, read "How Splunk looks through your data" in this chapter.

How to maximize search performance


The best way to address the resource overhead for many concurrent searches is to configure the environment to handle search requests completely within available physical server memory. While this is important for the servers that you designate as search heads, it's particularly important for indexers in your deployment, as indexers must both index incoming data and search existing data. For example, in a deployment that has up to 48 concurrent searches happening at a time, a single search that uses 200 MB of memory translates to nearly 10 GB of memory required to satisfy the 48 concurrent search requests. The amount of available memory is a very important statistic - while performance on an indexer only declines gradually with increased CPU usage from concurrent search jobs, it drops dramatically when the server exhausts all available physical memory. A single reference indexer could not handle 48 concurrent searches - at the very least, additional memory is required for satisfactory performance. Notwithstanding the memory constraints, a search's run time increases proportionally as the number of free CPU cores on a system decreases. For example, on an idle system with 8 available cores, the first eight searches to arrive get serviced immediately by each core, and complete within a short period of time (for the purposes of this example, 10 seconds). If there are 48 searches running concurrently, then completion time increases significantly, as shown in the calculation below: No. of concurrent searches / No. of avail. cores = No. of searches per core x No. of sec. per individual search = Total time (sec.) per search
60

48

10

Since indexers do the bulk of the work in search operations (reading data off disk, decompressing it, extracting knowledge and reporting), it's best practice to add indexers to decrease the amount of time per search. In this example, to return to the performance level of 10-second searches, you must deploy 6
21

indexers with eight cores each: No. of concurrent searches


8

/ No. of avail. cores


48 6

= No. of cores per search = No. of cores per search


1

x No. of sec. per individual search


10

= Total time (sec.) per search


1.6

No. of concurrent searches


48

/ No. of avail. cores


48

x No. of sec. per individual search


10

= Total time (sec.) per search


10

Note: In this example, one search head to service search requests is okay. It might be appropriate, however, to set aside a second search head to create summary indexes.) In many cases, though, the system isn't idle before searches arrive. For example, if an indexer is indexing 150 GB/day of data at peak times, then up to 4 of the 8 cores are in use indexing the data. In this case, search times increase significantly: No. of concurrent searches
4 4

/ No. of avail. cores


1

= No. of cores per search

x No. of sec. per individual search


10

= Total time (sec.) per search


10

No. of concurrent searches

/ No. of avail. cores

= No. of searches per core

x No. of sec. per individual search

= Total time (sec.) per search


120

48

12

10

Increasing the number of cores per server does decrease the amount of time taken per search, but is not the most effective way to streamline search operations. One system with 16 cores has the following performance (again, assuming an idle system before searches arrive): No. of concurrent searches
16

/ No. of avail. cores


16 1

= No. of cores per search

x No. of sec. per individual search


10

= Total time (sec.) per search


10

22

No. of concurrent searches

/ No. of avail. cores

= No. of searches per core

x No. of sec. per individual search

= Total time (sec.) per search


30

48

16

10

With two 8-core servers, the performance profile becomes the following: No. of concurrent searches
8

/ No. of avail. cores


16 2

= No. of cores per search = No. of cores per search


1

x No. of sec. per individual search


10 5

= Total time (sec.) per search = Total time (sec.) per search
10

No. of concurrent searches


16

/ No. of avail. cores


16

x No. of sec. per individual search


10

No. of concurrent searches

/ No. of avail. cores

= No. of searches per core

x No. of sec. per individual search

= Total time (sec.) per search


30

48

16

10

Two 8-core servers cost only slightly more than one 16-core server. In addition, two 8-core servers provide significantly more available disk throughput than one 16-core server does, based on spindle count. This is especially important for indexers because of the high disk bandwidth that they require. Adding indexers reduces the indexing load on any system and frees CPU cores for searching. Also, since the performance of almost all types of search scales with the number of indexers, searches will be faster, which mitigates the effect of reduced performance from sharing resources amongst both indexing and searching. Increasing search speed reduces the chance of concurrent searches with concurrent users. In real-world situations with hundreds of users, each user will run a search every few minutes, though not at the exact same time as other users. By adding indexers and reducing the search time, you reduce the concurrency factor and lower the concurrency-related I/O and memory contention.

23

How Splunk apps affect resource requirements


This topic discusses how Splunk apps and solutions affect hardware requirements in a distributed Splunk deployment. If you use a Splunk app or solution that gets knowledge by executing a larger number of saved searches, then you can easily overwhelm a single-server Splunk deployment. This is because multiple searches quickly exhaust available CPU resources on an indexer. (Review "Accommodate many simultaneous searches" in this manual for specifics on how Splunk deals with multiple concurrent searches.) When installing an app or solution, read the system requirements outlined in that app or solution's documentation. If that information is not available, contact the authors of the app or solution to get specific details about what is needed to run the app properly.

Summary of performance recommendations


The table below depicts the performance recommendations based on the reference servers described earlier in this chapter. For specifics on those reference servers, read "Reference hardware." Important: The table shows approximate guidelines only. You should modify these figures based on your specific use case. If you need additional guidance, contact Splunk. You might want to engage a member of Professional Services depending on the deployment's initial size. Daily Volume
< 2 GB/day 2 to 100 GB/day 100 to 200 GB/day 200 to 300 GB/day 300 to 400 GB/day

Number of Search Users


<2 up to 4 up to 8 up to 12 up to 8 up to 16

Recommended Indexers
1, shared 1, dedicated 2 3 4 5

Recommended Search Heads


N/A N/A 1 1 1 2

24

400 to 500 GB/day 500 GB to 1 TB/day 1 TB to 20 TB/day 20 TB to 60 TB/day up to 24 up to 100 up to 100 10 100 300 2 24 32

Answers
Have questions? Visit Splunk Answers to see what questions and answers other Splunk users had about hardware and Splunk.

25

Forward data
About forwarding and receiving
You can forward data from one Splunk instance to another Splunk server or even to a non-Splunk system. The Splunk instance that performs the forwarding is typically a smaller footprint version of Splunk, called a forwarder. A Splunk instance that receives data from one or more forwarders is called a receiver. The receiver is usually a Splunk indexer, but can also be another forwarder, as described here. This diagram shows three forwarders sending data to a single Splunk receiver (an indexer), which then indexes the data and makes it available for searching:

Forwarders represent a much more robust solution for data forwarding than raw network feeds, with their capabilities for: Tagging of metadata (source, source type, and host) Configurable buffering Data compression SSL security Use of any available network ports The forwarding and receiving capability makes possible all sorts of interesting Splunk topologies to handle functions like data consolidation, load balancing, and data routing. For more information on the types of deployment topologies that you can create with forwarders, see "Forwarder deployment topologies". Splunk provides a number of types of forwarders to meet various needs. These are described in "Types of forwarders".

26

Types of forwarders
There are three types of forwarders: The universal forwarder is a streamlined, dedicated version of Splunk that contains only the essential components needed to forward data to receivers. A heavy forwarder is a full Splunk instance, with some features disabled to achieve a smaller footprint. A light forwarder is also a full Splunk instance, with most features disabled to achieve as small a footprint as possible. The universal forwarder, with its even smaller footprint yet similar functionality, supersedes the light forwarder for nearly all purposes. Note: The light forwarder has been deprecated in Splunk version 6.0. For a list of all deprecated features, see the topic "Deprecated features" in the Release Notes. In nearly all respects, the universal forwarder represents the best tool for forwarding data to indexers. Its main limitation is that it forwards only unparsed data, as described later in this topic. Therefore, you cannot use it to route data based on event contents. For that, you must use a heavy forwarder. You also cannot index data locally on a universal forwarder; only a heavy forwarder can index and forward.

The universal forwarder


The universal forwarder is Splunk's new lightweight forwarder. You use it to gather data from a variety of inputs and forward the data to a Splunk server for indexing and searching. You can also forward data to another forwarder, as an intermediate step before sending the data onwards to an indexer. The universal forwarder's sole purpose is to forward data. Unlike a full Splunk instance, you cannot use the universal forwarder to index or search data. To achieve higher performance and a lighter footprint, it has several limitations: The universal forwarder has no searching, indexing, or alerting capability. The universal forwarder does not parse data. Unlike full Splunk, the universal forwarder does not include a bundled version of Python.

27

For details on the universal forwarder's capabilities, see "Introducing the universal forwarder". Note: The universal forwarder is a separately downloadable piece of software. Unlike the heavy and light forwarders, you do not enable it from a full Splunk instance. To learn how to download, install, and deploy a universal forwarder, see "Universal forwarder deployment overview".

Heavy and light forwarders


While the universal forwarder is generally the preferred way to forward data, you might have reason (legacy-based or otherwise) to use heavy or light forwarders as well. Unlike the universal forwarder, which is an entirely separate, streamlined executable, both heavy and light forwarders are actually full Splunk instances with certain features disabled. Heavy and light forwarders differ in capability and the corresponding size of their footprints. A heavy forwarder (sometimes referred to as a "regular forwarder") has a smaller footprint than a Splunk indexer but retains most of the capability, except that it lacks the ability to perform distributed searches. Much of its default functionality, such as Splunk Web, can be disabled, if necessary, to reduce the size of its footprint. A heavy forwarder parses data before forwarding it and can route data based on criteria such as source or type of event. One key advantage of the heavy forwarder is that it can index data locally, as well as forward data to another Splunk instance. You must turn this capability on; it's disabled by default. See "Configure forwarders with outputs.conf" in this manual for details. A light forwarder has a smaller footprint with much more limited functionality. It forwards only unparsed data. Starting with 4.2, it has been superseded by the universal forwarder, which provides very similar functionality in a smaller footprint. The light forwarder continues to be available mainly to meet any legacy needs. We recommend that you always use the universal forwarder to forward unparsed data. When you install a universal forwarder, the installer gives you the opportunity to migrate checkpoint settings from any (version 4.0 or greater) light forwarder residing on the same machine. See "Introducing the universal forwarder" for a more detailed comparison of the universal and light forwarders. For detailed information on the capabilities of heavy and light forwarders, see "Heavy and light forwarder capabilities".

28

To learn how to enable and deploy a heavy or light forwarder, see "Deploy a heavy or light forwarder".

Forwarder comparison
This table summarizes the similarities and differences among the three types of forwarders: Features and capabilities
Type of Splunk instance Footprint (memory, CPU load) Bundles Python? Handles data inputs? Forwards to Splunk? Forwards to 3rd party systems? Serves as intermediate forwarder?

Universal forwarder
Dedicated executable

Light forwarder
Full Splunk, with most features disabled Small Yes All types Yes Yes

Heavy forwarder
Full Splunk, with some features disabled Medium-to-large (depending on enabled features) Yes All types Yes Yes

Smallest No All types (but scripted inputs might require Python installation) Yes Yes

Yes

Yes

Yes

Indexer acknowledgment Optional (guaranteed delivery)? Load balancing? Data cloning? Per-event filtering? Event routing? Event parsing? Yes Yes No No No

Optional (version 4.2+) Optional (version 4.2+) Yes Yes No No No Yes Yes Yes Yes Yes Optional, by setting indexAndForward

Local indexing?

No

No

attribute in
outputs.conf Optional Optional

Searching/alerting? Splunk Web?

No No

No No

29

For detailed information on specific capabilities, see the rest of this topic, as well as the other forwarding topics in the manual.

Types of forwarder data


Forwarders can transmit three types of data: Raw Unparsed Parsed The type of data a forwarder can send depends on the type of forwarder it is, as well as how you configure it. Universal forwarders and light forwarders can send raw or unparsed data. Heavy forwarders can send raw or parsed data. With raw data, the data stream is forwarded as raw TCP; it is not converted into Splunk's communications format. The forwarder just collects the data and forwards it on. This is particularly useful for sending data to a non-Splunk system. With unparsed data, a universal forwarder performs only minimal processing. It does not examine the data stream, but it does tag the entire stream with metadata to identify source, source type, and host. It also divides the data stream into 64K blocks and performs some rudimentary timestamping on the stream, for use by the receiving indexer in case the events themselves have no discernible timestamps. The universal forwarder does not identify, examine, or tag individual events. With parsed data, a heavy forwarder breaks the data into individual events, which it tags and then forwards to a Splunk indexer. It can also examine the events. Because the data has been parsed, the forwarder can perform conditional routing based on event data, such as field values. The parsed and unparsed formats are both referred to as cooked data, to distinguish them from raw data. By default, forwarders send cooked data in the universal forwarder's case, unparsed data, and in the heavy forwarder's case, parsed data. To send raw data instead, set the sendCookedData=false attribute/value pair in outputs.conf.

Forwarders and indexes


Forwarders forward and route data on an index-by-index basis. By default, they forward all external data, as well as data for the _audit internal index. In some cases, they also forward data for the _internal internal index. You can change
30

this behavior as necessary. For details, see "Filter data by target index".

Forwarder deployment topologies


You can deploy Splunk forwarders in a wide variety of scenarios. This topic provides an overview of some of the most useful types of topologies that you can create with forwarders. For detailed information on how to configure various deployment topologies, refer to the topics in the section "Use the forwarder to create deployment topologies".

Data consolidation
Data consolidation is one of the most common topologies, with multiple forwarders sending data to a single Splunk server. The scenario typically involves universal forwarders forwarding unparsed data from workstations or production non-Splunk servers to a central Splunk server for consolidation and indexing. With their lighter footprint, universal forwarders have minimal impact on the performance of the systems they reside on. In other scenarios, heavy forwarders can send parsed data to a central Splunk indexer. Here, three universal forwarders are sending data to a single Splunk indexer:

For more information on data consolidation, read "Consolidate data from multiple machines".

Load balancing
Load balancing simplifies the process of distributing data across several Splunk indexers to handle considerations such as high data volume, horizontal scaling for enhanced search performance, and fault tolerance. In load balancing, the forwarder routes data sequentially to different indexers at specified intervals.

31

Splunk forwarders perform automatic load balancing, in which the forwarder switches receivers at set time intervals. If parsing is turned on (for a heavy forwarder), the switching will occur at event boundaries. In this diagram, three universal forwarders are each performing load balancing between two indexers:

For more information on load balancing, read "Set up load balancing".

Routing and filtering


In data routing, a forwarder routes events to specific Splunk or third-party servers, based on criteria such as source, source type, or patterns in the events themselves. Routing at the event level requires a heavy forwarder. A forwarder can also filter and route events to specific queues, or discard them altogether by routing to the null queue. Here, a heavy forwarder routes data to three Splunk indexers based on event patterns:

For more information on routing and filtering, read "Route and filter data".

32

Forwarders and clusters


You can use forwarders to send data to peer nodes in a cluster. It is recommended that you use load-balanced forwarders for that purpose. This diagram shows two load-balanced forwarders sending data to a cluster:

To learn more about forwarders and clusters, read "Use forwarders to get your data" in the Managing Indexers and Clusters Manual. To learn more about clusters in general, read "About clusters and index replication".

Forwarding to non-Splunk systems


You can send raw data to a third-party system such as a syslog aggregator. You can combine this with data routing, sending some data to a non-Splunk system and other data to one or more Splunk servers. Here, three forwarders are routing data to two Splunk servers and a non-Splunk system:

33

For more information on forwarding to non-Splunk systems, read "Forward data to third-party systems".

Intermediate forwarding
To handle some advanced use cases, you might want to insert an intermediate forwarder between a group of forwarders and the indexer. In this type of scenario, the end-point forwarders send data to a consolidating forwarder, which then forwards the data on to an indexer, usually after indexing it locally. Typical use cases are situations where you need an intermediate index, either for "store-and-forward" requirements or to enable localized searching. (In this case, you would need to use a heavy forwarder.) You can also use an intermediate forwarder if you have some need to limit access to the indexer machine; for instance, for security reasons. To enable intermediate forwarding, you need to configure the forwarder as a both a forwarder and a receiver. For information on how to configure a receiver, read "Enable a receiver".

34

Configure forwarding
Set up forwarding and receiving
Once you've determined your Splunk forwarder deployment topology and what type of forwarder is necessary to implement it, the steps for setting up forwarding and receiving are straightforward. This topic outlines the key steps and provides links to the detailed topics. To set up forwarding and receiving, you need to perform two basic actions, in this order: 1. Set up one or more Splunk indexers as receivers. These will receive the data from the forwarders. 2. Set up one or more Splunk forwarders. These will forward data to the receivers. The remainder of this topic lists the key steps involved, with links to more detailed topics. The procedures vary somewhat according to whether the forwarder is a universal forwarder or a heavy/light forwarder. Universal forwarders can sometimes be installed and configured in a single step. Heavy/light forwarders are first installed as full Splunk instances and then configured as forwarders. Note: This topic assumes that your receivers are indexers. However, in some scenarios, discussed elsewhere, a forwarder also serves as receiver. The set-up is basically much the same for any kind of receiver.

Forwarders and clusters


When using forwarders to send data to peer nodes in a cluster, you set up forwarding and receiving a bit differently from the description in this topic. To learn more about forwarders and clusters, read "Use forwarders to get your data" in the Managing Indexers and Clusters Manual.

Set up forwarding and receiving: universal forwarders


1. Install the full Splunk instances that will serve as receivers. See the Installation Manual for details.
35

2. Use Splunk Web or the CLI to enable receiving on the instances designated as receivers. See "Enable a receiver" in this manual. 3. Install, configure, and deploy the universal forwarders. Depending on your forwarding needs, there are a number of best practices deployment scenarios. See "Universal forwarder deployment overview" for details. Some of these scenarios allow you to configure the forwarder during the installation process. 4. If you have not already done so during installation, you must specify data inputs for each universal forwarder. See "What Splunk can index" in the Getting Data In manual. Note: Since the universal forwarder does not include Splunk Web, you must configure inputs through either the CLI or inputs.conf; you cannot configure with Splunk Manager. 5. If you have not already done so during installation, you must specify the universal forwarders' output configurations. You can do so through the CLI or by editing the outputs.conf file. You get the greatest flexibility by editing outputs.conf. For details, see the other topics in this section, including "Configure forwarders with outputs.conf". 6. Test the results to confirm that forwarding, along with any configured behaviors like load balancing or filtering, is occurring as expected.

Set up forwarding and receiving: heavy or light forwarders


Note: The light forwarder has been deprecated in Splunk version 6.0. For a list of all deprecated features, see the topic "Deprecated features" in the Release Notes. 1. Install the full Splunk instances that will serve as forwarders and receivers. See the Installation Manual for details. 2. Use Splunk Web or the CLI to enable receiving on the instances designated as receivers. See "Enable a receiver" in this manual. 3. Use Splunk Web or the CLI to enable forwarding on the instances designated as forwarders. See "Deploy a heavy or light forwarder" in this manual. 4. Specify data inputs for the forwarders in the usual manner. See "What Splunk can index" in the Getting Data In manual.

36

5. Specify the forwarders' output configurations. You can do so through Splunk Manager, the CLI, or by editing the outputs.conf file. You get the greatest flexibility by editing outputs.conf. For details, see "Deploy a heavy or light forwarder", as well as the other topics in this section, including "Configure forwarders with outputs.conf". 6. Test the results to confirm that forwarding, along with any configured behaviors like load balancing or routing, is occurring as expected.

Manage your forwarders


In environments with multiple forwarders, you might find it helpful to use the deployment server to update and manage your forwarders. See "About deployment server" in this manual. To view the status of your forwarders, you can use the deployment monitor.

Enable a receiver
To enable forwarding and receiving, you configure both a receiver and a forwarder. The receiver is the Splunk instance receiving the data; the forwarder sends data to the receiver. Depending on your needs (for example to enable load balancing), you might have multiple receivers for each forwarder. Conversely, a single receiver usually receives data from many forwarders. The receiver is either a Splunk indexer (the typical case) or another forwarder (referred to as an "intermediate forwarder") configured to receive data from forwarders. You must set up the receiver first. You can then set up forwarders to send data to that receiver.

Compatibility between forwarders and indexers


These are the compatibility restrictions between versions of forwarders and indexers: 4.2+/5.0+ forwarders (universal/light/heavy) are backwards compatible down to 4.2+ indexers. For example, a 4.3 forwarder can send data to a
37

4.2 indexer but not to a 4.1 indexer. Pre-4.2 forwarders are backwards compatible down to 4.0 indexers. All indexers are backwards compatible with any forwarder and can receive data from any earlier version forwarder. For example, a 4.2 indexer can receive data from a 4.1 forwarder. For each app, check Splunkbase for version compatibility. Note: Splunk recommends that the indexer version should be the same or newer than the version of the forwarders sending to it. Although we strive to ensure backward compatibility, it is not always possible.

Set up receiving
Before enabling a Splunk instance (either an indexer or a forwarder) as a receiver you must, of course, first install it. You can then enable receiving on a Splunk instance through Splunk Web, the CLI, or the inputs.conf configuration file. Set up receiving with Splunk Web Use Splunk Manager to set up a receiver: 1. Log into Splunk Web as admin on the server that will be receiving data from a forwarder. 2. Click the Manager link in the upper right corner. 3. Select Forwarding and receiving in the Data area. 4. Click Add new in the Receive data section. 5. Specify which TCP port you want the receiver to listen on (the listening port, also known as the receiving port). For example, if you enter "9997," the receiver will receive data on port 9997. By convention, receivers listen on port 9997, but you can specify any unused port. You can use a tool like netstat to determine what ports are available on your system. Make sure the port you select is not in use by splunkweb or splunkd. 6. Click Save. You must restart Splunk to complete the process.
38

Set up receiving with Splunk CLI To access the CLI, first navigate to $SPLUNK_HOME/bin/. This is unnecessary if you have added Splunk to your path. To enable receiving, enter:

./splunk enable listen <port> -auth <username>:<password>

For <port>, substitute the port you want the receiver to listen on (the receiving port). For example, if you enter "9997," the receiver will receive data on port 9997. By convention, receivers listen on port 9997, but you can specify any unused port. You can use a tool like netstat to determine what ports are available on your system. Make sure the port you select is not in use by splunkweb or splunkd. To disable receiving, enter:

./splunk disable listen -port <port> -auth <username>:<password>

Set up receiving with the configuration file You can enable receiving on your Splunk instance by configuring inputs.conf in $SPLUNK_HOME/etc/system/local. For most purposes, you just need to add a [splunktcp] stanza that specifies the receiving port. In this example, the receiving port is 9997:

[splunktcp://9997]

For further details, refer to the inputs.conf spec file. To configure a universal forwarder as an intermediate forwarder (a forwarder that functions also as a receiver), use this method.

Searching data received from a forwarder running on a different operating system


In most cases, a Splunk instance receiving data from a forwarder on a different OS will need to install the app for that OS. However, there are numerous subtleties that affect this; read on for the details.
39

Forwarding and indexing are OS-independent operations. Splunk supports any combination of forwarders and receivers, as long as each is running on a certified OS. For example, a Linux receiver can index data from a Windows universal forwarder. Once data has been forwarded and indexed, the next step is to search or perform other knowledge-based activities on the data. At this point, the Splunk instance performing such activities might need information about the OS whose data it is examining. You typically handle this by installing the app specific to that OS. For example, if you want a Linux Splunk instance to search OS-specific data forwarded from Windows, you will ordinarily want to install the Windows app on the Linux instance. If the data you're interested in is not OS-specific, such as web logs, then you do not need to install the Splunk OS app. In addition, if the receiver is only indexing the data, and an external search head is performing the actual searches, you do not need to install the OS app on the receiver, but you might need to install it on the search head. As an alternative, you can use a search head running the OS. For example, to search data forwarded from Windows to a Linux receiver, you can use a Windows search head pointing to the Linux indexer as a remote search peer. For more information on search heads, see "Set up distributed search". Important: After you have downloaded the relevant OS app, remove its inputs.conf file before enabling the app, to ensure that its default inputs are not added to your indexer. For the Windows app, the location is:
%SPLUNK_HOME%\etc\apps\windows\default\inputs.conf.

In summary, you only need to install the app for the forwarder's OS on the receiver (or search head) if it will be performing searches on the forwarded OS data.

Troubleshoot forwarder to receiver connectivity


Confusing the receiver's receiver and management ports As part of setting up a forwarder, you specify the receiver's hostname/IP_address and port. The forwarder uses these to send data to the receiver. Be sure to specify the port that was designated as the receiving port at the time the receiver was configured. If you mistakenly specify the receiver's management port, the receiver will generate an error similar to this:

40

splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error = error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0 splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error for fd from HOST:localhost.localdomain, IP:127.0.0.1, PORT:53075 splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error = error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0 splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error for fd from HOST:localhost.localdomain, IP:127.0.0.1, PORT:53076 splunkd.log:03-01-2010 13:35:28.653 ERROR TcpInputFd - SSL Error = error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0 splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - SSL Error for fd from HOST:localhost.localdomain, IP:127.0.0.1, PORT:53077 splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - SSL Error = error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol splunkd.log:03-01-2010 13:35:28.654 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0

Closed receiver socket If a receiving indexer's queues become full, it will close the receiver socket, to prevent additional forwarders from connecting to it. If a forwarder with load-balancing enabled can no longer forward to that receiver, it will send its data to another indexer on its list. If the fowarder does not employ load-balancing, it will hold the data until the problem is resolved. The receiver socket will reopen automatically when the queue gets unclogged. Typically, a receiver gets behind on the dataflow because it can no longer write data due to a full disk or because it is itself attempting to forward data to another Splunk instance that is not accepting data. The following warning message will appear in splunkd.log if the socket gets blocked:

Stopping all listening ports. Queues blocked for more than N seconds.

This message will appear when the socket reopens:

Started listening on tcp ports. Queues unblocked.

41

Answers
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has around configuring forwarding.

Configure forwarders with outputs.conf


The outputs.conf file defines how forwarders send data to receivers. You can specify some output configurations at installation time (Windows universal forwarders only) or through Splunk Web (heavy/light forwarders only) or the CLI, but most advanced configuration settings require that you directly edit outputs.conf. The topics describing various topologies, such as load balancing and data routing, provide detailed examples on configuring outputs.conf to support those topologies. Important: Although outputs.conf is a critical file for configuring forwarders, it specifically addresses the outputs from the forwarder. To specify the inputs to a forwarder, you must separately configure the inputs, as you would for any Splunk instance. For details on configuring inputs, see "Add data and configure inputs" in the Getting Data In manual.

Types of outputs.conf files


A single forwarder can have multiple outputs.conf files (for instance, one located in an apps directory and another in /system/local). No matter how many outputs.conf files the forwarder has and where they reside, the forwarder combines all their settings, using the rules of location precedence, as described in "Configuration file precedence". Your installation will contain both default and custom outputs.conf files. Default versions Splunk ships with these default versions of outputs.conf: On the universal forwarder: The universal forwarder has two default outputs.conf files, one in $SPLUNK_HOME/etc/system/default and the other in $SPLUNK_HOME/etc/apps/SplunkUniversalForwarder/default. The default version in the SplunkUniversalForwarder app has precedence over the version under /etc/system. On heavy and light forwarders: These have a single default outputs.conf file, located in $SPLUNK_HOME/etc/system/default.
42

Important: Do not touch default versions of any configuration files, for reasons explained in "About configuration files". Custom versions When you configure forwarding behavior, those changes get saved in custom versions of outputs.conf. There are several ways you can specify forwarding behavior: While installing the forwarder (Windows universal forwarder only) By running CLI commands By using Splunk Manager (heavy/light forwarders only) By directly editing an outputs.conf file Splunk automatically creates or edits custom versions of outputs.conf in response to the first three methods. The locations of those versions vary, depending on the type of forwarder and other factors: The universal forwarder. If you use the CLI to make changes to universal forwarder output behavior, it creates or edits a copy of outputs.conf in $SPLUNK_HOME/etc/system/local. However, the Windows installation process writes configuration changes to an outputs.conf file located in the MSICreated app. For more information on configuring the universal forwarder, look here. Heavy and light forwarders. When you enable a heavy/light forwarder through Splunk Web or the CLI, Splunk creates an outputs.conf file in the directory of the currently running app. For example, if you're working in the search app, Splunk places the file in $SPLUNK_HOME/etc/apps/search/local/. You can then edit it there. In addition to any outputs.conf files that you create and edit indirectly (for example, through the CLI), you can also create or edit an outputs.conf file directly. It is recommended that you work with just a single copy of the file, which you place in $SPLUNK_HOME/etc/system/local/. (If a copy of the file already exists in that directory, because of configuration changes made through the CLI, just edit that copy.) For purposes of distribution and management simplicity, you can combine settings from all non-default versions into a single custom outputs.conf file. After making changes to outputs.conf, you must restart the forwarder for the changes to take effect.

43

For detailed information on outputs.conf, look here for the spec and examples.

Configuration levels
There are two types of output processors: tcpout and syslog. You can configure them at three levels of stanzas: Global. At the global level, you specify any attributes that you want to apply globally, as well as certain attributes only configurable at the system-wide level for the output processor. This stanza is optional. Target group. A target group defines settings for one or more receiving indexers. There can be multiple target groups per output processor. Most configuration settings can be specified at the target group level. Single server. You can specify configuration values for single servers (receivers) within a target group. This stanza type is optional. Configurations at the more specific level take precedence. For example, if you specify compressed=true for a target group, the forwarder will send the servers in that target group compressed data, even if compressed is set to "false" for the global level. Note: This discussion focuses on the tcpout processor, which uses the [tcpout] header. For the syslog output processor, see "Forward data to third-party systems" for details. Global stanza Here you set any attributes that you want to apply globally. This stanza is not required. However, there are several attributes that you can set only at the global level, including defaultGroup and indexAndForward. The global stanza for the tcpout procesor is specified with the [tcpout] header. Here's an example of a global tcpout stanza:

[tcpout] defaultGroup=indexer1 indexAndForward=true

This global stanza includes two attribute/value pairs: defaultGroup=indexer1 This tells the forwarder to send all data to the
44

"indexer1" target group. See "Default target groups" for more information. indexAndForward=true This tells the forwarder to index the data locally, as well as forward the data to receiving indexers in the target groups. If set to "false" (the default), the forwarder just forwards data but does not index it. This attribute is only available for heavy forwarders; universal and light forwarders cannot index data.
Default target groups

To set default groups for automatic forwarding, include the defaultGroup attribute at the global level, in your [tcpout] stanza:

[tcpout] defaultGroup= <target_group1>, <target_group2>, ...

The defaultGroup specifies one or more target groups, defined later in tcpout:<target_group> stanzas. The forwarder will send all events to the specified groups. If you do not want to forward data automatically, don't set the defaultGroup attribute. (Prior to 4.2, you were required to set the defaultGroup to some value. This is no longer necessary.) For some examples of using the defaultGroup attribute, see "Route and filter data". Target group stanza The target group identifies a set of receivers. It also specifies how the forwarder sends data to those receivers. You can define multiple target groups. Here's the basic pattern for the target group stanza:

[tcpout:<target_group>] server=<receiving_server1>, <receiving_server2>, ... <attribute1> = <val1> <attribute2> = <val2> ...

To specify a receiving server in a target group, use the format <ipaddress_or_servername>:<port>, where <port> is the receiving server's receiving port. For example, myhost.Splunk.com:9997. You can specify multiple receivers and the forwarder will load balance among them.
45

Note: Starting with Splunk version 4.3, you can use an IPv6 address when specifying the receiving indexer. For more information, see "Configure Splunk for IPv6" in the Admin manual. See "Define typical deployment topologies", later in this topic, for information on how to use the target group stanza to define several deployment topologies. Single-server stanza You can define a specific configuration for an individual receiving indexer. However, the receiver must also be a member of a target group. When you define an attribute at the single-server level, it takes precedence over any definition at the target group or global level. Here is the syntax for defining a single-server stanza:

[tcpout-server://<ipaddress_or_servername>:<port>] <attribute1> = <val1> <attribute2> = <val2> ...

Example The following outputs.conf example contains three stanzas for sending tcpout to Splunk receivers: Global settings. In this example, there is one setting, to specify a defaultGroup. Settings for a single target group consisting of two receivers. Here, we are specifying a load-balanced target group consisting of two receivers. Settings for one receiver within the target group. In this stanza, you can specify any settings specific to the mysplunk_indexer1 receiver.

[tcpout] defaultGroup=my_indexers [tcpout:my_indexers] server=mysplunk_indexer1:9997, mysplunk_indexer2:9996 [tcpout-server://mysplunk_indexer1:9997]

46

Define typical deployment topologies


This section shows how you can configure a forwarder to support several typical deployment topologies. See the other topics in the "Forward data" section of this book for information on configuring forwarders for other topologies. Load balancing To perform load balancing, specify one target group with multiple receivers. In this example, the target group consists of three receivers:

[tcpout:my_LB_indexers] server=10.10.10.1:9997,10.10.10.2:9996,10.10.10.3:9995

The forwarder will load balance between the three receivers listed. If one receiver goes down, the forwarder automatically switches to the next one available. Data cloning To perform data cloning, specify multiple target groups, each in its own stanza. In data cloning, the forwarder sends copies of all its events to the receivers in two or more target groups. Data cloning usually results in similar, but not necessarily exact, copies of data on the receiving indexers. Here's an example of how you set up data cloning:

[tcpout] defaultGroup=indexer1,indexer2 [tcpout:indexer1] server=10.1.1.197:9997 [tcpout:indexer2] server=10.1.1.200:9997

The forwarder will send duplicate data streams to the servers specified in both the indexer1 and indexer2 target groups. Data cloning with load balancing You can combine load balancing with data cloning. For example:

[tcpout]

47

defaultGroup=cloned_group1,cloned_group2 [tcpout:cloned_group1] server=10.10.10.1:9997, 10.10.10.2:9997, 10.10.10.3:9997 [tcpout:cloned_group2] server=10.1.1.197:9997, 10.1.1.198:9997, 10.1.1.199:9997, 10.1.1.200:9997

The forwarder will send full data streams to both cloned_group1 and cloned_group2. The data will be load-balanced within each group, rotating among receivers every 30 seconds (the default frequency). Note: For syslog and other output types, you must explicitly specify routing as described here: "Route and filter data".

Commonly used attributes


The outputs.conf file provides a large number of configuration options that offer considerable control and flexibility in forwarding. Of the attributes available, several are of particular interest: Attribute Default Where configured Value
A comma-separated list of one or more target groups. Forwarder sends all events to all specified target groups. Don't set this attribute if you don't want events automatically forwarded to a target group. If set to "true", the forwarder will index all data locally, in addition to forwarding the data to a receiving indexer. indexAndForward false global stanza

defaultGroup

n/a

global stanza

Important: This attribute is only available for heavy forwarders. A universal forwarder cannot index locally.
Required. Specifies the server(s) that will function as receivers for the forwarder. This must be set to a value using the format <ipaddress_or_servername>:<port>, where <port> is the receiving

server

n/a

target group stanza

server's receiving port.

48

Note: Starting with Splunk version 4.3, you can use an IPv6 address when specifying the receiving indexer. For more information, see "Configure Splunk for IPv6" in the Admin manual.
disabled false any stanza level Specifies whether the stanza is disabled. If set to "true", it is equivalent to the stanza not being there.

sendCookedData compressed

true false

global or target Specifies whether data is cooked before group stanza forwarding. global or target Specifies whether the forwarder sends group stanza compressed data. any stanza level Set of attributes for configuring SSL. See "About securing data from forwarders" in the Securing Splunk manual for information on how to use these attributes.

ssl....

n/a

useACK

false

Specifies whether the forwarder waits for indexer acknowledgment confirming that the global or target data has been written to the file system. group stanza See "Protect against loss of in-flight

data".
dnsResolutionInterval 300 Specifies base time interval in seconds at global or target which indexer DNS names will be resolved group stanza to IP address. See "DNS resolution

interval". The outputs.conf.spec file, which you can find here, along with several examples, provides details for these and all other configuration options. In addition, most of these settings are discussed in topics dealing with specific forwarding scenarios. Note: In 4.2, the persistent queue capability was much improved. It is now a feature of data inputs and is therefore configured in inputs.conf. It is not related in any way to the previous, deprecated persistent queue capability, which was configured through outputs.conf. See "Use persistent queues to prevent data loss" for details. DNS resolution interval The dnsResolutionInterval attribute specifies the base time interval (in seconds) at which receiver DNS names will be resolved to IP addresses. This value is used to compute the run-time interval as follows:

49

run-time interval = dnsResolutionInterval + (number of receivers in server attribute - 1) * 30

The run-time interval is extended by 30 seconds for each additional receiver specified in the server attribute; that is, for each additional receiver across which the forwarder is load balancing. The dnsResolutionInterval attribute defaults to 300 seconds. For example, if you leave the attribute at the default setting of 300 seconds and the forwarder is load-balancing across 20 indexers, DNS resolution will occur every 14 minutes:

(300 + ((20 - 1) * 30)) = 870 seconds = 14 minutes

If you change dnsResolutionInterval to 600 seconds, and keep the number of load-balanced indexers at 20, DNS resolution will occur every 19.5 minutes:

(600 + ((20 - 1) * 30)) = 1170 seconds = 19.5 minutes

Protect against loss of in-flight data


To guard against loss of data when forwarding to an indexer, you can use Splunk's indexer acknowledgment capability. With indexer acknowledgment, the forwarder will resend any data not acknowledged as "received" by the indexer. This feature is disabled by default, because it can affect performance. You enable it in outputs.conf, as described later. Indexer acknowledgment is available for all varieties of forwarders: universal, light, or heavy. Note: Both forwarders and indexers must be at version 4.2 or higher for acknowledgment to function. Otherwise, the transmission between forwarder and indexer will proceed without acknowledgment.

Indexer acknowledgment and clusters


When using forwarders to send data to peer nodes in a cluster, you must enable
50

indexer acknowledgment. To learn more about forwarders and clusters, read "Use forwarders to get your data" in the Managing Indexers and Clusters Manual.

How indexer acknowledgment works when everything goes well


The forwarder sends data continuously to the indexer, in blocks of approximately 64kB. The forwarder maintains a copy of each block in memory, in its wait queue, until it gets an acknowledgment from the indexer. While waiting, it continues to send more data blocks. If all goes well, the indexer: 1. Receives the block of data. 2. Parses the data. 3. Writes the data to the file system as events (raw data and index data). 4. Sends an acknowledgment to the forwarder. The acknowledgment tells the forwarder that the indexer received the data and successfully wrote it to the file system. Upon receiving the acknowledgment, the forwarder releases the block from memory. If the wait queue is of sufficient size, it doesn't fill up while waiting for acknowledgments to arrive. But see this section for possible issues and ways to address them, including how to increase the wait queue size.

How indexer acknowledgment works when there's a failure


When there's a failure in the round-trip process, the forwarder does not receive an acknowledgment. It will then attempt to resend the block of data. Why no acknowledgment? These are the reasons that a forwarder might not receive acknowledgment: Indexer goes down after receiving the data -- for instance, due to machine failure. Indexer is unable to write to the file system -- for instance, because the disk is full.
51

Network goes down while acknowledgment is en route to the forwarder. How the forwarder deals with failure After sending a data block, the forwarder maintains a copy of the data in its wait queue until it receives an acknowledgment. In the meantime, it continues to send additional blocks as usual. If the forwarder doesn't get acknowledgment for a block within 300 seconds (by default), it closes the connection. You can change the wait time by setting the readTimeout attribute in outputs.conf. If the forwarder is set up for auto load balancing, it then opens a connection to the next indexer in the group (if one is available) and sends the data to it. If the forwarder is not set up for auto load balancing, it attempts to open a connection to the same indexer as before and resend the data. The forwarder maintains the data block in the wait queue until acknowledgment is received. Once the wait queue fills up, the forwarder stops sending additional blocks until it receives an acknowledgment for one of the blocks, at which point it can free up space in the queue. Other reasons the forwarder might close a connection There are actually three conditions that can cause the forwarder to close the network connection: Read timeout. The forwarder doesn't receive acknowledgment within 300 (default) seconds. This is the condition described above. Write timeout. The forwarder is not able to finish a network write within 300 (default) seconds. The value is configurable in outputs.conf by setting writeTimeout. Read/write failure. Typical causes include the indexer's machine crashing or the network going down. In all these cases, the forwarder will then attempt to open a connection to the next indexer in the load-balanced group, or to the same indexer again if load-balancing is not enabled. The possibility of duplicates It's possible for the indexer to index the same data block twice. This can happen if there's a network problem that prevents an acknowledgment from reaching the forwarder. For instance, assume the indexer receives a data block, parses it, and writes it to the file system. It then generates the acknowledgment. However, on
52

the round-trip to the forwarder, the network goes down, so the forwarder never receives the acknowledgment. When the network comes back up, the forwarder then resends the data block, which the indexer will parse and write as if it were new data. To deal with such a possibility, every time the forwarder resends a data block, it writes an event to its splunkd.log noting that it's a possible duplicate. The admin is responsible for using the log information to track down the duplicate data on the indexer. Here's an example of a duplicate warning:

10-18-2010 17:32:36.941 WARN TcpOutputProc - Possible duplication of events with channel=source::/home/jkerai/splunk/current-install/etc/apps/sample_app /logs/maillog.1|host::MrT|sendmail|, streamId=5941229245963076846, offset=131072 subOffset=219 on host=10.1.42.2:9992

Enable indexer acknowledgment


You enable indexer acknowledgment solely on the forwarder. You do not set any attribute on the indexer side; it will send acknowledgments if the forwarder tells it to. (But remember, both the forwarder and the indexer must be at version 4.2 or greater.) To enable indexer acknowledgment, set the useACK attribute to true in the forwarder's outputs.conf:

[tcpout:<target_group>] server=<server1>, <server2>, ... useACK=true ...

A value of useACK=true enables indexer acknowledgment. By default, this feature is disabled: useACK=false Note: You can set useACK either globally or by target group, at the [tcpout] or [tcpout:<target_group>] stanza levels. You cannot set it for individual servers at the [tcpout-server: ...] stanza level.

53

Indexer acknowledgment influence on forwarded data throughput


Indexer acknowledgment can limit and/or reduce forwarder throughput in some scenarios. Here, we describe how this can occur and steps that you should take to avoid this. If you have enabled indexer acknowledgment on the forwarder through useACK and the receiving indexer is at version 4.2+, the forwarder will use a wait queue to manage the acknowledgment process. Otherwise, it won't have a wait queue. This section describes how to manage the wait queue for performance. Because the forwarder sends data blocks continuously and does not wait for acknowledgment before sending the next block, its wait queue will typically maintain many blocks, each waiting for its acknowledgment. The forwarder will continue to send blocks until its wait queue is full, at which point it will stop forwarding. The forwarder then waits until it receives an acknowledgment, which allows it to release a block from its queue and thus resume forwarding. A wait queue can fill up when something is wrong with the network or indexer; however, it can also fill up even though the indexer is functioning normally. This is because the indexer only sends the acknowledgment after it has written the data to the file system. Any delay in writing to the file system will slow the pace of acknowledgment, leading to a full wait queue. There are a few reasons that a normal functioning indexer might delay writing data to the file system (and so delay its sending of acknowledgments): The indexer is very busy. For example, at the time the data arrives, the indexer might be dealing with multiple search requests or with data coming from a large number of forwarders. The indexer is receiving too little data. For efficiency, an indexer only writes to the file system periodically -- either when a write queue fills up or after a timeout of a few seconds. If a write queue is slow to fill up, the indexer will wait until the timeout to write. If data is coming from only a few forwarders, the indexer can end up in the timeout condition, even if each of those forwarders is sending a normal quantity of data. Since write queues exist on a per hot bucket basis, the condition occurs when some particular bucket is getting a small amount of data. Usually this means that a particular index is getting a small amount of data. To ensure that throughput does not degrade because the forwarder is waiting on the indexer for acknowledgment, you might need to use a larger wait queue size,
54

ensuring it has sufficient space to maintain all blocks in memory while waiting for acknowledgments to arrive. You'll need to experiment with the queue size that's right for your forwarder's specific environment. On the other hand, if you have many forwarders feeding a single indexer, and a moderate number of data sources per forwarder, you may be able to conserve a few megabytes of memory by using a smaller size. Note: You cannot configure the size of the wait queue directly. Its size is always relative to the size of the in-memory output queue, as described below. Configure the wait queue size Important: To optimize the wait queue size for indexer acknowledgment, Splunk recommends that you increase the maxQueueSize attribute. The maximum wait queue size is 3x the size of the in-memory output queue, which you set with the maxQueueSize attribute in outputs.conf:

maxQueueSize = [<integer>|<integer>[KB|MB|GB]]

For example, if you set maxQueueSize to 7MB, the maximum wait queue size will be 21MB. Note the following: This attribute sets the maximum size of the forwarder's in-memory (RAM) output queue. It also determines the maximum size of the wait queue, which is 3x the setting for the output queue. If specified as a lone integer (for example, maxQueueSize=100), it determines the maximum number of queued events (for parsed data) or blocks of data (for unparsed data). A block of data is approximately 64KB. For forwarders sending unparsed data (mainly universal forwarders), maxQueueSize is the maximum number of data blocks. For heavy forwarders sending parsed data, maxQueueSize is the maximum number of events. Since events are typically much shorter than data blocks, the memory consumed by the output and wait queues on a parsing forwarder will likely be much smaller than on a non-parsing forwarder, if you use this version of the setting. If specified as an integer followed by KB, MB, or GB (for example, maxQueueSize=100MB), it determines the maximum RAM allocated to the output queue and, indirectly, to the wait queue. If configured as maxQueueSize=100MB, the maximum size of the output queue will be
55

100MB and the maximum size of the wait queue, if any, will be 300MB. maxQueueSize defaults to 500KB. The default wait queue size is 3x that amount: 1500KB. Although the wait queue and the output queues are configured by the same attribute, they are separate queues. Important: To ensure that the forwarder does not get blocked while waiting for acknowledgment from an indexer, you should set the maxQueueSize to 7MB:

[tcpout] maxQueueSize = 7MB

Note the following points regarding this recommendation: This assumes that no thruput limit has been set in the forwarder's limits.conf file. This configuration will cause memory consumption to go up by about 28MB. Important: If you're enabling indexer acknowledgment, be careful to take into account your system's available memory when setting maxQueueSize. You'll need to accommodate 4x the maxQueueSize setting (1x for the output queue + 3x for the wait queue).

When the receiver is a forwarder, not an indexer


You can also use indexer acknowledgment when the receiving instance is an intermediate forwarder, instead of an indexer. Assume you have an originating forwarder that sends data to an intermediate forwarder, which in turn forwards that data to an indexer. There are two main possibilities to consider: The originating forwarder and the intermediate forwarder both have acknowledgment enabled. In this case, the intermediate forwarder waits until it receives acknowledgment from the indexer and then sends acknowledgment back to the originating forwarder. The originating forwarder has acknowledgment enabled; the intermediate forwarder does not. In this case, the intermediate forwarder sends acknowledgment back to the originating forwarder as
56

soon as it sends the data on to the indexer. It relies on TCP to safely deliver the data to the indexer. Because it doesn't itself have useACK enabled, the intermediate forwarder cannot verify delivery of the data to the indexer. This use case has limited value and is not recommended. If you use indexer acknowledgment, you should generally enable it on all forwarding tiers. That is the only way to ensure that data gets delivered all the way from the originating forwarder to the indexer.

57

Use the forwarder to create deployment topologies


Consolidate data from multiple machines
One of the most common forwarding use cases is to consolidate data originating across numerous machines. Forwarders located on the machines forward the data to a central Splunk indexer. With their small footprint, universal forwarders ordinarily have little impact on their machines' performance. This diagram illustrates a common scenario, where universal forwarders residing on machines running diverse operating systems send data to a single Splunk instance, which indexes and provides search capabilities across all the data:

The diagram illustrates a small deployment. In practice, the number of universal forwarders in a data consolidation use case could number upwards into the thousands. This type of use case is simple to configure: 1. Determine what data, originating from which machines, you need to access. 2. Install a Splunk instance, typically on its own server. This instance will function as the receiver. All indexing and searching will occur on it. 3. Enable the instance as a receiver through Splunk Web or the CLI. Using the CLI, enter this command from $SPLUNK_HOME/bin/:

./splunk enable listen <port> -auth <username>:<password>

For <port>, substitute the port you want the receiver to listen on. This also
58

known as the "receiver port". 4. If any of the universal forwarders will be running on a different operating system from the receiver, install the app for the forwarder's OS on the receiver. For example, assume the receiver in the diagram above is running on a Linux box. In that case, you'll need to install the Windows app on the receiver. You might need to install the *nix app, as well. -- However, since the receiver is on Linux, you probably have already installed that app. Details and provisos regarding this can be found here. After you have downloaded the relevant app, remove its inputs.conf file before enabling it, to ensure that its default inputs are not added to your indexer. For the Windows app, the location is:
$SPLUNK_HOME/etc/apps/windows/default/inputs.conf.

5. Install universal forwarders on each machine that will be generating data. These will forward the data to the receiver. 6. Set up inputs for each forwarder. See "What Splunk can index". 7. Configure each forwarder to forward data to the receiver. For Windows forwarders, you can do this at installation time, as described here. For *nix forwarders, you must do this through the CLI:

./splunk add forward-server <host>:<port> -auth <username>:<password>

For <host>:<port>, substitute the host and receiver port number of the receiver. For example, splunk_indexer.acme.com:9995. Alternatively, if you have many forwarders, you can use an outputs.conf file to specify the receiver. For example:

[tcpout:my_indexers] server= splunk_indexer.acme.com:9995

You can create this file once, then distribute copies of it to each forwarder.

Set up load balancing


With load balancing, a Splunk forwarder distributes data across several receiving
59

Splunk instances. Each receiver gets a portion of the total data, and together the receivers hold all the data. To access the full set of forwarded data, you need to set up distributed searching across all the receivers. For information on distributed search, see "About distributed search" in this manual. Load balancing enables horizontal scaling for improved performance. In addition, its automatic switchover capability ensures resiliency in the face of machine outages. If a machine goes down, the forwarder simply begins sending data to the next available receiver. Load balancing can also be of use when getting data from network devices like routers. To handle syslog and other data generated across port 514, a single heavy forwarder can monitor port 514 and distribute the incoming data across several Splunk indexers. Note: When implementing load balancing between forwarders and receivers, you must use the forwarder's inherent capability. Do not use an external load balancer. The use of external load balancers between forwarders and receivers will not work properly.

How load balancing works


Splunk forwarders perform "automatic load balancing". The forwarder routes data to different indexers based on a specified time interval. For example, assume you have a load-balanced group consisting of three indexers: A, B, and C. At some specified interval, such as every 30 seconds, the forwarder switches the data stream to another indexer in the group, selected at random. So, the forwarder might switch from indexer B to indexer A to indexer C, and so on. If one indexer is down, the forwarder immediately switches to another. To expand on this a bit, there is a data stream for each of the inputs that the forwarder is configured to monitor. The forwarder determines if it is safe for a data stream to switch to another indexer. Then, at the specified interval, it switches the data stream to the newly selected indexer. If it cannot switch the data stream to the new indexer safely, it keeps the connection to the previous indexer open and continues to send the data stream until it has been safely sent. Important: Universal forwarders are not able to switch indexers when monitoring TCP network streams of data (including Syslog) unless an EOF is reached or an indexer goes down, at which point the forwarder will switch to the next indexer in the list. Because the universal forwarder does not parse the data and identify event boundaries before forwarding the data to the indexer (unlike a heavy forwarder), it has no way of knowing when it's safe to switch to the next indexer
60

unless it receives an EOF. Note: Round-robin load balancing, which was previously available as an alternative to automatic load balancing, was deprecated in Splunk version 4.2. This diagram shows a distributed search scenario, in which three forwarders are performing load balancing across three receivers:

Targets for load balancing


When configuring the set of target receivers, you can employ either DNS or static lists. DNS lists provide greater flexibility and simplified scale-up, particularly for large deployments. Through DNS, you can change the set of receivers without needing to re-edit each forwarder's outputs.conf file. The main advantage of a static list is that it allows you to specify a different port for each receiver. This is useful if you need to perform load balancing across multiple receivers running on a single host. Each receiver can listen on a separate port. Static list target To use a static list for the target, you simply specify each of the receivers in the target group's [tcpout] stanza in the forwarder's outputs.conf file. In this example, the target group consists of three receivers, specified by IP address and receiver port number:

[tcpout: my_LB_indexers] server=10.10.10.1:9997,10.10.10.2:9996,10.10.10.3:9995

61

The universal forwarder will load balance between the three receivers listed. If one receiver goes down, the forwarder automatically switches to another one on the list. DNS list target To use a DNS list, edit your forwarder's outputs.conf file to specify a single host in the target group's [tcpout] stanza. For example:

[tcpout:my_LB_indexers] server=splunkreceiver.mycompany.com:9997

In your DNS server, create a DNS A record for each host's IP address, referencing the server name you specified in outputs.conf. For example:

splunkreceiver.mycompany.com splunkreceiver.mycompany.com splunkreceiver.mycompany.com

A A A

10.10.10.1 10.10.10.2 10.10.10.3

The forwarder will use the DNS list to load balance, sending data in intervals, switching among the receivers specified. If a receiver is not available, the forwarder skips it and sends data to another one on the list. If you have a topology with many forwarders, the DNS list method allows you to update the set of receivers by making changes in just a single location, without touching the forwarders' outputs.conf files.

Configure load balancing for horizontal scaling


To configure load balancing, first determine your needs, particularly your horizontal scaling and failover requirements. Then develop a topology based on those needs, possibly including multiple forwarders, as well as receivers and a search head to search across the receivers. Assuming the topology of three universal forwarders and three receivers illustrated by the diagram at the start of this topic, set up load balancing with these steps: 1. Install and enable a set of three Splunk instances as receivers. This example uses a DNS list to designate the receivers, so they must all listen on the same port. For example, if the port is 9997, enable each receiver by going to its $SPLUNK_HOME/bin/ location and using this CLI command:
62

./splunk enable listen 9997 -auth <username>:<password>

2. Install the set of universal forwarders, as described here. 3. Set up a DNS list with an A record for each receiver's IP address:

splunkreceiver.mycompany.com splunkreceiver.mycompany.com splunkreceiver.mycompany.com

A A A

10.10.10.1 10.10.10.2 10.10.10.3

4. Create a single outputs.conf file for use by all the forwarders. This one specifies the DNS server name used in the DNS list and the port the receivers are listening on:

[tcpout] defaultGroup=my_LB_indexers [tcpout:my_LB_indexers] disabled=false autoLBFrequency=40 server=splunkreceiver.mycompany.com:9997

This outputs.conf file uses the autoLBFrequency attribute to set a load-balance frequency of 40 seconds. Every 40 seconds, the forwarders will switch to another receiver. The default frequency, which rarely needs changing, is 30 seconds. 5. Distribute the outputs.conf file to all the forwarders. You can use the deployment server to handle the distribution.

Specify load balancing from the CLI


You can also use the CLI to specify load balancing. You do this when you start forwarding activity to a set of receivers, using this syntax:

./splunk add forward-server <host>:<port> -method autobalance

where <host>:<port> is the host and receiver port of the receiver. This example creates a load-balanced group of four receivers:

./splunk add forward-server indexer1:9997 -method autobalance ./splunk add forward-server indexer2:9997 -method autobalance

63

./splunk add forward-server indexer3:9997 -method autobalance ./splunk add forward-server indexer4:9997 -method autobalance

Route and filter data


Forwarders can filter and route data to specific receivers based on criteria such as source, source type, or patterns in the events themselves. For example, a forwarder can send all data from one group of hosts to one Splunk server and all data from a second group of hosts to a second Splunk server. Heavy forwarders can also look inside the events and filter or route accordingly. For example, you could use a heavy forwarder to inspect WMI event codes to filter or route Windows events. This topic describes a number of typical routing scenarios. Besides routing to receivers, forwarders can also filter and route data to specific queues or discard the data altogether by routing to the null queue. Important: Only heavy forwarders can route or filter data at the event level. Universal forwarders and light forwarders do not have the ability to inspect individual events, but they can still forward data based on a data stream's host, source, or source type. They can also route based on the data's input stanza, as described below, in the subtopic, "Route inputs to specific indexers based on the data's input". Here's a simple illustration of a forwarder routing data to three Splunk receivers:

This topic describes how to filter and route event data to Splunk instances. See "Forward data to third-party systems" in this manual for information on routing to non-Splunk systems. This topic also describes how to perform selective indexing and forwarding, in which you index some data locally on a heavy forwarder and forward the non-indexed data to one or more sepearate indexers. See "Perform selective indexing and forwarding" later in this topic for details.
64

Configure routing
This is the basic pattern for defining most routing scenarios (using a heavy forwarder): 1. Determine what criteria to use for routing. How will you identify categories of events, and where will you route them? 2. Edit props.conf to add a TRANSFORMS-routing attribute to determine routing based on event metadata:

[<spec>] TRANSFORMS-routing=<transforms_stanza_name> <spec>

can be:

<sourcetype>, the source type of an event host::<host>, where <host> is the host for an event source::<source>, where <source> is the source for an event Use the <transforms_stanza_name> specified here when creating an entry in transforms.conf (below). Examples later in this topic show how to use this syntax. 3. Edit transforms.conf to specify target groups and to set additional criteria for routing based on event patterns:

[<transforms_stanza_name>] REGEX=<routing_criteria> DEST_KEY=_TCP_ROUTING FORMAT=<target_group>,<target_group>,....

Note: <transforms_stanza_name> must match the name you defined in props.conf. In <routing_criteria>, enter the regex rules that determine which events get routed. This line is required. Use "REGEX = ." if you don't need additional filtering beyond the metadata specified in props.conf. DEST_KEY should be set to _TCP_ROUTING to send events via TCP. It can also be set to _SYSLOG_ROUTING or _HTTPOUT_ROUTING for other output
65

processors. Set FORMAT to a <target_group> that matches a target group name you defined in outputs.conf. A comma separated list will clone events to multiple target groups. Examples later in this topic show how to use this syntax. 4. Edit outputs.conf to define the target group(s) for the routed data:

[tcpout:<target_group>] server=<ip>:<port>

Note: Set <target_group> to match the name you specified in transforms.conf. Set the IP address and port to match the receiving server. The use cases described in this topic generally follow this pattern.

Filter and route event data to target groups


In this example, a heavy forwarder filters three types of events, routing them to different target groups. The forwarder filters and routes according to these criteria: Events with a source type of "syslog" to a load-balanced target group Events containing the word "error" to a second target group All other events to a default target group Here's how you do it: 1. Edit props.conf in $SPLUNK_HOME/etc/system/local to set two TRANSFORMS-routing attributes one for syslog data and a default for all other data:

[default] TRANSFORMS-routing=errorRouting [syslog] TRANSFORMS-routing=syslogRouting

2. Edit transforms.conf to set the routing rules for each routing transform:
66

[errorRouting] REGEX=error DEST_KEY=_TCP_ROUTING FORMAT=errorGroup [syslogRouting] REGEX=. DEST_KEY=_TCP_ROUTING FORMAT=syslogGroup

Note: In this example, if a syslog event contains the word "error", it will route to syslogGroup, not errorGroup. This is due to the settings previously specified in props.conf. Those settings dictated that all syslog events be filtered through the syslogRouting transform, while all non-syslog (default) events be filtered through the errorRouting transform. Therefore, only non-syslog events get inspected for errors. 3. Edit outputs.conf to define the target groups:

[tcpout] defaultGroup=everythingElseGroup [tcpout:syslogGroup] server=10.1.1.197:9996, 10.1.1.198:9997 [tcpout:errorGroup] server=10.1.1.200:9999 [tcpout:everythingElseGroup] server=10.1.1.250:6666

syslogGroup and errorGroup receive events according to the rules specified in transforms.conf. All other events get routed to the default group, everythingElseGroup.

Replicate a subset of data to a third-party system


This example uses data filtering to route two data streams. It forwards: All the data, in cooked form, to a Splunk indexer (10.1.12.1:9997) A replicated subset of the data, in raw form, to a third-party server (10.1.12.2:1234) The example sends both streams as TCP. To send the second stream as syslog
67

data, first route the data through an indexer. 1. Edit props.conf:

[syslog] TRANSFORMS-routing = routeAll, routeSubset

2. Edit transforms.conf:

[routeAll] REGEX=(.) DEST_KEY=_TCP_ROUTING FORMAT=Everything [routeSubset] REGEX=(SYSTEM|CONFIG|THREAT) DEST_KEY=_TCP_ROUTING FORMAT=Subsidiary,Everything

3. Edit outputs.conf:

[tcpout] defaultGroup=nothing [tcpout:Everything] disabled=false server=10.1.12.1:9997 [tcpout:Subsidiary] disabled=false sendCookedData=false server=10.1.12.2:1234

For more information, see "Forward data to third party systems" in this manual.

Filter event data and send to queues


Although similar to forwarder-based routing, queue routing can be performed by an indexer, as well as a heavy forwarder. It does not use the outputs.conf file, just props.conf and transforms.conf. You can eliminate unwanted data by routing it to nullQueue, Splunk's /dev/null equivalent. When you filter out data in this way, the filtered data is not forwarded or added to the Splunk index at all, and doesn't count toward your indexing
68

volume. Discard specific events and keep the rest This example discards all sshd events in /var/log/messages by sending them to nullQueue: 1. In props.conf, set the TRANSFORMS-null attribute:

[source::/var/log/messages] TRANSFORMS-null= setnull

2. Create a corresponding stanza in transforms.conf. Set DEST_KEY to "queue" and FORMAT to "nullQueue":

[setnull] REGEX = \[sshd\] DEST_KEY = queue FORMAT = nullQueue

That does it. Keep specific events and discard the rest Here's the opposite scenario. In this example, you use two transforms to keep only the sshd events. One transform routes sshd events to indexQueue, while another routes all other events to nullQueue. Note: In this example, the order of the transforms in props.conf matters. The null queue transform must come first; if it comes later, it will invalidate the previous transform and route all events to the null queue. 1. In props.conf:

[source::/var/log/messages] TRANSFORMS-set= setnull,setparsing

2. In transforms.conf:

[setnull] REGEX = .

69

DEST_KEY = queue FORMAT = nullQueue [setparsing] REGEX = \[sshd\] DEST_KEY = queue FORMAT = indexQueue

Filter WMI events To filter on WMI events, use [WMI:WinEventLog:Security] source type stanza in props.conf. The following example uses regex to filter out two Windows event codes, 592 and 593: In props.conf:

[WMI:WinEventLog:Security] TRANSFORMS-wmi=wminull

Note: In pre-4.2.x versions of Splunk, you must use [wmi] as the sourcetype in order to send events to nullQueue. In transforms.conf:

[wminull] REGEX=(?m)^EventCode=(592|593) DEST_KEY=queue FORMAT=nullQueue

Filter data by target index


Forwarders have a forwardedindex filter that allows you to specify whether data gets forwarded, based on the data's target index. For example, if you have one data input targeted to "index1" and another targeted to "index2", you can use the filter to forward only the data targeted to index1, while ignoring the index2 data. The forwardedindex filter uses whitelists and blacklists to specify the filtering. For information on setting up multiple indexes, see the topic "Set up multiple indexes". Note: The forwardedindex filter is only applicable under the global [tcpout] stanza. This filter does not work if it is created under any other stanza in outputs.conf.

70

Use the forwardedindex.<n>.whitelist|blacklist attributes in outputs.conf to specify which data should get forwarded on an index-by-index basis. You set the attributes to regexes that filter the target indexes. Default behavior By default, Splunk forwards data targeted for all external indexes, including the default index and any user-created indexes. Regarding data for internal indexes, the default behavior varies according to who is doing the forwarding: The universal forwarder forwards the data for the _audit internal index only. It does not forward data for other internal indexes. Its default outputs.conf file located in $SPLUNK_HOME/etc/apps/SplunkUniversalForwarder/default specifies that behavior with these attributes:

[tcpout] forwardedindex.0.whitelist = .* forwardedindex.1.blacklist = _.* forwardedindex.2.whitelist = _audit

Heavy forwarder and full Splunk instances with forwarding enabled (for example, a search head with forwarding enabled) forward the data for the _audit and _internal internal indexes. Their default outputs.conf file, located in $SPLUNK_HOME/etc/system/default, specifies that behavior with these attributes:

[tcpout] forwardedindex.0.whitelist = .* forwardedindex.1.blacklist = _.* forwardedindex.2.whitelist = (_audit|_internal)

Note: The default behavior for heavy forwarders and full Splunk instances changed in Splunk version 5.0.2. In earlier versions, the _internal index was not forwarded by default. Those forwarder types had the same behavior as the universal forwarder: only data for the _audit internal index was forwarded. In most deployments, you will not need to override the default settings. See outputs.conf for more information on how to whitelist and blacklist indexes. For more information on default and custom outputs.conf files and their locations, see "Types of outputs.conf files".

71

Forward all external and internal index data If you want to forward all internal index data, as well as all external data, you can override the default forwardedindex filter attributes like this:

#Forward everything [tcpout] forwardedindex.0.whitelist = .* # disable these forwardedindex.1.blacklist = forwardedindex.2.whitelist =

Forward data for a single index only If you want to forward only the data targeted for a single index (for example, as specified in inputs.conf), and drop any data that's not target for that index, here's how to do it:

[tcpout] #Disable the current filters from the defaults outputs.conf forwardedindex.0.whitelist = forwardedindex.1.blacklist = forwardedindex.2.whitelist = #Forward data for the "myindex" index forwardedindex.0.whitelist = myindex

This first disables all filters from the default outputs.conf file. It then sets the filter for your own index. Be sure to start the filter numbering with 0: forwardedindex.0. Note: If you've set other filters in another copy of outputs.conf on your system, you must disable those as well. You can use the CLI btools command to ensure that there aren't any other filters located in other outputs.conf files on your system:

splunk btools outputs list tcpout

This command returns the content of the tcpout stanza, after all versions of the configuration file have been combined.

72

Use the forwardedindex attributes with local indexing On a heavy forwarder, you can index locally. To do that, you must set indexAndForward attribute to "true". Otherwise, the forwarder just forwards your data and does not save it on the forwarder. On the other hand, the forwardedindex attributes only filter forwarded data; they do not filter any data that gets saved to the local index. In a nutshell, local indexing and forwarder filtering are entirely separate operations, which do not coordinate with each other. This can have unexpected implications when you're performing blacklist filtering: If you set indexAndForward to "true" and then filter out some data through forwardedindex blacklist attributes, the blacklisted data will not get forwarded, but it will still get locally indexed. If you set indexAndForward to "false" (no local indexing) and then filter out some data, the filtered data will get dropped entirely, since it doesn't get forwarded and it doesn't get saved (indexed) on the forwarder.

Route inputs to specific indexers based on the data's input


There is one type of routing that doesn't require a heavy forwarder. In this scenario, you use inputs.conf and outputs.conf to route data to specific indexers, based on the data's input. Here's an example that shows how this works. 1. In outputs.conf, you create stanzas for each receiving indexer:

[tcpout:systemGroup] server=server1:9997 [tcpout:applicationGroup] server=server2:9997

2. In inputs.conf, you use _TCP_ROUTING to specify the stanza in outputs.conf that each input should use for routing:

[monitor://.../file1.log] _TCP_ROUTING = systemGroup [monitor://.../file2.log] _TCP_ROUTING = applicationGroup

73

The forwarder will route data from file1.log to server1 and data from file2.log to server2.

Perform selective indexing and forwarding


With a heavy forwarder only, you can index and store data locally, as well as forward the data onwards to a receiving indexer. There are two ways to do this: Index all the data before forwarding it. To do this, just enable the indexAndForward attribute in outputs.conf. Index a subset of the data before forwarding it or other data. This is called selective indexing. With selective indexing, you can index just some of the data locally and then forward it on to a receiving indexer. Alternatively, you can choose to forward only the data that you don't index locally. Important: Do not enable the indexAndForward attribute in the [tcpout] stanza if you're also enabling selective forwarding. Configure selective indexing To use selective indexing, you need to modify both your inputs.conf and outputs.conf files: 1. In outputs.conf: a. Add the [indexAndForward] stanza:

[indexAndForward] index=true selectiveIndexing=true

The presence of this stanza, including the index and selectiveIndexing attributes, turns on selective indexing for the forwarder. It enables local indexing for any input (specified in inputs.conf) that has the _INDEX_AND_FORWARD_ROUTING attribute. Use the entire [indexAndForward] stanza exactly as shown here. Note: This is a global stanza, which only needs to appear once in outputs.conf. b. Include the usual target group stanzas for each set of receiving indexers:
74

[tcpout:<target_group>] server = <ip address>:<port>, <ip address>:<port>, ... ...

The named <target_group> is used in inputs.conf to route the inputs, as described below. 2. In inputs.conf: a. Add the _INDEX_AND_FORWARD_ROUTING attribute to the stanzas of each input that you want to index locally:

[input_stanza] _INDEX_AND_FORWARD_ROUTING=<any_string> ...

The presence of the _INDEX_AND_FORWARD_ROUTING attribute tells the heavy forwarder to index that input locally. You can set the attribute to any string value you want. Splunk just looks for the attribute itself; the string value has no effect at all on behavior. b. Add the _TCP_ROUTING attribute to the stanzas of each input that you want to forward:

[input_stanza] _TCP_ROUTING=<target_group> ...

The <target_group> is the name used in outputs.conf to specify the target group of receiving indexers. The next several sections show how to use selective indexing in a variety of scenarios. Index one input locally and then forward the remaining inputs In this example, the forwarder indexes data from one input locally but does not forward it. It also forwards data from two other inputs but does not index those inputs locally. 1. In outputs.conf, create these stanzas:

75

[tcpout] defaultGroup=noforward disabled=false [indexAndForward] index=true selectiveIndexing=true [tcpout:indexerB_9997] server = indexerB:9997 [tcpout:indexerC_9997] server = indexerC:9997

Since the defaultGroup is set to the non-existent group "noforward" (meaning that there is no defaultGroup), the forwarder will only forward data that's been routed to explicit target groups in inputs.conf. All other data will get dropped. 2. In inputs.conf, create these stanzas:

[monitor:///mydata/source1.log] _INDEX_AND_FORWARD_ROUTING=local [monitor:///mydata/source2.log] _TCP_ROUTING=indexerB_9997 [monitor:///mydata/source3.log] _TCP_ROUTING=indexerC_9997

The result: Splunk indexes the source1.log data locally but does not forward it (because there's no explicit routing in its input stanza and there's no default group in outputs.conf). Splunk forwards the source2.log data to indexerB but does not index it locally. Splunk forwards the source3.log data to indexerC but does not index it locally. Index one input locally and then forward all inputs This example is nearly identical to the previous one. The difference is that here, you index just one input locally, but then you forward all inputs, including the one you've indexed locally.

76

1. In outputs.conf, create these stanzas:

[tcpout] defaultGroup=noforward disabled=false [indexAndForward] index=true selectiveIndexing=true [tcpout:indexerB_9997] server = indexerB:9997 [tcpout:indexerC_9997] server = indexerC:9997

This outputs.conf is identical to the previous example. 2. In inputs.conf, create these stanzas:

[monitor:///mydata/source1.log] _INDEX_AND_FORWARD_ROUTING=local _TCP_ROUTING=indexerB_9997 [monitor:///mydata/source2.log] _TCP_ROUTING=indexerB_9997 [monitor:///mydata/source3.log] _TCP_ROUTING=indexerC_9997

The only difference from the previous example is that here, you've specified the _TCP_ROUTING attribute for the input that you're indexing locally. Splunk will route both source1.log and source2.log to the indexerB_9997 target group, but it will only locally index the data from source1.log. Another way to index one input locally and then forward all inputs You can achieve the same result as in the previous example by setting the defaultGroup to a real target group. 1. In outputs.conf, create these stanzas:

[tcpout] defaultGroup=indexerB_9997 disabled=false

77

[indexAndForward] index=true selectiveIndexing=true [tcpout:indexerB_9997] server = indexerB:9997 [tcpout:indexerC_9997] server = indexerC:9997

This outputs.conf sets the defaultGroup to indexerB_9997. 2. In inputs.conf, create these stanzas:

[monitor:///mydata/source1.log] _INDEX_AND_FORWARD_ROUTING=local [monitor:///mydata/source2.log] _TCP_ROUTING=indexerB_9997 [monitor:///mydata/source3.log] _TCP_ROUTING=indexerC_9997

Even though you haven't set up an explicit routing for source1.log, Splunk will still forward it to the indexerB_9997 target group, since outputs.conf specifies that group as the defaultGroup. Selective indexing and internal logs Once you enable selective indexing in outputs.conf, the forwarder will only index locally those inputs with the _INDEX_AND_FORWARD_ROUTING attribute. This applies to the internal logs in the /var/log/splunk directory (specified in the default etc/system/default/inputs.conf). By default, they will not be indexed. If you want to index them, you must add their input stanza to your local inputs.conf file (which takes precedence over the default file) and include the _INDEX_AND_FORWARD_ROUTING attribute:

[monitor://$SPLUNK_HOME/var/log/splunk] index = _internal _INDEX_AND_FORWARD_ROUTING=local

78

Forward data to third-party systems


Splunk forwarders can forward raw data to non-Splunk systems. They can send the data over a plain TCP socket or packaged in standard syslog. Because they are forwarding to a non-Splunk system, they can send only raw data. By editing outputs.conf, props.conf, and transforms.conf, you can configure a heavy forwarder to route data conditionally to third-party systems, in the same way that it routes data conditionally to other Splunk instances. You can filter the data by host, source, or source type. You can also use regex to further qualify the data.

TCP data
To forward TCP data to a third-party system, edit the forwarder's outputs.conf file to specify the receiving server and port. You must also configure the receiving server to expect the incoming data stream on that port. You can use any kind of forwarder, such as a universal forwarder, to perform this type of forwarding. To route the data, you need to use a heavy forwarder, which has the ability to parse data. Edit the forwarder's props.conf and transforms.conf files as well as outputs.conf. Edit the configuration files To simply forward data, edit outputs.conf: Specify target groups for the receiving servers. Specify the IP address and TCP port for each receiving server. Set sendCookedData to false, so that the forwarder sends raw data. To route and filter the data (heavy forwarders only), also edit props.conf and transforms.conf: In props.conf, specify the host, source, or sourcetype of your data stream. Specify a transform to perform on the input. In transforms.conf, define the transform and specify _TCP_ROUTING. You can also use regex to further filter the data.

79

Forward all data This example shows how to send all the data from a universal forwarder to a third-party system. Since you are sending all the data, you only need to edit outputs.conf:

[tcpout] [tcpout:fastlane] server = 10.1.1.35:6996 sendCookedData = false

Forward a subset of data This example shows how to use a heavy forwarder to filter a subset of data and send the subset to a third-party system: 1. Edit props.conf and transforms.conf to specify the filtering criteria. In props.conf, apply the bigmoney transform to all host names beginning with nyc:

[host::nyc*] TRANSFORMS-nyc = bigmoney

In transforms.conf, configure the bigmoney transform to specify TCP_ROUTING as the DEST_KEY and the bigmoneyreader target group as the FORMAT:

[bigmoney] REGEX = . DEST_KEY=_TCP_ROUTING FORMAT=bigmoneyreader

2. In outputs.conf, define both a bigmoneyreader target group for the non-Splunk server and a default target group to receive any other data:

[tcpout] defaultGroup = default-clone-group-192_168_1_104_9997 [tcpout:default-clone-group-192_168_1_104_9997] server = 192.168.1.104:9997 [tcpout:bigmoneyreader]

80

server=10.1.1.197:7999 sendCookedData=false

The forwarder will send all data from host names beginning with nyc to the non-Splunk server specified in the bigmoneyreader target group. It will send data from all other hosts to the server specified in the default-clone-group-192_168_1_104_9997 target group. Note: If you want to forward only the data specifically identified in props.conf and transforms.conf, set defaultGroup=nothing.

Syslog data
You can configure a heavy forwarder to send data in standard syslog format. The forwarder sends the data through a separate output processor. You can also filter the data with props.conf and transforms.conf. You'll need to specify _SYSLOG_ROUTING as the DEST_KEY. Note: The syslog output processor is not available for universal or light forwarders. The syslog output processor sends RFC 3164 compliant events to a TCP/UDP-based server and port, making the payload of any non-compliant data RFC 3164 compliant. Yes, that means Windows event logs! To forward syslog data, identify the third-party receiving server and specify it in a syslog target group in the forwarder's outputs.conf file. Note: If you have defined multiple event types for syslog data, the event type names must all include the string "syslog". Forward syslog data In outputs.conf, specify the syslog target group:

[syslog:<target_group>] <attribute1> = <val1> <attribute2> = <val2> ...

The target group stanza requires this attribute: Default


81

Value

Required Attribute
This must be in the format <ipaddress_or_servername>:<port>. server n/a

This is a combination of the IP address or servername of the syslog server and the port on which the syslog server is listening. Note that syslog servers use port 514 by default.

These attributes are optional: Optional Attribute


type

Default
udp

Value
The transport protocol. Must be set to "tcp" or "udp". Syslog priority. This must be an integer 1 to 3 digits in length, surrounded by angle brackets; for example: <34>. This value will appear in the syslog header.

priority

<13> this signifies a facility of 1 ("user") and a severity of 5 ("notice")

Mimics the number passed via syslog interface call; see outputs.conf for more information. Compute the priority value as (<facility> * 8) + <severity>. If facility is 4 (security/authorization messages) and severity is 2 (critical conditions), priority value will be: (4 * 8) + 2 = 34, which you specify in the conf file as <34>.
This must be in the format sourcetype::syslog,

syslogSourceType n/a

the

source type for syslog messages.


The format used when adding a timestamp to the header. This must be in the format: <%b %e %H:%M:%S>. See "Configure timestamps" in the Getting Data In manual for details.

timestampformat

""

Send a subset of data to a syslog server This example shows how to configure a heavy forwarder to forward data from hosts whose names begin with "nyc" to a syslog server named "loghost.example.com" over port 514: 1. Edit props.conf and transforms.conf to specify the filtering criteria. In props.conf, apply the send_to_syslog transform to all host names beginning
82

with nyc:

[host::nyc*] TRANSFORMS-nyc = send_to_syslog

In transforms.conf, configure the send_to_syslog transform to specify _SYSLOG_ROUTING as the DEST_KEY and the my_syslog_group target group as the FORMAT:

[send_to_syslog] REGEX = . DEST_KEY = _SYSLOG_ROUTING FORMAT = my_syslog_group

2. In outputs.conf, define the my_syslog_group target group for the non-Splunk server:

[syslog:my_syslog_group] server = loghost.example.com:514

83

Deploy the universal forwarder


Introducing the universal forwarder
The universal forwarder is Splunk's new lightweight forwarder. Use the universal forwarder to gather data from a variety of inputs and forward the data to a Splunk server for indexing and searching. This section of the Distributed Deployment manual describes how to deploy the universal forwarder for a variety of systems and needs. For information on the different kinds of forwarders and detailed information on configuring them for a range of topologies and use cases, see the "Forward data" chapter of this manual. The universal forwarder replaces the light forwarder. Note: The universal forwarder is a separate executable from Splunk. Instances of Splunk and the universal forwarder can co-exist on the same system. For information on deploying the universal forwarder, see "Universal forwarder deployment overview".

How universal forwarder compares to full Splunk


The universal forwarder's sole purpose is to forward data. Unlike a full Splunk instance, you cannot use the universal forwarder to index or search data. To achieve higher performance and a lighter footprint, it has several limitations: The universal forwarder has no searching, indexing, or alerting capability. The universal forwarder does not parse data. The universal forwarder does not output data via syslog. Unlike full Splunk, the universal forwarder does not include a bundled version of Python. Scripted inputs and Python Full Splunk comes bundled with Python. The universal forwarder does not. Therefore, if you're currently using scripted inputs with Python and you want to use those scripts with the universal forwarder, you must first install your own version of Python. If you have been using calls specific to Splunk's Python libraries, you cannot do so with the universal forwarder, since those libraries exist
84

only in full Splunk. You may use other scripting languages for scripted inputs with the universal forwarder if they are otherwise supported on the target host (for example, Powershell on Windows Server 2008.)

How universal forwarder compares to the light forwarder


The universal forwarder is a streamlined, self-contained forwarder that includes only the essential components needed to forward data to Splunk indexers. The light forwarder, by contrast, is a full Splunk instance, with certain features disabled to achieve a smaller footprint. In all respects, the universal forwarder represents a better tool for forwarding data to indexers. When you install the universal forwarder, you can migrate from an existing light forwarder, version 4.0 or greater. See "Migrating from a light forwarder" for details. Compared to the light forwarder, the universal forwarder provides a better performing and more streamlined solution to forwarding. These are the main technical differences between the universal forwarder and the light forwarder: The universal forwarder puts less load on the CPU, uses less memory, and has a smaller disk footprint. The universal forwarder has a default data transfer rate of 256Kbps The universal forwarder does not come bundled with Python. The universal forwarder is a forwarder only; it cannot be converted to a full Splunk instance.

Read on!
For information on deploying the universal forwarder, see the topics that directly follow this one. For information on using the universal forwarder to forward data and participate in various distributed topologies, see the topics in the "Forward data" section of this manual. Those topics also discuss light and heavy forwarders. For information on third-party Windows binaries that the Windows version of the Splunk universal forwarder ships with, read "Information on Windows third-party binaries distributed with Splunk" in the Installation Manual. For information about running the Splunk universal forwarder in Windows Safe Mode, read "Splunk Architecture and Processes" in the Installation Manual.

85

Universal forwarder deployment overview


The topics in this chapter describe how to install and deploy the universal forwarder. They include use cases that focus on installing and configuring the forwarder for a number of different scenarios. Important: Before attempting to deploy the universal forwarder, you must be familiar with how forwarding works and the full range of configuration issues. See: the chapter "Forward data" for an overview of forwarding and forwarders. the topics in the chapter "Configure forwarding" to learn how to configure forwarders. the subtopic "Set up forwarding and receiving: universal forwarders" for a overview of configuring Splunk forwarding and receiving.

Types of deployments
These are the main scenarios for deploying the universal forwarder: Deploy a Windows universal forwarder manually, either with the installer GUI or from the commandline. Deploy a nix universal forwarder manually, using the CLI to configure it. Remotely deploy a universal forwarder (Windows or nix). Make the universal forwarder part of a system image. Each scenario is described in its own topic. For most scenarios, there are separate Windows and *nix topics. Note: The universal forwarder is its own downloadable executable, separate from full Splunk. Unlike the light and heavy forwarders, you do not enable it from a full Splunk instance. To download the universal forwarder, go to http://www.splunk.com/download/universalforwarder .

Migrating from a light forwarder?


The universal forwarder provides all the functionality of the old light forwarder but in a smaller footprint with better performance. Therefore, you might want to migrate your existing light forwarder installations to universal forwarders. Splunk provides tools that ease the migration process and ensure that the new universal forwarder does not send an indexer any data already sent by the old light forwarder.
86

Note: You can only migrate from light forwarders of version 4.0 or later. Migration is available as an option during the universal forwarder installation process. See "Migrate a Windows forwarder" or "Migrate a nix forwarder" for details. You will want to uninstall the old light forwarder instance once your universal forwarder is up and running (and once you've tested to ensure migration worked correctly). What migration does Migration copies checkpoint data, including the fishbucket directory, from the old forwarder to the new universal forwarder. This prevents the universal forwarder from re-forwarding data that the previous forwarder had already sent to an indexer. This in turn avoids unnecessary re-indexing, ensuring that you maintain your statistics and keep your license usage under control. Specifically, migration copies: the fishbucket directory (contains seek pointers for tailed files). checkpoint files for WinEventLog (Windows only), WMI remote log (Windows only), and fschange. What migration does not do Migration does not copy any configuration files, such as inputs.conf or outputs.conf. This is because it would not be possible to conclusively determine where all existing versions of configuration files reside on the old forwarder. Therefore, you still need to configure your data inputs and outputs, either during installation or later. If you choose to configure later, you can copy over the necessary configuration files manually or you can use the deployment server to push them out to all your universal forwarders. See this section below for more information on configuration files. If the data inputs for the universal forwarder differ from the old forwarder, you can still migrate. Migrated checkpoint data pertaining to any inputs not configured for the universal forwarder will just be ignored. If you decide to add those inputs later, the universal forwarder will use the migrated checkpoints to determine where in the data stream to start forwarding. Migration also does not copy over any apps from the light forwarder. If you have any apps that you want to migrate to the universal forwarder, you'll need to do so manually.

87

Before you start


Forwarders and clusters When using forwarders to send data to peer nodes in a cluster, you deploy and configure them a bit differently from the description in this topic. To learn more about forwarders and clusters, read "Use forwarders to get your data" in the Managing Indexers and Clusters Manual. Indexer and universal forwarder compatibility See "Compatibility between forwarders and indexers" for details. System requirements See the Installation manual for specific hardware requirements and supported operating systems. Licensing requirements The universal forwarder ships with a pre-installed license. See "Types of Splunk licenses" in the Admin manual for details. Other requirements You must have admin or equivalent rights on the machine where you're installing the universal forwarder.

Steps to deployment
The actual procedure varies depending on the type of deployment, but these are the typical steps: 1. Plan your deployment. 2. Download the universal forwarder from http://www.splunk.com/download/universalforwarder 3. Install the universal forwarder on a test machine. 4. Perform any post-installation configuration. 5. Test and tune the deployment.
88

6. Deploy the universal forwarder to machines across your environment (for multi-machine deployments). These steps are described below in more detail. Important: Deploying your forwarders is just one step in the overall process of setting up Splunk forwarding and receiving. For an overview of that process, read "Set up forwarding and receiving: universal forwarders". Plan your deployment Here are some of the issues to consider when planning your deployment: How many (and what type of) machines will you be deploying to? Will you be deploying across multiple OS's? Do you need to migrate from any existing forwarders? What, if any, deployment tools do you plan to use? Will you be deploying via a system image or virtual machine? Will you be deploying fully configured universal forwarders, or do you plan to complete the configuration after the universal forwarders have been deployed across your system? What level of security does the communication between universal forwarder and indexer require? Install, test, configure, deploy For next steps, see the topic in this chapter that matches your deployment requirements most closely. Each topic contains one or more use cases that cover specific deployment scenarios from installation through configuration and deployment: "Deploy a Windows universal forwarder via the installer GUI" "Deploy a Windows universal forwarder via the commandline" "Remotely deploy a Windows universal forwarder with a static configuration" "Deploy a nix universal forwarder manually" "Remotely deploy a nix universal forwarder with a static configuration" "Make a universal forwarder part of a system image"
89

But first, read the next section to learn more about universal forwarder configuration. Note: The universal forwarder's executable is named splunkd, the same as the executable for full Splunk. The service name is SplunkUniversalForwarder.

General configuration issues


Because the universal forwarder has no Splunk Web GUI, you must perform all configuration either during installation (Windows-only) or later, as a separate step. To perform post-installation configuration, you can use the CLI, modify the configuration files directly, or use deployment server. Where to configure Key configuration files include inputs.conf (for data inputs) and outputs.conf (for data outputs). Others include server.conf and deploymentclient.conf. When you make configuration changes with the CLI, the universal forwarder writes the changes to configuration files in the search app (except for changes to outputs.conf, which it writes to a file in $SPLUNK_HOME/etc/system/local/). The search app is the default app for the universal forwarder, even though you cannot actually use the universal forwarder to perform searches. If this seems odd, it is. Important: The Windows installation process writes configuration changes to an app called "MSICreated", not to the search app. Note: The universal forwarder also ships with a SplunkUniversalForwarder app, which must be enabled. (This happens automatically.) This app includes preconfigured settings that enable the universal forwarder to run in a streamlined mode. No configuration changes get written there. We recommend that you do not make any changes or additions to that app. Learn more about configuration Refer to these topics for some important information: "About configuration files" and "Configuration file precedence" in the Admin manual, for details on how configuration files work. "Configure forwarders with outputs.conf", for information on outputs.conf specifically. The topics in the "Use the forwarder to create deployment topologies" section, for information on configuring outputs with the CLI.
90

"Configure your inputs" in the Getting Data In manual, for details on configuring data inputs with inputs.conf or the CLI. Deploy configuration updates These are the main methods for deploying configuration updates across your set of universal forwarders: Edit or copy the configuration files for each universal forwarder manually (for small deployments only). Use the Splunk deployment server to push configured apps to your set of universal forwarders. Use your own deployment tools to push configuration changes. Restart the universal forwarder Some configuration changes might require that you restart the forwarder. (The topics covering specific configuration changes will let you know if a change does require a restart.) To restart the universal forwarder, use the same CLI restart command that you use to restart a full Splunk instance: On Windows: Go to %SPLUNK_HOME%\bin and run this command:
> splunk restart

On *nix systems: From a shell prompt on the host, run this command:
# splunk restart

Deploy a Windows universal forwarder via the installer GUI


This topic describes how to manually install, configure, and deploy the universal forwarder in a Windows environment using the installer GUI. It assumes that you're installing directly onto the Windows machine, rather than using a deployment tool. This method of installation best suits these needs: small deployments proof-of-concept test deployments
91

system image or virtual machine for eventual cloning If you are interested in a different deployment method or a different operating system, look for another topic in this section that better fits your needs. You can also install the universal forwarder from the command line, using msiexec. See "Deploy a Windows universal forwarder via the command line" for more information. Important: If you do not want the universal forwarder to start immediately after installation, you must install via the command line. Before following the procedures in this topic, read "Deployment overview" to further understand the mechanics of a distributed Splunk deployment.

Steps to deployment
Once you have downloaded the universal forwarder and planned your deployment, as described in "Deployment overview", perform these steps: 1. Install the universal forwarder (with optional migration and configuration). 2. Test and tune the deployment. 3. Perform any post-installation configuration. 4. Deploy the universal forwarder across your environment.

Before you install


Choose where the universal forwarder will get its data from When you install the universal forwarder, you can select where the forwarder will get its data. You have two choices: Local Data Only Remote Windows Data If you tell the installer to collect Local Data only, the universal forwarder can collect any kind of data that is available on the local machine. It cannot, however, collect data from other machines.

92

You must tell the forwarder to collect Remote Windows Data if you intend to do any of the following: Read Event Logs remotely Collect performance counters remotely Read network shares for log files Enumerate the Active Directory schema, using Active Directory monitoring If you tell the installer to collect Remote Windows data, you must then specify a user which has access to this data. Read "Choose the Windows user Splunk should run as" in the Installation Manual for concepts and procedures on the user requirements that must be in place before you collect remote Windows data. Important: You should choose - and configure - the user that Splunk will run as before attempting to install a universal forwarder for remote Windows data collection. Configure your Windows environment for remote data collection If you do not need to install the universal forwarder to collect remote Windows data, you can continue to the installation instructions below. If your monitoring needs require you to install the universal forwarder to collect remote Windows data, then you must configure your Windows environment for the proper installation of the forwarder. 1. Create and configure security groups with the user you want the universal forwarder to run as. 2. Optionally, configure the universal forwarder account as a managed service account. 3. Create and configure Group Policy objects for security policy and user rights assignment. 4. Assign appropriate user rights to the GPO. 5. Deploy the GPO(s) with the updated settings to the appropriate objects. Note: These steps are high-level procedures only. For step-by-step instructions, read "Prepare your Windows network for a Splunk installation as a network or domain user" in the Installation Manual.

93

Install the universal forwarder


The Windows installer guides you through the process of installing and configuring your universal forwarder. It also offers you the option of migrating your checkpoint settings from an existing Splunk forwarder. To install the universal forwarder, double-click the appropriate MSI file: splunkuniversalforwarder-<...>-x86-release.msi (for 32-bit platforms) splunkuniversalforwarder-<...>-x64-release.msi (for 64-bit platforms) The value of <...> varies according to the particular release; for example, splunkuniversalforwarder-4.2-86454-x64-release.msi. Important: Running the 32-bit version of the universal forwarder on a 64-bit platform is not recommended. If you can run 64-bit universal forwarder on 64-bit hardware, we strongly recommend it. The performance is greatly improved over the 32-bit version. A series of dialogs guides you through the installation. When you're through with a dialog, click Next to move to the next in the series. Here are the dialogs, in order: 1. "Welcome" dialog To begin the installation, click Next. 2. "License Agreement" dialog Read the license agreement and select "I accept the terms in the license agreement". 3. "Destination Folder" dialog The installer puts the universal forwarder into the C:\Program Files\SplunkUniversalForwarder directory by default. Click Change... to specify a different installation directory. Caution: Do not install the universal forwarder over an existing installation of full Splunk. This is particularly important if you are upgrading from a light forwarder, as described in "Migrate a Windows light forwarder". The default installation directory for full Splunk is C:\Program Files\Splunk, so, if you stick with the
94

defaults, you're safe. 4. "Migration" pop-up If the installer detects an existing version of Splunk, it will ask you whether or not you want to migrate the existing Splunk's data checkpoint settings to the universal forwarder. If you click Yes, it will automatically perform the migration. Important: This is the only time when you can migrate old settings to this universal forwarder. You cannot perform the migration later. See "Migrate a Windows forwarder" for more information on what migration does. 5. "Deployment Server" dialog Enter the hostname or IP address and management port for your deployment server. The default management port is 8089. You can use the deployment server to push configuration updates to the universal forwarder. See "About deployment server" for details. Note: This step is optional, but if you skip it, you must enter a receiving indexer in step 6; otherwise, the universal forwarder does not do anything, as it does not have any way of determining which indexer to forward data to. 6. "Receiving Indexer" dialog Enter the hostname or IP address and receiving port of the receiving indexer (receiver). For information on setting up a receiver, see "Enable a receiver". Note: This step is optional, but if you skip it, you must enter a deployment server in step 5; otherwise, the universal forwarder does not do anything, as it does not have any way of determining which indexer to forward to. 7. "Certificate Information" dialog Select an SSL certificate for verifying the identity of this machine. This step is optional. Depending on your certificate requirements, you might need to specify a password and a Root Certificate Authority (CA) certificate to verify the identity of the certificate. If not, these fields can be left blank.

95

Note: This dialog will only appear if you previously specified a receiving indexer (step 6). 8. "Where do you want to get data from?" dialogs This step in the installer requires one or two dialogs, depending on whether the universal forwarder will be collecting local data exclusively. In the first dialog, you specify the user context: whether you want the universal forwarder to collect only local data or also remote Windows data. The installer uses this information to determine the permissions the universal forwarder needs. Note: If you select Local data only, the universal forwarder installs as the Local System user, and network resources will not be available to it. This is recommended for improved security, unless this universal forwarder will be collecting event logs or metrics from remote machines. For more help in determining what to select here, see "Before you install" earlier in this topic. Once you've made your choice, click Next. If you specified Local data only, the installer skips the second screen and takes you directly to the "Enable Windows Inputs" dialog (step 8). If you specified Remote Windows data, the installer now takes you to a second dialog, where you need to enter domain and user information for this instance of the universal forwarder. The universal forwarder will run as the user you specify in this dialog. Important: The user you specify must have specific rights assigned to it prior to completing the installation. Failure to do so might result in a failed installation. Read "Before you install" earlier in this topic for specific information and links to step-by-step instructions. 9. "Enable Windows Inputs" dialog Select one or more Windows inputs from the list. This step is optional. You can enable inputs later, by editing inputs.conf within the universal forwarder directory.

96

10. "Ready to Install the Program" dialog Click Install to proceed. The installer runs and displays the Installation Completed dialog. Once the installation is complete, the universal forwarder automatically starts. SplunkForwarder is the name of the universal forwarder service. You should confirm that it is running.

Test the deployment


Test your configured universal forwarder on a single machine, to make sure it functions correctly, before deploying the universal forwarder across your environment. Confirm that the universal forwarder is getting the desired inputs and sending the right outputs to the indexer. You can use the deployment monitor to validate the universal forwarder. If you migrated from an existing forwarder, make sure that the universal forwarder is forwarding data from where the old forwarder left off. If it isn't, you need to modify or add data inputs, so that they conform to those on the old forwarder. Important: Migration does not automatically copy any configuration files. You must set those up yourself. The usual way to do this is to copy the files, including inputs.conf, from the old forwarder to the universal forwarder. Compare the inputs.conf files on the universal forwarder and the old forwarder to ensure that the universal forwarder has all the inputs that you want to maintain. If you migrated from an existing forwarder, you can delete that old instance once your universal forwarder has been thoroughly tested and you're comfortable with the results.

Perform additional configuration


You can update your universal forwarder's configuration, post-installation, by directly editing its configuration files, such as inputs.conf and outputs.conf. You can also update the configuration using the CLI. See "Deployment overview" for information. Note: When you use the CLI, you might need to authenticate into the Splunk forwarder to complete commands. The default credentials for a universal forwarder are:
97

Username: admin Password: changeme For information on distributing configuration changes across multiple universal forwarders, see "About deployment server".

Deploy the universal forwarder across your environment


If you need just a few universal forwarders, you might find it simpler just to repeat the manual installation process, as documented in this topic. If you need to install a larger number of universal forwarders, it will probably be easier to deploy them remotely with a deployment tool or else as part of a system image or virtual machine.

Uninstall the universal forwarder


To uninstall the universal forwarder, perform the following steps: 1. Use the Services MMC snap-in (Start > Administrative Tools > Services) to stop the SplunkForwarder service. Note: You can also stop the service from the command line with the following command:
NET STOP SplunkForwarder

2. Next, use the Add or Remove Programs control panel to uninstall the forwarder. On Windows 7, 8, Server 2008, and Server 2012, that option is available under Programs and Features. Note: Under some circumstances, the Microsoft installer might present a reboot prompt during the uninstall process. You can safely ignore this request without rebooting.

Deploy a Windows universal forwarder via the command line


This topic describes how to install, configure, and deploy the universal forwarder in a Windows environment using the command line interface. If you prefer to use a GUI installer, see "Deploy a Windows universal forwarder via the installer GUI".

98

When to install from the command line?


You can manually install the universal forwarder on individual machines from a command prompt or PowerShell window. Here are some scenarios where installing from the command line is useful: You want to install the forwarder, but don't want it to start right away. You want to automate installation of the forwarder with a script. You want to install the forwarder on a system that you will clone later. You want to use a deployment tool such as Group Policy or System Center Configuration Manager. Read the following topics for additional information on installing universal forwarders: "Deployment overview" for basics on universal forwarders. "Remotely deploy a Windows universal forwarder with a static configuration" for detailed information on using the command line interface with a deployment tool.

Steps to deployment
Once you have downloaded the universal forwarder and have planned your deployment, as described in "Deployment overview", perform these steps: 1. Install the universal forwarder (with optional migration and configuration). 2. Test and tune the deployment. 3. Perform any post-installation configuration. 4. Deploy the universal forwarder across your environment.

Before you install


Choose the Windows user the universal forwarder should run as When you install the universal forwarder, you can select the user it should run as. By default, the user is Local System. To specify a domain account, use the flags LOGON_USERNAME and LOGON_PASSWORD, described later in this topic. If you install the forwarder as the Local System user, the forwarder can collect
99

any kind of data that is available on the local machine. It cannot, however, collect data from other machines. This is by design. You must give the universal forwarder a user account if you intend to do any of the following: Read Event Logs remotely Collect performance counters remotely Read network shares for log files Enumerate the Active Directory schema, using Active Directory monitoring Read "Choose the Windows user Splunk should run as" in the Installation Manual for concepts and procedures on the user requirements that must be in place before you collect remote Windows data. Important: You must choose - and configure - the user that Splunk will run as before attempting to install a universal forwarder for remote Windows data collection. Failure to do so can result in a failed installation. Configure your Windows environment prior to installation To configure your Windows environment for the proper installation of the forwarder, follow these steps: 1. Create and configure security groups with the user you want the universal forwarder to run as. 2. Optionally, configure the universal forwarder account as a managed service account. 3. Create and configure Group Policy or Local Security Policy objects for user rights assignments. 4. Assign appropriate security settings. 5. If using Active Directory, deploy the Group Policy object(s) with the updated settings to the appropriate objects. Note: These steps are high-level procedures only. For step-by-step instructions, read "Prepare your Windows network for a Splunk installation as a network or domain user" in the Installation Manual.

100

Install the universal forwarder


You can install the universal forwarder from the command line by invoking msiexec.exe, Microsoft's installer program. For 32-bit platforms, use splunkuniversalforwarder-<...>-x86-release.msi:

msiexec.exe /i splunkuniversalforwarder-<...>-x86-release.msi [<flag>]... [/quiet]

For 64-bit platforms, use splunkuniversalforwarder-<...>-x64-release.msi:

msiexec.exe /i splunkuniversalforwarder-<...>-x64-release.msi [<flag>]... [/quiet]

The value of <...> varies according to the particular release; for example, splunkuniversalforwarder-4.2-86454-x64-release.msi. Important: Running the 32-bit version of the universal forwarder on a 64-bit platform is not recommended. If you can run 64-bit universal forwarder on 64-bit hardware, we strongly recommend it. The performance is greatly improved over the 32-bit version. Command line flags allow you to configure your forwarder at installation time. Using command line flags, you can specify a number of settings, including: The user the universal forwarder runs as. (Be sure the user you specify has the appropriate permissions to access the content you want to forward.) The receiving Splunk instance that the universal forwarder will send data to. A Splunk deployment server for updating the configuration. The Windows event logs to index. Whether the universal forwarder should start automatically when the installation is completed. Whether to migrate checkpoint data from an existing light forwarder. The following sections list the flags available and provide a few examples of various configurations.

101

List of supported flags


Important: The installer for the full version of Splunk is a separate executable, with its own installation flags. Review the installation flags for the full Splunk installer at "Install on Windows" in the Installation Manual. Flag What it's for
Use this flag to agree to the EULA. This flag No must be set to Yes for

Default

AGREETOLICENSE=Yes|No

a silent installation.
Specifies the installation directory.

INSTALLDIR="<directory_path>"

Important: Do not install the universal forwarder over an existing installation of full Splunk. This is particularly vital if you are migrating from a light forwarder as c:\Program Files\SplunkUniversalF described in "Migrate a Windows light forwarder". The default install directory for full Splunk is
C:\Program Files\Splunk,

so, if you stick with the defaults, you're safe.


LOGON_USERNAME="<domain\username>" LOGON_PASSWORD="<pass>" Use these flags to n/a provide domain\username and password information for the user to run the SplunkForwarder

service. You must


102

specify the domain with the username in the format: domain\username. If you don't include these flags, the universal forwarder installs as the Local System user. See "Choose the Windows user Splunk should run as".
Use this flag to specify the receiving indexer to which the universal forwarder will forward data. Enter the name (hostname or IP address) and

RECEIVING_INDEXER="<host:port>"

receiving port of the Splunk receiver. For information on setting up a receiver, see "Enable a receiver". Note: This flag is optional, but if you don't specify it and also don't specify
DEPLOYMENT_SERVER,

n/a

the universal forwarder will be unable to function, as it will not have any way of determining which indexer to forward to.
103

Use this flag to specify a deployment

server for pushing configuration updates to the universal forwarder. Enter the deployment server's name (hostname or IP address) and port.
DEPLOYMENT_SERVER="<host:port>"

Note: This flag is n/a optional, but if you don't specify it and also don't specify RECEIVING_INDEXER, the universal forwarder will be unable to function, as it will not have any way of determining which indexer to forward to.
Use this flag to specify whether the universal forwarder should be 1 (yes) configured to launch automatically when the installation finishes. Use this flag to specify auto whether the universal forwarder should start automatically when the system reboots.

LAUNCHSPLUNK=1|0

SERVICESTARTTYPE=auto|manual

Note: By setting LAUNCHSPLUNK to 0 and


SERVICESTARTTYPE

to auto, you will cause the universal


104

forwarder to not start forwarding until the next system boot. This is useful when cloning a system image.
MONITOR_PATH="<directory_path>" Use this flag to specify a file or directory to n/a monitor. Use these flags to enable these Windows event logs, respectively:

application
WINEVENTLOG_APP_ENABLE=1|0 WINEVENTLOG_SEC_ENABLE=1|0 WINEVENTLOG_SYS_ENABLE=1|0 WINEVENTLOG_FWD_ENABLE=1|0 WINEVENTLOG_SET_ENABLE=1|0

security system forwarders setup Note: You can specify multiple flags.
Use this flag to enable perfmon inputs. <input_type> can 0 (no)

PERFMON=<input_type>,<input_type>,...

be any of these:
cpu memory network diskspace

n/a

ENABLEADMON=1|0

Use this flag to enable Active Directory 0 (not enabled) monitoring for a remote deployment. Use these flags to supply SSL certificates: n/a

CERTFILE=<c:\path\to\certfile.pem> ROOTCACERTFILE=<c:\path\to\rootcacertfile.pem>

105

CERTPASSWORD=<password>

Path to the cert file that contains the public/private key pair. Path to the file that contains the Root CA cert for verifying CERTFILE is legitimate (optional). Password for private key of CERTFILE (optional). Note: You must also set
RECEIVING_INDEXER

for these flags to have any effect.


MIGRATESPLUNK=1|0 Determines whether migration from an existing forwarder will occur during installation. If MIGRATESPLUNK=1, 0 (no migration)

the installer stops the existing forwarder and copies its checkpoint files to the new universal forwarder. You are responsible for uninstalling the old forwarder. See "Deployment overview" and "Migrate a Windows light
106

forwarder" for details.

CLONEPREP=1|0

Tells Splunk to delete any instance-specific data in preparation for creating a clone of a machine. This invokes 0 (do not prepare the instance cloning.) the splunk clone-prep

command from the CLI.

Silent installation
To run the installation silently, add /quiet to the end of your installation command string. You must also set the AGREETOLICENSE=Yes flag. If your system is running UAC (which is sometimes on by default), you must run the installation as Administrator. To do this, when opening a cmd prompt, right click and select "Run As Administrator". Then use the cmd window to run the silent install command.

Examples
The following are some examples of using different flags. Install the universal forwarder to run as the Local System user and request configuration from deploymentserver1 You might do this for new deployments of the forwarder.

msiexec.exe /i splunkuniversalforwarder_x86.msi DEPLOYMENT_SERVER="deploymentserver1:8089" AGREETOLICENSE=Yes /quiet

Install the universal forwarder to run as a domain user, but do not launch it immediately You might do this when preparing a sample host for cloning.

msiexec.exe /i splunkuniversalforwarder_x86.msi LOGON_USERNAME="AD\splunk" LOGON_PASSWORD="splunk123" DEPLOYMENT_SERVER="deploymentserver1:8089" LAUNCHSPLUNK=0 AGREETOLICENSE=Yes /quiet

107

Install the universal forwarder, enable indexing of the Windows security and system event logs, and run the installer in silent mode You might do this to collect just the Security and System event logs through a "fire-and-forget" installation.

msiexec.exe /i splunkuniversalforwarder_x86.msi RECEIVING_INDEXER="indexer1:9997" WINEVENTLOG_SEC_ENABLE=1 WINEVENTLOG_SYS_ENABLE=1 AGREETOLICENSE=Yes /quiet

Install the universal forwarder, migrate from an existing forwarder, and run the installer in silent mode You might do this if you want to migrate now and redefine your inputs later, perhaps after a validation step.

msiexec.exe /i splunkuniversalforwarder_x86.msi RECEIVING_INDEXER="indexer1:9997" MIGRATESPLUNK=1 AGREETOLICENSE=Yes /quiet

Test the deployment


Test your configured universal forwarder on a single machine, to make sure it functions correctly, before deploying the universal forwarder across your environment. Confirm that the universal forwarder is getting the desired inputs and sending the right outputs to the indexer. You can use the deployment monitor to validate the universal forwarder. If you migrated from an existing forwarder, make sure that the universal forwarder is forwarding data from where the old forwarder left off. If it isn't, you probably need to modify or add data inputs, so that they conform to those on the old forwarder. Important: Migration does not automatically copy any configuration files; you must set those up yourself. The usual way to do this is to copy the files, including inputs.conf, from the old forwarder to the universal forwarder. Compare the inputs.conf files on the universal forwarder and the old forwarder to ensure that the universal forwarder has all the inputs that you want to maintain. If you migrated from an existing forwarder, you can delete that old instance once your universal forwarder has been thoroughly tested and you're comfortable with the results.
108

Perform additional configuration


You can update your universal forwarder's configuration, post-installation, by directly editing its configuration files, such as inputs.conf and outputs.conf. You can also update the configuration using the CLI. See "Deployment overview" for information. Note: When you use the CLI, you might need to authenticate into the Splunk forwarder to complete commands. The default credentials for a universal forwarder are: Username: admin Password: changeme For information on distributing configuration changes across multiple universal forwarders, see "About deployment server".

Deploy the universal forwarder across your environment


If you need just a few universal forwarders, you might find it simpler just to repeat the command line installation process manually, as documented in this topic. If you need to install a larger number of universal forwarders, it will probably be easier to deploy them remotely with a deployment tool or else as part of a system image or virtual machine.

Uninstall the universal forwarder


To uninstall the universal forwarder, perform the following steps: 1. Stop the service from the command line with the following command:
NET STOP SplunkForwarder

Note: You can also use the Services MMC snap-in (Start > Administrative Tools > Services) to stop the SplunkForwarder service. 2. Next, use the Add or Remove Programs control panel to uninstall the forwarder. On Windows 7, 8, Server 2008, and Server 2012, that option is available under Programs and Features. Note: Under some circumstances, the Microsoft installer might present a reboot prompt during the uninstall process. You can safely ignore this request without rebooting.
109

Remotely deploy a Windows universal forwarder with a static configuration


You typically deploy the universal forwarder with a static configuration for one of these reasons: You don't need to change the configuration later - "fire-and-forget". You'll be making any post-installation changes with a non-Splunk deployment tool such as System Center Configuration Manager, Altris, or BigFix/Tivoli. For this type of deployment, you install via the Windows commandline interface. During installation, you must specify all configuration options and use silent mode (/quiet). See "Deploy a Windows universal forwarder via the commandline" for information on the commandline interface, including a list of supported flags.

Steps to deployment
Once you have downloaded the universal forwarder and have planned your deployment, as described in "Deployment overview", perform these steps: 1. Install and configure the universal forwarder on a test machine, using the commandline interface with the desired flags. 2. Test and tune the deployment. 3. Load the universal forwarder MSI into your deployment tool, specifying the tested flags. 4. Execute deployment with your deployment tool. 5. Use the deployment monitor to verify that the universal forwarders are functioning.

Required installation flags


Besides specifying /quiet mode, you must include, at a minimum, these commandline flags: AGREETOLICENSE=Yes RECEIVING_INDEXER="<server:port>"

110

At least one data input flag, such as WINEVENTLOG_APP_ENABLE=1. You can add as many data input flags as you need. See "Deploy a Windows universal forwarder via the commandline" for a list of all available commandline flags.

Example installation
This example sets the universal forwarder to run as Local System user, get inputs from Windows security and system event logs, send data to indexer1, and launch automatically:

msiexec.exe /i splunkuniversalforwarder_x86.msi RECEIVING_INDEXER="indexer1:9997" WINEVENTLOG_SEC_ENABLE=1 WINEVENTLOG_SYS_ENABLE=1 AGREETOLICENSE=Yes /quiet

Deploy with a secure configuration


To deploy a secure configuration, you can specify an SSL certifcate. Use these installation flags: CERTFILE=<c:\path\to\certfile.pem> ROOTCACERTFILE=<c:\path\to\rootcacertfile.pem> CERTPASSWORD=<password> For more information, see this list of supported commandline flags.

Test the deployment


Test your configured universal forwarder on a single machine, to make sure it functions correctly, before deploying the universal forwarder across your environment. Confirm that the universal forwarder is getting the desired inputs and sending the right outputs to the indexer. You can use the deployment monitor to validate the universal forwarder.

Deploy a *nix universal forwarder manually


This topic describes how to manually configure and deploy the universal forwarder in a *nix environment, such as Linux or Solaris. It assumes that you're installing directly onto the *nix machine, rather than using a deployment tool. This type of deployment best suits these needs:
111

small deployments proof-of-concept test deployments system image or virtual machine for eventual cloning If you are interested in a different deployment scenario, look for another topic in this section that better fits your needs. Before following the procedures in this topic, read "Deployment overview".

Steps to deployment
Once you have downloaded the universal forwarder and have planned your deployment, as described in "Deployment overview", perform these steps: 1. Install the universal forwarder. 2. Configure (and optionally migrate) the universal forwarder. 3. Test and tune the deployment. 4. Perform any additional post-installation configuration.

Install the universal forwarder


You can install the universal forwarder on a *nix machine using a package or a tar file. To install the universal forwarder on any of the supported *nix environments, see the set of *nix install topics in the Installation manual: Install on Linux Install on Solaris Install on Mac OS Install on FreeBSD Install on AIX Install on HP-UX You install the universal forwarder the same way that you install a full Splunk instance, as documented in these topics in the Installation manual. There are only two differences: The package name. The default installation directory.

112

The package name When installing a package, substitute the name of the universal forwarder package for the full Splunk package name used in the commands in the Installation manual. For example, if installing the universal forwarder onto Red Hat Linux, use this command:

rpm -i splunkforwarder_<package_name>.rpm

instead of this command for a full Splunk instance:

rpm -i splunk_<package_name>.rpm

The only difference is the prefix to the package name: "splunkforwarder", instead of "splunk". The default install directory The universal forwarder installs by default in /opt/splunkforwarder. (The default install directory for full Splunk is /opt/splunk.) Important: Do not install the universal forwarder over an existing installation of full Splunk. This is particuarly vital if you will be migrating from a light forwarder as described in "Migrate a nix light forwarder".

Configure the universal forwarder


The universal forwarder can run as any user on the local system. If you run the universal forwarder as a non-root user, make sure that it has the appropriate permissions to read the inputs that you specify. Refer to the instructions for running Splunk as a non-root user for more information. As part of configuration, you can migrate checkpoint settings from an existing forwarder to the universal forwarder. See "Deployment overview". Use the Splunk CLI to start and configure your universal forwarders.

113

Start the universal forwarder Important: If you want to migrate from an existing forwarder, you must perform a specific set of actions before you start the universal forwarder for the first time. See "Migrate a nix forwarder" for details. To start the universal forwarder, run the following command from $SPLUNK_HOME/bin directory (where $SPLUNK_HOME is the directory into which you installed the universal forwarder):

splunk start Accept the license agreement automatically

The first time you start the universal forwarder after a new installation, you must accept the license agreement. To start the universal forwarder and accept the license in one step:

splunk start --accept-license

Note: There are two dashes before the accept-license option. Configuration steps After you start the universal forwarder and accept the license agreement, follow these steps to configure it: 1. Configure universal forwarder to auto-start:

splunk enable boot-start

2. Configure universal forwarder to act as a deployment client (optional). To do this, just specify the deployment server:

splunk set deploy-poll <host>:<port>

where: <host> is the deployment server's hostname or IP address and <port> is the port it's listening on.
114

This step also automatically enables the deployment client functionality. 3. Configure the universal forwarder to forward to a specific receiving indexer, also known as the "receiver" (optional):

splunk add forward-server <host>:<port> -auth <username>:<password>

where: <host> is the receiving indexer's hostname or IP address and <port> is the port it's listening on. By convention, the receiver listens for forwarders on port 9997, but it can be set to listen on any port, so you'll need to check with the receiver's administrator to obtain the port number. For information on setting up a receiver, see "Enable a receiver". <username>:<password> is the username and password for logging into the forwarder. By default, these are "admin:changeme" (To set a different password than the default , issue the following command "splunk edit user admin -password <new password> -role admin -auth admin:changeme"). During this step, you can also configure a certificate for secure intra-Splunk communications, using a set of optional ssl flags to specify a certificate, root CA, and password. For example:

splunk add forward-server <host>:<port> -ssl-cert-path /path/ssl.crt -ssl-root-ca-path /path/ca.crt -ssl-password <password>

Note: If you do not specify a receiving indexer, be sure to configure universal forwarder to act as a deployment client, as described in step 2, so that it can later be configured for a receiving indexer. 4. To configure the universal forwarder's inputs, use the CLI add command or edit inputs.conf. See "About the CLI" and subsequent topics for details on using the CLI. For a complete list of CLI commands supported in the universal forwarder, see "Supported CLI commands".

Test the deployment


Test your configured universal forwarder on a single machine, to make sure it functions correctly, before deploying the universal forwarder across your
115

environment. Confirm that the universal forwarder is getting the desired inputs and sending the right outputs to the indexer. You can use the deployment monitor to validate the universal forwarder. If you migrated from an existing forwarder, make sure that the universal forwarder is forwarding data from where the old forwarder left off. If it isn't, you probably need to modify or add data inputs, so that they conform to those on the old forwarder. Examine the two inputs.conf files to ensure that the new universal forwarder has all the inputs that you want to maintain. If you migrated from an existing forwarder, you can delete that old instance once your universal forwarder has been thoroughly tested and you're comfortable with the results. See "Troubleshoot your deployment" for troubleshooting tips.

Perform additional configuration


In addition to using the CLI, you can update the universal forwarder's configuration by editing its configuration files, such as inputs.conf and outputs.conf, directly. See "Deployment overview" for information. For information on distributing configuration changes across multiple universal forwarders, see "About deployment server".

Deploy the universal forwarder across your environment


If you need just a few universal forwarders, you might find it simpler just to repeat the installation process manually, as documented in this topic. If you need to install a larger number of universal forwarders, however, it will probably be easier to deploy them remotely (using scripting or a deployment tool) or else as part of a system image or virtual machine.

Troubleshoot your deployment


The universal forwarder forwards some internal logs to the receiving indexer. These are:

$SPLUNK_HOME/var/log/splunk/splunkd.log $SPLUNK_HOME/var/log/splunk/license_audit.log

116

The logs can be searched on the indexer for errors (index=_internal host=<ua-machine>). If the universal forwarder is malfunctioning such that it cannot forward the logs, use a text editor or grep to examine them on the universal forwarder machine itself.

Remotely deploy a *nix universal forwarder with a static configuration


One of the main ways to deploy multiple universal forwarders remotely is through scripting. You can also use deployment management tools such as yum and Puppet. This topic focuses on script deployment. For information on how to install and configure a single universal forwarder, see "Deploy a nix universal forwarder manually". That topic explains how to install onto a wide variety of *nix platforms from a package or a tar file and how to configure (and optionally migrate) using the Splunk CLI.

Steps to deployment
Once you have downloaded the universal forwarder and have planned your deployment, as described in "Deployment overview", perform these steps: 1. Install and configure the universal forwarder on a test machine, as described in "Deploy a nix universal forwarder manually". 2. Test and tune the configuration. 3. Create a script wrapper for the installation and configuration commands. 4. Run the script on representative target machines to verify that it works with all required shells. 5. Execute the script against the desired set of hosts. 6. Use the deployment monitor to verify that the universal forwarders are functioning properly.

117

Create and execute the script


Once you've validated your installation and configuration process by testing a fully configured universal forwarder, you're ready to incorporate the process into a script. Script requirements You need to place the installation package or tar file in a network location accessible by the target machines. You can set this up so that the script pushes the file over to each target host, or you can place the file in a generally accessible location, such as an NFS mount. The script is responsible for error reporting. Logging to Splunk either directly or via a flat file is recommended. Sample script Here's a sample script you can use as a starting point. Note that this is only an example of the type of script you could create for your deployment. The comments in the script provide some guidance on how to modify it for your needs; however, the script will likely require further modification, beyond that indicated by the comments. Among other things, the script: Deploys the forwarder's tar file to a list of hosts specified in a file that the HOST_FILE variable points to. You will need to provide this file, in the format specified in the script comments. Specifies the location on each destination host where the tar file will get unpacked. Specifies a Splunk instance to serve as a deployment server that can subsequently manage and update the forwarders. This is an optional configuration step. Starts the forwarder executable on each host. The script is well commented; be sure to study it carefully before modifying it for your environment. Here's the sample deployment script:
118

#!/bin/sh # This script provides an example of how to deploy the Splunk universal forwarder # to many remote hosts via ssh and common Unix commands. # # Note that this script will only work unattended if you have SSH host keys # setup & unlocked. # To learn more about this subject, do a web search for "openssh key management".

# ----------- Adjust the variables below ----------# Populate this file with a list of hosts that this script should install to, # with one host per line. You may use hostnames or IP addresses, as # applicable. You can also specify a user to login as, for example, "foo@host". # # Example file contents: # server1 # server2.foo.lan # you@server3 # 10.2.3.4 HOSTS_FILE="/path/to/splunk.install.list" # This is the path to the Splunk tar file that you wish to push out. You may # wish to make this a symlink to a versioned Splunk tar file, so as to minimize # updates to this script in the future. SPLUNK_FILE="/path/to/splunk-latest.tar.gz" # This is where the Splunk tar file will be stored on the remote host during # installation. The file will be removed after installation. You normally will # not need to set this variable, as $NEW_PARENT will be used by default. # # SCRATCH_DIR="/home/your_dir/temp" # The location in which to unpack the new Splunk tar file on the destination # host. This can be the same parent dir as for your existing Splunk # installation (if any). This directory will be created at runtime, if it does # not exist.

119

NEW_PARENT="/opt" # After installation, the forwarder will become a deployment client of this # host. Specify the host and management (not web) port of the deployment server # that will be managing these forwarder instances. If you do not wish to use # a deployment server, you may leave this unset. # # DEPLOY_SERV="splunkDeployMaster:8089" # A directory on the current host in which the output of each installation # attempt will be logged. This directory need not exist, but the user running # the script must be able to create it. The output will be stored as # $LOG_DIR/<[user@]destination host>. If installation on a host fails, a # corresponding file will also be created, as # $LOG_DIR/<[user@]destination host>.failed. LOG_DIR="/tmp/splunkua.install" # For conversion from normal Splunk installs to the Splunk Universal Agent: # After installation, records of Splunk's progress in indexing files (monitor) # and filesystem change events (fschange) can be imported from an existing # Splunk (non-forwarder) installation. Specify the path to that installation here. # If there is no prior Splunk instance, you may leave this variable empty (""). # # NOTE: THIS SCRIPT WILL STOP THE SPLUNK INSTANCE SPECIFIED HERE. # # OLD_SPLUNK="/opt/splunk" # If you use a non-standard SSH port on the remote hosts, you must set this. # SSH_PORT=1234 # You must remove this line, or the script will refuse to run. to # ensure that all of the above has been read and set. :) UNCONFIGURED=1 # ----------- End of user adjustable settings ----------This is

120

# helpers. faillog() { echo "$1" >&2 } fail() { faillog "ERROR: $@" exit 1 } # error checks. test "$UNCONFIGURED" -eq 1 && \ fail "This script has not been configured. Please see the notes in the script." test -z "$HOSTS_FILE" && \ fail "No hosts configured! Please populate HOSTS_FILE." test -z "$NEW_PARENT" && \ fail "No installation destination provided! Please set NEW_PARENT." test -z "$SPLUNK_FILE" && \ fail "No splunk package path provided! Please populate SPLUNK_FILE." if [ ! -d "$LOG_DIR" ]; then mkdir -p "$LOG_DIR" || fail "Cannot create log dir at \"$LOG_DIR\"!" fi # some setup. if [ -z "$SCRATCH_DIR" ]; then SCRATCH_DIR="$NEW_PARENT" fi if [ -n "$SSH_PORT" ]; then SSH_PORT_ARG="-p${SSH_PORT}" SCP_PORT_ARG="-P${SSH_PORT}" fi NEW_INSTANCE="$NEW_PARENT/splunkforwarder" # this would need to be edited for non-UA... DEST_FILE="${SCRATCH_DIR}/splunk.tar.gz" # # # create script to run remotely. # # REMOTE_SCRIPT=" fail() { echo ERROR: \"\$@\" >&2 test -f \"$DEST_FILE\" && rm -f \"$DEST_FILE\" exit 1

121

} " ### try untarring tar file. REMOTE_SCRIPT="$REMOTE_SCRIPT (cd \"$NEW_PARENT\" && tar -zxf \"$DEST_FILE\") || fail \"could not untar /$DEST_FILE to $NEW_PARENT.\" " ### setup seed file to migrate input records from old instance, and stop old instance. if [ -n "$OLD_SPLUNK" ]; then REMOTE_SCRIPT="$REMOTE_SCRIPT echo \"$OLD_SPLUNK\" > \"$NEW_INSTANCE/old_splunk.seed\" || fail \"could not create seed file.\" \"$OLD_SPLUNK/bin/splunk\" stop || fail \"could not stop existing splunk.\" " fi ### setup deployment client if requested. if [ -n "$DEPLOY_SERV" ]; then REMOTE_SCRIPT="$REMOTE_SCRIPT \"$NEW_INSTANCE/bin/splunk\" set deploy-poll \"$DEPLOY_SERV\" --accept-license --answer-yes \ --auto-ports --no-prompt || fail \"could not setup deployment client\" " fi ### start new instance. REMOTE_SCRIPT="$REMOTE_SCRIPT \"$NEW_INSTANCE/bin/splunk\" start --accept-license --answer-yes --auto-ports --no-prompt || \ fail \"could not start new splunk instance!\" " ### remove downloaded file. REMOTE_SCRIPT="$REMOTE_SCRIPT rm -f "$DEST_FILE" || fail \"could not delete downloaded file $DEST_FILE!\" " # # # end of remote script. # # exec 5>&1 # save stdout. exec 6>&2 # save stderr.

122

echo "In 5 seconds, will copy install file and run the following script on each" echo "remote host:" echo echo "====================" echo "$REMOTE_SCRIPT" echo "====================" echo echo "Press Ctrl-C to cancel..." test -z "$MORE_FASTER" && sleep 5 echo "Starting." # main loop. install on each host.

for DST in `cat "$HOSTS_FILE"`; do if [ -z "$DST" ]; then continue; fi LOG="$LOG_DIR/$DST" FAILLOG="${LOG}.failed" echo "Installing on host $DST, logging to $LOG." # redirect stdout/stderr to logfile. exec 1> "$LOG" exec 2> "$LOG" if ! ssh $SSH_PORT_ARG "$DST" \ "if [ ! -d \"$NEW_PARENT\" ]; then mkdir -p \"$NEW_PARENT\"; fi"; then touch "$FAILLOG" # restore stdout/stderr. exec 1>&5 exec 2>&6 continue fi # copy tar file to remote host. if ! scp $SCP_PORT_ARG "$SPLUNK_FILE" "${DST}:${DEST_FILE}"; then touch "$FAILLOG" # restore stdout/stderr. exec 1>&5 exec 2>&6 continue fi # run script on remote host and log appropriately. if ! ssh $SSH_PORT_ARG "$DST" "$REMOTE_SCRIPT"; then touch "$FAILLOG" # remote script failed. else test -e "$FAILLOG" && rm -f "$FAILLOG" # cleanup any past attempt log.

123

fi # restore stdout/stderr. exec 1>&5 exec 2>&6 if [ -e "$FAILLOG" ]; then echo " --> FAILED <--" else echo " SUCCEEDED" fi done FAIL_COUNT=`ls "${LOG_DIR}" | grep -c '\.failed$'` if [ "$FAIL_COUNT" -gt 0 ]; then echo "There were $FAIL_COUNT remote installation failures." echo " ( see ${LOG_DIR}/*.failed )" else echo echo "Done." fi # Voila.

Execute the script Once you've executed the script, check any log files generated by your installation script for errors. For example, the sample script saves to /tmp/splunkua.install/<destination hostname>.

Make a universal forwarder part of a system image


This topic describes how to deploy a universal forwarder as part of a system image or virtual machine. This is particularly useful if you have a large number of universal forwarders to deploy. If you have just a few, you might find it simpler to install them manually, as described for Windows and nix machines. Before following the procedures in this topic, read "Deployment overview".

Steps to deployment
Once you have downloaded the universal forwarder and have planned your deployment, as described in "Deployment overview", perform these steps:

124

1. Install the universal forwarder on a test machine. See below. 2. Perform any post-installation configuration, as described below, here. 3. Test and tune the deployment, as described below. 4. Install the universal forwarder with the tested configuration onto a source machine. 5. Stop the universal forwarder. 6. Run this CLI command on the forwarder:

./splunk clone-prep-clear-config

This clears instance-specific information, such as the server name and GUID, from the forwarder. This information will then be configured on each cloned forwarder at initial start-up. 7. Prep your image or virtual machine, as necessary, for cloning. 8. On *nix systems, set the splunkd daemon to start on boot using cron or your scheduling system of choice. On Windows, set the service to Automatic but do not start it. 9. Distribute system image or virtual machine clones to machines across your environment and start them. 10. Use the deployment monitor to verify that the cloned universal forwarders are functioning.

Referenced procedures
Steps in the above deployment procedure reference these subtopics. Install the universal forwarder Install the universal forwarder using the procedure specific to your operating system: To install on a *nix machine, see "Deploy a nix universal forwarder manually".
125

For a Windows machine, you can use the installer GUI or the commandline interface. To install with the GUI, see "Deploy a Windows universal forwarder via the installer GUI". For information on the commandline interface, see "Deploy a Windows universal forwarder via the commandline". Important: On a Windows machine, if you do not want the universal forwarder to start immediately after installation, you must use the commandline interface. Using the proper commandline flags, you can configure the universal forwarder so that it does not start on the source machine when installed but does start automatically on the clones, once they're activated. At the time of installation, you can also configure the universal forwarder. See "General configuration issues" in the Deployment Overview. Perform additional configuration You can update your universal forwarder's configuration, post-installation, by directly editing its configuration files, such as inputs.conf and outputs.conf. See "Deployment overview" for information. For information on distributing configuration changes across multiple universal forwarders, see "About deployment server". Test the deployment Test your configured universal forwarder on a single machine, to make sure it functions correctly, before deploying the universal forwarder across your environment. Confirm that the universal forwarder is getting the desired inputs and sending the right outputs to the indexer. You can use the deployment monitor to validate the universal forwarder.

Migrate a Windows light forwarder


If you want to replace an existing light forwarder with a universal forwarder, you need to first migrate its checkpoint data to the new forwarder. Checkpoint data is internal data that the forwarder compiles to keep track of what data it has already forwarded to an indexer. By migrating the checkpoint data, you prevent the new universal forwarder from forwarding any data already sent by the old light forwarder. This ensures that the same data does not get indexed twice.

126

You can migrate checkpoint data from an existing Windows light forwarder (version 4.0 or later) to the universal forwarder. For an overview of migration, see "Migrating from a light forwarder" in the Deployment Overview. If you want to migrate, you must do so during the installation process. You cannot migrate post-installation. You perform a Windows installation with either the installer GUI or the commandline: If you use the installer GUI, one of the screens will prompt you to migrate. See "Deploy a Windows universal forwarder via the installer GUI" for a walkthrough of the GUI installation procedure. If you install via the commandline, the flag MIGRATESPLUNK=1 specifies migration. See "Deploy a Windows universal forwarder via the commandline" for a list of supported flags and how to use them to configure your installation. Important: You must install the universal forwarder in a different directory from the existing light forwarder. Since the default install directory for the universal forwarder is C:\Program Files\SplunkUniversalForwarder and the default install directory for full Splunk (including the light forwarder) is C:\Program Files\Splunk, you'll be safe if you just stick with the defaults. What the installer does Whichever installation method you use, the Windows installer performs the following actions: 1. Searches for an existing heavy or light forwarder on the machine. 2. Determines whether the forwarder is eligible for migration (must be at version 4.0 or above). 3. If it finds an eligible forwarder, the GUI offers the user the option of migrating. (The commandline installer looks to see whether the MIGRATESPLUNK=1 flag exists.) 4. If user specifies migration (or the MIGRATESPLUNK=1 flag exists), the installer shuts down any running services (splunkd and, if running, splunkweb) for the existing forwarder. It also sets the startup type of the services to manual, so that they don't start up again upon reboot.
127

5. Migrates the checkpoint files to the universal forwarder. 6. Completes installation and configuration of the universal forwarder. What you need to do At the end of this process, you might want to perform additional configuration on the universal forwarder. Since the migration process only copies checkpoint files, you will probably want to manually copy over the old forwarder's inputs.conf configuration file (or at least examine it, to determine what data inputs it was monitoring). Once the universal forwarder is up and running (and after you've tested to ensure migration worked correctly), you can uninstall the old forwarder.

Migrate a *nix light forwarder


If you want to replace an existing light forwarder with a universal forwarder, you need to first migrate its checkpoint data to the new forwarder. Checkpoint data is internal data that the forwarder compiles to keep track of what data it has already forwarded to an indexer. By migrating the checkpoint data, you prevent the new universal forwarder from forwarding any data already sent by the old light forwarder. This ensures that the same data does not get indexed twice. You can migrate checkpoint data from an existing *nix light forwarder (version 4.0 or later) to the universal forwarder. For an overview of migration, see "Migrating from a light forwarder" in the Deployment Overview. Important: Migration can only occur the first time you start the universal forwarder, post-installation. You cannot migrate at any later point. To migrate, do the following: 1. Stop any services (splunkd and splunkweb, if running) for the existing forwarder:

$SPLUNK_HOME/bin/splunk stop

2. Complete the basic installation of the universal forwarder, as described in "Deploy a nix universal forwarder manually". Do not yet start the universal forwarder.
128

Important: Make sure you install the universal forwarder into a different directory from the existing light forwarder. Since the default install directory for the universal forwarder is /opt/splunkforwarder and the default install directory for full Splunk (including the light forwarder) is /opt/splunk, you'll be safe if you just stick with the defaults. 3. In the universal forwarder's installation directory, (the new $SPLUNK_HOME), create a file named old_splunk.seed; in other words: $SPLUNK_HOME/old_splunk.seed. This file must contain a single line, consisting of the path of the old forwarder's $SPLUNK_HOME directory. For example: /opt/splunk. 4. Start the universal forwarder:

$SPLUNK_HOME/bin/splunk start

The universal forwarder will migrate the checkpoint files from the forwarder specified in the $SPLUNK_HOME/old_splunk.seed file. Migration only occurs the first time you run the start command. You can leave the old_splunk.seed in place; Splunk only looks at it the first time you start the forwarder after installing it. 5. Perform any additional configuration of the universal forwarder, as described in "Deploy a nix universal forwarder manually". Since the migration process only copies checkpoint files, you will probably want to manually copy over the old forwarder's inputs.conf configuration file (or at least examine it, to determine what data inputs it was monitoring). Once the universal forwarder is up and running (and after you've tested to ensure migration worked correctly), you can uninstall the old forwarder.

Supported CLI commands


The universal forwarder supports a subset of objects for use in CLI commands. Certain objects valid in full Splunk, like index (as in add index), make no sense in the context of the universal forwarder. Commands act upon objects. If you type an invalid command/object combination, the universal forwarder will return an error message.

129

Valid CLI objects


The universal forwarder supports all CLI commands for these objects:

ad app config datastore-dir default-hostname deploy-client deploy-poll eventlog exec forward-server monitor oneshot perfmon registry servername splunkd-port tcp udp user wmi

Note: A few commands, such as start and stop can be run without an object. A command with no object is also valid for the universal forwarder.

A brief introduction to CLI syntax


The general syntax for a CLI command is:

./splunk <command> [<object>] [[-<parameter>] <value>]...

As described above, it's the object that determines whether a command is valid in the universal forwarder. For example, the above list includes the monitor object. Therefore, the add monitor and edit monitor command/object combinations are both valid. For more information on the monitor object, see "Use the CLI to monitor files and directories" in the Getting Data In manual. For more details on using the CLI in general, see the "Use Splunk's command line interface" chapter in the Admin manual. In particular, the topic "CLI admin commands" provides details on CLI syntax, including a list of all commands supported by full Splunk and the objects they can act upon.
130

Deploy heavy and light forwarders


Deploy a heavy or light forwarder
Note: The light forwarder has been deprecated in Splunk version 6.0. For a list of all deprecated features, see the topic "Deprecated features" in the Release Notes. To enable forwarding and receiving, you configure both a receiver and a forwarder. The receiver is the Splunk instance receiving the data; the forwarder sends data to the receiver. You must first set up the receiver. You can then set up forwarders to send data to that receiver. Note: The receiver must be running the same (or later) version of Splunk as its forwarder. A 4.0 receiver can receive data from a 3.4 forwarder, but a 3.4 receiver cannot receive from a 4.0 forwarder. Setting up a heavy or light forwarder is a two step process: 1. Install a full Splunk instance. 2. Enable forwarding on the Splunk instance. The sections that follow describe these steps in detail. Important: This topic describes deployment and configuration issues specific to light and heavy forwarders. For information on how to deploy a universal forwarder, see "Universal forwarder deployment overview". For information on how to enable receiver functionality on a Splunk instance, see "Enable a receiver".

Install a full Splunk instance


To deploy a light or heavy forwarder, you must first install a full Splunk instance. For detailed information about installing Splunk, including system requirements and licensing issues, see the Installation manual. Once the Splunk instance has been installed, you can enable forwarder functionality on it. You can also determine whether the forwarder should be a
131

light forwarder or a heavy forwarder. For information on the differences between these types of forwarders, look here.

Set up forwarding
You can use Splunk Web or the Splunk CLI as a quick way to enable forwarding in a Splunk instance. You can also enable, as well as configure, forwarding by creating an outputs.conf file for the Splunk instance. Although setting up forwarders with outputs.conf requires a bit more initial knowledge, there are obvious advantages to performing all forwarder configurations in a single location. Most advanced configuration options are available only through outputs.conf. In addition, if you will be enabling and configuring a number of forwarders, you can easily accomplish this by editing a single outputs.conf file and making a copy for each forwarder. See the topic "Configure forwarders with outputs.conf" for more information. Note: When you install a Splunk instance to be used as a light forwarder, select the forwarder license. You can then enable the light forwarder, as described below. For a heavy forwarder that performs indexing, you need an Enterprise license. For more information, see "Types of Splunk licenses". Set up heavy forwarding with Splunk Web Use Splunk Manager to set up a forwarder. To set up a heavy forwarder: 1. Log into Splunk Web as admin on the server that will be forwarding data. 2. Click the Manager link in the upper right corner. 3. Select Forwarding and receiving in the Data area. 4. Click Add new in the Forward data section. 5. Enter the hostname or IP address for the receiving Splunk instance(s), along with the receiving port specified when the receiver was configured. For example, you might enter: receivingserver.com:9997. You can enter multiple hosts as a comma-separated list. 6. Click Save. You must restart Splunk to complete the process.
132

You can use Splunk Web to perform one other configuration (for heavy forwarders only). To store a copy of indexed data local to the forwarder: 1. From Forwarding and receiving, select Forwarding defaults. 2. Select Yes to store and maintain a local copy of the indexed data on the forwarder. Important: A heavy forwarder has a key advantage over light and universal forwarders in that it can index your data locally, as well as forward the data to another Splunk index. However, local indexing is turned off by default. If you want to store data on the forwarder, you must enable that capability - either in the manner described above or by editing outputs.conf. All other configuration must be done in outputs.conf. Set up light forwarding with Splunk Web Use Splunk Manager to set up a forwarder. To enable light forwarding, you must first enable heavy forwarding on the Splunk instance. Then you separately enable light forwarding. This procedure combines the two processes: 1. Log into Splunk Web as admin on the server that will be forwarding data. 2. Click the Manager link in the upper right corner. 3. Select Forwarding and receiving in the Data area. 4. Click Add new in the Forward data section. 5. Enter the hostname or IP address for the receiving Splunk instance, along with the receiving port specified when the receiver was configured. For example, you might enter: receivingserver.com:9997. 6. Click Save. 7. Return to Manager>Forwarding and receiving. 8. Click Enable lightweight forwarding in the Forward data section. You must restart Splunk to complete the process.

133

Important: When you enable a light forwarder, Splunk Web is immediately disabled. You will then need to use the Splunk CLI or outputs.conf to perform any further configuration on the forwarder. Therefore, if you want to use Splunk Web to configure your forwarder, do so before you enable light forwarding. Set up forwarding with the Splunk CLI With the CLI, setting up forwarding is a two step process. First you enable forwarding on the Splunk instance. Then you start forwarding to a specified receiver. To access the CLI, first navigate to $SPLUNK_HOME/bin/. This is unnecessary if you have added Splunk to your path. To enable the forwarder mode, enter:

./splunk enable app [SplunkForwarder|SplunkLightForwarder] -auth <username>:<password>

Note: In the CLI enable command, SplunkForwarder represents the heavy forwarder. Important: After this step, make sure you restart your Splunk instance as indicated! Attempting to start forwarding activity using the CLI before restarting splunkd will not work! To disable the forwarder mode, enter:

./splunk disable app [SplunkForwarder|SplunkLightForwarder] -auth <username>:<password>

By disabling forwarding, this command reverts the Splunk instance to a full server. Start forwarding activity from the Splunk CLI To access the CLI, first navigate to $SPLUNK_HOME/bin/. This is unnecessary if you have added Splunk to your path. To start forwarding activity, enter:

134

./splunk add forward-server <host>:<port> -auth <username>:<password>

To end forwarding activity, enter:

./splunk remove forward-server <host>:<port> -auth <username>:<password>

Note: Although this command ends forwarding activity, the Splunk instance remains configured as a forwarder. To revert the instance to a full Splunk server, use the disable command:

./splunk disable app [SplunkForwarder|SplunkLightForwarder] -auth <username>:<password>

Important: Make sure you restart your Splunk instance as indicated by the CLI to take these changes into account.

Upgrade a forwarder
To upgrade a forwarder to a new version, just upgrade the Splunk instance in the usual fashion. For details, read the upgrade section of the Installation manual. Important: Before doing an upgrade, consider whether you really need to. In many cases, there's no compelling reason to upgrade a forwarder. Forwarders are always compatible with later version indexers, so you do not need to upgrade them just because you've upgraded the indexers they're sending data to. Back up your files first Before you perform the upgrade, we strongly recommend that you back up all of your files. Most importantly, back up your Splunk configuration files. For information on backing up configurations, read "Back up configuration information" in the Admin manual. If you're upgrading a heavy forwarder that's indexing data locally, you also need to back up the indexed data. For information on backing up data, read "Back up indexed data" in the Managing Indexers and Clusters manual. Splunk does not provide a means of downgrading to previous versions; if you need to revert to an older forwarder release, just reinstall it.

135

Heavy and light forwarder capabilities


Certain capabilities are disabled in heavy and light forwarders. This section describes forwarder capabilities in detail. Note: The light forwarder has been deprecated in Splunk version 6.0. For a list of all deprecated features, see the topic "Deprecated features" in the Release Notes.

Splunk heavy forwarder details


The heavy forwarder has all Splunk functions and modules enabled by default, with the exception of the distributed search module. The file $SPLUNK_HOME/etc/apps/SplunkForwarder/default/default-mode.conf includes this stanza:

[pipeline:distributedSearch] disabled = true

For a detailed view of the exact configuration, see the configuration files for the SplunkForwarder application in $SPLUNK_HOME/etc/apps/SplunkForwarder/default.

Splunk light forwarder details


Most features of Splunk are disabled in the Splunk light forwarder. Specifically, the Splunk light forwarder: Disables event signing and checking whether the disk is full ($SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/default-mode.conf). Limits internal data inputs to splunkd and metrics logs only, and makes sure these are forwarded ($SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/inputs.conf). Disables all indexing ($SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/indexes.conf). Does not use transforms.conf and does not fully parse incoming data, but the CHARSET, CHECK_FOR_HEADER, NO_BINARY_CHECK, PREFIX_SOURCETYPE, and sourcetype properties from props.conf are used. Disables the Splunk Web interface ($SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/web.conf ). Limits throughput to 256KBps ($SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/limits.conf).
136

Disables the following modules in


$SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/default-mode.conf:

[pipeline:indexerPipe] disabled_processors= indexandforward, diskusage, signing,tcp-output-generic-processor, syslog-output-generic-processor, http-output-generic-processor, stream-output-processor [pipeline:distributedDeployment] disabled = true [pipeline:distributedSearch] disabled = true [pipeline:fifo] disabled = true [pipeline:merging] disabled = true [pipeline:typing] disabled = true [pipeline:udp] disabled = true [pipeline:tcp] disabled = true [pipeline:syslogfifo] disabled = true [pipeline:syslogudp] disabled = true [pipeline:parsing] disabled_processors=utf8, linebreaker, header, sendOut [pipeline:scheduler] disabled_processors = LiveSplunks

These modules include the deployment server (not the deployment client), distributed search, named pipes/FIFOs, direct input from network ports, and the scheduler. The defaults for the light forwarder can be tuned to meet your needs by overriding the settings in
$SPLUNK_HOME/etc/apps/SplunkLightForwarder/default/default-mode.conf

on

137

a case-by-case basis. Purge old indexes When you convert a Splunk indexer instance to a light forwarder, among other things, you disable indexing. In addition, you no longer have access to any data previously indexed on that instance. However, the data still exists. If you want to purge that data from your system, you must first disable the SplunkLightForwarder app, then run the CLI clean command, and then renable the app. For information on the clean command, see "Remove indexed data from Splunk" in the Managing Indexers and Clusters manual.

138

Search across multiple indexers


About distributed search
In distributed search, a Splunk instance called a search head sends search requests to a group of Splunk indexers, which perform the actual searches on their indexes. The search head then merges the results back to the user. In a typical scenario, one search head manages searches on several indexers. These are some of the key use cases for distributed search: Horizontal scaling for enhanced performance. Distributed search facilitates horizontal scaling by providing a way to distribute the indexing and searching loads across multiple Splunk components, making it possible to search and index large quantities of data. Access control. You can use distributed search to control access to indexed data. In a typical situation, some users, such as security personnel, might need access to data across the enterprise, while others need access only to data in their functional area. Managing geo-dispersed data. Distributed search allows local offices to access their own data, while maintaining centralized access at the corporate level. Chicago and San Francisco can look just at their local data; headquarters in New York can search its local data, as well as the data in Chicago and San Francisco. The Splunk instance that does the searching is referred to as the search head. The Splunk indexers that participate in a distributed search are called search peers or indexer nodes. A search head can also index and serve as a search peer. However, in performance-based use cases, such as horizontal scaling, it is recommended that the search head only search and not index. In that case, it is referred to as a dedicated search head. A search head by default runs its searches across all its search peers. You can limit a search to one or more search peers by specifying the splunk_server field in your query. See "Search across one or more distributed servers" in the Search manual.

139

You can run multiple search heads across a set of search peers. To coordinate the activity of multiple search heads (so that they share configuration settings, search artifacts, and job management), you need to enable search head pooling.

Some search scenarios


This diagram shows a simple distributed search scenario for horizontal scaling, with one search head searching across three peers:

In this diagram showing a distributed search scenario for access control, a "security" department search head has visibility into all the indexing search peers. Each search peer also has the ability to search its own data. In addition, the department A search peer has access to both its data and the data of department B:

Finally, this diagram shows load balancing with distributed search. There's a dedicated search head and a search head on each indexer. All the search heads can search across the entire set of indexers:

140

For more information on load balancing, see "Set up load balancing" in this manual. For information on Splunk distributed searches and capacity planning, see "Dividing up indexing and searching" in the Installation manual.

Search heads and clusters


In index replication, clusters use search heads to search across the set of indexers, or peer nodes. You deploy and configure search heads very differently when they are part of a cluster. To learn more about search heads and clusters, read "Configure the search head" in the Managing Indexers and Clusters Manual.

What search heads send to search peers


When initiating a distributed search, the search head replicates and distributes its knowledge objects to its search peers. Knowledge objects include saved searches, event types, and other entities used in searching across indexes. The search head needs to distribute this material to its search peers so that they can properly execute queries on its behalf. The set of data that the search head distributes is called the knowledge bundle. The indexers use the search head's knowledge bundle to execute queries on its behalf. When executing a distributed search, the indexers are ignorant of any local knowledge objects. They have access only to the objects in the search head's knowledge bundle. The process of distributing knowledge bundles means that indexers by default receive nearly the entire contents of all the search head's apps. If an app contains large binaries that do not need to be shared with the indexers, you can reduce the size of the bundle by means of the [replicationWhitelist] or [replicationBlacklist] stanza in distsearch.conf. See "Limit knowledge bundle size" in this manual.
141

The knowledge bundle gets distributed to the $SPLUNK_HOME/var/run/searchpeers/ directory on each search peer. Because the search head distributes its knowledge, search scripts should not hardcode paths to resources. The knowledge bundle will reside at a different location on the search peer's file system, so hardcoded paths will not work properly. By default, the search head replicates and distributes the knowledge bundle to each search peer. For greater efficiency, you can instead tell the search peers to mount the knowledge bundle's directory location, eliminating the need for bundle replication. When you mount a knowledge bundle, it's referred to as a mounted bundle. To learn how to mount bundles, read "Mount the knowledge bundle".

User authorization
All authorization for a distributed search originates from the search head. At the time it sends the search request to its search peers, the search head also distributes the authorization information. It tells the search peers the name of the user running the search, the user's role, and the location of the distributed authorize.conf file containing the authorization information.

Licenses for distributed deployments


Each indexer node in a distributed deployment must have access to a license pool. Search heads performing no indexing or only summary indexing can use the forwarder license. If the search head performs any other type of indexing, it must have access to a license pool. See "Licenses for search heads" in the Installation manual for a detailed discussion of licensing issues.

Cross-version compatibility
It's recommended that you upgrade search heads and search peers to any new version at the same time to take full advantage of the latest search capabilities. This section describes the consequences of deploying multi-version distributed search for specific scenarios.

142

All search nodes must be 4.x or later All search nodes must be running Splunk 4.x or 5.x to participate in the distributed search. Distributed search is not backwards compatible with Splunk 3.x. Search nodes and 5.0 features You need to upgrade both search heads and search peers to version 5.0 to take advantage of search capabilities that are new to 5.0, such as report acceleration. Compatibility between 4.3 search heads and 4.2 search peers You can run 4.3 search heads across 4.2 search peers, but some 4.3 search-related functionality will not be available. These are the main features that require 4.3 search peers: The spath search command. Bloom filters. Historical backfill for real-time data from the search peers. Also, note that 4.3-specific stats/chart/timechart functionality is less efficient when used against 4.2.x search peers because the search peers can't provide map/reduce capability for that functionality. The functionality affected includes sparklines and the earliest() and latest() functions. Compatibility between 4.2.5+ search heads and pre-4.2.5 search peers Because of certain feature incompatibilities, pre-4.2.5 search peers can consume 20-30% more CPU resources when deployed with a 4.2.5 or later search head. You might see error messages such as "ConfObjectManagerDB - Ignoring invalid database setting" in splunkd.log on the search peers. Bundle replication warning when running a 4.2 search head against a 4.1.x search peer Bundle replication is the process by which the search head distributes knowledge bundles, containing the search-time configurations, to its search peers. This ensures that all peers run searches using the same configurations, so that, for example, all peers use the same definition of an event type. Starting with 4.2, bundle replication occurs asynchronously. The search head performs bundle replication in a non-blocking fashion that allows in-progress
143

searches to continue on the search peers. When issuing searches, the search head specifies the bundle version that the peers must use to run those searches. The peers will not start using the newly replicated bundles until the search head confirms that all peers have the latest bundle version. However, the new 4.2 search head behavior can cause pre-4.2 search peers to get out of sync and use different bundles when running their searches. If you run a 4.2 search head against 4.1.x search peers, you'll get this warning message: "Asynchronous bundle replication might cause (pre 4.2) search peers to run searches with different bundle/config versions. Results might not be correct." Note: This issue goes away in 4.2.1 search heads. Starting with 4.2.1, the search head will revert to synchronized bundled replication if any of the search peers is pre-4.2.

Install a dedicated search head


Distributed search is enabled by default on every Splunk instance, with the exception of forwarders. This means that every Splunk server can function as a search head to a specified group of indexers, referred to as search peers. In some cases, you might want a single Splunk instance to serve as both a search head and a search peer. In other cases, however, you might want to set up a dedicated search head. A dedicated search head performs only searching; it does not do any indexing. Note: If you do want to use a Splunk instance as both a search head and a search peer, or otherwise perform indexing on the search head, just install the search head as a regular Splunk instance with a normal license, as described in "About Splunk licenses" in the Installation manual. To install a dedicated search head, follow these steps: 1. Determine your hardware needs by reading this topic in the Installation manual. 2. Install Splunk, as described in the topic in the Installation manual specific to your operating system. 3. Add the search head to your Enterprise license group even if it's a dedicated search head that's not expected to index any data. For more information, see "Types of Splunk licenses".
144

4. Establish distributed search from the search head to all the indexers, or "search peers", you want it to search. See "Configure distributed search" for how to do this. 5. Log in to your search head and do a search for *. Look at your results and check the splunk_server field. Verify that all your search peers are listed in that field. 6. Set up the authentication method you want to use on the search head, just as you would for any other Splunk instance. Do not set up any indexing on your search head, since that will violate its license. Note: Clusters use search heads to search across the set of indexers, or peer nodes. You deploy search heads very differently when they are part of a cluster. To learn more about deploying search heads in clusters, read "Enable the search head" in the Managing Indexers and Clusters Manual.

Configure distributed search


Distributed search is available on every Splunk server, with the exception of forwarders. This means that every Splunk server can function as a search head to a specified group of indexers, referred to as search peers. The distributed search capability is enabled by default. To activate distributed search, a Splunk instance that you designate as a search head just needs to add search peers. Ordinarily, you do this by specifying each search peer manually. A number of other configuration options are available, but ordinarily you do not need to alter their default values. When configuring Splunk instances as search heads or search peers, keep this key distinction in mind: A search head must maintain a list of search peers, or it will have nothing to search on. A dedicated search head does not have additional data inputs, beyond the default ones. A search peer must have specified data inputs, or it will have nothing to index. A search peer does not maintain a list of other search peers. These roles are not necessarily distinct. A Splunk instance can, and frequently does, function simultaneously as both a search head and a search peer.

145

Important: Clusters also use search heads to search across the set of indexers, or peer nodes. You deploy and configure search heads very differently when they are part of a cluster. To learn more about configuring search heads in clusters, read "Configure the search head" in the Managing Indexers and Clusters Manual.

Configuration overview
You set up distributed search on a search head using any of these methods: Splunk Web Splunk CLI The distsearch.conf configuration file Splunk Web is the recommended method for most purposes. The configuration happens on the designated search head. The main step is to specify the search head's search peers. The distributed search capability itself is already enabled by default. Important: Before an indexer can function as a search peer, you must change its password from the default "changeme". Otherwise, the search head will not be able to authenticate against it. Aside from changing the password, no configuration is generally necessary on the search peers. Access to the peers is controllable through public key authentication. However, you do need to perform some configuration on the search peers if you mount the knowledge bundles, so that the peers can access them. See "Mount the knowledge bundle" for details.

Use Splunk Web


Specify the search peers You specify the search peers through Manager on the search head: 1. Log into Splunk Web and click Manager. 2. Click Distributed search in the Distributed Environment area. 3. Click Search peers. 4. On the Search peers page, select New.
146

5. Specify the search peer, along with any authentication settings. 6. Click Save. 7. Repeat for each of the search head's search peers. Configure miscellaneous distributed search settings To configure other settings: 1. Log into Splunk Web and click Manager. 2. Click Distributed search in the Distributed Environment area. 3. Click Distributed search setup. 5. Change any settings as needed. 6. Click Save.

Use the CLI


Follow these instructions to enable distributed search via Splunk's CLI. To use Splunk's CLI, navigate to the $SPLUNK_HOME/bin/ directory and use the ./splunk command. Enable distributed search Distributed search is enabled by default, so this step is ordinarily not required:

splunk enable dist-search -auth admin:password

Add a search peer Use the splunk add search-server command to add a search peer. When you run this command, make sure you specify the splunkd management port of the peer server. By default, this is 8089, although it might be different for your deployment. Be sure to provide credentials for both the local and the remote machines. Use the -auth flag for your local credentials and the -remoteUsername and
147

-remotePassword flags to specify search peer 10.10.10.10):

the remote credentials (in this example, for

splunk add search-server -host 10.10.10.10:8089 -auth admin:password -remoteUsername admin -remotePassword passremote

A message indicates success, along with the need to restart the server.

Edit distsearch.conf
In most cases, the settings available through Splunk Web provide sufficient options for configuring distributed search environments. Some advanced configuration settings, however, are only available through distsearch.conf. Edit this file in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. For the detailed specification and examples, see distsearch.conf. For more information on configuration files in general, see "About configuration files". Distribute the key files If you add search peers via Splunk Web or the CLI, Splunk automatically handles authentication. However, if you add peers by editing distsearch.conf, you must distribute the key files manually. After enabling distributed search on a Splunk instance (and restarting the instance), you will find the keys in this location:
$SPLUNK_HOME/etc/auth/distServerKeys/

Distribute the file $SPLUNK_HOME/etc/auth/distServerKeys/trusted.pem on the search head to $SPLUNK_HOME/etc/auth/distServerKeys/<searchhead_name>/trusted.pem on the indexer nodes.
Support for keys from multiple Splunk instances

Any number of Splunk instances can have their certificates stored on other instances for authentication. The instances can store keys in
$SPLUNK_HOME/etc/auth/distServerKeys/<peername>/trusted.pem

148

For example, if you have Splunk search heads A and B and they both need to search Splunk index node C, do the following: 1. On C, create $SPLUNK_HOME/etc/auth/distServerKeys/A/ and etc/auth/distServerKeys/B/. 2. Copy A's trusted.pem to $SPLUNK_HOME/etc/auth/distServerKeys/A/ and B's trusted.pem to $SPLUNK_HOME/etc/auth/distServerKeys/B/. 3. Restart C.

Limit the knowledge bundle size


The knowledge bundle is the data that the search head replicates and distributes to each search peer to enable its searches. For information on the contents and purpose of this bundle, see "What search heads send to search peers". The knowledge bundle can grow quite large, as, by default, it includes nearly the entire contents of all the search head's apps. To limit the size of the bundle, you can create a replication whitelist. To do this, edit distsearch.conf and specify a [replicationWhitelist] stanza:

[replicationWhitelist] <name> = <whitelist_regex> ...

All files that satisfy the whitelist regex will be included in the bundle that the search head distributes to its search peers. If multiple regex's are specified, the bundle will include the union of those files. In this example, the knowledge bundle will include all files with extensions of either ".conf" or ".spec":

[replicationWhitelist] allConf = *.conf allSpec = *.spec

The names, such as allConf and allSpec, are used only for layering. That is, if you have both a global and a local copy of distsearch.conf, the local copy can be configured so that it overrides only one of the regex's. For instance, assume that the example shown above is the global copy. Assume you then specify a
149

whitelist in your local copy like this:

[replicationWhitelist] allConf = *.foo.conf

The two conf files will be layered, with the local copy taking precedence. Thus, the search head will distribute only files that satisfy these two regex's:

allConf = *.foo.conf allSpec = *.spec

For more information on attribute layering in configuration files, see "Attribute precedence" in the Admin manual. You can also create a replication blacklist, using the [replicationBlacklist] stanza. The blacklist takes precedence over any whitelist. See distsearch.conf in the Admin manual for details. As an alternative to replicating and distributing a knowledge bundle, large or small, to search peers, you can mount the knowledge bundle on shared storage. For more information, read "Mount the knowledge bundle".

Manage distributed server names


The name of each search head and search peer is determined by its serverName attribute, specified in server.conf. serverName defaults to the server's machine name. In a distributed search cluster, all nodes must have unique names. The serverName has three specific uses: For authenticating search heads. When search peers are authenticating a search head, they look for the search head's key file in /etc/auth/distServerKeys/<searchhead_name>/trusted.pem. For identifying search peers in search queries. serverName is the value of the splunk_server field that you specify when you want to query a specific node. See "Search across one or more distributed servers" in the Search manual. For identifying search peers in search results. serverName gets reported back in the splunk_server field.

150

Note: serverName is not used when adding search peers to a search head. In that case, you identify the search peers through their domain names or IP addresses. The only reason to change serverName is if you have multiple instances of Splunk residing on a single machine, and they're participating in the same distributed search cluster. In that case, you'll need to change serverName to distinguish them.

Remove a search peer


You can remove a search peer from a search head through the Distributed search page on Manager. However, this only removes the search peer entry from the search head; it does not remove the search head key from the search peer. In most cases, this is not a problem and no further action is needed. If you need to disable the trust relationship between a search peer and a search head, you can delete the search-head-specific trusted.pem file from the directory $SPLUNK_HOME/etc/auth/distServerKeys/<searchhead_name>. It's unlikely that you'll need to do this.

Mount the knowledge bundle


The set of data that a search head distributes to its search peers is called the knowledge bundle. For information on the contents and purpose of this bundle, see "What search heads send to search peers". The bundle contents reside in the search head's $SPLUNK_HOME/etc/{apps,users,system} subdirectories. By default, the search head replicates and distributes the knowledge bundle to each search peer. For greater efficiency, you can instead tell the search peers to mount the knowledge bundle's directory location, eliminating the need for bundle replication. When you mount a knowledge bundle on shared storage, it's referred to as a mounted bundle. Important: Most shared storage solutions don't work well across a WAN. Since mounted bundles require shared storage, you generally should not implement them across a WAN.

151

Why use mounted bundles


Mounted bundles are useful if you have large amounts of search-time data, which could otherwise slow the replication process. One common cause of slow bundle replication is large lookup tables. Splunk writes a warning in splunkd.log if bundle replication is taking a long time. For example:

DistributedBundleReplicationManager - bundle replication to 2 peer(s) took too long (12943ms), bundle file size=296150KB ...

Mounted bundle architectures


Depending on your search head configuration, there are a number of ways to set up mounted bundles. These are some of the typical ones: For a single search head. Mount the knowledge bundle on shared storage. All the search peers then access the bundle to process search requests. This diagram illustrates a single search head with a mounted bundle on shared storage:

For multiple pooled search heads. For multiple search heads, you can combine mounted bundles with search head pooling. The pooled search heads maintain one bundle on the shared storage, and all search peers access that bundle. This diagram shows search head pooling with a mounted bundle:

152

For multiple non-pooled search heads. Maintain the knowledge bundle(s) on each search head's local storage (no search head pooling). In this diagram, each search head maintains its own bundle, which each search peer mounts and accesses individually:

There are numerous other architectures you can design with mounted bundles. You could, for example, use shared storage for multiple search heads, but without search head pooling. On the shared storage, you would maintain separate bundles for each search head. The search peers would need to access each bundle individually. In each case, the search peers need access to each search head's $SPLUNK_HOME/etc/{apps,users,system} subdirectories. In the case of search head pooling, the search peers need access to the pool's shared set of subdirectories. Important: The search peers use the mounted directories only when fulfilling the search head's search requests. For indexing and other purposes not directly related to distributed search, the search peers will use their own, local apps, users, and system directories, the same as any other indexer.

Configure mounted bundles


To set up mounted bundles, you need to configure both the search head and its search peers. The procedures described here assume the bundles are on shared
153

storage, but they do not need to be. They just need to be in some location that both the search head and its search peers can access. Note: It's best not to locate mounted bundles in the search head's local $SPLUNK_HOME path. These procedures also assume a single search head (no search head pooling). For details on how to configure mounted bundles with search head pooling, see "Use mounted bundles with search head pooling" below. Configure the search head Here are the steps you take on the search head: 1. Mount the bundle subdirectories ($SPLUNK_HOME/etc/{apps,users,system}) on shared storage. The simplest way to do this is to mount the search head's entire $SPLUNK_HOME/etc directory: On *nix platforms, set up an NFS mount. On Windows, set up a CIFS (SMB) share. Important: The search head's Splunk user account needs read/write access to the shared storage location. The search peers need read access to the bundle subdirectories. 2. In the distsearch.conf file on the search head, set:

shareBundles=false

This stops the search head from replicating bundles to the search peers. 3. Restart the search head. Configure the search peers For each search peer, follow these steps to access the mounted bundle: 1. Mount the bundle directory on the search peer. 2. Create a distsearch.conf file in $SPLUNK_HOME/etc/system/local/ on the search peer. For each search head that the peer is connected to, create a
154

[searchhead:<searchhead-splunk-server-name>]

stanza, with these attributes:

[searchhead:<searchhead-splunk-server-name>] mounted_bundles=true bundles_location=<path_to_bundles>

Note the following: The search peer's configuration file must contain only the [searchhead:<searchhead-splunk-server-name>] stanza(s). The other stanzas in distsearch.conf are for search heads only. To identify the <searchhead-splunk-server-name>, run this command on the search head:

splunk show servername

The <path_to_bundles> needs to specify the mountpoint on the search peer, not on the search head. For example, say $SPLUNK_HOME on your search head is /opt/splunk, and you export /opt/splunk/etc via NFS. Then, on the search peer, you mount that NFS share at /mnt/splunk-head. The value of <path_to_bundles> should be /mnt/splunk-head, not /opt/splunk. Important: If multiple search heads will be distributing searches to this search peer, you must create a separate stanza on the search peer for each of them. This is necessary even if you're using search head pooling. 3. Restart the search peer. Note: You can optionally set up symbolic links to the bundle subdirectories (apps,users,system) to ensure that the search peer has access only to the necessary subdirectories in the search head's /etc directory. See the following example for details on how to do this. Example configuration Here's an example of how to set up mounted bundles on shared storage:

155

Search head

On a search head whose Splunk server name is "searcher01": 1. Mount the search head's $SPLUNK_HOME/etc directory to shared storage with read/write access. 2. In the distsearch.conf file on the search head, set:

[distributedSearch] ... shareBundles = false

3. Restart the search head.


Search peers

For each search peer: 1. Mount the search head's $SPLUNK_HOME/etc directory on the search peer to:

/mnt/searcher01

2. (Optional.) Create a directory that consists of symbolic links to the bundle subdirectories:

/opt/shared_bundles/searcher01 /opt/shared_bundles/searcher01/system -> /mnt/searcher01/system /opt/shared_bundles/searcher01/users -> /mnt/searcher01/users /opt/shared_bundles/searcher01/apps -> /mnt/searcher01/apps

Note: This optional step is useful for ensuring that the peer has access only to the necessary subdirectories. 3. Create a distsearch.conf file in $SPLUNK_HOME/etc/system/local/ on the search peer, with this stanza:

[searchhead:searcher01] mounted_bundles = true bundles_location = /opt/shared_bundles/searcher01

4. Restart the search peer.


156

5. Repeat the process for each search peer.

Use mounted bundles with search head pooling


The process for configuring mounted bundles is basically no different if you're using search head pooling to manage multiple search heads. A few things to keep in mind: Use the same shared storage location for both the search head pool and the mounted bundles. Search head pooling uses a subset of the directories required for mounted bundles. Search head pooling itself only requires that you mount the $SPLUNK_HOME/etc/{apps,users} directories. However, when using mounted bundles, you must also provide a mounted $SPLUNK_HOME/etc/system directory. This doesn't create any conflict among the search heads, as they will always use their own versions of the system directory and ignore the mounted version. The search peers must create separate stanzas in distsearch.conf for each search head in the pool. The bundles_location in each of those stanzas must be identical. See "Configure search head pooling" for information on setting up a search head pool. Example configuration: Search head pooling with mounted bundles This example shows how to combine search head pooling and mounted bundles in one system. There are two main sections to the example: 1. Set up a search head pool consisting of two search heads. In this part, you also mount the bundles. 2. Set up the search peers so that they can access bundles from the search head pool. The example assumes you're using an NFS mount for the shared storage location.
Part 1: Set up the search head pool

Before configuring the pool, perform these preliminary steps:

157

1. Enable two Splunk instances as search heads. This example assumes that the instances are named "searcher01" and "searcher02". 2. Set up a shared storage location accessible to each search head. This example assumes that you set up an NFS mountpoint, specified on the search heads as /mnt/search-head-pooling. For detailed information on these steps, see "Create a pool of search heads". Now, configure the search head pool: 1. On each search head, stop splunkd:

splunk stop splunkd

2. On each search head, enable search head pooling. In this example, you're using an NFS mount of /mnt/search-head-pooling as your shared storage location:

splunk pooling enable /mnt/search-head-pooling [--debug]

Among other things, this step creates empty /etc/apps and /etc/users directories under /mnt/search-head-pooling. Step 3 uses those directories. 3. Copy the contents of the $SPLUNK_HOME/etc/apps and $SPLUNK_HOME/etc/users directories on one of the search heads into the /etc/apps and /etc/users subdirectories under /mnt/search-head-pooling:

cp -r $SPLUNK_HOME/etc/apps/* /mnt/search-head-pooling/etc/apps cp -r $SPLUNK_HOME/etc/users/* /mnt/search-head-pooling/etc/users

4. Copy one search head's $SPLUNK_HOME/etc/system directory to /mnt/search-head-pooling/etc/system.

cp -r $SPLUNK_HOME/etc/system /mnt/search-head-pooling/etc/

5. On each search head, edit the distsearch.conf file to set shareBundles = false:

158

[distributedSearch] ... shareBundles = false

6. On each search head, start splunkd:

splunk start splunkd

Your search head pool should now be up and running.


Part 2: Mount bundles on the search peers

Now, mount the bundles on the search peers. On each search peer, perform these steps: 1. Mount the shared storage location (the same location that was earlier set to /mnt/search-head-pooling on the search heads) so that it appears as /mnt/bundles on the peer. 2. Create a directory that consists of symbolic links to the bundle subdirectories:

/opt/shared_bundles/bundles/system -> /mnt/bundles/etc/system /opt/shared_bundles/bundles/users -> /mnt/bundles/etc/users /opt/shared_bundles/bundles/apps -> /mnt/bundles/etc/apps

3. Create a distsearch.conf file in $SPLUNK_HOME/etc/system/local/ on the search peer, with stanzas for each of the two search heads:

[searchhead:searcher01] mounted_bundles = true bundles_location = /opt/shared_bundles/bundles [searchhead:searcher02] mounted_bundles = true bundles_location = /opt/shared_bundles/bundles

4. Restart the search peer:

splunk restart splunkd

159

Repeat the process for each search peer.

Configure search head pooling


Important: Search head pooling is an advanced feature. It's recommended that you contact the Splunk sales team to discuss your deployment before attempting to implement it.

Search head pooling feature overview


You can set up multiple search heads so that they share configuration and user data. This is known as search head pooling. The main reason for having multiple search heads is to facilitate horizontal scaling when you have large numbers of users searching across the same data. Search head pooling can also reduce the impact if a search head becomes unavailable.

You must enable search head pooling on each search head, so that they can share configuration and user data. Once search head pooling has been enabled, these categories of objects will be available as common resources across all search heads in the pool: configuration data -- configuration files containing settings for saved searches and other knowledge objects. search artifacts, records of specific search runs. scheduler state, so that only one search head in the pool runs a particular scheduled search. For example, if you create and save a search on one search head, all the other search heads in the pool will automatically have access to it.
160

Search head pooling makes all files in $SPLUNK_HOME/etc/{apps,users} available for sharing. This includes *.conf files, *.meta files, view files, search scripts, lookup tables, etc. Important: Be aware of these key issues: Most shared storage solutions don't perform well across a WAN. Since search head pooling requires low-latency shared storage capable of serving a high number of operations per second, implementing search head pooling across a WAN is not supported. All search heads in a pool should be running the same version of Splunk. Be sure to upgrade all of them at once. See "Upgrade your distributed deployment" for details.

Search head pooling and knowledge bundles


The set of data that a search head distributes to its search peers is known as the knowledge bundle. For details, see "What search heads send to search peers". By default, only one search head in a search head pool sends the knowledge bundle to the set of search peers. Also, if search heads in a pool are also search peers of each other, they will not send bundles to each other, since they can access the bundles in the pool.This is an optimization introduced in version 4.3.2 but made the default in version 5.0. It is controllable by means of the useSHPBundleReplication attribute in distsearch.conf. As a further optimization, you can mount knowledge bundles on shared storage, as described in "Mount the knowledge bundle". By doing so, you eliminate the need to distribute the bundle to the search peers. For information on how to combine search head pooling with mounted knowledge bundles, read the section in that topic called "Use mounted bundles with search head pooling".

Create a pool of search heads


To create a pool of search heads, follow these steps: 1. Set up a shared storage location accessible to each search head. 2. Configure each individual search head. 3. Stop the search heads.
161

4. Enable pooling on each search head. 5. Copy user and app directories to the shared storage location. 6. Restart the search heads. The steps are described below in detail: 1. Set up a shared storage location accessible to each search head So that each search head in a pool can share configurations and artifacts, they need to access a common set of files via shared storage: On *nix platforms, set up an NFS mount. On Windows, set up a CIFS (SMB) share. Important: The Splunk user account needs read/write access to the shared storage location. When installing a search head on Windows, be sure to install it as a user with read/write access to shared storage. The Local System user does not have this access. For more information, see "Choose the user Splunk should run as" in the Installation manual. 2. Configure each search head a. Set up each search head individually, specifying the search peers in the usual fashion. See "Configure distributed search". b. Make sure that each search head has a unique serverName attribute, configured in server.conf. See "Manage distributed server names" for detailed information on this requirement. If the search head does not have a unique serverName, Splunk will generate a warning at start-up. See "Warning about unique serverName attribute" for details. c. Specify the necessary authentication. You have two choices: Specify user authentication on each search head separately. A valid user on one search head is not automatically a user on another search head in the pool. You can use LDAP to centrally manage user authentication, as described in "Set up user authentication with LDAP". Place a common authentication configuration on shared storage, to be used by all pool members. You must restart the pool members after any
162

change to the authentication. Note: Any authentication change made on an individual pool member (for example, via its Splunk Manager) overrides for that pool member only any configuration on shared storage. You should, therefore, generally avoid making authentication changes through Splunk Manager if a common configuration already exists on shared storage. 3. Stop the search heads Before enabling pooling, you must stop splunkd. Do this for each search head in the pool. 4. Enable pooling on each search head You use the pooling enable CLI command to enable pooling on a search head. The command sets certain values in server.conf. It also creates subdirectories within the shared storage location and validates that Splunk can create and move files within them. Here's the command syntax:
splunk pooling enable <path_to_shared_storage> [--debug]

Note: On NFS, <path_to_shared_storage> should be the NFS's share mountpoint. On Windows, <path_to_shared_storage> should be the UNC path of the CIFS/SMB share. The --debug parameter causes the command to log additional information to btool.log. Execute this command on each search head in the pool. The command sets values in the [pooling] stanza of the server.conf file in $SPLUNK_HOME/etc/system/local. For detailed information on server.conf, look here. 5. Copy user and app directories to the shared storage location Copy the contents of the $SPLUNK_HOME/etc/apps and $SPLUNK_HOME/etc/users directories on an existing search head into the empty /etc/apps and /etc/users
163

directories in the shared storage location. Those directories were created in step 4 and reside under the <path_to_shared_storage> that you specified at that time. For example, if your NFS mount is at /tmp/nfs, copy the apps subdirectories that match this pattern:

$SPLUNK_HOME/etc/apps/*

into

/tmp/nfs/etc/apps

This results in a set of subdirectories like:

/tmp/nfs/etc/apps/search /tmp/nfs/etc/apps/launcher /tmp/nfs/etc/apps/unix [...]

Similarly, copy the user subdirectories:

$SPLUNK_HOME/etc/users/*

into

/tmp/nfs/etc/users

Important: You can choose to copy over just a subset of apps and user subdirectories; however, be sure to move them to the precise locations described above. 6. Restart the search heads After running the pooling enable command, restart splunkd. Do this for each search head in the pool.

Use a load balancer


You will probably want to run a load balancer in front of your search heads. That way, users can access the pool of search heads through a single interface,
164

without needing to specify a particular one. Another reason for using a load balancer is to ensure access to search artifacts and results if one of the search heads goes down. Ordinarily, RSS and email alerts provide links to the search head where the search originated. If that search head goes down (and there's no load balancer), the artifacts and results become inaccessible. However, if you've got a load balancer in front, you can set the alerts so that they reference the load balancer instead of a particular search head. Configure the load balancer There are a couple issues to note when selecting and configuring the load balancer: The load balancer must employ layer-7 (application-level) processing. Configure the load balancer so that user sessions are "sticky" or "persistent". This ensures that the user remains on a single search head throughout their session. Generate alert links to the load balancer To generate alert links to the load balancer, you must edit alert_actions.conf: 1. Copy alert_actions.conf from a search head to the appropriate app directory in the shared storage location. In most cases, this will be /<path_to_shared_storage>/etc/apps/search/local. 2. Edit the hostname attribute to point to the load balancer:

hostname = <proxy host>:<port>

For details, see alert_actions.conf in the Admin manual. The alert links should now point to the load balancer, not the individual search heads.

Other pooling operations


Besides the pooling enable CLI command, there are several other commands that are important for managing search head pooling:
165

pooling validate pooling disable pooling display You must stop splunkd before running pooling enable or pooling disable. However, you can run pooling validate and pooling display while splunkd is either stopped or running. Validate that each search head has access to shared resources The pooling enable command validates search head access when you initially set up search head pooling. If you ever need to revalidate the search head's access to shared resources (for example, if you change the NFS configuration), you can run the pooling validate CLI command:
splunk pooling validate [--debug]

Disable search head pooling You can disable search head pooling with this CLI command:
splunk pooling disable [--debug]

Run this command for each search head that you need to disable. Important: Before running the pooling disable command, you must stop splunkd. After running the command, you should restart splunkd. Display pooling status You can use the pooling display CLI command to determine whether pooling is enabled on a search head:
splunk pooling display

This example shows how the system response varies depending on whether pooling is enabled:

$ splunk pooling enable /foo/bar $ splunk pooling display Search head pooling is enabled with shared storage at: /foo/bar $ splunk pooling disable $ splunk pooling display Search head pooling is disabled

166

Manage configuration changes


Important: Once pooling is enabled on a search head, you must notify the search head whenever you directly edit a configuration file. Specifically, if you add a stanza to any configuration file in a local directory, you must run the following command:

splunk btool fix-dangling

Note: This is not necessary if you make changes by means of Splunk Web Manager or the CLI.

Deployment server and search head pooling


With search head pooling, all search heads access a single set of configurations, so you don't need to use a deployment server or a third party deployment management tool like Puppet to push updates to multiple search heads. However, you might still want to use a deployment tool with search head pooling, in order to consolidate configuration operations across all Splunk instances. If you want to use the deployment server to manage your search head configuration, note the following: Designate one of the search heads as a deployment client by creating a deploymentclient.conf file in $SPLUNK_HOME/etc/system/local. You only need to designate one search head as a deployment client. In serverclass.conf on the deployment server, define a server class for the search head. Set its repositoryLocation attribute to the shared storage mountpoint on the search head. You can also specify the value in deploymentclient.conf on the search head, but in either case, the value must point to the shared storage mountpoint. For detailed information on the deployment server, see "About deployment server" and the topics that follow it.

Select timing for configuration refresh


You might want to configure search head pooling to synchronize from the storage location less frequently than the default. Particularly in deployments with large numbers of users (in the hundreds or thousands), Splunk can spend excessive
167

time reading configuration changes from the pool. In the server.conf file (typically $SPLUNK_HOME/etc/system/local/server.conf), the following settings affect configuration refresh timing:

# current defaults [pooling] poll.interval.rebuild = 2s poll.interval.check = 5s

With these default values, a change made on one search head would become available on another search head at most seven seconds later. There is usually no need for updates to be propagated that quickly. By changing the settings to use values in the scale of minutes, the load on the shared storage system is greatly reduced.

Answers
Have questions? Visit Splunk Answers and see what questions and answers the Splunk community has about search head pooling.

How authorization works in distributed searches


The authorization settings that a search peer uses when processing distributed searches are different from those which it uses for its local activities, such as administration and local search requests: When processing a distributed search, the search peer uses the settings contained in the knowledge bundle that the search head distributes to all the search peers when it sends them a search request. These settings are created and managed on the search head. When performing local activities, the search peer uses the authorization settings created and stored locally on the search peer itself. When managing distributed searches, it is therefore important that you distinguish between these two types of authorization. You need to be particularly aware of how authorization settings get distributed through the knowledge bundle when you're managing a system with search head pooling or mounted bundles. For background information on key concepts, read these topics:
168

Splunk authorization: "About role-based user access" in the Securing Splunk manual Mounted bundles: "Mount the knowledge bundle" in this chapter Search head pooling: "Configure search head pooling" in this chapter

Manage authorization for distributed searches


All authorization settings are stored in one or more authorize.conf files. This includes settings configured through Splunk Manager or the CLI. It is these authorize.conf files that get distributed from the search head to the search peers. On the knowledge bundle, the files are usually located in either /etc/system/{local,default} and/or /etc/apps/<app-name>/{local,default}. Since search peers automatically use the settings in the knowledge bundle, things normally work fine. You configure roles for your users on the search head, and the search head automatically distributes those configurations to the search peers when it distributes the search itself. With search head pooling, however, you must take care to ensure that the search heads and the search peers all use the same set of authorize.conf file(s). For this to happen, you must make sure: All search heads in the pool use the same set of authorize.conf files The set of authorize.conf files that the search heads use goes into the knowledge bundle so that they get distributed to the search peers. This topic describes the four main scenarios, based on whether or not you're using search head pooling or mounted bundles. It describes the scenarios in order from simple to complex.

Four scenarios
What you need to do with the distributed search authorize.conf files depends on whether your deployment implements search head pooling or mounted bundles. The four scenarios are: No search head pooling, no mounted bundles No search head pooling, mounted bundles Search head pooling, no mounted bundles Search head pooling, mounted bundles

169

The first two scenarios "just work" but the last two scenarios require careful planning. For the sake of completeness, this section describes all four scenarios. Note: These scenarios address authorization settings for distributed search only. Local authorization settings function the same independent of your distributed search deployment. No search head pooling, no mounted bundles Whatever authorization settings you have on the search head get automatically distributed to its search peers as part of the replicated knowledge bundle that they receive with distributed search requests. No search head pooling, mounted bundles Whatever authorization settings you have on the search head get automatically placed in the mounted bundle and used by the search peers during distributed search processing. Search head pooling, no mounted bundles The search heads in the pool share their /apps and /users directories but not their /etc/system/local directories. Any authorize.conf file in an /apps subdirectory will be automatically shared by all search heads and included in the knowledge bundle when any of the search heads distributes a search request to the search peers. The problem arises because authorization changes can also wind up in an authorize.conf file in a search head's /etc/system/local directory (for example, if you update the search head's authorization settings via Manager). This directory does not get shared among the search heads in the pool, but it still gets distributed to the search peers as part of the knowledge bundle. Unless you account for this situation, the search peers can end up using different authorization settings for different searches, depending on which search head distributed the search to them. For most situations, this is likely not what you want to occur. Therefore, you need to make sure that any changes made to a search head's /etc/system/local/authorize.conf file get propagated to all search heads in the pool. One way to handle this is to move any changed /etc/system/local/authorize.conf file into an app subdirectory, since all search heads in the pool share the /apps directory.

170

Search head pooling, mounted bundles This is similar to the previous scenario. The search heads in the pool share their /apps and /users directories but not their /etc/system/local directories. Any authorize.conf file in an /apps subdirectory will be automatically shared by all search heads. It will also be included in the mounted bundle that the search peers use when processing a search request from any of the search heads. However, authorization changes can also wind up in an authorize.conf file in a search head's /etc/system/local directory (for example, if you update the search head's authorization settings via Manager). This directory does not get automatically shared among the search heads in the pool. It also does not get automatically distributed to the mounted bundle that the search peers use. Therefore, you must provide some mechanism that ensures that all the search heads and all the search peers have access to that version of authorize.conf. The simplest way to handle this is to move any changed /etc/system/local/authorize.conf file into an app subdirectory, since both the pooled search heads and all the search peers share the /apps directory.

Use distributed search


From the user standpoint, specifying and running a distributed search is essentially the same as running any other search. Behind the scenes, the search head distributes the query to its search peers and consolidates the results when presenting them to the user. Users can limit the search peers that participate in a search. They also need to be aware of the distributed search configuration to troubleshoot.

Perform distributed searches


In general, you specify a distributed search through the same set of commands as for a local search. However, Splunk provides several additional commands and options to assist with controlling and limiting a distributed search. A search head by default runs its searches across all search peers in its cluster. You can limit a search to one or more search peers by specifying the splunk_server field in your query. See "Search across one or more distributed servers" in the User manual.

171

The search command localop is also of use in defining distributed searches. It enables you to limit the execution of subsequent commands to the search head. See the description of localop in the Search Reference for details and an example. In addition, the lookup command provides a local argument for use with distributed searches. If set to true, the lookup occurs only on the search head; if false, the lookup occurs on the search peers as well. This is particularly useful for scripted lookups, which replicate lookup tables. See the description of lookup in the Search Reference for details and an example.

Troubleshoot distributed search


This topic describes issues to be aware of when configuring or using distributed search.

General configuration issues


Clock skew between search heads and search peers can affect search behavior It's important to keep the clocks on your search heads and search peers in sync, via NTP (network time protocol) or some similar means. If the clocks are out-of-sync by more than a few seconds, you can end up with search failures or premature expiration of search artifacts.

Search head pooling configuration issues


When implementing search head pooling, there are a few potential issues you should be aware of, mainly having to do with coordination among search heads. Authentication and authorization changes made in Manager apply only to a single search head Authentication and authorization changes made through a search head's Manager apply only to that search head and not to other search heads in that pool. Each member of the pool maintains its local configurations in $SPLUNK_HOME/etc/system/local. To share configurations across the pool, set them up in shared storage, as described in "Configure search head pooling".

172

Clock skew between search heads and shared storage can affect search behavior It's important to keep the clocks on your search heads and shared storage server in sync, via NTP (network time protocol) or some similar means. If the clocks are out-of-sync by more than a few seconds, you can end up with search failures or premature expiration of search artifacts. Permission problems on the shared storage server can cause pooling failure On each search head, the user account Splunk runs as must have read/write permissions to the files on the shared storage server. NFS client concurrency limits can cause search timeouts or slow search behavior The search performance in a search head pool is a function of the throughput of the shared storage and the search workload. The combined effect of concurrent search users and concurrent scheduled searches running will yield a total IOPs that the shared volume needs to support. IOP requirements will also vary by the kind of searches run. To adequately provision a device to be shared between search heads, you need to know the number of concurrent users submitting searches and the number of jobs/apps that will be executed simultaneously. If searches are timing out or running slowly, you might be exhausting the maximum number of concurrent requests supported by the NFS client. To solve this problem, increase your client concurrency limit. For example, on a Linux NFS client, adjust the tcp_slot_table_entries setting. NFS latency for large user count can incur splunk configuration access latency or slow dispatch reaping Splunk synchronizes the search head pool storage configuration state with the in-memory state when it detects changes. Essentially, it reads the configuration into memory when it detects updates. When dealing either with overloaded search pool storage or with large numbers of users, apps, and configuration files, this synchronization process can reduce performance. To mitigate this, the minimum frequency of reading can be increased, as discussed in "Select timing for configuration refresh".

173

Warning about unique serverName attribute Each search head in the pool must have a unique serverName attribute. Splunk validates this condition when each search head starts. If it finds a problem, it generates this error message:

serverName "<xxx>" has already been claimed by a member of this search head pool in <full path to pooling.ini on shared storage> There was an error validating your search head pooling configuration. For more information, run 'splunk pooling validate'

The most common cause of this error is that another search head in the pool is already using the current search head's serverName. To fix the problem, change the current search head's serverName attribute in .system/local/server.conf. There are a few other conditions that also can generate this error: The current search head's serverName has been changed. The current search head's GUID has been changed. This is usually due to /etc/instance.cfg being deleted. To fix these problems, run

splunk pooling replace-member

This updates the pooling.ini file with the current search head's serverName->GUID mapping, overwriting any previous mapping. Artifacts and incorrectly-displayed items in Manager UI after upgrade When upgrading pooled search heads, you must copy all updated apps - even those that ship with Splunk (such as the Search app and the data preview feature, which is implemented as an app) - to the search head pool's shared storage after the upgrade is complete. If you do not, you might see artifacts or other incorrectly-displayed items in Manager. To fix the problem, copy all updated apps from an upgraded search head to the shared storage for the search head pool, taking care to exclude the local sub-directory of each app.

174

Important: Excluding the local sub-directory of each app from the copy process prevents the overwriting of configuration files on the shared storage with local copies of configuration files. Once the apps have been copied, restart Splunk on all search heads in the pool.

Distributed search error messages


This table lists some of the more common search-time error messages associated with distributed search: Error message
status=down status=not a splunk server duplicate license certificate mismatch

Meaning
The specified remote peer is not available. The specified remote peer is not a Splunk server. The specified remote peer is using a duplicate license. Authentication with the specified remote peer failed.

175

Monitor your deployment


About Splunk Deployment Monitor App
In pre-5.0 versions of Splunk, the Splunk Deployment Monitor App was included as part of the core distribution. This is no longer the case. You now need to download it from Splunkbase. However, if you are migrating from a pre-5.0 version of Splunk with an enabled Deployment Monitor App, the app will remain on your system. In addition, documentation for the Deployment Monitor App is no longer located here in the Distributed Deployment Manual. Instead, to learn about the latest version of the app, read the Deploy and Use Splunk Deployment Monitor App manual.

176

Deploy configuration updates and apps across your environment


About deployment server
Important: The deployment server handles configuration and content updates to existing Splunk installations. You cannot use it to install or upgrade Splunk software components. To learn how to install and deploy Splunk, see "Step-by-step installation procedures" for full Splunk and "Universal forwarder deployment overview" for the Splunk universal forwarder. To learn how to upgrade your deployment to a new version of Splunk, see "Upgrade your deployment". The deployment server is Splunk's tool for pushing out configurations, apps, and content updates to distributed Splunk instances. You can use it to push updates to any Splunk component: forwarder, indexer, or search head. A key use case is to manage configuration for groups of forwarders. For example, if you have several sets of forwarders, each set residing on a different machine type, you can use the deployment server to push out different content according to machine type. Similarly, in a distributed search environment, you can use a deployment server to push out content to sets of indexers. Important: Do not use deployment server to manage configuration files across nodes in a cluster. Instead, use the configuration bundle method discussed in "Update common peer configurations" in the Managing Indexers and Clusters manual. The first several topics in this section explain how to configure a deployment server and its clients. Topics then follow that show how to employ this technology for specific use cases.

The big picture (in words and diagram)


In a Splunk deployment, you use a deployment server to push out content and configurations (collectively called deployment apps) to deployment clients, grouped into server classes.
177

A deployment server is a Splunk instance that acts as a centralized configuration manager, collectively managing any number of Splunk instances, called "deployment clients". Any full, enterprise Splunk instance -- even one indexing data locally -- can act as a deployment server. A deployment client is a Splunk instance remotely configured by a deployment server. A Splunk instance can be both a deployment server and client at the same time. Each deployment client belongs to one or more server classes. A server class is a set of deployment clients, grouped by configuration characteristics, managed as a unit. You can group clients by application, OS, type of data, or any other feature of your Splunk deployment. To update the configuration for a set of clients, the deployment server pushes configuration files to all or some members of a server class. Besides configuration files, you can push any sort of content. You configure server classes on the deployment server. This diagram provides a conceptual overview of the relationship between a deployment server and its set of deployment clients and server classes:

In this example, each deployment client is a Splunk forwarder that belongs to two server classes, one for its OS and the other for its geographical location. The deployment server maintains the list of server classes and uses those server classes to determine what content to push to each client. For an example of how to implement this type of arrangement to govern the flow of content to clients, see "Deploy several forwarders". A deployment app is a set of deployment content (including configuration files) deployed as a unit to clients of a server class. A deployment app might consist of just a single configuration file, or it can consist of many files. Depending on filtering criteria, an app might get deployed to all clients in a server class or to a subset of clients. Over time, an app can be updated with new content and then redeployed to its designated clients. The deployment app can be an existing
178

Splunk app, or one developed solely to group some content for deployment purposes. Note: The term "app" has a somewhat different meaning in the context of the deployment server from its meaning in the general Splunk context. For more information on Splunk apps in general, see "What are apps and add-ons?". For more information on deployment servers, server classes, and deployment apps, see "Define server classes". For more information on deployment clients, see "Configure deployment clients". A multi-tenant environment means that you have more than one deployment server running on the same Splunk instance, and each deployment server is serving content to its own set of deployment clients. For information about multi-tenant environments, see "Deploy in multi-tenant environments".

Key terms
Here's a recap of the key definitions: Term
deployment server deployment client server class deployment app multi-tenant environment

Meaning
A Splunk instance that acts as a centralized configuration manager. It pushes configuration updates to other Splunk instances. A remotely configured Splunk instance. It receives updates from the deployment server. A deployment configuration category shared by a group of deployment clients. A deployment client can belong to multiple server classes. A unit of content deployed to one or more members of a server class or classes. A deployment environment involving multiple deployment servers.

Communication between deployment server and clients


Each deployment client periodically polls the deployment server, identifying itself. The deployment server then determines whether it has new or updated content to push to that particular client. If there is content, the deployment server tells the client, which then retrieves the content and treats it according to the instructions for the server class it belongs to. Depending on those instructions, the client might restart, run a script, or wait for further instructions.

179

Lookup tables and deployment server


In some cases, your indexers or search heads might be running apps that save information in lookup tables. Be careful about using the deployment server to manage such instances. When the deployment server pushes an updated app configuration, it overwrites the existing app. At that point, you'll lose those lookup tables.

Plan a deployment
If you've got Splunk instances serving a variety of different groups within your organization, chances are their configurations vary depending on who uses them and for what purpose. You might have some Splunk instances serving the help desk team, configured with a specific app to accelerate troubleshooting of Windows desktop issues. You might have another group of Splunk instances in use by your operations staff, set up with a few different apps designed to emphasize tracking of network issues, security incidents, and email traffic management. A third group of Splunk instances might serve the Web hosting group within the operations team. Rather than trying to manage and maintain these divergent Splunk instances one at a time, you can group them based on their use, identify the configurations and apps needed by each group, and then use the deployment server to update their apps and configurations as needed. In addition to grouping Splunk instances by use, there are other useful types of groupings you can specify. For example, you might group Splunk instances by OS or hardware type, by version, or by geographical location or timezone.

Configuration overview
For the great majority of deployment server configurations, perform these steps: 1. Designate one of your Splunk instances as the deployment server. Note: While in small environments (fewer than 30 deployment clients), it may be perfectly viable to provide the deployment server service from an indexer or search head node, Splunk strongly recommends putting the deployment server on its own Splunk instance when using it with larger numbers of clients. Another thing to consider is the need to restart the deployment server when making certain configuration changes, which may affect user searches if it shares a
180

system with a search head. For additional information about deployment server sizing, refer to this topic about the deployment server on the Splunk Community Wiki. 2. Group the deployment clients into server classes. A server class defines the clients that belong to it and what content gets pushed out to them. Each deployment client can belong to multiple server classes. 3. Create a serverclass.conf file on the deployment server. It specifies the server classes and the location of the deployment apps. Refer to "Define server classes" in this manual for details. Note: You can also add server classes and perform simple configuration through Splunk Manager, as described in "Define server classes". 4. Create the directories for your deployment apps, and put the content to be deployed into those directories. Refer to "Deploy apps and configurations" in this manual for details. 5. On each deployment client, create a deploymentclient.conf file. It specifies what deployment server the client should communicate with, the specific location on that server from which it should pick up content, and where it should put it locally. Refer to "Configure deployment clients" in this manual for details. 6. For more complex deployments with multiple deployment servers, create a tenants.conf file on one of the deployment servers. This allows you to define multiple deployment servers on a single Splunk instance and redirect incoming client requests to a specific server according to rules you specify. Refer to "Deploy in multi-tenant environments" in this manual for more information about configuring tenants.conf. Most deployment server topologies don't need tenants.conf, however. For an example of an end-to-end configuration, see "Deploy several forwarders". Note: The deployment server and its deployment clients must agree in the SSL setting for their splunkd management ports. They must all have SSL enabled, or they must all have SSL disabled. To configure SSL on a Splunk instance, set the enableSplunkdSSL attribute in server.conf to "true" or "false". For detailed information on using SSL with deployment server, see "Securing deployment server and clients".

181

Restart or reload?
The first time you configure the deployment server and its clients, you'll need to restart all instances of Splunk. When you restart the deployment server, it automatically deploys any new content to its clients. Later on, to deploy new or updated content without restarting, you can use the CLI reload command, as described in "Deploy apps and configurations" in this manual.

Enable and disable deployment server using the CLI


To enable a deployment server, run the following command from the Splunk CLI:
./splunk enable deploy-server

Now restart the deployment server to make the change take effect. To disable a deployment server, run the following command:
./splunk disable deploy-server

Configure deployment clients


This topic explains how to set up deployment clients to receive content from a deployment server. Each deployment client belongs to one or more server classes. Server classes define what content a deployment client should download, based on qualifying criteria. Two configuration files control this process: The deployment server has a serverclass.conf file, which specifies its set of server classes. See "Define server classes" for details on how to configure this file. Each deployment client has a deploymentclient.conf file, which specifies which deployment server it contacts for content, along with some associated information. The current topic describes how to configure this file. You can view deployment client settings in Splunk Manager. Go to Splunk Manager > Deployment and select the Deployment client row.

182

For information about configuration files, including an explanation of their basic structure, see About configuration files in the Admin manual. The next section provides a reference for the deployment client configuration settings. You might want to read it while referring to the set of simple example configurations presented later in this topic. In addition, there are several longer and more complete examples in later topics in this manual, including "Deploy several forwarders". Important: Although a deployment server can be a deployment client of another deployment server, it cannot be a deployment client of itself.

Enable a deployment client


To enable a Splunk instance as a deployment client, create a deploymentclient.conf file in $SPLUNK_HOME/etc/system/local. You can also enable a deployment client through the CLI. See "Enable and disable deployment clients using the CLI", later in this topic. The deploymentclient.conf file provides two stanzas: Stanza
[deployment-client]

Meaning
Includes a number of configuration attributes, including where to find new or updated content. The default values for this stanza usually work fine. Specifies the location of this client's deployment server. deploymentServer is the default

name for a deployment server. This is the most important configuration for this file. These are the main attributes available in the [deployment-client] stanza: Attribute
disabled Set to true

[target-broker:deploymentServer]

What it's for or false. If true, it disables the deployment client.


A name the deployment server can use to filter on. It takes precedence over hostnames. A temporary folder used by the deployment client to download server classes and applications. false

clientName

deploymentClient

workingDir

$SPLUNK_HOME/var/run/dep

183

The repository location where deployment apps are installed after being downloaded from a deployment server.

Apps must be installed in the default location ($SPLUNK_HOME/etc/apps) or Splunk will be unable to recognize them.
repositoryLocation $SPLUNK_HOME/etc/apps

Settings defined in on the deployment server can also specify a repository location for a deployment client. The deployment client uses its
serverclass.conf serverRepositoryLocationPolicy

attribute to determine which setting to use.


Set to one of these values:

serverRepositoryLocationPolicy

acceptSplunkHome - Accept the repository location supplied by the deployment server, if and only if it is rooted by $SPLUNK_HOME. acceptAlways - Always accept the repository acceptSplunkHome location supplied by the deployment server. rejectAlways - Always reject the server-supplied repository location. Instead, use the
repositoryLocation

specified in this configuration file.


endpoint The HTTP endpoint from which content should be downloaded.

$deploymentServerUri$/se

184

Note: The deployment server can specify a different endpoint from which to download each set of content (individual apps, etc). The deployment client uses the serverEndpointPolicy attribute to determine which value to use.
$deploymentServerUri$ resolves to targetUri, defined in the [target-broker] stanza. $serviceClassName$ and $appName$ mean what they Set to one of these values:

say.

serverEndpointPolicy

acceptAlways - Always accept the endpoint supplied by the deployment server. rejectAlways - Always reject the endpoint supplied by the deployment server. Instead, use the endpoint defined by the endpoint attribute.
A number that determines how frequently the deployment client should check for new content.

acceptAlways

phoneHomeIntervalInSecs

60

In the [target-broker:deploymentServer] stanza,you specify the deployment server for this client: Attribute What it's for
Set to <Deployment_server_URI>:<Mgmt_port>.

Default

Specifies targetUri the deployment server connection information. The n/a management port is typically 8089. Note: For a complete list of attribute values, see the deploymentclient.conf specification file in the Admin manual.

185

Examples
Here are several examples of defining deployment clients through the deploymentclient.conf file:

# Example 1 # Deployment client receives apps, placing repositoryLocation locally, # relative to $SPLUNK_HOME, that it picked $SPLUNK_HOME/etc/apps. # There is nothing in [deployment-client], client is not overriding # the location value set on the deployment [deployment-client]

them into the same them up from - typically because the deployment server side.

[target-broker:deploymentServer] targetUri= deploymentserver.splunk.mycompany.com:8089

# Example 2 # Deployment server keeps apps to be deployed in a non-standard location on the server side # (perhaps for organizational purposes). # Deployment client receives apps and places them in the standard location. # This configuration rejects any location specified by the deployment server and replaces # it with the standard client-side location. [deployment-client] serverRepositoryLocationPolicy = rejectAlways repositoryLocation = $SPLUNK_HOME/etc/apps [target-broker:deploymentServer] targetUri= deploymentserver.splunk.mycompany.com:8089

# Example 3 # Deployment client should get apps from an HTTP server that is different from the one # specified by the deployment server. [deployment-client] serverEndpointPolicy = rejectAlways endpoint = http://apache.mycompany.server:8080/$serviceClassName$/$appName$.tar [target-broker:deploymentServer]

186

targetUri= deploymentserver.splunk.mycompany.com:8089

# Example 4 # Deployment client should get apps from a location on the file system and not from a # location specified by the deployment server. [deployment-client] serverEndpointPolicy = rejectAlways endpoint = file:/<some_mount_point>/$serviceClassName$/$appName$.tar [target-broker:deploymentServer] targetUri= deploymentserver.splunk.mycompany.com:8089

Enable and disable deployment clients using the CLI


To enable a Splunk instance as a deployment client, run this command on the client:
splunk set deploy-poll <IP address/hostname>:<port>

You specify the IP address/hostname and management port of the deployment server you want the client to connect with. You must restart the deployment client for the change to take effect. To disable a deployment client, run this command on the deployment client:
splunk disable deploy-client

Define server classes


A server class defines a deployment configuration shared by a group of deployment clients. It defines both the criteria for being a member of the class and the set of content to deploy to members of the class. This content (encapsulated as "deployment apps") can consist of Splunk apps, Splunk configurations, and other related content, such as scripts, images, and supporting material. You can define different server classes to reflect the different requirements, OSes, machine types, or functions of your deployment clients. You define server classes in serverclass.conf on the deployment server. You can also define server classes and perform basic configuration through the
187

Splunk Manager. To perform advanced configuration tasks, however, you'll need to edit serverclass.conf.

Use serverclass.conf
You can define server classes in serverclass.conf on the deployment server. Create one in $SPLUNK_HOME/etc/system/local. For information about configuration files, including an explanation of their basic structure, see "About configuration files" in the Admin manual. If you have multiple server classes, you might want to define one server class that applies to all deployment clients by default. You can then override various aspects of it as needed by defining more specific server classes. For example, if you have a mix of Windows and Linux universal forwarders sending data to the same indexer, you might want to make sure that all forwarders get a common outputs.conf file, but that Windows forwarders get one inputs.conf file while Linux forwarders get a different one. In that case, you could define an "all forwarder" server class that distributes a deployment app containing the outputs.conf file to all forwarders, while also defining Windows and Linux server classes that distribute separate apps containing the different inputs.conf files to the appropriate subsets of forwarders. In addition to defining attributes and content for specific server classes, you can also define attributes that pertain just to a single app within a server class. A deployment client has its own configuration, defined in deploymentclient.conf. The information in deploymentclient.conf tells the deployment client where to go to get the content that the server class it belongs to says it should have. The next section provides a reference for the server class configuration settings. You might want to read it while referring to the set of simple example configurations presented later in this topic. In addition, there are several longer and more complete examples presented later in this manual, including "Deploy several forwarders". What you can configure for a server class You can specify settings for a global server class, as well as for individual server classes or apps within server classes. There are three levels of stanzas to enable this: Stanza
188

Meaning

Scope

[global]

The global server class.

Attributes defined here pertain to all server classes.

Attributes defined here pertain to just the server Individual class server class. A <serverClassName>. serverClass [serverClass:<serverClassName>]

is a Note: collection of <serverClassName> apps. cannot contain spaces.


Attributes defined here pertain to just the specified deployment app <appName> within

the specified
<serverClassName>.

To indicate all apps within App within <serverClassName>, [serverClass:<serverClassName>:app:<appName>] server class. <appName> can be the wildcard character: *, in which case it will cause all content in the repositoryLocation to be added to this serverClass. Important: When defining app names, you should be aware of the rules of configuration file precedence, as described in the topic "Configuration file precedence" in the Admin manual. In particular, note that app directories are evaluated by ASCII sort order. For example, if you set an attribute/value pair whatever=1 in the file x.conf in an app directory named "A", the setting in app A overrides the setting whatever=0 in x.conf in an app named "B", etc. For details, see the subtopic "How app names affect precedence". Attributes in more specific stanzas override less specific stanzas. Therefore, an attribute defined in a [serverClass:<serverClassName>] stanza will override the same attribute defined in [global]. The attributes are definable for each stanza level, unless otherwise indicated. Here are the most common ones:
189

Attribute
repositoryLocation

What it's for


The location on the deployment server where the content to be deployed for this server class is stored. $SPLUNK_HOME/etc/deployment-apps

The location on the deployment client where the content to be deployed for this server class should be installed. You can targetRepositoryLocation override this in deploymentclient.conf on the

$SPLUNK_HOME/etc/apps

deployment client.
If set to false, the deployment server will look through the list of server classes in this configuration file and stop when it matches the first one to a client. If set to true, the deployment server will continue to look and match. This option is available true because you can define multiple, layered sets of server classes. A serverClass

continueMatching

can override this property and stop the matching.


The HTTP location from which content can be downloaded by a deployment client. The deployment server fills in the variable substitutions itself, based on information received from the client. You can provide any URI here, as long as it uses the same variables. In most cases, this attribute does not need to be specified. Set to

endpoint

$deploymentServerUri$/services/stre

filterType

"whitelist" or "blacklist".

whitelist

This determines the order of execution of filters. If filterType is whitelist, all whitelist filters are applied first, followed by blacklist filters. If filterType is blacklist, all blacklist filters are applied first, followed by whitelist filters. The whitelist setting indicates a filtering strategy that pulls in a subset: Items are not considered to
190

match the stanza by default. Items that match any whitelist entry, and do not match any blacklist entry, are considered to match the stanza. Items that match any blacklist entry are not considered to match the stanza, regardless of whitelist. The blacklist setting indicates a filtering strategy that rules out a subset: Items are considered to match the stanza by default. Items that match any blacklist entry, and do not match any whitelist entry, are considered to not match the stanza. Items that match any whitelist entry are considered to match the stanza. More briefly: whitelist: default no-match -> whitelists enable -> blacklists disable blacklist: default match -> blacklists disable-> whitelists enable You can override this value at the serverClass and serverClass:app levels. If you specify whitelist at the global level, and then specify blacklist for an individual server class, the setting becomes
191

for that server class, and you have to provide another filter in that server class definition to replace the one you overrode.
blacklist whitelist.<n> <n> is a number starting at 0, and incrementing by 1. n/a

blacklist.<n>

Set the attribute to ipAddress, hostname, or clientName: ipAddress is the IP address of the deployment client. Can use wildcards, such as 10.1.1.* hostname is the host name of deployment client. Can use wildcards, such as *.splunk.com. clientName is a logical or tag name that can be assigned to a deployment client in deploymentclient.conf. clientName takes precedence over ipAddress or hostname when matching a client to a filter. Here are some examples: When filterType is whitelist:

whitelist.0=*.fflanda.com blacklist.0=printer.fflanda.com blacklist.1=scanner.fflanda.com

This will cause all hosts in fflanda.com, except 'printer' and 'scanner' to match this server class. When filterType is blacklist:

192

blacklist.0=* whitelist.0=*.web.fflanda.com whitelist.1=*.linux.fflanda.com

This will cause only the 'web' and 'linux' hosts to match the server class. No other hosts will match. You can override this value at the serverClass and serverClass:app levels. Important: Overriding one type of filter (whitelist/blacklist) causes the other to be overridden too. If, for example, you override the whitelist, the blacklist will not be inherited from the parent; you must provide one in the stanza.
Set to "enabled", "disabled", or "noop". This setting specifies whether the deployment client receiving an app should enable or disable the app once it is enabled installed. The "noop" value is for apps that do not require enablement; for example, apps containing only Splunk knowledge, such as event or source types. Matches any of the machine types in a comma-separated list. n/a

stateOnClient

machineTypesFilter

This setting lets you use the hardware type of the deployment client as a filter. This filter will be used only if a client matches the whitelist/blacklist filters. The value for machineTypesFilter is a comma-separated list of machine types; for example, linux-i686, linux-x86_64, etc. Each machine type is a specific string designated by the hardware platform itself.
193

Note: A machineTypesFilter value can contain wildcards; for example: linux-*, windows-*, or
aix-*

The method for finding this string on the client varies by platform, but if the deployment client is already connected to the deployment server, you can determine the string's value by using this Splunk CLI command on the deployment server:

./splunk list deploy-clients

This will return a value for utsname that you can use to specify machineTypesFilter. This setting will match any of the machine types in a comma-delimited list. Commonly-used machine types are linux-x86_64, windows-x64, linux-i686, freebsd-i386, darwin-i386, sunos-sun4u, linux-x86_64, sunos-i86pc, freebsd-amd64.
restartSplunkWeb Set to "true" or "false". Determines whether the client's Splunk Web restarts after the installation of a server class or app. Set to "true" or "false". Determines whether the client's splunkd restarts false

after the installation of a server class or app. Note: The most accurate and up-to-date list of settings available for a given configuration file is in the .spec file for that configuration file. You can find the latest version of the .spec and .example files for serverclass.conf in serverclass.conf in the Configuration file reference in the Admin manual, or in $SPLUNK_HOME/etc/system/README.
194

restartSplunkd

false

Examples Here are several examples of defining server classes in the serverclass.conf file:

# Example 1 # Matches all clients and includes all apps in the server class [global] whitelist.0=* # whitelist matches all clients. [serverClass:AllApps] [serverClass:AllApps:app:*] # a server class that encapsulates all apps in the repositoryLocation

# Example 2 # Assign server classes based on hostnames. [global] [serverClass:AppsForOps] whitelist.0=*.ops.yourcompany.com [serverClass:AppsForOps:app:unix] [serverClass:AppsForOps:app:SplunkLightForwarder] [serverClass:AppsForDesktops] filterType=blacklist # blacklist everybody except the Windows desktop machines. blacklist.0=* whitelist.0=*.desktops.yourcompany.com [serverClass:AppsForDesktops:app:SplunkDesktop]

# Example 3 # Deploy server class based on machine types [global] # whitelist.0=* at the global level ensures that the machineTypesFilter attribute # invoked later will apply. whitelist.0=* [serverClass:AppsByMachineType] machineTypesFilter=windows-*, linux-i686, linux-x86_64 [serverClass:AppsByMachineType:app:SplunkDesktop] # Deploy this app only to Windows boxes. machineTypesFilter=windows-*

195

[serverClass:AppsByMachineType:app:unix] # Deploy this app only to unix boxes - 32/64 bit. machineTypesFilter=linux-i686, linux-x86_64

Use Splunk Manager


To define deployment server classes in Splunk Manager: 1. Go to Splunk Manager > Deployment. 2. Select Add new from the Deployment server row. 3. Enter values for the Server class and Repository location fields. 4. Optionally, enter values for whitelists or blacklists. For more information on what to enter in these fields, see the descriptions for the corresponding attributes in "What you can configure for a server class", located above. You can only configure the most basic options using Splunk Manager. For advanced configurations, edit serverclass.conf directly.

Deploy apps and configurations


After configuring the deployment server and clients, you're ready to distribute content to the deployment clients. This involves two steps: 1. Put the new or updated content into deployment directories on the deployment server. 2. Inform the clients that it's time to download new content.

Put the content into directories on the deployment server


You place the deployment apps into individual subdirectories in a special location on the deployment server. From there, the deployment server can push the content to its deployment clients. The default location is $SPLUNK_HOME/etc/deployment-apps, but this is configurable through the repositoryLocation attribute in serverclass.conf. Underneath this location, each
196

app must have its own subdirectory, with the same name as the app itself. The app name is specified in serverclass.conf. Note: On the deployment clients, the downloaded apps reside in a different location, which defaults to $SPLUNK_HOME/etc/apps. This location is configurable in deploymentclient.conf, but, for most purposes, you should not change it from the default. This example creates a deployment directory in the default repository location on the deployment server, for an app named "fwd_to_splunk1":

mkdir ?p $SPLUNK_HOME/etc/deployment-apps/fwd_to_splunk1/default

Place the content for each app into the app's subdirectory. To update the app with new or changed content, just add or overwrite the files in the directory.

Inform the clients of new content


When you first configure the deployment server, and whenever you update its configuration by editing serverclass.conf, you'll need to restart or reload it for the changes to take effect. The clients will then pick up any new or changed content. Instead of restarting the server, you can use the CLI reload deploy-server command. This version of the command checks all server classes for change and notifies the relevant clients:

splunk reload deploy-server

This version of the command notifies and updates only the server class you specify:

splunk reload deploy-server -class <server class>

For example:

splunk reload deploy-server -class www

In this example, the command notifies and updates only the clients that are members of the "www" server class.
197

Confirm the deployment update


To confirm that all clients received the configuration correctly, run this command from the deployment server:

splunk list deploy-clients

This lists all the deployment clients that have contacted the deployment server within the last two minutes. It specifies the last time that they successfully synced.

App management issues


Once you start using the deployment server to manage an app, you cannot later stop using the deployment server to manage the app. It is important to understand the implications of this. If you remove an app from the deployment server's repositoryLocation (defined in serverclass.conf), the deployment client will delete its copy of the app. There is no way instead to tell the deployment client to start managing the app on its own. For example, say you are using the deployment server to manage updates to "appA". To do this, you have created a directory called "appA" in the repositoryLocation on the deployment server and you have placed the app's contents there. From now on, whenever the deployment clients poll the server to check for updates, they compare their checksum for appA with the server's checksum for appA. If the checksums differ, the clients download the latest version of the app from the server. However, if appA has been deleted from the server's app repository, the client behavior is to delete their own instances of the app. Therefore, by deleting an app from the deployment server's repositoryLocation, you are not telling the clients to stop using the deployment server to manage the app and to start managing it on their own. Instead, you're actually telling them to delete the app. Once the deployment server manages an app, it always manages that app. Warning: Because of this behavior, you should be extremely cautious before deciding to use the deployment server to manage the Splunk search app. By managing the search app through the deployment server, you are preventing users from saving any unique searches on their search heads. And since there's
198

no way to tell the deployment server to stop managing an app, you're effectively stuck with that decision.

Extended example: deploy configurations to several forwarders


What we're aiming for
A common use for deployment servers is to manage configuration files for forwarders. In some distributed environments, forwarders can number into the thousands, and the deployment server greatly eases the work of configuring and updating them. This example shows how to use the deployment server to initially configure a set of dissimilar universal forwarders. A follow-up example, in "Example: add an input to forwarders", shows how to use the deployment server to update the forwarders' configurations with new inputs. Before reading this example, you should already be familiar with how forwarders and receivers communicate, as described in "About forwarding and receiving" and the topics that follow it. The example sets up the following distributed environment, in which a deployment server deploys configurations for three universal forwarders sending data to two indexers: The deployment server Fflanda-SPLUNK3 (10.1.2.4) manages deployment for these universal forwarders: Fflanda-WIN1 Fflanda-LINUX1 Fflanda-LINUX2 Fflanda-WIN1 forwards Windows event logs to the receiving indexer Fflanda-SPLUNK1 (10.1.2.2). Fflanda-LINUX1 and Fflanda-LINUX2 forward Linux messages to the receiving indexer Fflanda-SPLUNK2 (10.1.2.3). The forwarders are deployment clients, receiving their configuration files from the deployment server. Here's the basic set up:

199

For information on forwarders (including the universal forwarder), see "About forwarding and receiving" in this manual. For information on monitoring Windows event log data, see "Monitor Windows event log data". For information on monitoring files, such as message logs, see "Monitor files and directories".

Overview of the set up


Here's an overview of the set up process (the detailed steps follow in the next section): On the deployment server: 1. Create the set of server classes and apps for the deployment clients (forwarders) in serverclass.conf. You'll create two server classes to represent the two OS types (Windows, Linux). For each server class, you'll also create two deployment apps, for a total of four apps. The apps encapsulate: The type of input -- the data that the universal forwarder will monitor (Windows event logs or Linux messages). The type of output -- the indexer the forwarder will send data to (SPLUNK1 or SPLUNK2). This configuration results in each universal forwarder belonging to a server class and receiving two apps: one for its inputs and one for its outputs. 2. Create directories to hold the deployment apps.
200

3. Create configuration files (outputs.conf and inputs.conf) to deploy to the forwarders. These files constitute the deployment apps and reside in the app directories. 4. Restart the deployment server. On each Splunk indexer that will be receiving data from universal forwarders: 1. Enable receiving through the Splunk CLI. 2. Restart the indexer. On each forwarder/deployment client: 1. Create a deploymentclient.conf file that points to the deployment server. 2. Restart the forwarder. The rest is Splunk magic. After a short delay (while the forwarders receive and act upon their deployed content), Windows event logs begin flowing from Fflanda-WIN1 to Fflanda-SPLUNK1, and /var/log/messages begin flowing from Fflanda-LINUX1 and Fflanda-LINUX2 to Fflanda-SPLUNK2.

Detailed configuration steps


On the deployment server, Fflanda-SPLUNK3: 1. Install Splunk, if you haven't already done so. 2. Set up your server classes. Create
$SPLUNK_HOME/etc/system/local/serverclass.conf

with the following settings:

# Global server class [global] # Filter (whitelist) all clients whitelist.0=* # Server class for Windows [serverClass:Fflanda-WIN] # Filter (whitelist) all Windows clients whitelist.0=Fflanda-WIN* # App for inputting Windows event logs

201

# This app is only for clients in the server class Fflanda-WIN [serverClass:Fflanda-WIN:app:winevt] #Enable the app and restart Splunk, after the client receives the app stateOnClient=enabled restartSplunkd=true # App for forwarding to SPLUNK1 # This app is only for clients in the server class Fflanda-WIN [serverClass:Fflanda-WIN:app:fwd_to_splunk1] stateOnClient=enabled restartSplunkd=true # Server class for Linux [serverClass:Fflanda-LINUX] # Filter (whitelist) all Linux clients whitelist.0=Fflanda-LINUX* # App for inputting Linux messages # This app is only for clients in the server class Fflanda-LINUX [serverClass:Fflanda-LINUX:app:linmess] stateOnClient=enabled restartSplunkd=true # App for forwarding to SPLUNK2 # This app is only for clients in the server class Fflanda-LINUX [serverClass:Fflanda-LINUX:app:fwd_to_splunk2] stateOnClient=enabled restartSplunkd=true

See "Define server classes" for details on how to configure this file. 3. Create the app deployment directories with the following commands:

mkdir mkdir mkdir mkdir

?p ?p ?p ?p

$SPLUNK_HOME/etc/deployment-apps/fwd_to_splunk1/default $SPLUNK_HOME/etc/deployment-apps/fwd_to_splunk2/default $SPLUNK_HOME/etc/deployment-apps/winevt/default $SPLUNK_HOME/etc/deployment-apps/linmess/default

Each app gets its own directory, so that they can be deployed individually. In addition, the directory name determines the name of the app. 4. Create
$SPLUNK_HOME/etc/deployment-apps/fwd_to_splunk1/default/outputs.conf

with the following settings:

[tcpout]

202

defaultGroup=splunk1 [tcpout:splunk1] # Specifies the server that receives data from the forwarder. server=10.1.2.2:9997

For information on outputs.conf, see "Configure forwarders with outputs.conf" in this manual. 5. Create
$SPLUNK_HOME/etc/deployment-apps/fwd_to_splunk2/default/outputs.conf

with the following settings:

[tcpout] defaultGroup=splunk2 [tcpout:splunk2] server=10.1.2.3:9997

6. Create $SPLUNK_HOME/etc/deployment-apps/winevt/default/inputs.conf with the following settings:

[WinEventLog:Application] disabled=0 [WinEventLog:Security] disabled=0 [WinEventLog:System] disabled=0

For information on monitoring Windows event log data, see "Monitor Windows event log data" in the Getting Data In manual. 7. Create $SPLUNK_HOME/etc/deployment-apps/linmess/default/inputs.conf with the following settings:

[monitor:///var/log/messages] disabled=false sourcetype=syslog

For information on monitoring files, such as message logs, see "Monitor files and directories" in the Getting Data In manual.

203

8. Restart Splunk on the deployment server. Note: Because the deployment server in this example is newly configured, it requires a restart for its configuration to take effect. When clients poll the server for the first time, they'll get all the content designated for them. To deploy subsequent content, you generally do not need to restart the server. Instead, you just invoke the Splunk CLI reload command on the server, as described in "Deploy apps and configurations". By doing so, you ensure that the server will inform its clients of content changes. However, whenever you edit serverclass.conf, you must always restart the deployment server for the configuration changes to take effect. To set up the receiving indexers, Fflanda-SPLUNK1 and Fflanda-SPLUNK2, perform these steps on the machines where you want them to reside: 1. Install Splunk, if you haven't already done so. 2. Run the following CLI command:

./splunk enable listen 9997 -auth <username>:<password>

This specifies that the indexer will serve as a receiver, listening for data on port 9997. With proper authorization, any forwarder can now send data to the receiver by designating its IP address and port number. You must enable the indexers as receivers before you enable the forwarders that will be sending data to them. For more information on enabling receivers, see "Enable a receiver" in this manual. 3. Restart Splunk. On each universal forwarder, Fflanda-WIN1, Fflanda-LINUX1, and Fflanda-LINUX2: 1. Install the forwarder, if you haven't already done so. 2. You can specify the deployment server as part of the installation process, or you can specify it in $SPLUNK_HOME/etc/system/local/deploymentclient.conf after the installation, using these settings:

[deployment-client]

204

[target-broker:deploymentServer] # Specify the deployment server that the client will poll. targetUri= 10.1.2.4:8089

See "Configure deployment clients" for details on how to configure this file. To learn how to specify the deployment server at installation time, see "Universal forwarder deployment overview" and the topics that follow it. 3. Restart the universal forwarder. Each forwarder will now poll the deployment server, download its configuration files, restart, and begin forwarding data to its receiving indexer. For a follow-up example showing how to use the deployment server to update forwarder configurations, see "Example: Add an input to forwarders".

What the communication between the deployment server and its clients looks like
Using the above example, the communication from Fflanda-WIN1 to Fflanda-SPLUNK3 on port 8089 would look like this: Fflanda-WIN1: Hello, I am Fflanda-WIN1. Fflanda-SPLUNK3: Hello, Fflanda-WIN1. I have been expecting to hear from you. I have you down as a member of the Fflanda-WIN server class, and you should have the fwd_to_splunk1 (checksum=12345) and winevt (checksum=12378) apps. Fflanda-WIN1: Hmmm, I don?t have those configs. Using this connection I just opened up to you, can I grab the configs from you? Fflanda-SPLUNK3: Sure! I have them ready for you. Fflanda-WIN1: Thanks! I am going to back off a random number of seconds between 1 and 60 (in case you have a lot of clients that are polling you at the moment) ... OK, now send me the files. Fflanda-SPLUNK3: Done! You now have fwd_to_splunk1-timestamp.bundle and winevt-timestamp.bundle. Fflanda-WIN1: Awesome! I am going to store them in my $SPLUNK_HOME/etc/apps directory. Now I am going to restart myself, and
205

when I come back up I am going to read the configurations that you sent me directly out of the .bundle files, which I know are just tar balls with a different extension. A couple of minutes go by.... Fflanda-WIN1: Hello, I am Fflanda-WIN1. Fflanda-SPLUNK3: Hello, Fflanda-WIN1. I have been expecting to hear from you. I have you down as a member of the Fflanda-WIN server class, and you should have the fwd_to_splunk1 (checksum=12345) and winevt (checksum=12378) Apps. Fflanda-WIN1: Hmmm, I already have both of those, but thanks anyway! Later on, an admin modifies the winevt/inputs.conf file on Fflanda-SPLUNK3 to disable the collection of system event logs, and then runs the CLI command splunk reload deploy-server to force the deployment server to rescan serverclass.conf and the app directories. The next time Fflanda-WIN1 talks to Fflanda-SPLUNK3, it goes like this: Fflanda-WIN1: Hello, I am Fflanda-WIN1. Fflanda-SPLUNK3: Hello, Fflanda-WIN1. I have been expecting to hear from you. I have you down as a member of the Fflanda-WIN server class, and you should have the fwd_to_splunk1 (checksum=12345) and winevt (checksum=13299) Apps. Fflanda-WIN1: Hmmm, I know I have those configs, but the checksum I have for the winevt configs is different than the one you just told me about. Using this connection I just opened up to you, can I grab the updated winevt config from you? Fflanda-SPLUNK3: Sure! I have it ready for you. Fflanda-WIN1: Thanks! I am going to back off a random number of seconds between 1 and 60 (in case you have a lot of clients that are polling you at the moment) ... Ok, now send me the updated config. Fflanda-SPLUNK3: Done! You now have winevt-newer_timestamp.bundle. Fflanda-WIN1: Awesome! I am going to store it my $SPLUNK_HOME/etc/apps directory and move the old winevt.bundle I had out of the way. Now I am going to
206

restart myself, and when I come back up, I'll have the most up-to-date config.

Example: add an input to forwarders


The previous topic, "Extended example: deploy several forwarders", described setting up a deployment environment to manage a set of universal forwarders. It showed how to configure a new deployment server to deploy content to a new set of deployment clients. The current example follows on directly from there, using the configurations created in that topic. It shows how to update a forwarder configuration file and deploy the updated file to a subset of forwarders, defined by a server class.

Overview of the update process


This example starts with the set of configurations and Splunk instances created in the topic "Extended example: deploy several forwarders". The Linux universal forwarders now need to start monitoring data from a second source. To accomplish this, perform these steps on the deployment server: 1. Edit the inputs.conf file for the Linux server class to add the new source, overwriting the previous version in its apps directory. 2. Use CLI to reload the deployment server, so that it becomes aware of the change and can deploy it to the appropriate set of clients (forwarders). You need make changes only on the deployment server. When the deployment clients in the Linux server class next poll the server, they'll be notified of the new inputs.conf file. They'll download the file, enable it, restart Splunk, and immediately begin monitoring the second data source.

Detailed configuration steps


On the deployment server: 1. Edit $SPLUNK_HOME/etc/deployment-apps/linmess/default/inputs.conf to add a new input:

[monitor:///var/log/messages] disabled=false sourcetype=syslog

207

[monitor:///var/log/httpd] disabled=false sourcetype = access_common

2. Use Splunk CLI to reload the deployment server:

./splunk reload deploy-server -class Fflanda-LINUX

Once this command has been run, the deployment server notifies the clients that are members of the Fflanda-LINUX server class of the changed file. Since the change doesn't affect the Fflanda-WIN server class, its members don't need to know about it.

Example: deploy an app


This example walks through the configuration needed to deploy an app, in this case, the Splunk light forwarder.

On the deployment server


1. Copy the SplunkLightForwarder app from $SPLUNK_HOME/etc/apps to the deployment directory, $SPLUNK_HOME/etc/deployment-apps on the deployment server. 2. Edit serverclass.conf in /system/local on the deployment server. Add a server class named "lightforwarders" that includes the light forwarder app:

[global] whitelist.0=* [serverClass:lightforwarders] whitelist.0=* [serverClass:lightforwarders:app:SplunkLightForwarder] stateOnClient=enabled restartSplunkd=true

Note the following: The [global] stanza is required. It contains any settings that should be globally applied.

208

In the [global] stanza, whitelist.0=* signifies that all of the deployment server's clients match all server classes defined in this configuration file. In this example, there is just a single server class. The server class name is "lightforwarders". You can call your server classes anything you want. In the [serverClass:lightforwarders] stanza, whitelist.0=* signifies that all clients match the lightforwarders server class. The [serverClass:lightforwarders:app:SplunkLightForwarder] stanza contains settings specific to the SplunkLightForwarder app on the lightforwarders server class. stateOnClient specifies that this app should be enabled on the client when it is deployed. restartSplunkd specifies that when this app is deployed, splunkd should be restarted. See "Define server classes" for details on how to configure this file.

On the deployment client


Edit deploymentclient.conf in /system/local on the deployment client to tell the client how to contact the deployment server:

[deployment-client] [target-broker:deploymentServer] targetUri=<IP:port>

Note the following: deploymentServer is the default name for a deployment server. <IP:port> is the IP address and port number for this client's deployment server. The file points the client to the deployment server located at IP:port. There, it will pick up the Splunk light forwarder app, enable it, and restart. See "Configure deployment clients" for details on how to configure this file.

Deploy in multi-tenant environments


Important: It is recommended that you work with Splunk Professional Services when designing a multi-tenant deployment.

209

A multi-tenant deployment server topology means that you have more than one deployment server running on the same Splunk instance, and each deployment server is serving content to its own set of deployment clients. (You can also achieve the same effect by using two Splunk instances, each with its own configuration.) Use tenants.conf to redirect incoming requests from deployment clients to another deployment server or servers. The typical reason for doing this is to offload splunkd's HTTP server -- if too many deployment clients are simultaneously hitting the splunkd HTTP server to download apps and configurations, it can overload the deployment server. Over 400 connections at one time has been shown to bog down splunkd's HTTP server, but this does not take into account hardware or the size of the package the client is downloading. To set up multiple deployment servers on a single Splunk instance, you: Create a tenants.conf containing a whitelist or blacklist that tells deployment clients which deployment server instance to use. Create a separate instance of serverclass.conf for each deployment server, named for that deployment server, like so: <tenantName>-serverclass.conf. For each deployment client, configure deploymentclient.conf the way you would if there were just one deployment server.

What you can define in tenants.conf


You identify the different deployment servers as "tenants" in tenants.conf on the Splunk instance that will host these deployment servers. There isn't a tenants.conf file by default, so you must create one in $SPLUNK_HOME/etc/system/local and define the tenants in it. For each tenant, create a stanza with the heading [tenant:<tenantName>] with these attributes: Attribute
Set to whitelist filterType

What it's for

Default

or blacklist. Determines the type of filter to use. Deployment clients use the whitelist filter to determine which deployment server to access. is a number starting at 0, and incrementing n/a by 1. The client stops looking at the filter when <n> breaks.
<n>

whitelist.<n> blacklist.<n>

210

Set the attribute to one of these value categories: ipAddress: The IP address of the deployment client. You can use wildcards, for example, 10.1.1.* hostname: The host name of deployment client. You can use wildcards, for example, *.splunk.com clientName: A logical, or tag, name that can be assigned to a deployment client in deploymentclient.conf. A clientName takes precedence over ipAddress or hostname when matching a client to a filter.

Example
Here is an example of defining two tenants in the tenants.conf file:

# Define two tenants - dept1 and dept2. # Deployment server configuration for dept1 will be in a matching dept1-serverclass.conf # Deployment server configuration for dept2 will be in a matching dept2-serverclass.conf [tenant:dept1] whitelist.0=*.dept1.splunk.com [tenant:dept2] whitelist.0=*.dept2.splunk.com

211

Upgrade your deployment


Upgrade your distributed environment
This topic discusses the process of upgrading components of a distributed Splunk deployment. Upgrading a distributed Splunk environment presents challenges over upgrading an indexer-only Splunk installation. For the purposes of reducing downtime and ensuring no data is lost, we strongly recommend that you upgrade your Splunk components in a specific order. This order is depicted in the instructions below. Note: This is a high-level guidance on upgrading Splunk in a distributed environment. We realize that every distributed Splunk environment is different, and therefore do not offer detailed step-by-step procedures. If you have additional questions about upgrading your distributed Splunk environment after reading this topic, you can log a case via the Splunk Support Portal.

Cross-version compatibility between distributed components


For information on compatibility between differerent versions of search heads and search peers (indexers), see "Cross-version compatibility for search heads". For information on compatibility between indexers and forwarders, see "Indexer and universal forwarder compatibility".

Test your apps prior to the upgrade


Before upgrading your distributed environment, make sure that all of your Splunk apps work on the version of Splunk that you plan to upgrade to. Important: This procedure is required if you are upgrading a distributed environment with a search head pool, because pooled search heads use shared storage space for apps and configurations. To ensure that your apps work on the desired upgraded version of Splunk: 1. On a reference machine, install the full version of Splunk that you currently run.

212

Note: You can also use an existing Splunk instance, provided that it is not indexing relevant data and is at the same version level as the other instances in your environment. 2. Install the apps on this Splunk instance. 3. Confirm that the apps work as expected. 4. Upgrade the Splunk instance to the desired version. 5. Test the apps again to make sure they work as desired in the new version. If the apps work as expected, you can move them to the appropriate location during the upgrade of your distributed Splunk environment: If you use non-pooled search heads, move the apps to $SPLUNK_HOME/etc/apps on each search head during the search head upgrade process. If you use pooled search heads, move the apps to the shared storage location where the pooled search heads expect to find the apps. Caution: The migration utility warns you of apps that need to be copied to shared storage for pooled search heads when you upgrade them. It does not, however, copy them for you. You must manually copy all updated apps including apps that ship with Splunk (such as the Search app and the data preview feature, which is implemented as an app) - to shared storage during the upgrade process. Failure to do so can cause problems with Splunk's user interface after the upgrade is complete.

Upgrade a distributed environment with multiple indexers and non-pooled search heads
To maintain availability, Splunk recommends that, when upgrading your distributed Splunk environment with multiple indexers and non-pooled search heads, that you upgrade the search heads first, then upgrade the indexing infrastructure that supports the search heads. If you have deployment servers in the environment, be sure to disable those prior to upgrading your search heads. To upgrade a distributed Splunk environment with multiple indexers and non-pooled search heads:

213

Prepare the upgrade 1. Confirm that any apps that the pooled search heads use will work on the upgraded version of Splunk, as described in "Test your apps prior to the upgrade" in this topic. 2. If you use a deployment server in your environment, disable it temporarily. This prevents the server from distributing invalid configurations to your other Splunk components. 3. Upgrade your deployment server, but do not restart it. Upgrade the search heads 4. Disable and upgrade one of the search heads. Do not allow it to restart. 5. After you upgrade the search head, place the confirmed working apps into the $SPLUNK_HOME/etc/apps directory of the search head. 6. Restart this search head and test for operation and functionality. 7. If there are no problems with the search head, then disable and upgrade the remaining search heads, one by one. Repeat this step until you have reached the last search head in your environment. Optionally, you can test each search head for operation and functionality after you bring it up. 8. Once you have upgraded the last search head, test all of the search heads for operation and functionality. Upgrade the indexers 9. Disable and upgrade your indexers, one by one. You can restart the indexers immediately after you upgrade them. 10. Test your search heads to ensure that they find data across all your indexers. 11. After all indexers have been upgraded, restart your deployment server.

Upgrade a distributed environment with multiple indexers and pooled search heads
If your distributed Splunk environment has pooled search heads, the process to upgrade the environment becomes significantly more complex. If your
214

organization has restrictions on downtime, this type of upgrade is best done within a maintenance window. The key concepts to understand about upgrading this kind of environment are: Pooled search heads must be enabled and disabled as a group. The version of Splunk on all pooled search heads must be the same. Apps and configurations that the search heads use must be tested prior to upgrading the search head pool. If you have additional concerns about the guidance shown here, you can log a case via the Splunk Support Portal. To upgrade a distributed Splunk environment with multiple indexers and pooled search heads: Prepare the upgrade 1. Confirm that any apps that the pooled search heads use will work on the upgraded version of Splunk, as described in "Test your apps prior to the upgrade" in this topic. 2. If you use a deployment server in your environment, disable it temporarily. This prevents the server from distributing invalid configurations to your other Splunk components. 3. Upgrade your deployment server, but do not restart it. Upgrade the search head pool 4. Designate a search head (Search Head #1) in your search head pool to upgrade as a test for functionality and operation. Note: Search heads must be removed from the search head pool temporarily before you upgrade them. This must be done for several reasons: To prevent changes to the apps and/or user objects hosted on the search head pool shared storage. To stop the inadvertent migration of local apps and system settings to shared storage during the upgrade. To ensure that you have a valid local configuration to use as a fallback, should a problem occur during the upgrade.

215

If problems occur as a result of the upgrade, search heads can be temporarily used in a non-pooled configuration as a backup. 5. Bring down all of the search heads in your environment. Note: Search capability will be unavailable at this time, and will remain unavailable until you restart all of the search heads after upgrading. 6. Place the confirmed working apps (as tested in Step 1) in the search head pool shared storage area. 7. Remove Search Head #1 from the search head pool. Note: Review "Configure search head pooling" for instructions on how to enable and disable search head pooling on each search head. 8. Upgrade Search Head #1. 9. Restart Search Head #1 and test for operation and functionality. Note: In this case, 'operation and functionality' means that the Splunk instance starts and that you can log into it. It does not mean that you can use apps or objects hosted on shared storage. It also does not mean distributed searches will run correctly. 10. If the upgraded Search Head #1 functions as desired, bring it down and add it back to the search head pool. 11. Upgrade the remaining search heads in the pool, one by one, following Steps 7 through 10 above. Caution: Remove each search head from the search head pool before you upgrade, and add them back to the pool after you upgrade. While it is not necessary to confirm operation and functionality of each search head, only one search head at a time can be up during the upgrade phase. Do not start any of the other search heads until you have upgraded all of them. 12. Once you have upgraded the last search head in the pool, then restart all of them. 13. Test all search heads for operation and functionality across all of the apps and user objects that are hosted on the search head pool.

216

14. Test distributed search across all of your indexers. Upgrade the indexers 15. Once you have confirmed that your search heads are functioning as desired, choose an indexer to keep the environment running (Indexer #1), and another to upgrade initially (Indexer #2). Note: If you do not have downtime concerns, you do not need to perform this step. 16. Bring down all of the indexers except Indexer #1. Note: If you do not have downtime concerns, you can bring down all of the indexers. 17. Upgrade Indexer #2. 18. Bring up Indexer #2 and test for operation and functionality. Note: Search heads running the latest version of Splunk can communicate with indexers running earlier versions of Splunk. 19. Once you have confirmed proper operation on Indexer #2, bring down Indexer #1. 20. Upgrade Indexer #1 and all of the remaining indexers, one by one. You can restart the indexers immediately after you upgrade them. 21. Confirm operation and functionality across all of your indexers. 22. Restart your deployment server, and confirm its operation and functionality.

Upgrade forwarders
When upgrading your distributed Splunk environment, you can also upgrade any universal forwarders in that environment. This is not required, however, and you might want to consider whether or not you need to. Forwarders are always compatible with later version indexers, so you do not need to upgrade them just because you've upgraded the indexers they're sending data to. To upgrade universal forwarders, review the following topics in this manual:

217

Upgrade the Windows universal forwarder Upgrade the universal forwarder on *nix systems

Upgrade the Windows universal forwarder


This topic describes the procedure for upgrading your Splunk universal forwarder from version 4.2.x or 4.3.x to 5.0. The upgrade process is much simpler than the original installation. The MSI does a straight-forward upgrade with no configuration changes. If you need to change any configuration settings on your forwarders, you can do so after the upgrade, preferably through the deployment server. Important: Before doing an upgrade, consider whether you really need to. In most cases, there's no compelling reason to upgrade a forwarder. Forwarders are always compatible with later version indexers, so you do not need to upgrade them just because you've upgraded the indexers they're sending data to. This topic describes three upgrade scenarios: Upgrade a single forwarder with the GUI installer Upgrade a single forwarder with the command line installer Perform a remote upgrade of a group of forwarders For deployments of any size, you will most likely want to use this last scenario.

Before you upgrade


Be sure to read this section before performing an upgrade. Back your files up Before you perform the upgrade, we strongly recommend that you back up your Splunk configuration files. For information on backing up configurations, read "Back up configuration information" in the Admin manual. Configure auto load balancing for clustering If you plan to do load-balanced forwarding to a Splunk cluster, you must configure your existing forwarders to use auto-load balancing. To learn how to do this, read "Set up load balancing" in this manual.
218

Splunk does not provide a means of downgrading to a previous version; if you need to revert to an older forwarder release, just uninstall the current version and reinstall the older release.

Upgrade using the GUI installer


You can upgrade a single forwarder with the GUI installer: 1. Download the new MSI file from the Splunk universal forwarder download page. 2. Double-click the MSI file. The Welcome panel is displayed. Follow the onscreen instructions to upgrade the forwarder. Note: You do not need to stop the forwarder before upgrading. The MSI will do this automatically as part of the upgrade process. 3. The forwarder will start automatically when you complete the installation. The installer puts a log of upgrade changes in %TEMP%. It also reports any errors in the Application Event log.

Upgrade using the command line


You can upgrade a single forwarder by running the command line installer. To upgrade a group of forwarders, you can load the command line installer into a deployment tool, as described below. Here are the steps for using the command line installer to upgrade a single forwarder: 1. Download the new MSI file from the Splunk universal forwarder download page. 2. Install the universal forwarder from the command line by invoking msiexec.exe. For 32-bit platforms, use
splunkuniversalforwarder-<...>-x86-release.msi:

msiexec.exe /i splunkuniversalforwarder-<...>-x86-release.msi [AGREETOLICENSE=Yes /quiet]

219

For 64-bit platforms, use


splunkuniversalforwarder-<...>-x64-release.msi:

msiexec.exe /i splunkuniversalforwarder-<...>-x64-release.msi [AGREETOLICENSE=Yes /quiet]

The value of <...> varies according to the particular release; for example, splunkuniversalforwarder-5.0-142438-x64-release.msi. Important: You cannot make configuration changes during upgrade. If you specify any command line flags besides "AGREETOLICENSE", the MSI just ignores them. Note: You do not need to stop the forwarder before upgrading. The MSI will do this automatically as part of the upgrade process. 3. The forwarder will start automatically when you complete the installation. The installer puts a log of upgrade changes in %TEMP%. It also reports any errors in the Application Event log.

Perform a remote upgrade


To upgrade a group of forwarders across your environment: 1. Load the universal forwarder MSI into your deployment tool. In most cases, you will want to run the command like this:

msiexec.exe /i splunkuniversalforwarder-<...>.msi AGREETOLICENSE=Yes /quiet

See the previous section, "Upgrade using the command line", for details on the MSI command. 2. Execute deployment with your deployment tool. 3. Use the deployment monitor to verify that the universal forwarders are functioning properly. You might want to test the upgrade locally on one machine before performing a remote upgrade across all your forwarders.

220

Upgrade the universal forwarder for *nix systems


This topic describes the procedure for upgrading your Splunk universal forwarder from version 4.2.x or 4.3.x to 5.0. Important: Before doing an upgrade, consider whether you really need to. In most cases, there's no compelling reason to upgrade a forwarder. Forwarders are always compatible with later version indexers, so you do not need to upgrade them just because you've upgraded the indexers they're sending data to. This topic describes two upgrade scenarios: Upgrade a single forwarder manually Perform a remote upgrade of a group of forwarders For deployments of any size, you will most likely want to use this second scenario.

Before you upgrade


Be sure to read this section before performing an upgrade. Back your files up Before you perform the upgrade, we strongly recommend that you back up your Splunk configuration files. For information on backing up configurations, read "Back up configuration information" in the Admin manual. Splunk does not provide a means of downgrading to a previous version; if you need to revert to an older forwarder release, just reinstall it. Configure auto load balancing for clustering If you plan to do load-balanced forwarding to a Splunk cluster, you must configure your existing forwarders to use auto-load balancing. To learn how to do this, read "Set up load balancing" in this manual.

How upgrading works


After performing the installation of the new version, your configuration changes are not actually made until you start Splunk.. You can run the migration preview utility at that time to see what will be changed before the files are updated. If you
221

choose to view the changes before proceeding, a file containing the changes that the upgrade script proposes to make is written to
$SPLUNK_HOME/var/log/splunk/migration.log.<timestamp>

Upgrade a single forwarder


1. Execute the stop command:

$SPLUNK_HOME/bin/splunk stop

Important: Make sure no other processes will start the forwarder automatically (such as Solaris SMF). 2. Install the Splunk package over your existing Splunk deployment: If you are using a .tar file, expand it into the same directory with the same ownership as your existing universal forwarder instance. This overwrites and replaces matching files but does not remove unique files. If you are using a package manager, such as an RPM, type rpm -U
<splunk_package_name>.rpm

If you are using a .dmg file (on MacOS), double-click it and follow the instructions. Be sure to specify the same installation directory as your existing installation. If you use init scripts, be sure to include the following so the EULA gets accepted:

./splunk start --accept-license

3. Execute the start command:

$SPLUNK_HOME/bin/splunk start

The following output is displayed:

This appears to be an upgrade of Splunk. -------------------------------------------------------------------------------Splunk has detected an older version of Splunk installed on this machine. To finish upgrading to the new version, Splunk's installer will automatically update and alter your current configuration files. Deprecated

222

configuration files will be renamed with a .deprecated extension. You can choose to preview the changes that will be made to your configuration files before proceeding with the migration and upgrade: If you want to migrate and upgrade without previewing the changes that will be made to your existing configuration files, choose 'y'. If you want to see what changes will be made before you proceed with the upgrade, choose 'n'. Perform migration and upgrade without previewing configuration changes? [y/n]

4. Choose whether you want to run the migration preview script to see what changes will be made to your existing configuration files, or proceed with the migration and upgrade right away. 5. If you choose to view the expected changes, the script provides a list. 6. Once you've reviewed these changes and are ready to proceed with migration and upgrade, run $SPLUNK_HOME/bin/splunk start again. Note: You can complete Steps 3 to 5 in one line: To accept the license and view the expected changes (answer 'n') before continuing the upgrade:

$SPLUNK_HOME/bin/splunk start --accept-license --answer-no

To accept the license and begin the upgrade without viewing the changes (answer 'y'):

$SPLUNK_HOME/bin/splunk start --accept-license --answer-yes

Perform a remote upgrade


To upgrade a group of forwarders across your environment: 1. Upgrade the universal forwarder on a test machine, as described above. 2. Create a script wrapper for the upgrade commands, as described in "Remotely deploy a nix universal forwarder with a static configuration". You will need to modify the sample script to meet the needs of an upgrade.
223

3. Run the script on representative target machines to verify that it works with all required shells. 4. Execute the script against the desired set of hosts. 5. Use the deployment monitor to verify that the universal forwarders are functioning properly.

224

You might also like