You are on page 1of 10

19th International Symposium on Software Reliability Engineering

Analysis of Computer Security Incident Data Using Time Series Models


Edward Condon1, Angela He2, and Michel Cukier1 Center for Risk and Reliability, Department of Mechanical Engineering 2 Department of Mathematics University of Maryland, College Park {econdon, dhe1, mcukier}@umd.edu

Abstract
Organizations face increasing challenges in addressing and preventing computer and network security incidents. There are financial consequences from security incidents. These include lost time and resources used during recovery, possible theft of personal and/or proprietary information, and reputational damage that may negatively impact stock prices or reduce consumer confidence in a company. Being able to understand and predict trends in computer and network security incidents can aid an organization with resource allocation for prevention of such incidents, as well as evaluation of mitigation strategies. We look at using time series models with a large set of security incident data. We examine appropriateness of the data for modeling and consider needed transformations. Parameter search and model selection criteria are discussed. Then, forecasts from time series models are compared to forecasts from Non-Homogeneous Poisson Process (NHPP) software reliability growth (SRG) models.

1. Introduction and Motivation


Computer and network security incidents have financial consequences to both individuals and organizations. Large-scale botnets [1, 2] are increasingly used for organized criminal activities [3] such as extortion and phishing/pharming [4] attacks in addition to distribution of pop-up advertisements and spam [5]. An increasing number of attacks are motivated by prospects of financial gain [6]. The 2007 CSI Computer Crime and Security Survey report [7] (previously years surveys performed in conjunction with the FBI) indicates reported average annual loss by U.S. companies has almost doubled compared to the previous years survey. Financial fraud and system penetration by outsiders were
1071-9458/08 $25.00 2008 IEEE DOI 10.1109/ISSRE.2008.39 77

identified as significant causes of loss. A recent technical report explores the economic structure of malicious websites in China and provides insight into the financial nature and incentives behind such an underground market [8]. Laws and regulations may also require companies to publicly disclose data breaches. This can negatively affect stock prices and other valuations of the organization [9, 10]. Valid models will enable researchers to identify characteristics of incident occurrence, quantify existing trends and may provide insight into future trends. Models developed from the analysis of security incident data can be used by organizations to assist with the efficient allocation of resources and to gauge the impact of prevention and control measures or changes to security policies. This can be accomplished by comparing model forecasts to observed activity. If the observed number of incidents begins to deviate as an increasing trend away from the forecast values, an external factor may have changed (such as the development of a new variant of an attack) and an organization should reevaluate their control measures and make necessary adjustments. A similar strategy can be a used for evaluating the impact of changes to the internal security environment (such as policy changes, IDS/IPS deployment, system updates, etc.). Model forecasts can be compared to actual observations before and after implementing a new control measure. The effectiveness of the change to the security environment can be gauged if the observations trend away from the forecast values after the new control measure has been implemented. Since many computer security incidents may be the result of a computer virus or worm, techniques used in the field of epidemiology and infectious disease surveillance provide a useful framework. One approach considers the biological mechanisms involved with disease transmission. The SusceptibleInfected-Recovered (SIR) and Susceptible-Exposed

Incidents all data msblast genbot klez bagle ircbot agobot rogueftp nethicsreq spamrelay nachi

Table 1. Summary of incident data # of Incidents Start Date End Date # of Days 11966 6/14/2001 3/14/2007 2100 2133 8/12/2003 9/12/2003 32 1940 6/8/2004 3/12/2007 1008 1118 2/12/2002 12/2/2003 659 849 1/20/2004 11/28/2005 679 690 4/8/2002 3/14/2007 1802 589 10/13/2003 9/13/2005 702 500 4/19/2002 2/7/2007 1756 509 11/12/2001 3/14/2007 1949 429 6/11/2002 3/9/2007 1733 317 9/16/2003 4/1/2004 199

# of Weeks 300.00 4.57 144.00 94.14 97.00 257.43 100.29 250.86 278.43 247.57 28.43

Infected-Recovered (SEIR) models are two examples. These models involve sets of differential equations to relate transitions between states and can include parameters for transmission and recovery rates. More complex models may also factor in birth and death rates (analogous to the rate new machines are added and the rate old machines are decommissioned) and proportion vaccinated (similar to patching in software terms). Examples of using epidemic models to model computer worms and viruses include [11] and [12]. However, other factors (such as the academic calendar of a university) can cause fluctuations in populations. Different interconnection topologies can result in varying epidemic patterns even when other factors are the same [13], and new strains (or variants) of viruses can emerge rendering existing vaccinations less effective. Models that attempt to incorporate or account for the influence of these factors become increasingly computationally complex. Also, when it comes to modeling computer security incidents, some incident types may not be aptly described by or result from a contagious process. Another approach is to use time series analysis and modeling as a surrogate for epidemic models for control purposes. Time series models are not explanatory models and do not rely on knowing the underlying dynamics producing the observed behavior. Instead, time series models attempt to make use of the time dependent structure present in a set of observations. This method is often encountered in the fields of economics and finance, and it has been described in [14] as a general technique for the early identification of outbreaks of infectious diseases with applications for intervention strategies. The use of these techniques is also illustrated in [15] applied to childhood infectious disease (pertussis, mumps, measles and rubella) data from Canada. [16] describes

monitoring the SARS epidemic in China using time series models and [17] uses similar methods to evaluate intervention measures taken in Singapore regarding SARS. [18] suggests that classification of infectious diseases may be possible based on their time series structure. We will explore the use of time series modeling for a range of computer security incident data. Several incident types may spread due to factors directly analogous to those affecting the spread of contagious diseases in animals. The occurrences of other incident types are likely more influenced by different factors. Since time series models do not attempt to directly account for or use information about causal factors, they will be applied to a range of incident types. This paper uses almost six years of security incidents collected from the University of Marylands Intranet. The data collection method assures there are no false positives among the recorded incidents. The data set includes 51 different incident types. We used times series analysis techniques to transform the data and to construct models of incident occurrence. We also examined model selection criteria for forecasting purposes and compare time series model forecasts to SRG model forecasts of the same data. This paper is structured as follows. The data set of security incidents that is used in this paper is described in Section 2. In Section 3 time series modeling aspects such as data transformation, auto-regressive integrated moving average (ARIMA) parameters, and model selection criteria are discussed. The process of fitting time series models to the data for the purpose of forecast evaluation is presented in Section 4. In Section 5, a comparison is made between the time series forecasts and predictions obtained from SRG models. Section 6 contains discussion of some of the challenges and provides conclusions regarding the results.

78

00 3

00 4

00 5

00 1

4/2

00 2

4/2

3/2

3/2

6 /1

6 /1

6 /1

6 /1

6 /1

6 /1

3/2

4/2

00 6

all data msblast genbot klez bagle ircbot agobot rogueftp nethicsreq spamrelay nachi

Figure 1. Timeline of incident data


@all dataD Cumulative # of Incidents vs. Time H daysL
12000 17.5 15 12.5

@spamrelay D Histogram of Incidents per Week

Cumulative # of Incidents

10000 8000 6000 4000

Frequency
0 500 1000 Time Hdays L 1500 2000

10 7.5 5

2000 0

2.5 0 50 100 150 Time Hweeks L 200

Figure 2. Cumulative number of all incidents

Figure 3. Histogram of number of incidents per week for "spamrelay" security failure (as opposed to a hardware failure or misinterpretation of data). The incidents recorded were based on three sources of events: 1) an intrusion detection system (IDS) (Snort [19] using a combination of regular rules and in-house rules), 2) reports from users and 3) reports from other system administrators. Since recorded incidents led to the blocking of the suspected computers IP address, OIT verified the authenticity of each incident. Therefore, all incidents obtained from these three sources were manually reviewed. OIT launched port scans and packet captures to validate the suspicious behavior of identified hosts. Based on the IDS rule that raised an alert, about 60% of the alerts were inconclusive (e.g., when the rule was too broad). Among the remaining 40%, about half led to the direct action of OIT blocking the IP address and half required a confirmation. Among the incident alerts, very few

2. The Incidents Data


The Office of Information Technology (OIT) at the University of Maryland provided the data set used in this paper. This data set consisted of 11,966 security incidents recorded over a period of 5.75 years (from June 14, 2001 to March 14, 2007). This data set included 51 different incident types. The number of incidents detected in a single day varies from 0 to 580. Table 1 summarizes the incident data and provides more details on the top ten incident types by frequency of occurrence and will be discussed in more detail. In the context of computer security, sometimes a distinction is made between an event and an incident. An event is typically something that attracts attention or appears abnormal. An incident is usually considered an event that has been verified as attributable to a

79

[spamrelay] Cumulative # Incidents vs. Time (weeks)


400 15

[spamrelay] Differenced cumul.data with d = 2


3

[spamrelay] Differenced Log(weekly.data) with d = 1

Cumulative # Incidents

300

10

Difference

Difference 0 50 100 150 200 250

200

-15 -10 -5

100

50

100

150

200

250

-3 0

-2

-1

50

100

150

200

250

Time(w eeks)

Time(w eeks)

Time(w eeks)

[spamrelay] ACF of Cumulative data


1.0 1.0

[spamrelay] ACF of Differenced cumul.data (d = 2)


1.0 ACF 0.0 -0.5 0 10 20 30 Lag 40 50 60 -0.5 0.0 0.5

[spamrelay] ACF of Differenced Log(weekly.data) (d = 1)

0.8

ACF

0.6

0.0

0.2

0.4

10

20

30 Lag

40

50

60

ACF

0.5

10

20

30 Lag

40

50

60

Figure 4. Difference transformations and ACF values for "spamrelay" were reports from users. The reports from other system administrators were defined as incidents in roughly 75% of the cases based on the source trustworthiness. Because of the method used by OIT to validate each incident, all incidents were actual failures. Thus, there are no false positives among the incidents reported. However, we cannot quantify the number of undetected attacks and intrusions that did not lead to a recorded security incident. We thoroughly examined the ten incident types that occurred most frequently in the data set. The ten types of incidents out of 51 total represented only 19.6% of the incident types classified. However, these ten incident types accounted for 9,074 of total 11,966 incidents, representing 75.8% of the entire dataset. Details on these ten incident types are provided in Table 1 which contains the number of incidents recorded for each of these incident types. Most incident type names are self-explanatory. However, note that nethicsreq represents the identification of an illegal use of copyrighted material like music and movies or other violation of the campus network acceptable use guidelines or policies. While this incident type has some aspects unique to the campus, it is similar to fraud, waste and abuse type incidents encountered in a more corporate setting. The Start Date indicates the date when the first incident associated with that incident type was recorded. The End Date represents the date when the last incident associated with the incident type was recorded or the end data of the data collection, whichever is earlier. # of Days is the inclusive number of days in the interval bounded by Start Date and End Date and # of Weeks is simply this number divided by seven. Figure 1 shows a timeline of the intervals for all of the data and for the top ten most frequent incident types. Figure 2 shows the cumulative number of all the incident types for the full time interval. There is a noticeable sharp increase around day 790. This increase is attributable to the start of the msblast incident detections. Similarly, around days 244 and 994 are two other examples of a rapid increase in the number of incidents in a short period of time. Day 244 coincides with the start of detections for klez incidents, and there were a large number of bagle incidents detected around day 994, which were likely due to a new variant of this worm being released at the time. These incident types contribute to the total number of incidents with short bursts of activity. Figure 3 shows a histogram of the number of spamrelay incidents per week. It illustrates several groups of resurgent activity over time with somewhat increasing density. We grouped the data into weekly intervals before performing the analysis and calculating models.

3. Time Series Modeling


Univariate time series models can be used to predict future values of a variable based only on its past measurements. These models do not rely on being able to measure or explain the causal factors underlying the behavior of the observed variable. Instead, the time series models are constructed to account for patterns in

80

past movements in a manner useful for forecasting future behavior [20]. While there are many types of time series models, we focus on a widely used and commonly applied class of linear models presented by Box and Jenkins [21] for univariate time series. These models combine autoregressive (AR) and moving average (MA) models into the combined autoregressive moving average (ARMA) models. In particular, we explore using autoregressive integrated moving average (ARIMA) models, which are one of the most commonly used models for univariate time series. The ARIMA model is a generalization of the ARMA model but includes a parameter for a differencing transformation of the data. An AR model of order p denoted AR(p) expresses the observed value at time t as:
p

X t = t + c + i X t p
i =1

[1]

with d as the number of repeated applications of the differencing transformation. An ARIMA(p, d, q) model is an ARMA(p, q) model but with the differencing transformation applied d times. For data that exhibits cycles or seasonality, a form of the backshift operator can also be applied to remove these effects and can be incorporated into a multiplicative seasonal ARIMA model (SARIMA), see [22] for more details. The use of different transformations can be explored if the variance changes over time. Common transforms include logarithm, square-root, and reciprocal. Mean-range plots may indicate if a particular transformation is appropriate. It is also common to plot and examine the sample autocorrelation function (ACF) to check for seasonality in the data and to assist with selection of model parameters. The sample ACF provides an estimate of the interdependency between data points X t and X t + k usually expressed as a function of k where k is referred as the lag value. Stationary data will have ACF values that decay towards zero quickly as k increases. Figure 4 shows the cumulative values for the spamrelay data and associated sample ACF values for d = 0 and 2. Also shown is a once differenced series of the log transformed weekly values and corresponding ACF plot. We see that there is a high amount of autocorrelation present in the cumulative values indicating that this data is nonstationary. Differencing the cumulative data (d = 2) appears to stabilize the mean. Applying the log transformation appears to reduce some of the variance. The actual transform was log[x+1] because there are weeks in which no incidents occurred. We explored the use of this transformation for all of the top incident types. Another related function is the partial autocorrelation function (PACF). Patterns in the ACF and PACF plots can sometimes indicate order values to try for the MA and AR parts of the models respectively. For a discussion of ACF and PACF properties for different MA(q) and AR(p) ordered processes, see [23]. In this study, we use two information based selection criteria for choosing model order (p,q). While examining the residuals (also referred to as innovations) of one-step-ahead model predictions can provide an indication of model fit, it is important to avoid overfitting the data to ensure the best forecasts. Akaike [24] described an approach to model selection that attempted to minimize a measure consisting of two parts. The first part was a term to measure model fit and the second part was a penalty term for the addition of parameters (as the inclusion of more parameters might lead to overfitting). We use the form presented

where X t is a weighted average of the previous p values in time with weights i , t an additional random error term and c a constant. The error term is generally assumed to be sampled from a white noise process of a mean of zero and constant variance. Similarly, an MA model of order q denoted MA(q) expresses the observed value at time t as:
q

X t = t + c + i t q
i =1

[2]

where X t is a weighted average of the previous q random error terms with weights i , t an additional random error term and c a constant. The AR(p) and MA(q) models can be combined to form an ARMA(p, q) model expressed as:
p q

X t = t + c + i X t p + i t q
i =1 i =1

[3]

These models assume the underlying stochastic process being observed is time invariant or stationary. It is common to apply transformations to non-stationary data to obtain a stationary series before modeling. Differencing is one technique for stabilizing the mean and it refers to successive applications of the following transformation: X t = X t X t 1 [4] A backshift operator is commonly defined as:

BX t = X t 1
which allows differencing to be expressed as:

[5] [6]

d X t = (1 B ) d X t

81

Table 2. ARIMA models and forecasts (AICc based selection) Not Seasonal Seasonal Cumulative Log weekly Cumulative Log weekly Forecast Forecast Forecast Forecast Seasonal Incident Selection Data Data Data Data Period Type Criterion RMS RMS RMS RMS Weeks all data AICc 1418.04 2104.02 869.32 52 224.59 msblast AICc 1333.48 53.31 n/a n/a n/a genbot AICc 182.01 117.50 220.93 26 36.46 53.43 26 klez AICc 66.93 86.61 43.54 bagle AICc 72.62 220.03 17.86 26 6.35 153.43 93.26 52 ircbot AICc 80.58 58.20 agobot AICc 21.59 1.96 1.58 5.46 26 rogueftp AICc 124.22 52.67 767.64 34.60 52 nethicsreq AICc 27.96 137.11 52 41.09 63.47 spamrelay AICc 72.35 55.66 52 111.08 35.25 nachi AICc 4.03 27.43 7.61 5 4.14 Shaded cells indicate lower RMS values compared to BIC selected models Table 3. ARIMA models and forecasts (BIC based selection) Not Seasonal Seasonal Cumulative Log weekly Cumulative Log weekly Forecast Forecast Forecast Forecast Seasonal Incident Selection Data Data Data Data Period Type Criterion RMS RMS RMS RMS Weeks all data BIC ** 273.14 ** ** 52 ** n/a n/a n/a msblast BIC 1222.72 genbot BIC ** ** 65.37 26 108.99 klez BIC ** ** 430.62 ** 26 bagle BIC ** 8.90 ** ** 26 ircbot BIC 84.76 93.79 52 72.54 88.14 agobot BIC ** ** 26 0.17 0.17 ** ** 52 rogueftp BIC * 48.48 nethicsreq BIC 41.23 ** 65.24 52 26.15 spamrelay BIC ** 120.74 35.44 52 54.27 nachi BIC 20.03 ** ** 5 6.86 ** indicates same model selected using AICc, see Table 2 for RMS value Shaded cells indicate lower RMS values compared to AICc selected models

in [25] and identified as AICc. calculating AICc is given by:

An equation for [8]

can also be referred to as Schwarz information criterion (SIC). The equation for calculating BIC is given by:

AICc = ln(

RSS (n + k ) )+ (n k 2) n

BIC = ln(

RSS k ln(n) )+( ) n n

[9]

where RSS is the sum of squared residuals, n is the number of observations, and k is number of parameters (k = p + q). A second information criterion examined uses a different penalty term based on Bayes factors [26] and is known as Bayesian information criterion (BIC), but

where RSS is the sum of squared residuals, n is the number of observations, and k is number of parameters (k = p + q). Models were calculated for combinations of p and q values each ranging from 0-5 and the models with the lowest AICc and BIC values were identified. (There are methods for evaluating models with AICc or BIC

82

values within a certain range of closeness to the lowest values, however, these methods can involve some subjective judgments and were not explored.) The residuals of selected models should be checked to see verify they resemble white noise.

4. Fitting Time Series Models to the Data


We follow a basic modeling process. First, transformations (differencing, log) are applied to the data to stabilize the mean and variance. Next, models are generated for a range of parameters (p, q). From the generated models, models are selected based on lowest AICc and BIC values. We are interested in developing models for forecasting, so we compare the accuracy of model predictions to unseen incident data. We do this by splitting the data and using the hold-out crossvalidation technique [27] for comparing model predictions. Using this method, some of the data is withheld and not used for parameter estimations or model selection. Then model forecast values are compared to the withheld data. Model parameters are estimated using maximum likelihood method with the R statistical software package [28]. We opted to split the data by time. We are more interested in examining expected trends over time instead of forecasting when a specific number of incidents will have occurred. Data from the first twothirds of the time interval was used for model order selection and model parameter estimation. The remaining data was used for forecast validation. In addition, we examine some multiplicative seasonal ARIMA (SARIMA) models. A one year period was used when possible to explore possible influences of the academic calendar schedule on occurrences of incidents. Since the length of the data interval being modeled must be at least twice the length of the seasonal period to allow for differencing, not all of the incident types could be modeled using a 52 week period after being split into modeling and validation sets. A period of 26 weeks was used for incidents with shorter intervals of observation because it is a factor of 52 and provides some overlapping of spring and fall semester activity. While it seems reasonable to expect some incident types might be affected by the academic calendar (observed occurrences increase during spring and fall semesters), it is difficult to capture using a single lag offset value due to variations in the lengths of semester breaks. In the case of the nachi incident type, a period of 5 weeks was chosen due to the shorter time interval of the data. A period of 5 weeks may capture some monthly influences if present. Closer

examination of the ACF plots for each incident type might indicate other seasonal periods to try. Tables 2 and 3 show the models selected for each incident type based on lowest AICc and BIC values. AICc and BIC are widely used information criteria for model selection, but alternate selection criteria have been proposed such as the Hannan-Quinn information criterion (HQC) [29]. Model selection was independent of the data withheld for validation. The root-mean-squared (RMS) errors were calculated to compare forecasts from the selected models to data withheld for validation. Table 2 shows results of the AICc selected models using the cumulative values and log transformed weekly values, while Table 3 shows the results for the BIC selected models. The forecasts for the log transformed weekly value models were converted to cumulative values for comparison purposes. The msblast incident type did not have enough weeks of data to calculate seasonal models. We can compare how the choice of model selection criteria affects forecasting. In half of the cases (21 out of 42), AICc and BIC select the same model. Of the remaining 21 cases where AICc and BIC select different models, using AICc resulted in picking a better predictive model 11 times, while using BIC resulted in better predictive models in 10 instances. Simply using AICc to select models did slightly better than using BIC. If overfitting is a concern, another strategy might be to examine the models selected using AICc and BIC and then choose the one of lower order. Although the model orders are not shown in Table 2 or Table 3, this strategy would not have improved upon using only AICc for model selection. Comparing the non-seasonal models constructed using the cumulative values to the ones using the log transformed values shows improved forecasting performance using the log values in eight out of eleven incident categories. Overall, it seems using the log transform improves forecasts in more cases than it degrades forecasts. The magnitude of forecasting improvements also appears to be larger than most forecasting degradations that occur. Including a seasonal element into the models improved forecasts using the cumulative data for three incident types: klez, spamrelay, and ircbot, while for the log transformed values it improved forecasts for four incident types: genbot, klez, rogueftp, and spamrelay. Other than klez, the incident types benefiting from including a seasonal factor are named after an observed function of a compromised machine and these types are not easily associated with a particular piece of malware or exploited vulnerability.

83

Incidents in these categories could potentially result from several different underlying causes yet exhibit the same outward behavior. The frequency of these incident types are more likely to be influenced by the number of active machines on the campus network than the exploitation of a particular vulnerability. Some forms of malware are modular in nature and have multiple variants. These variants can incorporate exploits for new vulnerabilities as they are discovered. Malware of this type often provides some desired functionality to an attacker (such as relaying spam or becoming part of botnet to be rented out for distributed denial-of-service attacks or phishing schemes). These incident types may persist even as specific vulnerabilities are patched as long as an economic incentive exists for the desired functionality provided. Besides academic calendar, another potential source of periodic behavior could result from monthly patch cycles used by some vendors. To examine this, the data could be grouped by months and scaled to account for differences in the number of days per month. This could be applied to the complete set of incidents as well as some individual incident types.

Table 4. Comparison of best model forecast RMS values for time series and SRG models Incident Type TS SRG all 937.44 224.59 msblast 53.31 6.14 genbot 36.46 26.89 klez 43.54 15.36 8.44 bagle 6.35 ircbot 58.20 43.00 agobot 8.90 0.17 64.67 rogueftp 34.60 nethicsreq 26.15 22.05 spamrelay 35.25 19.51 nachi 4.03 3.88 Shaded cells indicate lower RMS values of the common Non-Homogenous Poisson Process (NHPP) models in use today. Table 4 shows a comparison of the best time series model forecasts to the best SRG model forecasts for each incident type. The best SRG model yields better forecasts than the best selected time series model in seven of eleven cases. This result suggests viewing computer security incident data (when separated into different types) as similar to a class of errors in a software system can be a useful approach for forecasting and control. However, when modeling the total aggregate number of incidents over time (resulting from the combination of all incident types) a time series model may be more appropriate. This is not surprising as even if factors affecting the propagation and occurrence for each different incident type were well understood, the intervals at which new incident types are found is likely a less understood process and better captured by time series models. The spamrelay incident type actually displays a slight increasing trend over time and the best fit was a Duane model. This continued increasing trend probably results from economic incentives for compromised machines with the ability to send and relay spam. These incentives drive innovation to develop new techniques for compromising machines for this purpose. In some cases, applying a single SRG model across a whole interval may not be appropriate. The observed agobot incidents mostly fall into two distinct groupingsa small group at the beginning and much larger group about a third of the way into the interval. Only a few occurrences are observed after week 40. Agobot is an example of modular malware that can easily be modified to include new exploits. The second grouping of incidents likely results from a

5. Comparing Times Series and Software Reliability Growth Model Forecasts


Software reliability growth (SRG) models are often used to describe and predict failures and faults in software systems. In many systems, reliability is assumed to increase over time as faults are found and fixed and as users become more familiar with the features of a system and can avoid or adjust to different failure states. While security incident data is not the same as software failure data in systems, there are some similarities. A previous study explored the concept of viewing computer security incidents on a network as analogous to software errors in a system and tested the prediction ability of some common SRG models [34]. Growth models have also been applied to modeling SARS epidemic data [16]. Another study (involving several small sets of data from different types of repairable systems) compared forecasts from ARIMA time series models to forecasts from an SRG model and found the time series models forecasts to be similar in most cases and slightly better in a few instances [35]. The comparisons provided in this section are based on four different SRG models: the Goel-Okumoto (GO) model [36], the S-Shaped model [37], the K-Stage Curve model [38], and the Duane model [39]. These models were selected because they represent some

84

different exploited vulnerability than the first group. If so, it would be more appropriate to apply SRG models to each group. Our data does not identify different variants of the agobot malware. We examined the most commonly observed incident types. These incident types likely attracted enough attention for some form of intervention and control measures to be taken to reduce the number of occurrences. As each measure has some cost involved (labor, equipment, inconvenience to users, etc.), being able to gauge the effectiveness of the measures is important. Comparing model forecasts to actual observations before and after implementing a control measure can provide valuable feedback regarding a control measure.

6. Conclusion
Modeling computer security incidents can be a challenging task. Some incident types may exhibit propagation behavior similar to contagious diseases in animals. However, the connective topology influencing transmission can be dependent on structure of data communication networks or on social communication networks. Other incident types can result from economic incentives external to an organization or environment. For example, an increase in the number of spamrelay incidents may not be the result of a new variant of a worm or virus, but could reflect an increase in the economic value to external entities of using compromised machines to relay spam. As malware becomes increasingly modular, variants of some worms and viruses persist. The functional aspects of the malware can remain more or less the same while the infection mechanism can be easily adapted to exploit newer vulnerabilities as they are discovered while older exploits become less usable due to patching. The persistence of an incident type may be influenced more by the functional utility the malware provides in economic terms [40] rather than the prevalence of the exploited vulnerabilities. Identifying causes or underlying dynamics affecting an incident type such as nethicsreq may not be practical. Observed changes in frequency could just as likely result from a policy change and/or increased enforcement of existing policies. However, being able to model such incident types can be useful for assessing if an observed change in occurrences reflects a change in actual occurrences or simply a change in the evaluation and measurement processes. Being able to monitor and assess control measures is still valuable.

By modeling computer security incident data using time series techniques and comparing forecast results to predictions made by SRG models, we gain some insight into the challenges of understanding the underlying dynamics. SRG models tended to produce better forecasts when applied to individual incident types while time series models do better to forecast the overall aggregate number of incidents resulting from the combination of all incident types. This suggests incorporating or making some assumptions regarding underlying causal mechanisms contributing to the observed behavior for specific incident types is more useful than relying only on its time dependent structure. Also, learning more about what leads to new incident types could help forecasts for the aggregate total number of incidents. Future work could involve examining the root causes of each incident type as a means to explore if incorporating more causal information into the models improves forecasts. Incidents with similar root causes and propagation methods could be compared more closely to identify influential aspects. Taking a closer look at such factors may be a way to identify conditions which may result in the spread of new incident types.

Acknowledgments
We thank Gerry Sneeringer, Director of Security at OIT, for providing the incident data set and for providing insightful discussion during the development of this paper. This research has been supported in part by NSF CAREER Award 0237493.

References
[1] P. Bcher, T. Holz, M.Ktter, G. Wicherski, Know Your Enemy: Tracking Botnets, The Honeynet Project and Research Alliance, Mar. 2005; http://www.honeynet.org/papers/bots/. [2] US-CERT. (2007, Mar.). Quarterly Trends and Analysis Report. [Online]. 2 (1); http://www.uscert.gov/press_room/trendsandanalysisQ107.pdf. [3] Reuters, Cybercrime is Getting Organized, Wired, Sept. 2006; http://www.wired.com/techbiz/media/news/2006/09/71793. [4] Anti-Phishing Working Group, Apr. 2007; http://www.antiphishing.org/. [5] D. Goodin, California Man Pleads Guilty to Felony Hacking, Breitbart, Jan. 2005; http://www.breitbart.com/article.php?id=D8FALFU05&sho w_article=1. [6] M. Braverman, J. Williams, Z. Mador, Microsoft Security Intelligence Report January-June 2006, 2006; http://download.microsoft.com/download/4/b/f/4bff7b73-

85

9250-4004-a07a-81191ddae984/MS_Security_Report_JanJun06.pdf. [7] R. Richardson, 2007 CSI Computer Crime and Security Survey, Sep. 2007; http://i.cmpnet.com/v2.gocsi.com/pdf/CSISurvey2007.pdf. [8] J. Zhuge, T. Holz, C. Song, J. Guo, X. Han, W. Zou, Studying Malicious Websites and the Underground Economy on the Chinese Web, Department for Mathematics and Computer Science, University of Mannheim, TR-2007-011, Dec. 2007; http://madoc.bib.unimannheim.de/madoc/volltexte/2007/1718/. [9] K. Campbell, L. A. Gordon, M. P. Loeb, L. Zhou, The Economic Cost of Publicly Announced Information Security Breaches: Empirical Evidence from the Stock Market, Jour. of Computer Security, vol. 11, (3), 2003, pp. 431-448. [10] A. Acquisti, A. Friedman and R. Telang, Is There a Cost to Privacy Breaches? An Event Study, The Fifth Workshop on the Economics of Information Security (WEIS 2006), University of Cambridge, England, Jun. 26-28, 2006; http://weis2006.econinfosec.org/docs/40.pdf. [11] H.K. Browne, W.A. Arbaugh, J. McHugh and W.L. Fithen, A Trend Analysis of Exploitations, Proc. Security and Privacy, Oakland, California, USA, May 14-16, 2001, pp. 214-229. [11] J. Kim, S. Radhakrishnan, S.K. Dhall, Measurement and analysis of worm propagation on Internet network topology, Proc. 13th Int. Conference on Computer Communications and Networks, Oct. 11-13, 2004, pp. 495500. [12] C.C. Zou, W. Gong, and D. Towsley, Code Red Worm Propagation Modeling and Analysis, Proc. 9th ACM Conference on Computer and Communications Security, Nov. 2002, pp. 138-147. [13] D.J. Watts, R. Muhamad, D.C. Medina, and P.S. Dodds, Multiscale, resurgent epidemics in a hierarchical metapopulation model, Proc. National Academy of Sciences USA, vol. 102, (32), Aug. 2005, pp. 11157-11162. [14] R. Allard, Use of Time-series Analysis in Infectious Disease Surveillance, Bulletin of the World Health Organization, vol. 76, (4), 1998, pg. 327-333. [15] H. Trottier, P. Philippe and R. Roy, Stochastic Modeling of Empirical Time Series of Childhood Infectious Diseases Data before and After Mass Vaccination, Emerging Themes in Epidemiology, vol. 3, (9), Aug. 2006;

http://www.ete-online.com/content/3/1/9.
[16] D. Lai, Monitoring the SARS Epidemic in China: A Time Series Analysis, Journal of Data Science, vol. 3, 2005, pp. 279-293. [17] B. Han, T.Y. Leong, We Did the Right Thing: An Intervention Analysis Approach to Modeling Intervened SARS Propagation in Singapore, Proc. of World Congress on Medical Informatics (MEDINFO), 2004, pp. 1246-1250. [18] U. Helfenstein, Box-Jenkins Modeling of Some Viral Infectious Diseases, Statistics in Medicine, vol. 5, 1986, pp. 37-47. [19] Snort; http://www.snort.org. [20] R. Pindyck and D. Rubinfeld, Econometric Models and Economic Forecasts, New York: McGraw-Hill Book Company, 1981.

[21] G. Box and G. Jenkins, Time Series Analysis; Forecasting and Control, San Francisco: Holden-Day, 1970. [22] P.J. Brockwell and R.A. Davis, Time Series: Theory and Methods, New York: Springer-Verlag, 1991. [23] R. Shumway and D. Stoffer, Time Series Analysis and Its Applications, New York: Springer-Verlag, 2000. [24] H. Akaike, Information theory and the extension of the Maximum Likelihood Principle, in Second International Symposium on Information Theory, B.N. Petrov and F. Csaki, eds., Budapest: Academia Kiado, 1973, pp. 267281. [25] C. Hurvitch and C. Tsai, Regression and Time Series Model Selection in Small Samples, Biometrika, vol. 76, (2), Jun. 1989, pp. 297-307. [26] G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol. 6, (2), Mar. 1978, pp. 461-464. [27] R. Picard, R. Cook, Cross-validation of Regression Models, Journal of the American Statistical Association, vol. 79, (387), Sep., 1984, pp. 575-583. [28] R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2008, ISBN 3900051-07-0; http://www.R-project.org. [29] E.J. Hannan and B.G. Quinn, The Determination of the Order of an Autoregression, Journal of the Royal Statistical Society, Series B, vol. 41, (2), 1979, pp. 190-195. [30] F-Secure Computer Virus Information Pages: Klez; http://www.f-secure.com/v-descs/klez.shtml. [31] F-Secure Computer Virus Information Pages: Bagle; http://www.f-secure.com/v-descs/bagle.shtml. [32] F-Secure Computer Virus Information Pages: Lovsan; http://www.f-secure.com/v-descs/msblast.shtml. [33] F-Secure Computer Virus Information Pages: Welchi; http://www.f-secure.com/v-descs/welchi.shtml. [34] E. Condon, M. Cukier, and T. He, Applying Software Reliability Models on Security Incidents, Proc. 18th IEEE Intl. Symposium on Software Reliability Engineering, Trollhttan, Sweden, Nov. 5-9, 2007, pp. 159-168. [35] M. Xie and S. Ho, Analysis of repairable system failure data using time series models, Journal of Quality in Maintenance Engineering, vol. 5, (1), 1999, pp. 50-61. [36] A. L. Goel and K. Okumoto, Time-Dependent ErrorDetection Rate Model for Software Reliability and Other Performance Measures, IEEE Trans. on Reliability, vol. R28, (3), Aug. 1979, pp 206-211. [37] S. Yamada, M. Ohba and S. Osaki, S-shaped Software Reliability Growth Models and Their Application, IEEE Trans. on Reliability, vol. R-33, (4), Oct. 1984, pp 289-291. [38] T.M. Khoshgoftaar, Nonhomogeneous Poisson Processes for Software Reliability Growth, Proc. 8th symposium in computational statistics, Copenhagen, Denmark, Aug 1988, pp. 11-12. [39] J. T. Duane, Learning Curve Approach to Reliability Monitoring, IEEE Tran. On Aerospace, As-2, 1964, pp 563-566. [40] J. Franklin, V. Paxson, A. Perrig, and S. Savage, An Inquiry into the Nature and Causes of the Wealth of Internet Miscreants, Proc. ACM Conf. Computer and Communications Security, October 2007.

86

You might also like