Professional Documents
Culture Documents
4.1 Introduction
In 'hapters I and 2 (e descri ed the need for an epidemiological approach to the in#estigation of disease pro lems. )e also implied that such in#estigations usually ha#e the asic o *ecti#e of descri ing and quantifying disease pro lems and of e%amining associations et(een determinants and disease. )ith these o *ecti#es in mind+ epidemiological in#estigations are normally conducted in a series of stages+ (hich can e roadly classified as follo(s, 1. - diagnostic phase+ in (hich the presence of the disease is confirmed. 2. - descripti#e phase+ (hich descri es the populations at ris. and the distri ution of the disease+ oth in time and space+ (ithin these populations. This may then allo( a series of hypotheses to e formed a out the li.ely determinants of the disease and the effects of these on the frequency (ith (hich the disease occurs in the populations at ris.. 3. -n in#estigati#e phase+ (hich normally in#ol#es the implementation of a series of field studies designed to test these hypotheses. 4. -n e%perimental phase+ in (hich e%periments are performed under controlled conditions to test these hypotheses in more detail+ should the results of phase 3 pro#e promising. 5. -n analytical phase+ in (hich the results produced y the a o#e in#estigations are analysed. This is often com ined (ith attempts to model the epidemiology of the disease using the information generated. Such a process often ena les the epidemiologist to determine (hether any #ital its of information a out the disease process are missing. !. -n inter#ention phase+ in (hich appropriate methods for the control of the disease are e%amined either under e%perimental conditions or in the field. Inter#entions in the disease process are effected y manipulating e%isting determinants or introducing ne( ones. $. - decision/ma.ing phase+ in (hich a .no(ledge of the epidemiology of the disease is used to e%plore the #arious options a#aila le for its control. This often in#ol#es the
modelling of the effects that these different options are li.ely to ha#e on the incidence of the disease. These models can e com ined (ith other models that e%amine the costs of the #arious control measures and compare them (ith the enefits+ in terms of increased producti#ity+ that these measures are li.ely to produce. The optimum control strategy can then e selected as a result of the e%pected decrease in disease incidence in the populations of li#estoc. at ris.. &. - monitoring phase+ (hich ta.es place during the implementation of the control measures to ensure that these measures are eing properly applied+ are ha#ing the desired effect on reducing disease incidence+ and that de#elopments that are li.ely to *eopardise the success of the control programme are quic.ly detected. The follo(ing t(o sections are concerned (ith descri ing (ays in (hich epidemiological in#estigations can e designed and implemented+ and the data produced analysed.
Prospective studies, (hich loo. for(ard o#er a period of time and normally attempt to e%amine associations et(een determinants and the frequency of occurrence of a disease y comparing attac. rates or incidences of disease in groups of indi#iduals in (hich the determinant is either present or a sent+ or its frequency of occurrence #aries.
1etrospecti#e studies+ (hich loo. backward o#er a period of time and normally attempt to compare the frequency of occurrence of a determinant in groups of diseased and non/ diseased indi#iduals.
Cross-sectional studies, (hich attempt to e%amine and compare estimates of disease pre#alence et(een #arious populations and su sets of populations at a particular point in time.
2requently+ ho(e#er+ these approaches may e com ined in a general study of a disease pro lem. In such studies+ other mor idity and mortality rates may e compared as (ell as other #aria les such as (eight gain+ mil. yield etc. depending on the o *ecti#es of the particular study.
distri ution of the determinant that is to e studied. The indi#idual animals selected for the study are assigned to groups or cohorts. 32or this reason+ prospecti#e studies are often called cohort studies4. The determinant to e studied is then introduced into one cohort and the other cohort is .ept free of the determinant as a control. The t(o cohorts are o ser#ed o#er a period of time and the frequencies (ith (hich disease occurs in them are noted and compared. 5ften+ ho(e#er+ the in#estigator has no control o#er the distri ution of the determinant eing studied. In such a case he (ill select the indi#iduals that ha#e een or are e%posed to the determinant concerned+ (hile another group of indi#iduals that do not ha#e+ or ha#e not een e%posed to+ that determinant is used as a control. The frequency of occurrence of the disease in the different groups is then o ser#ed o#er a period of time and compared. In prospecti#e studies+ the cohorts eing compared should consist+ ideally+ of animals of the same age+ reed and se% and should e dra(n from (ithin the same herds or floc.s+ since there may e many differences in the (ay that different herds or floc.s are .ept and managed+ (hich may e e%pected to ha#e an effect on the frequency of occurrence of the disease eing in#estigated. If such cohorts can e selected+ prospecti#e studies can demonstrate accurately the association et(een determinants and disease+ since the cohorts (ill differ from each other merely in the presence or a sence of the particular determinant eing studied. This (ill only e possi le if the in#estigator has control o#er the distri ution of the determinant eing selected. 6#en then+ such conditions are often #ery difficult to fulfil in the field+ (here the in#estigator is dependent on the cooperation of li#estoc. o(ners (ho may e un(illing to alter their management systems to fit in (ith the study design. If the in#estigator has no control o#er the distri ution of the determinant eing studied+ the study design ecomes more complicated and the in#estigation may ha#e to e repeated to ta.e into account the #ariations in the many different factors in#ol#ed. 0rospecti#e studies ha#e the disad#antage that if the incidence of the disease is lo(+ or the difference one (ishes to demonstrate et(een groups is small+ the size of the study groups has to e large. 3Methods for analysing the results of prospecti#e studies and for estimating the size of cohorts needed are descri ed in 'hapter 54. The pro lem of lo( disease incidence can sometimes e o#ercome y artificially challenging the different cohort groups (ith the disease in question. 7o(e#er+ this may not e accepta le under field conditions+ since li#estoc. o(ners ta.e gra#e e%ception to ha#ing their animals artificially infected8 2or these reasons+ prospecti#e studies are normally performed on diseases of high incidence and (here the e%pected difference in disease frequencies et(een the groups studied is li.ely to e large.
1etrospecti#e studies ha#e #arious ad#antages and disad#antages (hen compared (ith prospecti#e studies. The principal ad#antage of retrospecti#e studies is that they ma.e use of data that ha#e already een collected and can+ therefore+ e performed quic.ly and cheaply. In addition+ ecause diseased indi#iduals ha#e already een identified+ retrospecti#e studies are particularly useful in in#estigating diseases of lo( incidence. The main disad#antage is that the in#estigator has no control o#er ho( the original data (ere collected+ unless he or she collected them. If the data are old+ it may not e possi le to contact the indi#iduals (ho had collected them+ and thus there is often no (ay of .no(ing (hether the data are iased or incomplete 3see also Section 4.$ on some other disad#antages in using already generated data in epidemiological (or.4. The second ma*or disad#antage is that although one .no(s the frequency of occurrence of the determinant in the case group+ one does not .no( its frequency of occurrence in non/ diseased indi#iduals from the same population. The latter is normally determined y sampling from a population of non/diseased indi#iduals at the time that the study is eing carried out. There is no (ay of .no(ing the e%tent of the similarity et(een the t(o different populations from (hich the case and control groups are ta.en. 'onsequently+ there is no (ay of ascertaining the distri ution (ithin these populations of undetermined factors (hich could affect the frequency of the disease. :reat caution has to e e%ercised+ therefore+ in ma.ing inferences a out associations et(een determinants and disease frequencies from retrospecti#e studies. - third disad#antage is that historical data on cases of disease that are sufficiently accurate to merit further study+ are hard to come y in #eterinary medicine. The opportunities for doing case/control studies are thus rather limited. They are much more common in human medical studies. In spite of the fact that classic case/control studies are rarely performed in #eterinary epidemiology+ retrospecti#e data are often used in li#estoc. disease studies. The ad#antages and disad#antages of using such data are discussed later on in this chapter. Methods for analysing case/control study data and for calculating the sizes of case and control study groups are descri ed in the follo(ing chapter.
;nfortunately+ in most instances the populations studied are large and censuses ecome difficult and e%pensi#e to underta.e. - further dra( ac. (ith censuses in large populations is that+ ecause of the practical constraints of staff and facilities+ each indi#idual unit (ithin a population can e allocated only a limited amount of time and effort. 'onsequently+ the amount of data that can e o tained from each unit sampled is limited. !ample surveys Sample sur#eys ha#e the ad#antage of eing cheaper and easier to perform than censuses. "ecause the population is eing sampled+ the actual num er of units eing measured is relati#ely small+ and as a result more time and effort can e de#oted to each unit. This ena les a considera le amount of data to e collected on each sample unit. The question is+ ho( closely do the results of the sur#ey correspond to the real situation in the population eing sampled< If underta.en properly+ sample sur#eys can generate relia le information at a reasona le cost= if they are performed improperly+ the results may e #ery misleading. This is also true of censuses.
1andom sampling remo#es ias in the selection of the sample and there y remo#es one of the main sources of error in epidemiological studies. The first step in random sampling is to construct a list of all the indi#idual sample units in the population eing sampled. This is .no(n as the sample frame 6ach unit in the sample frame can then e assigned an identification num er (hich is normally the numerical order in (hich they appear in the sample frame. - computer program can e used to generate random num ers or a ta le of the out put from such a program. 3- random num er ta le is gi#en in -ppendi% 14. -s each num er is produced+ the unit to e sampled can e identified from the sample frame. 1andom num ers are selected from a random num er ta le y starting any(here in the ta le and then reading either horizontally across the ro(s or #ertically do(n the columns. 6%ample, Suppose (e are interested in detecting the presence of rucellosis in a dairy herd of 34? co(s. )e decide that+ for our purposes+ (e (ish to e ?@A sure of detecting the disease and (e estimate+ although (e do not .no(+ that the pre#alence of rucellosis in the herd is not li.ely to e less than &A 3see Section 4.4 on estimating sample sizes4. 2rom Ta le 1@ (e see that in order to e ?@A sure of detecting the disease at this le#el of pre#alence in a herd of 34? co(s+ (e need a random sample of 2$ animals. The animals in the herd are not tagged+ ut the herdsman is a le to identify each animal y name. )e can+ therefore+ construct a sample frame of the animals in the herd y listing their names. If+ for any reason+ t(o or more animals had the same name+ (e could further identity them y a num er 3e.g. Baisy 1+ Baisy 2 etc4. - similar procedure can sometimes e used to esta lish the identify of certain unnamed animals in a herd y identifying them as the first calf of 6mma+ the second calf of 2lora etc. To select the animals to e sampled (e could simply (rite the name of each animal in the herd on a piece of paper+ place the name cards in a hat and then dra( out 2$ cards. -lternati#ely+ (e could use a random num er generator or ta le to produce a set of three/ digit num ers. 1e*ecting all num ers greater than 34?+ (e continue until (e ha#e 2$ three/ digit num ers. - series of such num ers might for instance read @@1+ @&&+ @45+ @@&+ @1!+ 344 etc. )e (ould then select the first+ the eighty/eighth+ the forty/fifth+ the si%teenth+ the three/hundred/and/fourty/fourth etc animal from the sample frame. Since (e no( .no( the names of the animals to e sampled+ (e can identify them in the herd and include them in the sample. -s a simple alternati#e+ (e could run the herd through a chute and select the animals as they come through+ ta.ing the first+ eighth+ si%teenth+ forty/fifth etc animal for the sample. 9ote that if the population to e sampled (as et(een 1@ and ??+ (e (ould use t(o/digit num ers to select the sample= if it (as et(een 1@@ and ???+ three/digit num ers (ould e used= for populations et(een 1@@@ and ????+ and et(een 1@ @@@ and ?? ???+ four/digit and fi#e/digit num ers+ respecti#ely+ (ould e selected. -ny num er in these categories greater than the size of the population eing sampled is re*ected. If during the sampling procedure the same unit is selected a second time+ the num er that led to that selection is also re*ected. If (e (ere selecting animals from the same herd for the purposes of a prospecti#e study+ (e could use random num ers to identify them in the sample frame and then assign each
animal in turn to the appropriate group. Thus+ in the a o#e e%ample+ if (e (anted to select three groups from the herd+ the first co( on the list (ould e assigned to group I+ the eighty/ eighth co( on the list to group 2+ the forty/fifth co( on the list to group 3+ the eighth co( to group I+ the si%teenth co( to group 2+ the three/hundred/and/forty/fourth co( to group 3 and so on. There are many (ays of selecting random samples+ ut the principles are su stantially the same as those outlined a o#e. -part from remo#ing ias in the selection of the sample+ random sampling has other ad#antages+ the main eing that (e can easily calculate an estimate of the error for the #alues of a population parameter estimated y a random sample. This is done y the use of a statistic .no(n as the standard error 3see Section 4.44. 7a#ing calculated the error+ (e can ad*ust the size of the sample according to ho( precise (e require our sample estimate to e. It is possi le to calculate estimates of errors in other forms of sampling+ ut the calculations in#ol#ed are more comple%. 2or this reason+ random sampling is normally the method of choice (hen circumstances permit. The main disad#antage of random sampling is that it cannot e attempted if the size of the population is not .no(n. In most instances+ a sample frame must e constructed efore sampling can egin. This sample frame must contain all the sample units in the population+ and the sample units must e identifia le y some means or other in the population (hich is eing sampled. Sample frames are notoriously difficult to construct+ certain sample units may occur in the frame more than once+ thus increasing their chance of selection+ or certain sectors of the population to e sampled may e omitted. Moreo#er in -frica+ (here records of indi#idually identifia le animals are seldom a#aila le+ sample frames of indi#idual animal units can rarely e constructed. 2or this reason+ simple random sampling ased on indi#idual animals as sample units is rarely attempted in -frica. 2urthermore+ random sampling is impossi le (here the type of unit eing sampled does not permit the population size to e determined eforehand. If+ for instance+ e#ents such as irths or deaths are eing sampled+ there is simply no (ay of .no(ing (ith a solute precision ho( many irths or deaths there (ill e in a population o#er the study period.
sampling is often for tra#el+ the ad#antages of sampling all the animals in the herd+ #illage or farm during one #isit are o #ious. 2or this reason+ cluster sampling is often the method of choice in epidemiological studies in -frica. -n alternati#e method of cluster sampling is to define the target population as all the li#estoc. of a particular type (ithin a region demarcated y (ell defined geographical oundaries. -n areal sampling method is then used (here y the region is di#ided into small units+ (ith all the animals in each unit eing defined as a single cluster. The ad#antage of this procedure is that the in#estigator .no(s ho( many areal units there are in total+ since he has defined them+ and this in turn ena les him to construct easily a sample frame. The disad#antage is that it may e difficult to find all the animals in a gi#en small area+ or e#en to e sure to (hich areal unit a particular animal elongs. 'luster sampling has some ad#antages and disad#antages (hen compared (ith simple random sampling. These are discussed in detail in the ne%t chapter ut it may e useful to include a rief summary here. The first ad#antage of cluster sampling is one of a sa#ing in tra#el costs. Much less tra#elling is in#ol#ed in sampling animals on a cluster asis than if animals are selected at random from a target population. 0ro#ided that the complete collection of animals in each cluster is included in the sample+ it is not too difficult to calculate an estimate of the #aria le eing in#estigated and the corresponding standard error. 3It is not #ery difficult e#en if only a su set is used4. 7o(e#er+ since the #ariation in disease pre#alence is li.ely to e greater et(een clusters than (ithin clusters+ e%amining animals (ithin clusters (ill gi#e less information than e%amining animals from different clusters. This is particularly so in the case of infectious diseases. The more infectious the disease+ the more li.ely it is that in any particular cluster of animals either none or most of the animals (ill e infected. "ecause of this+ cluster sampling (ill almost al(ays increase the standard error / sometimes #ery considera ly / and hence the uncertainty in#ol#ed in the estimation of the particular #aria le eing considered. 5ne implication of this is that the minimum num er of cases required for a relia le estimate of disease pre#alence or incidence in the target population as a (hole (ill e se#eral times larger than that required in simple random sampling The sample size in a cluster sample has to e correspondingly larger+ therefore+ to produce an estimate of the same relia ility. If+ as a result+ the procedures for measuring a particular #aria le ecome time consuming andDor costly+ the time and money spent may out(eigh the enefits of reduced tra#el costs and increased administrati#e con#enience gained y cluster sampling.
The main ad#antage of systematic sampling is that it is easier to do than random sampling+ particularly if the sample frame is large. It also ena les sampling a population (hose e%act size is not .no(n. This is impossi le in random sampling. Thus systematic sampling is used to sample such e#ents as irths or deaths+ (hose total num er cannot e .no(n efore the study egins+ or li#estoc. populations at a attoirs or dips (here+ again+ the population size may not e determina le at the outset. The main disad#antage of systematic sampling is that if the sample units are distri uted in the sample frame or in the population periodically+ and this periodicity coincides (ith the sampling inter#al+ the sample estimate may e #ery misleading. 6stimating the standard error is thus more difficult and depends on ma.ing the assumption that there is no periodicity in the data.
4.3.$ !tratification
This in#ol#es treating the population to e sampled as a series of defined su /populations or strata. Suppose+ for e%ample+ that (e (ished to sample a population of 4@@@ goat floc.s in order to estimate the pre#alence of a particular disease in an area+ and that this population consisted of 2@@ large/sized floc.s containing 51 animals or more=
&@@ medium/sized floc.s containing et(een 2@ and 5@ animals= and 3@@@ small/sized floc.s containing 1? animals or less. If (e too. a 1A random sample of all floc.s+ (e might find that this (ould gi#e us a sample consisting of+ say+ 1 large floc.+ ? medium/sized floc.s and 3@ small floc.s. Suppose+ ho(e#er+ that one of the determinants (e (ere interested in (as the influence of floc. size on the pre#alence of the disease. )e (ould o #iously (ant to .no( more a out the larger floc.s than our present system of sampling (ould tell us. )e could+ therefore+ di#ide the population to e sampled into strata according to floc. size+ and sample each stratum in turn. )e could also ta.e larger samples from those strata that (e are particularly interested in and smaller from those that (e are not. 2or e%ample+ (e might decide to ta.e a 5A random sample from the large/floc. stratum+ a 2A sample from the medium/floc. stratum and a @.5A sample from the small/floc. stratum. This might gi#e us 1@ large floc.s+ 1! medium floc.s and 15 small floc.s. 9ote that the actual sample size has increased from 4@ to 41 only+ although if (e (ere cluster sampling more animals (ould e in#ol#ed. This technique is .no(n as stratification with a variable sampling fraction, and its usefulness lies in that it allo(s us to concentrate the facilities at our disposal on those sections of the population that are of particular interest to us. Many different systems of stratification are possi le+ depending on the purpose of the study eing underta.en. 'ommon #aria les for stratification include area+ production system+ herd size+ age+ reed and se%.
sometimes the same cluster (ill appear more than once in the sample+ though this (ill happen rarely if the total num er of clusters is large compared to the sample eing selected. 3The interested reader should consult 'hapters ? and 1@ in 'ochran 31?$$4 for further details4. There are many #ariations and com inations of sampling possi le e#en (ithin one particular study. Betailed descriptions of all the possi le permutations in#ol#ed are eyond the scope of this manual+ and the ensuing discussions in this and the ne%t chapter (ill focus on simple random and cluster sampling.
carry out the sur#ey and o tain an estimated pre#alence p. If (e repeated the (hole sur#ey a second time using the same sampling method and the same sample size+ (e (ould get a different estimate p of the pre#alence 0. If it (ere possi le to go on repeating the sur#ey many times (ith the same sample size+ (e (ould get a (hole series of estimates from (hich (e could dra( a histogram. This (ould resem le 2igure $ if n+ the sample size+ (as large. )igure &. Distribution of different estimates of disease prevalence in a large-sized sample.
It can e sho(n that the a#erage of all the estimates p1+ p2 etc (ill e almost e%actly the true pre#alence 0. and that !&A of the estimates (ill differ from the true #alue y less than the quantity called the standard error of the estimated prevalence 3S64+ (here,
0 G true pre#alence 3A4+ H G 1@@/ 0+ and n G size of the sample. Similarly+ ?5A of the estimates (ould differ from the true #alue y less than t(ice the standard error+ and ??A of the estimates (ould e (ithin three standard errors of the true #alue. This suggests a method for stating ho( precise (e (ould li.e the results to e. )e might+ for e%ample+ say that (e (ould li.e to e ?5A sure of eing (ithin 1A of the correct+ true pre#alence 03A4. This implies that (e (ant t(ice the standard error to e no greater than 1A+ or that the standard error should not e greater than @.5A. This means that it is al(ays possi le to fi% a gi#en accuracy le#el y choosing the sample size so that the standard error of the estimate is controlled. 1equirements for precision can e stated in terms of absolute or relative accuracy. If (e tal. in terms of a solute accuracy (e might say that E(e (ant the error in the pre#alence estimate to e no more than 1AE i.e. p G 0 I 1A. 2or e%ample+ if the true pre#alence is 3A+
(e (ill e requiring an estimate that lies in the range of 2 to 4A. If the true pre#alence is 2@A+ (e require the estimated #alue to fall et(een I ? and 21 A. If (e (ant to state our requirements in terms of relative accuracy, the estimated #alue must lie (ithin 1@A of the true #alue. 2or e%ample+ if the true pre#alence is 2@A+ this (ould mean o taining an estimate in the range of 1& to 22A+ since 2 is 1@A of 2@. If the true #alue (as 5A+ (e (ould e demanding an estimate et(een 4.5 and 5.5A+ since @.5 is 1@A of 5. In principle+ there is nothing (rong in stating accuracy requirements in this (ay+ ut high relati#e accuracy (ill not e possi le (hen true pre#alence is lo( 3see Ta le ?4. Ta le & sho(s the sample sizes required for estimating pre#alences at different le#els of a solute accuracy from large populations. 9ote that no sample size is gi#en unless the standard error is smaller than the true pre#alence. The entries ha#e een calculated using the formula, n G 031@@/04DS6J If the sample size is a large proportion of the population+ say greater than 1@A+ then it is etter to use the more e%act formula,
(here 9 is the total size of the population. Ta le &. Sample size n) for controlling the standard error S!) of estimated prevalence for different values of the true prevalence P) in large populations.
P *+, @.5 1.@ 1.5 2.@ 2.5 3.@ 3.5 4.@ 4.5 5.@ !.@ $.@ &.@ ?.@ 1@.@ 2@.@ 3@.@ ..1 4?$5 ??@@ 132$5 1?!@@ 243$5 2?1@@ 33$$5 3&4@@ 42?$5 4$5@@ 5!4@@ !51@@ $3!@@ &1?@@ ?@@@@ 1!@@@@ 21@@@@ ..$ / 3?! 5?1 $&4 ?$5 11!4 1351 153! 1$1? 1?@@ 225! 2!@4 2?44 32$! 3!@@ !4@@ &4@@ !- *+, 1.. 1.$ 2.. 2.$ / 14& 1?! 244 2?1 33& 3&4 43@ 4$5 5!4 !51 $3! &1? ?@@ 1!@@ 21@@ / / &$ 1@& 12? 15@ 1$1 1?1 211 251 2&? 32$ 3!4 4@@ $11 ?33 / / / !1 $3 &4 ?! 1@$ 11? 141 1!2 1&4 2@5 225 4@@ 525 / / / / 4$ 54 !1 !? $! ?@ 1@4 11& 131 144 25! 33!
4@.@ 5@.@
24@@@@ ?!@@ 24@@ 1@!$ !@@ 3&4 25@@@@ 1@@@@ 25@@ 1111 !25 4@@
6%ample 1, Suppose (e (ish to e ?5A sure that a sur#ey (ill gi#e an estimated pre#alence (ithin 1A of the true #alue in a solute terms. T(o standard errors (ill then e less than 1A i.e 2 S6 GK1A or S6 G K @.5A. Ta le & gi#es the sample sizes required for different pre#alence rates and standard errors. 7o(e#er+ since the sample size (e are loo.ing for (ill depend on true pre#alence+ (hose #alue (e do not .no(+ that eing the reason for the sur#ey+ this does not seem to help much. It (ill e rare+ ho(e#er+ to ha#e a solutely no idea (hat #alue of the true pre#alence to e%pect. )e (ill usually e a le to ma.e an estimate and say+ for e%ample+ that E(e elie#e the pre#alence is not greater than &AE. If (e then choose the sample size+ it might turn out to e much too ig+ since the correct sample size to measure a pre#alence of+ say+ around 2A to the desired accuracy is $&4+ (hile the sample size corresponding to a pre#alence of around &A is 2?44. 7o(e#er+ there is nothing much (e can do a out this. Cac. of prior .no(ledge (ill al(ays result in a need for li eral 3i.e. o#erlarge4 sample sizes and hence higher costs. If (e do not ha#e the slightest idea (hat pre#alence to e%pect+ (e can use the sample size corresponding to the least fa#oura le case 30 G 5@A4 gi#en in Ta le &+ though if (e are demanding a high degree of accuracy the indicated sample size 31@ @@@4 may e unrealistically large. 6%ample 2, )e might suspect that the true pre#alence is of the order of 2@A and (ould li.e to e ??A sure that the estimated pre#alence is (ithin 2A of the true #alue. )e can e ??A certain that the true #alue lies (ithin three standard errors of the estimate. 7ence+ to fulfill the required conditions (e must choose the sample size in such a (ay that 3 S6 G K2A or S6 G K2D3 G @.$A appro%imately. 2rom Ta le & (e see that for S6 G @.5A and 0 G 2@A+ (e need a sample of !4@@. 2or S6 G @.$A+ it seems+ (e (ill need around 4@@@. 3In fact the e%act sample size as calculated from the formula n G 031@@/04DS6J is only 32!54. Ta le ? gi#es sample sizes required to estimate pre#alence in a large population (hen the desired precision is stated in terms of relati#e accuracy. In this case the sample sizes are such as to ensure that the standard error (ill not e greater than the stated percentage of the true pre#alence. The entries in the ta le ha#e een calculated using the formula,
If the sample size required represents a #ery high proportion of+ or is greater than+ the sampled population itself+ the more accurate formula
should e used to calculate the sample size. 39 is the size of the population eing sampled4.
Ta le ?. Sample size n) to control the standard error S!) of estimated prevalence relative to the true value of the prevalence.
P *+, !- as a percentage of P 1.. $.. 1... @.5 1 ??@ @@@ $? !@@ 1? ?@@ 1.@ ??@ @@@ 3? !@@ ? ?@@ 1.5 !5! !!$ 2! 2!$ ! 5!$ 2.@ 4?@ @@@ 1? !@@ 4 ?@@ 2.5 3?@ @@@ 15 !@@ 3 ?@@ 3.@ 323 333 12 ?33 3 233 3.5 2$5 $14 11@2? 2 $5$ 4.@ 24@ @@@ ? !@@ 2 4@@ 4.5 212 222 & 4&? 2 122 5.@ 1?@@@@ $ !@@ 1 ?@@ !.@ 15! !!$ ! 2!$ 1 5!$ $.@ 132&5$ 5314 1 32? &.@ 115@@@ 4!@@ 1 15@ ?.@ 1@1111 4@44 1 @11 1@.@ ?@@ @@@ 3 !@@ ?@@ 2@.@ 4@@@@ 1 !@@ 4@@ 3@.@ 23 333 ?33 233 4@.@ 15 @@@ !@@ 15@ 5@.@ 1@ @@@ 4@@ 1@@
The sample sizes calculated in the t(o different e%ercises (ere o tained assuming that the sample (as to e chosen y simple random sampling i.e. that animals (ere sampled indi#idually. If (e use a different sampling method+ these sample sizes (ill no longer e appropriate. 2or e%ample in cluster sampling+ (hich increases the #aria ility of any estimates made+ (e should assume that+ to e on the safe side+ (e (ill need to e%amine four times as many animals as for a simple random sample. If (e require an accurate estimate of pre#alence not only for the complete population ut also (ithin (ell defined su groups+ as in a stratified sur#ey+ (e need to choose the sample size sufficiently large within each subgroup. Suppose+ for instance+ that the population is distri uted in si% regions. Then+ in our first e%ample+ if (e require to estimate a true pre#alence of 2A (ith an S6 of @.5A for each region+ (e (ould need a sample size of $&4 in each region, assuming that (e ta.e simple random samples (ithin the regions.
-gain the ans(er (ill depend on the true+ ut un.no(n+ #alue of the pre#alence of the disease in the target population. 2or small populations+ e.g. indi#idual herds+ the ans(er (ill depend on the size of the population 3Ta le 1@4. 2or populations of o#er 1@ @@@+ the sample sizes in the last column of the ta le (ill e appro%imately correct. The #alues in Ta le 1@ (ere calculated from the formula, 0ro a ility of detection G 1/39/M4D9%39/M/14D39/14%.. 39/M/nL14D39/nL 14 (here, 9 G size of population+ M G total num er of infected animals+ and n G sample size. )here the indicated pre#alence did not correspond to a (hole num er of animals+ the #alue (as rounded up to the ne%t (hole num er 3e.g. 3A of $5 G 2.25 animals= this (as rounded up to 34. The sample sizes indicated in Ta le 1@ are appropriate only for simple random sampling and (ould e much larger if cluster sampling (as used. The determination of sample sizes required to estimate continuous #aria les is discussed in Section 5.3.2.
production. It should e noted+ ho(e#er+ that questionnaires in#ol#ing a considera le effort in filling in are li.ely to ha#e a high non/return rate+ and the sample size may ha#e to e ad*usted accordingly. 2urthermore+ high non/return rates can introduce su stantial ias in the estimates calculated from the returns. 6pidemiological studies often in#ol#e #isiting the sample units and collecting the rele#ant data y questioning the o(ners andDor carrying out the appropriate measurement procedure on the animals concerned. Besigning questionnaire formats and inter#ie( protocols can e a long and difficult process+ particularly (here traditional li#estoc. producers are concerned. 1emem er that questioning a traditional li#estoc. producer a out the num ers or performance of his animals is a.in to questioning other indi#iduals a out their an. accounts8 'onsidera le time and patience are needed to o tain the trust and cooperation of such indi#iduals. )here#er possi le+ a trusted intermediary should e employed. 9e#ertheless+ as most traditional li#estoc. producers li#e in close pro%imity to their animals and normally come from sections of the population (ith a #ast e%perience of .eeping li#estoc. under -frican conditions+ they are o #iously an e%tremely useful and #alua le source of information. Ta le 1@. Sample size as a function of population size, prevalence and minimum probability of detection.
Population si(e $. &$ 1.. 3.. $.. 1... $... 1. ... a4 ?@A pro a ility of detection @.5 5@ $5 1@@ 2$1 342 3!? 43? 44? 1 45 !& ?1 1!1 1&4 2@5 224 22$ 2 45 51 !? ?5 1@2 1@& 113 114 3 34 4@ 54 !$ $1 $3 $! $! 4 34 4@ 44 52 54 55 5$ 5$ 5 2$ 33 3$ 42 43 44 45 45 ! 2$ 2$ 32 35 3! 3$ 3& 3& $ 22 24 2& 31 31 32 32 32 & 22 24 25 2$ 2$ 2& 2& 2& ? 1& 21 2@ 22 22 22 22 22 1@ 1& 1& 2@ 22 22 22 22 22 4 ?5A pro a ility of detection @.5 5@ $2 1@@ 2&! 3&& 45@ 5!4 5&1 1 4& $2 ?! 1&? 225 25& 2?@ 2?4 2 4& 5& $& 11$ 12? 13& 14$ 14& 3 3? 4$ !3 &4 ?@ ?4 ?& ?& 4 3? 4$ 52 !! !? $1 $3 $4 5 31 3? 45 54 5! 5$ !? 5? ! 31 33 3? 45 4$ 4& 4? 4? $ 2! 2? 34 3? 4@ 41 42 42 & 2! 2? 31 34 35 3! 3! 3! ? 22 2! 2& 31 31 32 32 32 1@ 22 23 25 2& 2& 2? 2? 2? c4 ??A pro a ility of detection @.5 5@ $5 1@@ 2?$ 45@ !@1 &4@ &$& P *+,
1 2
5 ! $ & ? 1@
5@ 4? 4& 45 3? 3? 34 34 2? 2?
$5 !& 5? 5? 51 44 3? 3? 35 32
?? ?@ $& !& 5? 53 4$ 43 3? 3!
The success or failure of this type of epidemiological study depends as much on the design of recording forms as it does on the o#erall sur#ey+ the actual field (or. and the analysis. The latter (ill e impossi le unless the material recorded is intelligi le. Much thought should therefore e gi#en to the design of forms and their efficiency should e tested in pilot trials. The forms should e orderly+ (ith related items grouped together 3calf num er+ date of irth+ place of irth4+ con#enient to use 3the form should fit on a clip oard4+ and technical (ords not li.ely to e understood y field staff a#oided+ as should any am iguities in the terms used. The form should ha#e a title and pro#isions for the identification of oth the officer completing the form and the data source. It should also ha#e a reference num er (hich relates to the sur#ey design 3e.g. @!D@4D?3 might indicate the si%th #isit to farm ?3 in stratum 44. 'ompleted forms should e chec.ed for errors as soon as possi le+ so that appropriate corrections can e made (hile the memory of the inter#ie(er is still fresh and the sample unit accessi le. Some additional points to ear in mind in the design of inter#ie(s and questionnaires include, i4 6%plain the purposes of the inter#ie( to the inter#ie(ee. 0eople are generally much more cooperati#e (hen they .no( (hy they are eing questioned. ii4 "eing normally #ery polite+ li#estoc. o(ners tend to ans(er questions (ith the ans(er that they thin. the inter#ie(er (ishes to hear+ rather than gi#ing the correct ans(er. The use of leading questions (hich gi#e the inter#ie(ee a clue as to the ans(er e%pected or desired+ should therefore e a#oided. iii4 7uman memories are short+ and there is a tendency to concentrate e#ents into a more limited time period than (as actually the case. So if li#estoc. o(ners are as.ed a out e#ents that occurred in their animals o#er the last year+ they tend to report e#ents that happened o#er the last 2 or 3 years. This o #iously e%aggerates data on disease frequencies. i#4 Bo not ma.e inter#ie(s or questionnaires too long+ or else the inter#ie(ee (ill get ored and the quality of his ans(ers (ill suffer. To a#oid this+ the most important questions should e as.ed at the eginning.
#4 Huestions requiring su *ecti#e ans(ers generate data that are e%tremely difficult to analyse. They should e a#oided (hene#er possi le+ e#en though they may gi#e #alua le insights. #i4 Cong+ complicated questions tend to lead to misunderstanding and (rong ans(ers.
-n additional pro lem frequently encountered is that of ias on the part of the o ser#er. If an indi#idual (ishes to pro#e a particular point he may+ quite unintentionally+ e iased in recording his o ser#ations. This pro lem can e a#oided y the use of a E lindE technique (here y the o ser#er is .ept ignorant of the distri ution of the determinant in the groups eing studied+ merely eing required to record a set of o ser#ations a out those groups. -rrors due to measurements 6rrors inherent in the procedures y (hich a #aria le is eing measured are common in epidemiological studies. 2or e%ample+ if t(o (eighing scales are eing used in a study+ one scale may consistently gi#e a higher reading than the other. 5 #iously+ careful chec.ing and monitoring of such apparatus efore and during the study (ill reduce errors of this .ind. 2urther errors may occur (hen diagnostic tests are eing used to determine the presence or a sence of an infectious agent. The terms used to descri e the relia ility of diagnostic procedures are, "epeatability, (hich is the a ility of a diagnostic test to gi#e consistent results. -ccuracy+ (hich is the a ility of a test to gi#e a true measure of the #aria le eing tested. -ccuracy is normally measured y t(o criteria, - Sensitivity, (hich is the capa ility of that test to identify an indi#idual as eing infected (ith a disease agent (hen that indi#idual is truly infected (ith the disease agent in question. In other (ords+ it gi#es the proportion of infected indi#iduals in the sample that produce a positi#e test result. / Specificity+ (hich is the capa ility of that test to identify an indi#idual as eing uninfected (ith a disease agent (hen that indi#idual is truly not infected (ith the disease agent in question. In other (ords+ it gi#es the proportion of uninfected indi#iduals in the sample that produce a negati#e test result. These t(o terms are illustrated in Ta le 11. Ta le 11. !stimated and true prevalences of a disease agent illustrating the terms specifcity and sensitivity.
/umber of individuals infected /umber of individuals not infected Total 0ositi#e test result a aL 9egati#e test result c d cLd Total aLc Ld 9
9otes, The estimated pre#alence is 3aL 4D9= the true pre#alence is 3aLc4D9. The sensiti#ity of the test is aD3aLc4 and its specificity is dD3 Ld4
6%ample 1, Suppose that (e tested a sample of 1@@@ animals for the presence of a disease agent using a test of ?@A sensiti#ity and ?@A specificity. The results of the testing procedure are sho(n in Ta le 12. Ta le 12 is some(hat artificial in that it gi#es the column totals+ (hich (e are trying to estimate. 7o(e#er+ if the disease (as distri uted through the population in this (ay and (e used a test that (as ?@A sensiti#e and ?@A specific to estimate the e%tent of this distri ution+ (e (ould arri#e at an estimated pre#alence of 1&@D1@@@+ (hich (ould e an o#erestimate of the true pre#alence of 1@@D1@@@. 5f the 1&@ animals that the test identified as positi#e+ ?@ (ere+ in fact+ not infected (ith the disease+ (hile of the &2@ animals that the test identified as negati#e+ 1@ (ere+ in fact+ infected (ith the disease. Ta le 12. "esults of using a diagnostic test of #$% sensitivity and #$% specificity in a sample of &$$$ animals in which the true prevalence of infection is &$%.
/umber of individuals infected /umber of individuals not infected 0ositi#e test result ?@ ?@ 9egati#e test result 1@ &1@ Total 1@@ ?@@ Total 1&@ &2@ 1@@@
6%ample 2, Suppose (e used the same diagnostic test on a similar sample of animals ut the true pre#alence of the infection in the sample (as 1A. The results of this test are gi#en in Ta le 13. Ta le 13. "esults of using a diagnostic test of #$% sensitivity and #$% specificity in a sample of &$$$ animals in which the true prevalence of infection is l A
/umber of individuals infected /umber of individuals not infected 0ositi#e test result ? ?? 9egati#e test result 1 &?1 Total 1@ ??@ Total 1@& &?2 1@@@
The true pre#alence of the infection in this case is 1@D1@@@ G 1A+ (hile the estimated pre#alence of infection is 1@&D1@@@ G 1@.&A. 5f the 1@& animals that the test diagnosed as positi#e+ ?2A 3i.e. ??D1@&4 (ere+ in fact+ not infected (ith the disease agent in question. This leads us to another useful statistic+ the diagnosibility of a test+ (hich is the proportion of test/positi#e indi#iduals that are truly infected (ith the disease agent. In our first e%ample the diagnosi ility (as ?@D1&@ G 5@A (hile in the second it (as ?D1@& G &.3A. 9ote that the diagnosi ility of a diagnostic test declines as the pre#alence of a disease decreases. This means that sensiti#ity and specificity errors in diagnostic tests produce relati#ely much greater errors in pre#alence estimates of diseases (ith lo( true pre#alence than (ould e the case in diseases of high pre#alence. It is o #iously desira le to use a test that is as sensiti#e and specific as possi le+ so that the num ers of false positi#es and false negati#es in the sample are reduced. The sensiti#ity and specificity of a test can e determined y administering the test to a num er of animals and then comparing its results (ith the results o tained from a series of detailed diagnostic
in#estigations on the animals concerned. In order for the results to e #alid+ ho(e#er+ the animals selected for the e#aluation must e representati#e of the population to (hich the test is to e applied. 5nce the sensiti#ity and specificity of a test are .no(n+ a correction factor can e applied to the pre#alence estimate to ta.e into account the sensiti#ity and specificity of the test,
(here all #alues are e%pressed as decimals. 2or our e%ample 2 3Ta le 134, True pre#alence G 3@.1@& L @.?@/ 14D3@.?@ L @.?@/ 14 G @.@@&D@.&@ G @.@1 or 1A. 9ote that although (e can no( correct the pre#alence estimate+ (e still ha#e no idea (hich of the indi#idual animals are truly negati#e+ falsely negati#e+ truly positi#e and falsely positi#e. This pro lem can occur (hen diagnostic tests are eing used in a test/and/ slaughter policy for controlling a particular disease. Such policies are normally only implemented after a #accination campaign has reduced the disease to a lo( pre#alence+ (hen the diagnosi ility of a test is li.ely to e lo(. In addition+ #accination it tests are eing used in a test/and/slaughter policy for controlling a particular disease. Such policies are normally only implemented after a #accination campaign has reduced the disease to a lo( pre#alence+ (hen the diagnosi ility of a test is li.ely to e lo(. In addition+ #accination itself often has an ad#erse effect on test sensiti#ity and specificity. )e can see from our second e%ample that if (e slaughtered all the test positi#es+ ?2A of the animals eing slaughtered (ould not e actually infected (ith the disease agent. )hile it is relati#ely easy to ma.e a test more sensiti#e+ often y lo(ering the criteria y (hich a test result is deemed positi#e+ this normally results in the test ecoming less specific. Tests (hich are highly specific are often complicated+ time consuming and+ consequently+ e%pensi#e. -s such they can rarely e employed on a large scale. - (ay round this pro lem is to apply t(o separate and independent testing procedures. Initially+ a screening test of high sensiti#ity is needed to ensure that as many infected animals as possi le are detected. 5nce the initial screening test has een performed+ all positi#e reactors can e ree%amined y a second test of high specificity. Since only the positi#e reactors ha#e to e e%amined and not the entire sample+ this cuts do(n the cost of using a highly specific test. 6%ample, Suppose (e (ere attempting to eradicate a disease of 1A pre#alence from a population of 1@ @@@ animals y a process of test and slaughter. If (e first use a test of high sensiti#ity 3?5A4 ut lo( specificity 3&5A4+ our initial results (ould e as illustrated in Ta le 14.
Ta le 14. "esults of a diagnostic test of #'% sensitivity and ('% specificity used to e)amine a population of &$ $$$ animals for the presence of a disease with true prevalence of &%.
/umber of individuals infected /umber of individuals not infected 0ositi#e test result ?5 1 4&5 9egati#e test result 5 & 415 Total 1@@ ? ?@@ Total 1 5&@ & 42@ 1@ @@@
)e then su *ect the 15&@ test/positi#e animals to a further test of the same sensiti#ity ut a higher specificity 3Ta le 154. Ta le 15. "esults of a diagnostic test of #'% sensitivity and #(% specificity applied to the &'($ test-positive animals identified in *able &+.
0ositi#e test result 9egati#e test result Total /umber of individuals infected /umber of individuals not infected ?@ 3@ 5 1 455 ?5 1 4&5 Total 12@ 1 4!@ 1 5&@
This test indicates that (e (ould need to slaughter 12@ as opposed to 15&@ animals. -dmittedly+ a fe( false negati#es might ha#e slipped through the testing procedure+ ut it is hoped that these (ould e pic.ed up on su sequent testing.
The first step is to (rite out clearly the o *ecti#es of the study and the data that (ill need to e generated in order to attain them. Throughout the entire planning process+ constant reference should e made to these o *ecti#es in order to ensure that the procedures eing planned are of rele#ance. If it is found that the resources a#aila le may not permit the achie#ement of the original o *ecti#es+ the o *ecti#es may ha#e to e redefined or additional resources found. 5 *ecti#es can often e defined y constructing a hypothesis. -n epidemiological hypothesis should, Specify the population to which it refers i. e. the population a out (hich one (ishes to ma.e inferences and therefore sample from. This is referred to as the target population. Sometimes+ for practical reasons+ the population actually sampled may e smaller than the target population. In such cases the findings of the study (ill relate to the sampled population+ and care must e e%ercised in e%trapolating inferences from the sampled population to the target population. 2requently+ inferences may e required a out different groups (ithin the target population. 2or e%ample+ one may (ant to estimate not only the o#erall pre#alence of a specific disease+ ut also the pre#alences or incidences of the disease in #arious groups or su sets of the population. To o tain estimates (ith the precision required+ the samples ta.en from these groups must e large enough+ and this (ill o #iously affect the design of the study. - further pro lem may occur (hen defining the actual units to e sampled (ithin a population. If+ for e%ample+ the sample unit (as a calf+ at (hat age e%actly does a calf cease eing a calf< -lternati#ely+ suppose the sample unit is a herd. )hat e%actly is meant y the term EherdE< If a li#estoc. o(ner has only one animal+ does that constitute a herd< 5 #iously+ the sample unit must e precisely defined and appropriate procedures designed to ta.e care of orderline cases.
Specify the determinant or determinants being considered 'an such disease determinants as EstressE+ EclimateE and managementE e defined accurately< 7o( are these determinants to e quantified and (hat measurements (ould e used in their quantification< )hat are the ad#antages and disad#antages of these methods of measurement< 7o( accurate are they< Specify the disease or diseases being considered. The criteria y (hich an animal is regarded as suffering from a particular disease must e carefully defined. )ill the disease e diagnosed on clinical symptoms alone< If so+ (hat clinical symptoms< -re there li.ely to e pro lems (ith differential diagnoses< )ill la oratory confirmation e needed< If so+ are there adequate la oratory facilities a#aila le< )ill they e a le to process all the samples su mitted< )ill diagnostic tests e used< 7o( accurate are these tests< 1emem er that studies ased solely on diagnostic tests may pro#ide data a out the rates of infection present in the population eing sampled+ ut they may not indicate (hether the infected animals are sho(ing signs of disease or not. -dditional data on mortalities and mor idities may ha#e to e generated. )hat rates are to e calculated< 1emem er that incidence and attac. rates cannot normally e o tained y a cross/sectional study. If estimates on economic losses due to particular diseases are required+ #arious production parameters may ha#e to e recorded. 7o( are these to e measured< 7o( good and ho( accurate (ill these measurements e< Specify the e)pected response induced by a determinant on the fre,uency of occurrence of a disease. In other (ords+ (hat effect (ould an increase or decrease in the frequency of occurrence of the determinant ha#e on the frequency of occurrence of the disease< 1emem er that the determinant must occur prior to the disease. This may e difficult to demonstrate in a retrospecti#e study. Make biological sense. In epidemiological studies (e are interested in e%ploring relationships et(een the frequency of occurrence of determinants and the frequency of occurrence of disease. )e are particularly interested in determining (hether the relationship is a causal one i.e. (hether the frequency of occurrence of the particular #aria le eing studied determines the frequency of occurrence of the disease. )e analyse such relationships y the use of statistical tests (hich tell us the pro a ility of occurring y chance of the relati#e distri utions of the determinant and the disease in the studied populations. If there is a good pro a ility that the distri utions occur y chance+ the result is not significant and the distri utions of the #aria le and the disease are independently related. If there is a strong pro a ility that the distri utions did not occur y chance+ the result is significant and the distri utions of the #aria le and the disease are related in some (ay. 9ote that astatistically significant result does not necessarily imply a causal relationship. 6%ample, Suppose that the frequency of occurrence of #aria le - is determined y the frequency of occurrence of #aria le ". (hich also determines the frequency of occurrence of disease B. )hat is the relationship et(een #aria le - and disease B< )igure
9ote that although this arrangement (ould produce a statistically significant relationship et(een #aria le - and the disease B+ the relationship is not a causal one+ since altering the frequency of occurrence of #aria le - (ould ha#e no effect on the frequency of occurrence of the disease+ (hich is determined y #aria le ". >aria les that eha#e in this (ay are .no(n as confounding variables and can cause serious pro lems in the analysis of epidemiological data. 2or this reason+ any hypothesis that is made a out the possi le association of a determinant and a disease should offer a rational iological e%planation as to (hy this association should e. 2inally+ remem er that common e#ents occur commonly and that often the simplest e%planation for a disease phenomenon is the right one. 'omplicated hypotheses should not e tested until the simplest ones ha#e een ruled out. 2or e%ample+ the presence of tic.s on supposedly dipped animals is more li.ely to e due to a failure to dip the animals or to improper dipping procedure+ rather than to the appearance of a ne( strain of acaricide/ resistant tic.s. These considerations emphasise the need for careful and detailed planning of an epidemiological study. They also illustrate the need to o tain as comprehensi#e and detailed .no(ledge as possi le a out the su *ect eing in#estigated and the techniques used in the in#estigation. The time spent reading rele#ant literature is therefore usually (ell spent. 6%tensi#e literature searches can often e performed quic.ly and easily y using modern information/processing techniques. Bo not e afraid to as. ad#ice from e%perts. Such ad#ice is essential (hen one is conducting in#estigations or employing techniques outside oneFs particular area of e%pertise. 1emem er that the time to as. for ad#ice is before the study has egun. )hene#er possi le+ consult a statistician on the statistical design of the study in order to ensure that the data generated (ill e sufficient and can e analysed in the appropriate (ay to fulfil the o *ecti#es of the study.
The main ad#antages of using e%isting data are, Bata collection is e%pensi#e= using e%isting data is cheaper although not cost free. Time is often essential= analysis of e%isting data sources gi#es ans(ers more quic.ly. "y using data from #arious sources+ it may ecome possi le to monitor the progress of a disease through different populations and to esta lish lin.ages et(een disease e#ents+ so that the sources of disease out rea.s can e traced and populations li.ely to e at ris. of the disease identified. The use of e%isting data sources (ill help strengthen them or induce the need for change. Since the original data collection (as performed in ignorance of the ongoing study+ there may e a reduced chance of ias in fa#our or against any hypothesis eing tested. The main disad#antages encountered in the use of e%isting data include, Bata sets are often incomplete. 2or e%ample+ national reports ased on compilations of regional reports are almost in#aria ly incomplete and frequently #ery late in appearing+ as some regions are late in reporting. 0arts of data sets may ha#e een lent out and not returned. The data may ha#e een collected for other purposes than those of the present study. 2or e%ample+ data collected initially for administrati#e or accounting purposes are unli.ely to help identify the associations et(een a disease and its determinants. 6%isting data may e inconsistent or of un.no(n consistency. 5 ser#ers change and so do recording systems. 'hanges in administrati#e procedures or policy may alter the type and method of data collection and complicate analysis. 1andom errors of counting or in reading instruments may cancel each other out in the long term+ ut errors are often not random. Scales may e consistently misread due to confusion o#er units and graduations. Bifferent o ser#ers may consistently under/ or o#erestimate li#estoc. num ers+ (eights and ages and differ in their diagnosis of the same disease condition. 'alculations of epidemiological rates are often pre*udiced y ignorance of the size of the population at ris. and of the time o#er (hich e#ents (ere o ser#ed. The data may not e rele#ant. 1ecords for 2riesians (ill not e useful in estimating production losses in ze us. -lthough data may e readily a#aila le from commercial producers+ they (ill not relate to the ma*ority of rural enterprises. Since li#estoc. production is dependent on (eather+ among other factors+ data from a series of years need to e e%amined to o tain representati#e estimates of means and scatter. 6#en if such data are a#aila le from apparently similar farming systems+ chec.ing is necessary to indentify any changes that might ha#e occurred in the pro#ision of ser#ices+ health control+ mar.ets and in prices+ efore ta.ing historical data as eing a good estimate of animal health and production at present.
The method used to collate and analyse the data may not e adequate for epidemiological purposes. If this is the case+ the data may ha#e to e o tained in the original form+ if still a#aila le+ and reanalyzed. This may e a time/consuming process. Moreo#er+ it may not e possi le to su *ect the original data to the appropriate analysis. There are nearly al(ays some serious limitations in the #alue of e%isting data for epidemiological purposes. This does not mean that the data may not e useful= if the limitations are understood+ the pro a ility of their misinterpretation (ill e reduced.
or to the pro#ision of additional drugs+ equipment and facilities. -n increased a(areness on the part of li#estoc. o(ners to a particular disease pro lem or more selecti#e diagnosis and treatment may also lead to an apparent increase in recorded incidence. 0ro a ly the most useful data from such sources are those related to notifia le disease out rea.s+ on (hich detailed reports ha#e to e compiled. If the report forms ha#e een properly designed and the in#estigati#e procedures specified+ such data may allo( the appropriate rates to e calculated. 7o(e#er+ o(ners may e reluctant to report such diseases in their li#estoc.+ especially if they .no( that restrictions are li.ely to e imposed. /iagnostic laboratories. The data generated y diagnostic la oratories often pro#ide precise diagnoses of disease conditions ut can e highly selecti#e. The relati#e frequencies (ith (hich specific diagnoses are reported often reflect the standard and range of la oratory facilities+ and the interests or e%pertise of the field staff and la oratory (or.ers+ rather than the actual situation in the field. ;nless the la oratory has a field sur#ey capacity+ incidence and pre#alence rates cannot e esta lished+ since the data on diagnoses o tained cannot e related to a source population. 9e#ertheless+ such data are often useful in highlighting disease pro lems (hich are of particular concern to the indi#iduals su mitting the specimens. The minimum .no(ledge that disease % (as confirmed in location y at time z pro#ides some asis on (hich to uild. "esearch laboratories, institutions and universities. Most of the data generated y these institutions are li.ely to come from e%periments and may e difficult to relate to the situation in the field. 9e#ertheless+ if research is eing conducted into a particular disease+ the data generated are li.ely to pro#ide #alua le insights into the epidemiology of the disease in question. Such institutions are also good sources of reference and ad#ice. Slaughter houses and slaughter slabs. The data generated from these sources are normally in the form of findings at meat inspection+ and may e recorded in a limited and highly administrati#e format. Ma*or #ariations in the sensiti#ity and specificity of diagnoses may occur et(een different inspectors. The data only pertain to certain sections of li#estoc. populations+ eing highly iased since mostly healthy young adults are e%amined. Significant omissions are common+ and relati#ely rare pathological conditions are not usually differentiated+ ut the data may pro#ide information on congenital a normalities and chronic disease conditions (hich produce distincti#e lesions. Slaughter houses and slaughter sla s are frequently used as a starting point for epidemiological in#estigations since they ha#e facilities for conducting e%aminations and ta.ing specimens that are not a#aila le else(here. 0arketing organizations. Bata from mar.eting organizations pro#ide information on sales and off ta.e and sometimes also on li#estoc. mo#ements. Information on the latter might e used to trace ac. disease out rea.s to their sources. ;nfortunately+ this is rarely the ease in -frica+ since animals are seldom indi#idually identified and therefore their mo#ements cannot e accurately recorded. Control posts and ,uarantine stations. 1ecords from these facilities can pro#ide information a out li#estoc. mo#ements and out rea.s of notifia le diseases.
1rtificial insemination services. 1ecords from -I ser#ices may e of assistance in pro#iding some information a out fertility. The data are normally collected in the form of non/return rates i.e. the proportions of first+ second+ third inseminations etc for (hich no further insemination is requested. Such rates often gi#e an o#erestimate of the true reproducti#e performances in the populations concerned. Many -I ser#ices often include a facility for the in#estigation of infertility pro lems. Bata from such a facility can e of interest ut are difficult to a source population. 2nsurance companies. Since these companies no( offer insurance co#er for high/#alue animals+ and may offer limited co#er for animals of lo(er #alue+ they need to calculate and monitor ris.s+ (hich reflects the interest of the epidemiologist. -s such their records may e useful ut only limited data may e a#aila le. The time required to identify and analyse e%isting records should not e underestimated+ (hile their #alue needs to e carefully (eighed against the cost. - quic. ut comprehensi#e sur#ey of such material should indicate (hether it (ill pro#ide the required ans(ers.
Trace the course of disease out rea.s (ith the o *ecti#e of identifying their sources and the populations of li#estoc. li.ely to e at ris.. 0ro#ide a comprehensi#e and readily accessi le data ase on disease in li#estoc. populations for research and planning purposes. The prime o *ecti#e of such acti#ities is+ ho(e#er+ to pro#ide up/to/date information to disease control authorities to assist them in formulating policy decisions and in the planning and implementation of disease control programmes. -lthough a detailed discussion on the design and implementation of sur#eillance systems is eyond the scope of this manual+ it may e useful to re#ie( riefly some of the considerations in#ol#ed. The success of any sur#eillance or monitoring system depends largely on the speed and efficiency (ith (hich the data gathered can e collated and analysed+ so that up/to/date information can e rapidly disseminated to interested parties. -s a result of recent ad#ances in data processing techniques+ particularly in the field of computing+ the de#elopment of comprehensi#e and efficient sur#eillance and monitoring systems at a reasona le cost is no( (ithin the reach of most #eterinary ser#ices. The capacity of epidemiological units to employ these modern techniques means that such units may e a le to offer data/processing ser#ices to institutions and organisations in return for the use of their data. This has remo#ed one of the main constraints on the de#elopment of such systems in the past+ (hich (as the reluctance of #arious data/ generating sources to ma.e their data a#aila le to those responsi le for sur#eillance. Such cooperation depends on a clear identification of the information needs of reporting organisations and fulfilling these rapidly and efficiently. Modern computerised data processing allo(s complicated analytical procedures to e carried out on large #olumes of data quic.ly and easily. 7o(e#er+ they must e used (ith a great deal of caution and only on data (hich *ustify them. If used on incomplete or inaccurate data (hose limitations are not understood+ they may produce results (hich are at est confusing or misleading. 2or this reason+ the analysis of sur#eillance or monitoring data should e .ept simple and the limitations of information produced should e clearly stated.8 - further consideration is that of confidentiality. -ny sur#eillance or monitoring system (ill contain a certain amount of confidential data. If such data get into the (rong hands and are used indiscriminately (ithout due regard to their pro a le limitations+ serious pro lems may result. -ppropriate safeguards need to e designed+ therefore+ to ensure that information is distri uted to interested parties on a confidential and need/to/.no( asis.
tic. control programme y dipping. The follo(ing o *ecti#es should e orne in mind in the design of monitoring systems, If control measures are eing employed+ the monitoring programme should pro#ide a means to ascertain (hether these measures are eing carried out promptly and efficiently as specified in the programme design+ and if not+ (hy not. The monitoring programme should pro#ide a means to ascertain (hether the control measures eing applied are ha#ing the desired and predicted effect on disease incidence. This normally implies a prompt and comprehensi#e disease/reporting system. The system should not e passi#e+ ut should include a component that is acti#ely concerned (ith searching out disease out rea.s. The monitoring programme should pro#ide a means for a rapid detection of de#elopments (hich might *eopardise the control programme+ or+ in instances (here no control measures are eing implemented+ (hich might (arrant the introduction of control acti#ities.