Professional Documents
Culture Documents
1
Laboratory of Atmospheric Chemistry, Centre for Atmosphere Watch and Services,
Chinese Academy of Meteorological Sciences, Beijing, China
2
NOAA Air Resources Laboratory, Silver Spring, Maryland, USA
Corresponding Author:
Email: wangyq@cams.cma.gov.cn
Fax: +86-10-62176414
1
Abstract: Statistical analysis of air mass back trajectories combined with long-term
ambient air pollution measurements are useful tools for source identification. Using
these methods, the geographic information system (GIS) based software, TrajStat, was
developed to view, query, and cluster the trajectories and compute the potential source
when measurement data are included. The paper presents the software structure and
sources and dust pathways to Jiuquan in springtime. The results indicate that the dust
mainly come from west of Jiuquan via the Taklimakan desert and the Kumtag desert.
2
Software availability
1. Introduction
Identification of pollutant sources using ambient air quality data is essential for
air pollution management. Air mass back trajectory analysis is frequently used to
point out the direction and sources of air pollution at a receptor site (José et al., 2005;
20% of the traversed distance (Stohl, 1998), but the statistical uncertainty will be
reduced with large sets of trajectories. Trajectory clustering techniques, which assign
flow climatology and pollutant transport pathways with particle or gas measurements
3
1980’s by computing the potential source contribution function (PSCF), also called
calculate and describe possible source locations using back trajectories (Ashbaugh et
al., 1985). This method tends to give good angular resolution but poor radial
(Vasconcelos et al., 1996). A limitation of the PSCF method is that grid cells can have
the same PSCF value when sample concentrations are either only slightly higher or
much higher than the criterion. As a result, it can be difficult to distinguish moderate
(CF), can more easily distinguish source strength by assigning the concentration
values at the receptor site to their corresponding trajectories (Seibert et al., 1994). The
mean or logarithmic mean concentration is computed and used as weight for the
residence time of the trajectory in each grid cell. In CWT method, a measured
concentration is assigned equally to all segments of its trajectory, but sources of the
air pollutants are often concentrated in “hot spots”. Using the redistributed CWT field,
Samson, 1989) developed the quantitative transport bias analysis (QTBA). When
applied to multi-site data, QTBA fields were overlaid to locate the sources (Keeler
and Samson, 1989). Based on the coupling of residence time analysis and a known
4
emission inventory, the relative source contribution function (RSCF) was developed
to show the contribution of different kinds of sources (Lin and Chang, 2002). PSCF
and CWT methods have been used widely in recent published studies (Abbott et al.,
2008; Begum et al., 2005; Hsu et al., 2003; Park et al., 2008).
For air mass trajectory visualization and statistical analysis applications, a new
systems (GIS) technique was used for spatial data management, visualization and
analyses. This paper will describe the structure and functions of the software and
demonstrate its use to determine the sources of dust to Jiuquan in spring time.
2. Methods
Hess, 1998). The trajectory model, although included with the TrajStat distribution
and integrated into the GIS, is an external process to the TrajStat software.
the wind, so its trajectory is just the time (t) integration of the particle position vector
(P) in space. The final position for each trajectory segment is computed from the
average velocity (V) at the initial position (P) and first-guess position (P’),
5
3.2. Trajectory clustering
Cluster analysis is a multivariate statistical technique that splits a data set into a
method, but the selection of the clustering algorithm, the distance definition and the
al., 1998; Dorling et al., 1992; Harris and Kahl, 1990; Sirois and Bottenheim, 1995)
have been developed, but in TrajStat, Ward’s hierarchical method (Ward, 1963) is
used to form the clusters by combining the nearest trajectories. Euclidean distance is
often used to define the distance between two trajectories using the latitude and
n
d12 = ∑ (( X (i ) − X
i =1
1 2 (i )) 2 + (Y1 (i ) − Y2 (i )) 2 ) , (3)
where X1 (Y1) and X2 (Y2) reference backward trajectories 1 and 2, respectively. The
main disadvantage of using the Euclidean distance is that two backward trajectories
that followed the same path, but with one having higher speed, may be classified into
two different clusters. If the main interest is to determine the direction from which the
air masses reach the site, one could use angle distance (Sirois and Bottenheim, 1995),
1 n ⎛ ( A + Bi − Ci ) ⎞⎟
d12 = ∑
n i =1
cos −1 ⎜ 0.5 i
⎜ A B ⎟
, (4)
⎝ i i ⎠
where
Ai = ( X 1 (i ) − X 0 ) 2 + (Y1 (i ) − Y0 ) 2 , (5)
Bi = ( X 2 (i ) − X 0 ) 2 + (Y2 (i ) − Y0 ) 2 , (6)
6
Ci = ( X 2 (i ) − X 1 (i )) 2 + (Y2 (i ) − Y1 (i )) 2 . (7)
The variables X0 and Y0 define the position of the receptor site (backward
trajectory origin point) and d12 is the mean angle between the two backward
Both Euclidean and angle distance algorithms are included in TrajStat and visual
inspection of the best cluster line shown by the software is used to determine the final
number of clusters.
Air parcel back trajectories from the receptor site are represented by the segment
endpoints. To calculate the PSCF, the whole geographic region covered by the
trajectories is divided into an array of grid cells whose size is dependent on the
geographical scale of the problem. The PSCF will be a function of location as defined
by the cell indices i and j while the number of segments with endpoints that fall in the
ijth cell is denoted by nij. The number of endpoints in the ijth cell associated with a
trajectory that arrives at the sampling site at the same time as a corresponding
The PSCF value can be interpreted as the conditional probability that the
concentrations of a given pollutant sample greater than the criterion level are related
to the passage of air parcels through the ijth cell during transport to the receptor site.
That is, cells with high PSCF values are associated with the arrival of air parcels at
7
the receptor site that have measured concentrations higher than the criterion value.
These cells are indicative of areas of ‘high potential’ contributions for that pollutant.
Like the PSCF method, in the CWT method a grid is superimposed over the
weighted concentration from the measured sample associated with the trajectories that
where Cij is the average weighted concentration in the ijth cell, l is the index of the
arrival of trajectory l, and τijl is the time spent in the ijth cell by trajectory l. The time a
located in the cell. Also the concentrations can be transformed to their logarithmic
value if less weight is desired for the high concentration outliers. A high value for Cij
implies that air parcels traveling over the ijth cell would be, on average, associated
In both the PSCF and CWT methods, cells with few endpoints may result in high
uncertainty. Thus, to reduce the influence of those grid cells, an arbitrary weight
function is multiplied into the PSCF or CWT value (Polissar et al., 1999).
3. Software description
There are three approaches to using GIS functionality for an application. In most
8
cases, applications are built in the GIS environment, which offers a special macro or
professional GIS functions are the main advantages of this kind of approach.
However, a major limitation is that the application can only run in the GIS
programming environment to create the application with a GIS component that can be
run without any installed GIS environment. A third approach would be to write all the
required GIS functions directly into the application. But this approach adds too much
the most flexibility and software independency, the second approach was used to
develop TrajStat.
components such as MapObjects and MapX to build their GIS functions (Rees et al.,
2006; Shen et al., 2005; Symeonidis et al., 2004). However these commercial
approaches are expensive and there are limitations in distributing the software. In
control (MapWindow open source team, 2007) to generate its basic GIS functions
which all use the ESRI shapefile data formats. These GIS functions include shape
layer management, shape view, map zoom options, attribute data view and edit, label,
shape layer legend edit and several others. All shape layers in a trajectory analyses
task could be saved as one project file, which is convenient for multi-task
9
can be included in the project for more colorful and attractive map views.
Microsoft Visual Basic 2005 was chosen for implementation of the program. Fig.
1 shows the structure of the TrajStat system, which is divided into two parts:
trajectory data preparation, and trajectory statistics. The trajectory calculation function
and adding measurement data into trajectory are included in the first part. The second
part consists of three models: clustering, PSCF and CWT. The GIS functions of the
system were used to generate, edit and view the corresponding shape files for each
part. The detailed steps and data flow are described as below.
which are loaded into the system as an external process. The meteorological data can
the trajectory model parameters to compute one month’s worth of trajectories. These
parameters consist of the start location, start time, run duration, top of the model,
vertical motion method, meteorological data file path and name, and the trajectory file
output path and name. Individual endpoint text files will be generated from this step
Generate trajectory shape file: Internally, the endpoint files from the previous
step are first converted to comma-delimited text files prior to converting them to the
ESRI shape file format. The trajectories calculated from HYSPLIT are
10
In this type of shape file the x, y and z properties of each point are defined by its
longitude, latitude and air pressure along the trajectory. The MapWindowGIS object
creation functions of point, shape and shapefile were used to generate the trajectory
shape file. In addition, the trajectory start date and height attribute were added to the
shapefile.
For instance, using only the height (z) coordinate, the air pressure profile of each
sources for a given receptor site can be analyzed, the long-term measurement data
measurement data column to the trajectory shape file attribute table according to the
which a user could add an attribute field. For instance, a marker can be added to each
trajectories and then the pollutant pathway could be roughly estimated through
must first be converted to shape files prior to running the clustering model. Euclidean
distance or angle distance can be selected as the cluster model and the maximum
11
number of clusters to output should also be preset. The clustering output text file
includes the cluster number for each trajectory. The “calculate mean trajectories” for a
given cluster number will generate the corresponding mean trajectory shape file for
that cluster. A reasonable maximum cluster number can be decided through visual
inspection and comparison of the mean-trajectory maps. Then the cluster id could be
added to each corresponding trajectory in the trajectory shape files. The mean
pollutant concentration for each cluster can be computed using the cluster statistics
function. Pollutant pathways could then be associated with the high concentration
clusters.
polygon shapes from gridded data such as the PSCF and CWT analyses according to
the extent of the domain and grid cell size. For each cell in the PSCF shape file, the
total endpoint number and pollutant endpoint (with corresponding measured pollutant
concentration higher than an arbitrary criterion value) number were counted using the
total endpoint number was also counted in CWT shape file. Then the PSCF and CWT
values were calculated using Eqns. 8 and 9. To reduce the effect of small total
endpoint numbers, the weighting function should be set in the analyses. Weighted
PSCF or CWT fields could be plotted as polygon maps with various color schemes.
High PSCF or CWT cells identify potential source regions to the receptor site.
The deserts in the middle latitude area of Asia (such as the Gobi) were
12
considered as the primary sources of Asian dust. Previous studies have shown that the
deserts of south Mongolia, the Taklimakan and Badain Juran in China contributed to
about 70% of total Asian dust emissions (Zhang et al., 2003). However, for a given
receptor site the source attribution results may be quite different and therefore
additional special studies are required to determine the site’s primary dust sources.
These studies may include different chemical tracer analyses (Kanayama et al., 2002;
Wang et al., 2005; Zhang et al., 1996), sensitivity model experiments (Escudero et al.,
Jiuquan, located in north of Gansu province with the Kumtag desert in west and
the Badain Juran desert in east (Fig 2), is one of the operational sand and dust storm
The PM10 (particle matter with diameter less than 10 µm) concentration data in spring
of 2004-2006 have been reported by Wang et al. (Wang et al., 2008). These data will
statistical methods can be used to identify the dust sources and transport pathways to
Using TrajStat, daily 3-day back trajectories were calculated during springtime
of 2004-2006, and then the corresponding PM10 measured concentration data were
assigned to each trajectory. Trajectory view functions, spatial distribution, air pressure
profile and 3-D distribution, are shown in Fig. 2 providing a visualization of the
detailed path of each trajectory. The trajectory query function was used to distinguish
13
the most polluted trajectories by displaying only those trajectories with PM10
concentrations larger than 150 µg m-3, the limiting value of the Class Ⅱcategory of
the National Ambient Air Quality Standards in China. These most polluted trajectories
were mainly from west, with only a few from east (Fig. 3).
Because air mass transport direction is our main interest in the analysis of dust
storm source regions, angle distance was chosen as the clustering model. A final
cluster number of 5 was selected after the visual examination of the mean trajectory
maps for different cluster numbers (Fig. 4). After the cluster number identification
was assigned to each trajectory, the statistical results of the mean concentration of all
trajectories and only the polluted trajectories was calculated (Fig. 4) for each cluster.
Cluster 4, with 89 trajectories, showed the highest mean PM10 concentration of 227.41
µg m-3 of all clusters and with 37 it also had the most number of polluted trajectories
with a mean PM10 concentration of 427.74 µg m-3. These results indicate that the dust
transport pathway to Jiuquan is mainly from west, via the Taklimakan desert and the
Kumtag desert.
The TrajStat PSCF and CWT analysis results are shown in Fig. 5 and Fig. 6,
respectively. The dust source locations identified by these two methods were very
similar to the clustering results. The eastern part of Taklimakan desert and the Kumtag
desert showed the highest PSCF and CWT values and therefore are the main dust
5. Conclusions
The TrajStat software provides an integrated GIS interface to create, view, and
14
statistically analyze multiple trajectories and the corresponding sampling data. The
GIS environment is created using open source MapWindowGIS components that have
tools include clustering, PSCF and CWT methods all of which can be associated with
measurement data at the receptor site (backward trajectory origin) by assigning each
source regions can be identified using the software. The trajectory statistical
analysis process is not entirely automated, in that some subjective decisions are
specify the final cluster number desired, the clustering method, and the grid cell size
and sources to Jiuquan during the springtime. In this application, the trajectories could
for each trajectory. With the option of two trajectory distance clustering definitions,
direction. Users could also select PSCF and CWT analyses of the trajectories and
measurement data using TrajStat. The example results showed that the Taklimakan
This paper summarizes the applications available in the first version of TrajStat.
Although there are some limitations, the software does include the most common
trajectory analysis methods, such as clustering, PSCF, CWT, and weighting functions
15
to reduce the uncertainty of the grid cells with a small number of trajectories.
confidence intervals and smoothing functions for CWT (Seibert et al., 1994) and
16
Acknowledgements
This study was supported by grants from the Basic Scientific and Operational
Research Fund of CAMS (2008Z004) and National Basic Research Program of China
(2006CB403700).
17
References:
Abbott, M.L., Lin, C.-J., Martian, P. and Einerson, J.J., 2008. Atmospheric mercury near Salmon Falls
Creek Reservoir in southern Idaho. Applied Geochemistry, 23, 438-453.
Ashbaugh, L.L., Malm, W.C. and Sadeh, W.Z., 1985. A residence time probability analysis of sulfur
concentrations at Grand Canyon National Park. Atmospheric Environment, 19(8), 1263-1270.
Begum, B.A., Kim, E., Jeong, C.-H., Lee, D.-W. and Hopke, P.K., 2005. Evaluation of the potential
source contribution function using the 2002 Quebec forest fire episode. Atmospheric
Environment, 39, 3719-3724.
Brankov, E., Rao, S.T. and Porter, P.S., 1998. A trajectory-clustering-correlation methodology for
examining the long-range transport of air pollutants. Atmospheric Environment, 32(9),
1525-1534.
Dorling, S.R., Davies, T.D. and Pierce, C.E., 1992. Cluster analysis: a technique for estimating the
synoptic meteorological controls on air and precipitation chemistry - results from Eskdalemuir,
South Scotland. Atmospheric Environment, 26A, 2583-2602.
Draxler, R.R. and Hess, G.D., 1998. An overview of the HYSPLIT_4 modelling system for trajectories,
dispersion, and deposition. Australian Meteorological Magazine, 47, 295-308.
Escudero, M., Stein, A., Draxler, R.R., Querol, X., Alastuey, A., Castillo, S. and Avila, A., 2006.
Determination of the contribution of northern Africa dust source areas to PM10 concentrations
over the central Iberian Peninsula using the Hybrid Single-Particle Lagrangian Integrated
Trajectory model (HYSPLIT) model. Journal of Geophysical Research, 111, D06210,
doi:10.1029/2005JD006395.
Harris, J.M. and Kahl, J.D., 1990. A descriptive atmospheric transport climatology for the Mauna Loa
Observatory, using clustered trajectories. Journal of Geophysical Research, 95, 13651-13667.
Hsu, Y.-K., Holsen, T.M. and Hopke, P.K., 2003. Comparison of hybrid receptor models to locate PCB
sources in Chicago. Atmospheric Environment, 37, 545-562.
José, R.S., Stohl, A., Karatzas, K., Bohler, T., James, P. and Pérez, J.L., 2005. A modelling study of an
extraordinary night time ozone episode over Madrid domain. Environmental Modelling &
Software, 20, 587-593.
Kanayama, S., Yabuki, S., Yanagisawa, F. and Motoyama, R., 2002. The chemical and strontium
isotope composition of atmospheric aerosols over Japan: the contribution of
long-range-transported Asian dust (Kosa). Atmospheric Environment, 36(33), 5159-5175.
Keeler, G.J. and Samson, J., 1989. Spatial representativeness of trace element ratios. Environmental
Science and Technology, 23, 1358-1364.
Lin, C.-H. and Chang, L.-F.W., 2002. Relative source contribution analysis using an air trajectory
statistical approach. Journal of Geophysical Research, 107(D21), 4583.
MapWindow open source team, 2007. MapWinGIS reference manual. Geospatial Software Lab, Idaho
State University, Pocatello.
Park, S.S., Lee, K.-H., Kim, Y.J., Kim, T.Y., Cho, S.Y. and Kim, S.J., 2008. High time-resolution
measurements of carbonaceous species in PM2.5 at an urban site of Korea. Atmospheric
Research, 89, 48-61.
Polissar, A.V., Hopke, P.K., Paatero, P., Kaufmann, Y.J., Hall, D.K., Bodhaine, B.A., Dutton, E.G. and
Harris, J.M., 1999. The aerosol at Barrow, Alaska: long-term trends and source locations.
Atmospheric Environment, 33, 2441-2458.
18
Rees, H.G., Holmes, M.G.R., Fry, M.J., Young, A.R., Pitson, D.G. and Kansakar, S.R., 2006. An
integrated water resource management tool for the Himalayan region. Environmental
Modelling & Software, 21, 1001-1012.
Rousseau, D.-D., Duzer, D., Etienne, J.-L., Cambon, G., Jolly, D., Ferrier, J. and Schevin, P., 2004.
Pollen record of rapidly changing air trajectories to the North Pole. Journal of Geophysical
Research, 109, D06116, doi:10.1029/2003JD003985.
Seibert, P., Kromp-Kolb, H., Baltensperger, U., Jost, D.T., Schwikowski, M., Kasper, A. and Puxbaum,
H., 1994. Trajectory analysis of aerosol measurements at high alpine sites. In: B.P. M., B. P., C.
T. and S. W. (Editors), Transport and Transformation of Pollutants in the Troposphere.
Academic Publishing, Den Haag, pp. 689-693.
Shen, J., Parker, A. and Riverson, J., 2005. A new approach for a Windows-based watershed modeling
system based on a database-supporting architecture. Environmental Modelling & Software, 20,
1127-1138.
Sirois, A. and Bottenheim, J.W., 1995. Use of backward trajectories to interpret the 5-year record of
PAN and O3 ambient air concentrations at Kejimkujik National Park, Nova Scotia. Journal of
Geophysical Research, 100, 2867-2881.
Stohl, A., 1996. Trajectory statistics - a new method to establish source-receptor relationships of air
pollutants and its application to the transport of particulate sulfate in Europe. Atmospheric
Environment, 30(4), 579-587.
Stohl, A., 1998. Computation, accuracy and applications of trajectories - a review and bibliography.
Atmospheric Environment, 32(6), 947-966.
Symeonidis, P., Ziomas, I. and Proyou, A., 2004. Development of an emission inventory system from
transport in Greece. Environmental Modelling & Software, 19, 413-421.
Vasconcelos, L.A.P., Kahl, J.D.W., Liu, D., Macias, E.S. and White, W.H., 1996. Spatial resolution of a
transport inversion technique. Journal of Geophysical Research, 101, 19337-19342.
Wang, Y.Q., Zhang, X.Y. and Arimoto, R., 2006. The Contribution from Distant Dust Sources to the
Atmospheric Particulate Matter Loadings at XiAn, China during Spring. Science of the Total
Environment, 368(2-3), 875-883.
Wang, Y.Q., Zhang, X.Y., Arimoto, R., Cao, J.J. and Shen, Z.X., 2004. The transport pathways and
sources of PM10 pollution in Beijing during spring 2001, 2002 and 2003. Geophysical
Research Letters, 31(14), L14110, doi:10.1029/2004GL019732.
Wang, Y.Q., Zhang, X.Y., Arimoto, R., Cao, J.J. and Shen, Z.X., 2005. Characteristics of carbonate
content and carbon and oxygen isotopic composition of Northern China soil and dust aerosol
and its application to tracing dust sources. Atmospheric Environment, 39, 2631-2642.
Wang, Y.Q., Zhang, X.Y., Gong, S.L., Zhou, C.H., Liu, H.L., Niu, T. and Yang, Y.Q., 2008. Surface
observation of sand and dust storm in East Asia and its application in CUACE/Dust.
Atmospheric Chemistry and Physics, 8, 545-553.
Ward, J.H., 1963. Hierarchical grouping to optimize an objective function. Journal of the American
Statistical Association, 58, 236-244.
Zhang, X.Y., Gong, S.L., Zhao, T.L., Arimoto, R., Wang, Y.Q. and Zhou, Z.J., 2003. Sources of Asian
dust and role of climate change versus desertification in Asian dust emission. Geophysical
Research Letters, 30(24), 2272, doi:10.1029/2003GL018206.
Zhang, X.Y., Zhang, G.Y., Zhu, G.H., Zhang, D.E., An, Z.S., Chen, T. and Huang, X.P., 1996. Elemental
tracers for Chinese source dust. Science in China Series D, b 395, 512-521.
19
Figure Captions
Figure 2. The spatial distribution, 2-D air pressure profile, and 3-D view of the
backward trajectories from Jiuquan .
Figure 3. Trajectories associated with PM10 concentration larger then 150 µg m-3.
Figure 4. Cluster-mean back trajectories and cluster statistical results for Jiuquan
using a final cluster number of 5.
20
Meteorological
Data
HYSPLIT
Model
21
Figure 2. The spatial distribution, 2-D air pressure profile, and 3-D view of the
backward trajectories from Jiuquan
22
Figure 3. Trajectories associated with PM10 concentration larger then 150 µg m-3.
23
Figure 4. Cluster-mean back trajectories and cluster statistical results for Jiuquan
using a final cluster number of 5.
24
Figure 5. PSCF map of dust sources contributing to high values at Jiuquan.
25
Figure 6. CWT map of dust sources contributing to high values at Jiuquan.
26