Professional Documents
Culture Documents
Atmospheric Environment
journal homepage: www.elsevier.com/locate/atmosenv
A R T I C L E I N F O A B S T R A C T
Keywords: With rapid economic development, industrialization and urbanization, the ambient air PM2.5 has become a
Particulate matter major pollutant linked to respiratory, heart and lung diseases. In China, PM2.5 pollution constitutes an extreme
Space-time analysis environmental and social problem of widespread public concern. In this work we estimate ground-level PM2.5
Remote sensing from satellite-derived aerosol optical depth (AOD), topography data, meteorological data, and pollutant emis-
Aerosol optical depth
sion using an integrative technique. In particular, Geographically Weighted Regression (GWR) analysis was
Bayesian Maximum Entropy
combined with Bayesian Maximum Entropy (BME) theory to assess the spatiotemporal characteristics of PM2.5
Geographically weighted regression
Meteorological fields exposure in a large region of China and generate informative PM2.5 space-time predictions (estimates). It was
found that, due to its integrative character, the combined BME-GWR method offers certain improvements in the
space-time prediction of PM2.5 concentrations over China compared to previous techniques. The combined BME-
GWR technique generated realistic maps of space-time PM2.5 distribution, and its performance was superior to
that of seven previous studies of satellite-derived PM2.5 concentrations in China in terms of prediction accuracy.
The purely spatial GWR model can only be used at a fixed time, whereas the integrative BME-GWR approach
accounts for cross space-time dependencies and can predict PM2.5 concentrations in the composite space-time
domain. The 10-fold results of BME-GWR modeling (R2 = 0.883, RMSE = 11.39 μg / m3 ) demonstrated a high
level of space-time PM2.5 prediction (estimation) accuracy over China, revealing a definite trend of severe PM2.5
levels from the northern coast toward inland China (Nov 2015–Feb 2016). Future work should focus on the
addition of higher resolution AOD data, developing better satellite-based prediction models, and related air
pollutants for space-time PM2.5 prediction purposes.
∗
Corresponding author. Institute of Islands and Coastal Ecosystems, Ocean College, Zhejiang University, Zhoushan, China.
E-mail address: gchristakos@zju.edu.cn (G. Christakos).
https://doi.org/10.1016/j.atmosenv.2017.10.062
Received 11 June 2017; Received in revised form 19 October 2017; Accepted 29 October 2017
Available online 21 November 2017
1352-2310/ © 2017 Elsevier Ltd. All rights reserved.
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
et al. (2001) have used Bayesian maximum entropy (BME) theory to the Xinjiang, Tibet, Qinghai, Mongolia and Heilongjiang provinces
represent and predict spatiotemporal particulate matter distributions in (Fig. 1), which were not included in the study because the PM2.5
North Carolina and California, USA. Wang and Christopher (2003) used monitoring sites in these areas are sparse (while these provinces cover a
linear regression models, whereas Liu et al. (2004) proposed a Chemical total area of 5.33 million Km2 , they have only 89 monitoring sites, and
Transport Model (CTM). Later, Reid et al. (2015) and Donkelaar et al. this monitoring limitation would seriously affect pollutant estimation
(2011) also used the CTM. Lee et al. (2011) developed the day-specific accuracy in provinces with serious pollution problems). On the other
Mixed-Effect Model (MEM), Lee et al. (2012) used a space-time geos- hand, the study area covers about 4.18 million Km2 that include 1408
tatistical kriging model to estimate long-term ambient PM2.5 exposure monitoring sites and 93% of the total population of China. For data
in U.S.A. Liu et al. (2009) and Kloog et al. (2011) proposed a two-stage processing and mapping purposes, the study area is covered with a grid
generalized additive model (GAM), and Ma et al. (2014) and (Xiao consisting of 357,997 grid cells of 3 × 3Km2 size (Xiao et al., 2017).
et al., 2017) used Geographically Weighted Regression (GWR) tools.
Hence, remote sensing techniques, spatial and temporal modeling, and
statistical prediction theory have been individually or in combination 2.2. Data
employed in the quantitative assessment of air pollution and environ-
mental health (Kim et al., 2015; Xiao et al., 2017). 2.2.1. Ground-level pollutant measurements
In view of the above considerations, the objective of the present The 24h-averaged PM2.5, NO2, and CO concentrations at nationally-
work is to introduce and validate in terms of real data a new satellite- referenced monitoring stations during the period November 1, 2015 to
based technique of composite space-time modeling and estimation of February 29, 2016 were downloaded from the China Environmental
PM2.5 concentrations in China during a four-month period: this is the Monitoring Center (CEMC, http://106.37.208.233:20035/). The ob-
combined (or integrative) Bayesian Maximum Entropy-Geographically served PM2.5 concentrations, which served as the dependent variable of
Weighted Regression (BME-GWR) method. Advantages of the combined the pollutant space-time prediction (estimation) techniques used in this
BME-GWR method include its rigorous consideration of the physical work, include 1408 monitoring sites (Fig. 1) with a total of 3009 ob-
cross-space-time dependencies of pollutant distribution, the generation servations in the study area, and, also, 43 monitoring sites that were
of pollutant predictions in a realistic space-time domain rather than evenly distributed in adjacent to the study area provinces to avoid any
separately, the inclusion of more general cases (non-linear and non- edge-effects. PM2.5, NO2, and CO concentrations less than 2 μg / m3
Gaussian predictors), and its ability to jointly incorporate different (5.6% of total records) were discarded since they are below the estab-
environmental predictors, including topography data (elevation) and lished detection limit (EPA, 2008). Also, stations where data were
meteorological data (wind speed, precipitation, temperature, relative available during less than 15 days per month were removed, according
humidity and pressure), as well as pollutant emission indicators (such to China National Ambient Air Quality Standards (CNAAQS). All 3009
as NO2, CO, land use, population, and road network information). observations were distributed during four months, i.e., there exist 953,
707, 529 and 820 observations during the months of November, De-
cember, January and February, respectively. Daily PM2.5, NO2, and CO
2. Data and method data were used to calculate the monthly average pollutant concentra-
tion at each site, and the monthly averages were obtained using the R
2.1. Study area programming language (R version 3.3.2, https://www.r-project.org/).
Notice that most of the PM2.5 monitoring sites are clustered in urban
The present study focuses on the entire China, with the exception of areas (rural areas have little coverage in China).
Fig. 1. Study area. The green dots represent the 1408 PM2.5 monitoring sites within the study area and the 43 sites in neighboring provinces. (For interpretation of the references to colour
in this figure legend, the reader is referred to the web version of this article.)
296
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
297
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
Fig. 3. Histogram and summary statistics of the GWR model variables for the four months during which PM2.5 monitoring took place (N = 3009 estimation points).
(Pop) were available from the Gridded Population of the World, Version center of each cell of the 30 × 30Km2 grid was calculated as the average
4 (GPWv4) (http://sedac.ciesin.columbia.edu/data/collection/gpw- of the values on the 3 × 3Km2 grid nodes that fall within the 30 × 30Km2
v4). Elevations and populations were subsequently averaged at each grid cell. Finally, the complete PM2.5 sites dataset was extracted based
3 × 3Km2 grid cell. on all 3 × 3Km2 grid data in ArcGIS 10.3 (see, also, Xiao et al., 2017).
We notice that, due to the large number and variety of environmental
2.2.7. Data integration variables, they were standardized by using the z-score method fol-
For data integration purposes, at the data pre-processing stage: (a) lowing multiple linear regression (MLR). The purpose of MLR is to se-
All meteorological data at a 3 km scale were interpolated using a spatial lect the significant variables and eliminate any collinearity between
interpolation software for climatic data (ANUSPLIN), whereas for the variables. MLR is the most common form of linear regression analysis.
NO2 and CO data at the 3 km scale (i.e., matching the AOD grid size) the As a predictive analysis, the multiple linear regression is used to explain
standard inverse distance weighted (IDW) interpolation software the relationship between one continuous dependent variable and two or
(ArcGIS 10.3) was used. (b) All these data were integrated into records more independent variables. Each independent variable is given a
appropriate for model fitting, validation and mapping purposes. (c) The computed VIF and Tolerance value. When VIF value is large (> 10, for
collected data were re-projected onto the Asia Lambert Conformal example) and Tolerance < 0.1, collinearity is a problem and the of-
Conic coordinate system. (d) Two square grids with 3 km and 30 km fending variables should be removed from the model.
spatial resolution were constructed, which consisted of a total number
of 357,997 and 3582 grid cells, respectively. 2.3. Methods
The 30 × 30Km2 grid data served as the soft information for BME-
GWR modeling purposes (for more details on the term “soft informa- 2.3.1. The GWR model
tion”, i.e., secondary information bases of various levels of uncertainty, Ordinary least squares (OLS) is a statistical technique for estimating
see sub-section 2.3.2 below). On the other hand, the 3 × 3Km2 grid the unknown parameters of a linear regression model subject to the
(AOD, NO2, CO and meteorological) data were used, together with the condition of minimizing the sum of squares of the differences between
3 × 3Km2 grid (land-cover, road length, elevation and population) data the observed responses (values of the variable being predicted) in the
to obtain the complete 3 km resolution dataset. The PM2.5 value at the given dataset vs. those predicted by a linear function of a set of
298
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
299
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
Fig. 4. The spatiotemporal empirical covariance (in (μg / m3)2 ) and the fitted theoretical model used by the BME-GWR technique.
2.3.2. The BME technique economically developed area of China. The role of GWR in this in-
Bayesian Maximum Entropy (BME, Christakos, 1990, 2000) is a tegration was to generate soft data. The monthly GWR models involve
spatiotemporal modeling and prediction theory with very general fea- different variables after they are filtered by OLS (Table 2), where Eq (2)
tures (e.g., it provides non-linear estimators and allows non-Gaussian describes the general multi-variable structure of monthly GWR. And,
probability laws, it incorporates information from many different the role of BME is to use these soft data together with hard data to
sources, as well as core and site-specific knowledge bases-KB). Its im- produce space-time maps of PM2.5 distribution during four months
plementation is made possible in practice in terms of various software (3 × 3 Km2 resolution).
libraries, like the one used here, namely, the Spatiotemporal Epistemic Specifically, the hard data that served as input to the BME-GWR
Knowledge Synthesis Graphical User Interface software library (SEKS- technique included the measured PM2.5 concentration values at mon-
GUI, Yu et al., 2007). The basic set of BME equations of space-time itoring stations for all eligible station-days during the period November
PM2.5 estimation used in the present study are (Christakos, 2000, 1, 2015 to February 28, 2016 (all predictions are about monthly-
2010), averaged data). On the other hand, the soft data generated by the GWR
model and used in the GWR-BME technique consisted of PM2.5 esti-
∫ dG (g − g ) e μ ⋅ g = 0, mates and an associated variance at the center of each 30 × 30 Km2
∫ dSe μ ⋅ g − AfPM2.5 = 0 (3a-b) grid cell. The soft data obtained by the GWR model (in the form of
probability distributions with mean and variance estimated by GWR)
where g is a vector of functions expressing mathematically the available
serve as useful auxiliary information that can improve the accuracy of
core G-KB, including theoretical space-time covariance models, popu-
the predictions generated by BME at the unsampled points of the space-
lation exposure laws, and scientific theories, including the GWR model;
time grid. Notice that, four separate GWR models were fitted to the data
g denotes the mean value of g ; S denotes the available site-specific KB
during each month, and the BME-GWR method was applied for the
about the pollutant in the specific study region as described earlier (S-
entire four-month period. An outline of the combined BME-GWR
KB may include AOD, meteorological monitoring data, road networks
technique is shown in Fig. 2.
information, land-use data, DEM and population); μ is a vector of
coefficients representing the relative importance of each g -function ( μ⋅g
denotes the inner product of the vectors g and μ , which are both
2.3.4. 10-Fold cross-validation between spatiotemporal estimates and
functions of space-time); fPM2.5 is the probability law of the PM2.5 dis-
ground observations
tribution in space-time, where the distribution is considered as a
In order to assess the performance of the combined BME-GWR
random field model (Christakos, 2017); and A is a normalization
technique, the coefficient of determination (R2), the mean prediction
parameter. Technically, S-KB may represent both hard and soft data:
error (MPE), the mean error (ME), the mean absolute prediction error
hard data include site-specific measurements with negligible or no
(MAE), and the root mean squared prediction error (RMSE) were the
uncertainty associated with them, whereas soft data include site-spe-
accuracy indicators calculated in terms of the differences between the
cific information in the form of uncertain observations, auxiliary vari-
space-time PM2.5 predictions generated by the combined BME-GWR
ables, and probabilistic assessments (e.g., intervals of possible values
technique and the ground-level PM2.5 observations at a set of control
and probability distribution functions of any shape; He and Kolovos,
points.
2017). This allows BME to rigorously integrate any non-Gaussian soft
Since the sample-based, 10-fold cross validation (CV) procedure
data, such as soft data with a truncated Gaussian distribution (Reyes
(Diego Rodriguez et al., 2010; Lee et al., 2011; Hu et al., 2014; Ma
and Serre, 2014).
et al., 2014) has been more widely tested in previous PM2.5-AOD
Eq (3a-b) can be solved with respect to the PM2.5 probability law
modeling studies than other procedures (such as, e.g., the site-based CV
fPM2.5 at all pollutant mapping points (i.e., space-time points at which
technique, Chang et al., 2014), in this study we also chose the 10-fold
predictions of the PM2.5 concentrations are sought). Also, it has been
CV method to test the potential over-fitting of the BME-GWR technique.
proven in theory that Kriging is a special case of BME under limiting
Previous GWR studies that used the 10-fold cross validation method
conditions –linear estimation, Gaussian distribution and hard only site-
include Hu et al. (2013), Ma et al. (2014), Fang et al. (2016), You et al.
specific data are considered (Christakos, 2000). More technical details
(2016b), and Zou et al. (2016). In the present study, the data set was
concerning the BME approach above can be found in the relevant lit-
divided into k = 10 folds, a classifier learned using k-1 = 9 folds, and
erature. Software libraries have been developed dealing with the so-
an error value was calculated by testing the classifier in the remaining
lution of Eq (3a-b) in real-world conditions, including BMElib, SEKS-
fold. Then, the k-fold cross-validation classification (k-cv) error esti-
GUI, QuantumBME, and StarBME (Yu et al., 2007).
mator was calculated as the average value of the errors committed in
each fold. Note that the k-cv error estimator depends on two factors: the
2.3.3. The combined (integrative) BME-GWR technique
training set and the partition into folds (Diego Rodriguez et al., 2010).
In this work, we developed a combined (integrative) BME-GWR
technique to estimate space-time PM2.5 concentrations in the most
300
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
Fig. 5. Scatter plots of model fitting and validation result. The solid line denotes the trend line: (a) CV results for the BME-GWR technique (N = 3009 estimation points, t = 4 months);
(b)–(e) are GWR model fitting results during the four months considered.
301
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
Table 2 where c1 = 0.8, c2 = 0.2, a s1 = 1, a s2 = 14 (in °C), and at1 = 3.8, at2 = 13
10-fold cross-validation results of space-time BME-GWR estimation of monthly PM2.5 (in months). Using the above covariance model, we obtained BME es-
averages (k = 1408 monitoring stations, N = 3009 estimated points, t = 4 months).
timates of PM2.5 that are representative of the actual distribution.
Method R2 RMSE MPE MAE ME
3.3. Cross validation results
BME-GWR 0.883 11.39 −0.067 7.90 −1.065
Fig. 5 and Table 2 show the cross-validation (CV) results of the BME-
GWR estimation technique. The accuracy indicators of technique, in-
3. Results
cluding RMSE, MPE, MAE, and ME, are listed in Table 2. For model
validation purposes, the R2, RMSE, MPE, MAE and ME values of the
3.1. Descriptive statistics
BME-GWR technique were 0.883, 11.39, −0.067, 7.90, and
−1.065 μg / m3 , respectively.
Various simulations of the 16-variable model were initially gener-
ated using the GWR and OLS techniques. The histograms and summary
3.4. Spatial distributions of PM2.5 predictions
statistics of GWR variable fitting for the four months (Nov–Dec 2015,
Jan–Feb 2016) are plotted in Fig. 3. These plots show that all the
Fig. 6 presents ground-level PM2.5 measurements and the corre-
variables are roughly lognormally and unimodally distributed. The
sponding monthly-averaged PM2.5 predictions using the space-time
geometric mean, standard deviation, maximum, and minimum for all
BME-GWR technique. The first observation is that the fact that most of
variables for all days are also presented in Fig. 3. The mean PM2.5
the PM2.5 monitoring sites are clustered in urban areas (rural areas have
concentration over all monitoring sites is 63.75 μg / m3 , and the overall
little coverage in China), may impact the performance of a space-time
mean of the MODIS generated AOD is 0.41.
prediction technique that covers the entire China. Provinces (such as
On the basis of the OLS results, the Koenker (BP) Statistic tests were
Xinjiang, Tibet, Qinghai, Mongolia and Heilongjiang) with sparse and
found to be statistically significant (p < 0.01) during the four months
unevenly distributed PM2.5 monitoring sites were removed from the
considered in the present study, indicating that there is a non-stationary
present study. Also, the considerable number of missing PM2.5 data in
relationship between dependent variables and independent variables,
the Tibet and Xinjiang provinces can lead to substantial errors in PM2.5
that is, the local sublinearity between environmental variables and
concentration prediction (Ma et al., 2014; Fang et al., 2016). These
PM2.5 concentrations in a wide geographic area can be better inter-
provinces, also, were not included in the present study. The second
preted by the GWR model. This result is consistent with previous stu-
observation is that, while the Beijing, Tianjin and Hebei provinces were
dies (Hu et al., 2013; Song et al., 2014; Zou et al., 2016).
always highly polluted areas, in southern provinces the air quality has
Table 1 provides a summary of the parameters of the fitted GWR
been much better. The BME-GWR technique provided considerably
and OLS models. The overall mean adjusted R2 = 0.75 (GWR model) is
detailed maps of the space-time pollutant distribution due to its in-
greater than the overall mean adjusted R2 = 0.67 (OLS model). This
corporation of soft data.
result confirms the superiority of the GWR model over the OLS model in
As the maps of Fig. 6 reveal, the PM2.5 concentrations were high in
simulating site-based PM2.5 concentrations. These findings are con-
the northern part of the study area, especially in the Beijing, Tianjin,
sistent with previous GWR applications (Hu et al., 2013; Zou et al.,
Hebei and Shandong provinces. The spatial gradients of PM2.5 con-
2016). The GWR result for December was the most accurate among the
centrations showed a significant change during the four months con-
GWR results for the four months considered, with the highest R2 (=
sidered, and the overall PM2.5 concentration trend was high in the north
0.828) and the lowest RMSE (=12.25 μg / m3 ). As regards the OLS re-
and low in the south part of the study region. The temporal PM2.5
sults, the model with the highest R2 (=0.77) corresponds to December.
concentrations in the study region showed a clear monthly variation,
Different monthly OLS models have different variables, and the same
and there was a PM2.5 concentration trend from the northern coast
variable has different relationships (positive or negative) with PM2.5.
toward inland during the period November of 2015 to February of
For example, the AOD coefficients for December, November, January,
2016. A possible explanation may be the monsoon weather conditions
and February are, respectively, −3.52, 2.12, 0.58, and 5.96.
that caused the dispersion of the coastal PM2.5 pollution towards inland
China (Ainslie et al., 2008). The average PM2.5 concentration during
3.2. Covariance model fitting December 2015 (reaching 76.1 μg / m3 ) was significantly higher than
that during the other three months, while the most widely distributed
The covariance function (Kolovos et al., 2004; Ma, 2008) is a sta- PM22.5 concentrations were observed during January 2016 (with spa-
tistical tool that offers information about the variation of PM2.5 con- tially distributed values that exceeded the 85 μg / m3 level).
centrations across space-time, when is mathematically represented by a
spatiotemporal random field (Christakos, 2017). Concerning the em- 4. Discussion
pirical space-time covariance presented in Fig. 4, we noticed that the
PM2.5 covariance used by the combined BME-GWR technique had a From a methodological perspective, this work presented a combined
linear shape at the space-time origin (in random field modeling terms, BME-GWR technique of space-time pollutant modeling and mapping. In
this covariance shape is interpreted as characterizing considerable this setting, the role of GWR was to generate “soft” data, and that of
PM2.5 concentration changes both in space and in time), followed by a BME to use these data together with the available “hard” data to pro-
sharp PM2.5 covariance drop along space, and a very slow decline along duce space-time maps of PM2.5 distribution over China during a four-
the time axis (this dual behavior indicates a short PM2.5 correlation month period. In particular:
range in space and a long PM2.5 correlation range in time). Given that
four months are considered in this study, the temporal correlation of (a) The BME approach is able to integrate general or core KBs (theo-
adjacent months is strong. retical covariance models, trend functions, physical laws, and sci-
In light of these empirical PM2.5 covariance features, the following entific relationships) about the attribute of interest with site-spe-
theoretical covariance model was fitted to the empirical one of Fig. 4, cific KBs (hard or accurate data, such as monthly-averaged PM2.5
concentrations, and soft data in the form of uncertain measure-
cPM2.5 (h, τ ) = c1 ⎛⎜1 −
3h
+
h3 ⎞ ⎛
3 ⎟⎜
1−
3τ
+
τ3 ⎞
3 ⎟
+ c2 e
−3 ( ah + aτ )
s2 t2 ments, probabilistic assessments and auxiliary information) to
⎝ 2a s 2a s1 ⎠ ⎝ 2a t 2a t1 ⎠
1 1 generate space-time PM2.5 prediction maps and the associated
(4) prediction accuracy maps.
302
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
303
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
Table 3
Summary of previous GWR, improved GWR, and Bayesian models PM2.5 results in China.
(b) The GWR model, on the other hand, explores spatial heterogeneity environmental variables will become available, such as wind directions,
by means of local regression. Instead of estimating global parameter in this case.
values, by estimating the parameters at each location GWR gen- The findings of this study are useful for exposure assessment and
erates a continuous surface of spatially varying parameter values. health risk management purposes, as well as for air pollution control
Also, GWR can assess the influence of independent variables on strategies and environmental protection related studies. Accordingly,
dependent variables in terms of location changes and spatial het- future work on the combined GWR-BME technique should consider (i)
erogeneity of the relationship between an independent and a de- adding higher resolution data sources, such as 1 km AOD data (Lin
pendent variable. et al., 2015a; Wu et al., 2016a,b), (ii) focusing on developing satellite-
based models for the prediction of historical PM2.5 and other air pol-
As regards (b), many earlier studies have used various versions of lutants, (iii) visualizing exposure estimates in ArcEngine 10 in order to
GWR-based models, as wells as Bayesian models. Each one of these properly assess the long-term effects of PM2.5 exposure on human
studies exhibited different characteristics because of the different con- health and facilitate better prevention and control of PM2.5 exposure in
ditions, PM2.5 concentration levels, and geological and meteorological China.
environments considered (Table 3). By accounting for cross space-time
correlations, allowing more general assumptions (non-linear and non- Acknowledgments
Gaussian predictors), and incorporating a sufficient number of key
environmental factors (capable of explaining a large proportion of the This research was supported by the National Science Foundation of
PM2.5 concentration variance) the model used in this work performed China (Grant No. 41671399).
better than the earlier ones, because of its advantages as outlined in the
Introduction section and in other parts of the paper. In particular, it References
provided the best 10-fold cross-validating results (R2 = 0.883,
RMSE = 11.39). Ainslie, B., Steyn, D.G., Su, J., Buzzelli, M., Brauer, M., Larson, T., Rucker, M., 2008. A
As regards (a), it was shown that the combination of the BME source area model incorporating simplified atmospheric dispersion and advection at
fine scale for population air pollutant exposure assessment. Atmos. Environ. 42,
technique with the GWR model offers certain improvements in the 2394–2404.
space-time prediction (estimation) of PM2.5 concentrations over China. Bagheri, N., Holt, A., Benwell, G.L., 2009. Using geographically weighted regression to
In particular, we compared the combined BME-GWR technique pro- validate approaches for modelling accessibility to primary health care. Appl. Spatial
Anal. Pol. 2, 177.
posed in this work with seven previous studies of satellite-derived PM2.5 Bekara, M., Hafidi, B., Fleury, G., Ieee, 2005. Smoothing Parameter Selection in
concentrations in China. The combined BME-GWR technique generated Nonparametric Regression Using an Improved Kullback Information Criterion.
realistic maps of space-time PM2.5 distribution, and its performance was Brook, R.D., Rajagopalan, S., Pope, C.A., Brook, J.R., Bhatnagar, A., Diez-Roux, A.V.,
Holguin, F., Hong, Y., Luepker, R.V., Mittleman, M.A., Peters, A., Siscovick, D., Smith,
superior to these techniques, in terms of prediction accuracy. S.C., Whitsel, L., Kaufman, J.D., 2010. Particulate matter air pollution and cardio-
Other important differences and advantages of BME-GWR over the vascular disease an update to the scientific statement from the American Heart
previous techniques include the following. First, due to sparse sites and Association. Circulation 121, 2331–2378.
Chang, H.H., Hu, X., Liu, Y., 2014. Calibrating MODIS aerosol optical depth for predicting
AOD missing data, Ma et al. (2014) obtained higher PM2.5 concentra-
daily PM2.5 concentrations via statistical downscaling. J. Expo. Sci. Environ.
tion estimates in rural regions and slightly lower concentration esti- Epidemiol. 24, 398–404.
mates in the Sichuan basin. The study by Fang et al. (2016) was char- Christakos, G., 1990. A Bayesian maximum entropy view to the spatial estimation pro-
acterized by a considerable number of missing PM2.5 data in the Tibet blem. Math. Geol. 22, 763–777.
Christakos, G., 2000. Modern Spatiotemporal Geostatistics. Oxford University Press, New
and Xinjiang provinces, which generated substantial errors in the PM2.5 York.
concentration estimates they derived. Second, in the present study we Christakos, G., 2010. Integrative Problem-Solving in a Time of Decadence. Springer, New
collected and processed more data of environmental factors than in York.
Christakos, G., 2017. Spatiotemporal Random Fields: Theory and Applications. Elsevier,
previous studies, such as 3 km AOD, pollutant emissions, meteor- Cambridge, MA.
ological data and land use. Furthermore, for the first time, we in- Christakos, G., Hristopulos, D.T., 1998. Spatiotemporal Environmental Health Modelling:
corporated a GWR model into the BME estimation technique in a high a Tractatus Stochasticus. Kluwer Academic Publ., Boston, MA.
Christakos, G., Serre, M.L., 2000. BME analysis of spatiotemporal particulate matter
spatial/temporal resolution setting. The calculated accuracy indicators distributions in North Carolina. Atmos. Environ. 34 (20), 3393–3406.
of the combined GWR-BME technique showed a considerable im- Christakos, G., Serre, M.L., Kovitz, J., 2001. BME representation of particulate matter
provement compared to the existing techniques. As was mentioned distributions in the state of California on the basis of uncertain measurements. J.
Geophys. Res. 106 (D9), 9717–9731.
throughout the paper, the limited number of monitoring sites is a Chu, Y., Liu, Y., Li, X., Liu, Z., Lu, H., Lu, Y., Mao, Z., Chen, X., Li, N., Ren, M., Liu, F.,
limitation of this work, as well as the lack of information about other Tian, L., Zhu, Z., Xiang, H., 2016. A review on predicting ground PM2.5 concentration
environmental variables. When more monitoring stations are estab- using satellite aerosol optical depth. Atmosphere 7 (10), 129. http://dx.doi.org/10.
3390/atmos7100129.
lished in certain rural areas, the BME-GWR can be used to improve the
Dhondt, S., Quynh Le, X., Hieu Vu, V., Luc, H., 2011. Environmental health impacts of
study of air pollution in these areas. And, we hope that data about more mobility and transport in Hai Phong, Vietnam. Stoch. Environ. Res. Risk Assess. 25,
304
L. Xiao et al. Atmospheric Environment 173 (2018) 295–305
305