You are on page 1of 16

Focus Soft Computing 7 (2003) 370 – 385 Ó Springer-Verlag 2003

DOI 10.1007/s00500-002-0226-2

Automatic fuzzy-rule assessment and its application to the modelling


of nitrogen leaching for large regions
A. Bárdossy, U. Haberlandt, V. Krysanova

370
Abstract Diffuse nutrient emissions from agricultural Keywords Fuzzy rules, Environmental modelling,
land is one of the major sources of pollution for ground Simulated annealing
water, rivers and coastal waters. The quantification of
pollutant loads requires mathematical modelling of water
and nutrient cycles. The deterministic simulation of
1
nitrogen dynamics, represented by complicated highly
Introduction
non-linear processes, requires the application of detailed
Environmental problems are often related to complicated
models with many parameters and large associated data
physical and chemical processes. Their quantitative mod-
bases. The operation of those models within integrated
elling requires a great number of parameters, which can be
assessment tools or decision support systems for large
determined under laboratory circumstances for selected
regions is often not feasible. Fuzzy rule based modelling
areas. Environmental decisions however require the
provides a fast, transparent and parameter parsimonious
prediction of large scale effects. The parameters for the
alternative. Besides, it allows regionalisation and integra-
modelling on the large scale are usually not available.
tion of results from different models and measurements at
Further the computational effort for using detailed models
a higher generalised level and enables explicit consider-
is enormous.
ation of expert knowledge. In this paper an algorithm for
Nitrogen (N) leaching to groundwater in rural land-
the assessment of fuzzy rules for fuzzy modelling using
scapes represents one of the most significant sources of
simulated annealing is presented. The fuzzy rule system is
water pollution in Europe. Therefore quantification of ef-
applied to simulate nitrogen leaching for selected agri-
fects of natural conditions (soil, climate, and topography)
cultural soils within the 23687 km2 Saale River Basin. The
and human impacts (land management practices, like
fuzzy rules are defined and calibrated using results from
fertilization rates and crop rotations) is very important for
simulation experiments carried out with the deterministic
defining appropriate land management policies and
modelling system SWIM. Monthly aggregated time series
building the corresponding Decision Support Systems
of simulated water balance components (e.g. percolation
(DSS). For deterministic simulation of nitrogen dynamics
and evapotranspiration), fertilization amounts, resulting
complicated highly non-linear processes have to be con-
nitrogen leaching and crop parameters are used for the
sidered often requiring the application of detailed models
derivation of the fuzzy rules. The 30-year simulation pe-
with many parameters and large associated data bases. The
riod was divided into 20 years for training and 10 years for
operation of those models within integrated assessment
validation, with the latter taken from the middle part of the
tools or decision support systems for large regions is not
period. Three specific fuzzy rule systems were created
always the optimal solution. Fuzzy rule based modelling
from the simulation experiments, one for each selected soil
provides a fast, transparent and parameter parsimonious
profile. Each rule system includes 15 rules as well as one
alternative. Besides, it allows regionalisation and integra-
prescribed rules from expert knowledge and 7 input
tion of results from different models and measurements at
variables. The performance of the fuzzy rule system is
a higher generalised level and enables explicit consider-
satisfactory for the assessment of nitrate leaching on an-
ation of expert knowledge. The rule systems can be derived
nual to long term time steps. The approach allows rapid
from deterministic models, and can be used subsequently
scenario analysis for large regions and has the potential to
as a replacement of the complicated physically based
become part of decision support systems for generalised
model. Its use is advantageous as fuzzy rules are case
integrated assessment of water and nutrients in macroscale
dependent, and use only those arguments which are es-
regions.
sential for the given situation. Further they can describe
very complicated non-linear, non-monotonic relationships
between variables. Once a rule system is assessed, its
application is extremely fast and robust.
A. Bárdossy (&) Fuzzy rule assessment from experimental data and/or
Universität Stuttgart expert knowledge is a central research topic. In their paper
Pfaffenwaldring 61, 70550 Stuttgart, Germany
E-mail: bardossy@iws.uni-stuttgart.de [1] Pedrycz and Reformat suggest a fuzzy partitioning
algorithm for the description of rule systems. Neuro-fuzzy
U. Haberlandt, V. Krysanova approaches are also used for this purpose [2]. In [3] a
Potsdam Institut für Klimafolgenforschung learning algorithm using genetic optimisation is presented.
In this paper a simulated annealing based algorithm is with Kj being the number of possible membership
used to identify rules in a rule system. The rule perfor- functions for rule argument j and L being the number of
mance defined as a distance between observations and rule possible individual rule responses.
response is optimised. The goal is to find the best matrix R. It is assumed that
The paper is organized as follows: after the introduction the rules are applied with a product inference and a
a methodology for the assessment of fuzzy rules is pre- weighted linear combination of the results. This means
sented. Sections 3 and 4 are dealing with modelling of that for each vector ðx1 ; . . . ; xJ Þ the defuzzified response is
nitrogen leaching to groundwater from arable land using calculated as:
P P Q
(1) the process-based deterministic ecohydrological mi ðx1 ; . . . ; xJ ÞMðBi Þ
iP i j lA ðxj ÞMðBi Þ
model SWIM [16], and y^ ¼ ¼ P Q ji
i mi ðx1 ; . . . ; xJ Þ i j lAji ðxj Þ 371
(2) a fuzzy rule based model, which uses the outputs from
SWIM as the training and validation sets. ð5Þ
A section with discussion and conclusions completes the where M denotes the center of gravity of a fuzzy set.
paper. These calculations are done for each element of the
training set. Then the results are compared to the observed
yðtÞ values. The performance P of the rule system is
2 calculated using the observed and calculated values:
Fuzzy rules and their assessment X
The most popular fuzzy direct reasoning technique is the P¼ F ðy^ðtÞ; yðtÞÞ ð6Þ
Mamdani’s method [4]. It is composed of direct IF-THEN t
statements with fuzzy premises and fuzzy conclusions. It is Typically F can be chosen as an lp measure:
widely applied in fuzzy control [5], however its use in
fuzzy modelling is still restricted to a few specific case F ðy^ðtÞ; yðtÞÞ ¼ jy^ðtÞ yðtÞjp ð7Þ
studies [6]. Mamdani’s method is very well suited to Other performance functions such as a likelihood type
describe the complex case dependent nature of complex measure, a geometric distance or a performance related to
environmental systems. proportional errors can also be formulated. Once one has a
The most important task for the application of a fuzzy measure of performance an automatic assessment of the
rule system is the assessment of the suitable rules. For the rules can be established. This means that the goal is to find
present study rules are derived from datasets obtained the rule matrix R for which the performance is the best:
from a limited number of runs of a complicated complex
environmental model. PðRÞ ! min ð8Þ
Given a dataset T the goal is to describe the relationship Q
The number of possible rules is: j Kj L This means that
between the variables x1 ; . . . ; xJ and y using fuzzy rules. the number of possible rule matrices is:
T ¼ ðx1 ðtÞ; . . . ; xJ ðtÞ; yðtÞ t ¼ 1; . . . TÞ ð1Þ
Q
j Kj L
The rule system consisting of I rules should deliver results I
(after a combination and defuzzification) such that the
rule response R for each t should be close to the observed which is usually a very large number. For example, in the
value: case of a small rule system with J ¼ 3 arguments with
Kj ¼ 6 possible membership functions for each of them
Rðx1 ðtÞ; . . . ; xJ ðtÞÞ  yðtÞ ð2Þ and considering a rule response with 5 possibilities the
The fuzzy rules are formulated using predefined fuzzy sets number of rule sets consisting of I ¼ 5 rules is
63 5
for each variable j fAj;1 ; . . . ; Aj;Kj g. The fuzzy rules then are 5  1:2 1013 . Thus there is no possibility to try out
described in the form: each rule combination to find the best. Therefore discrete
optimisation methods have to be used to find ‘‘good’’ rule
If x1 is A1;ki;1 and . . . xJ is AJ;ki;J then y is Bli;1 ð3Þ systems. Genetic algorithm or simulated annealing are
Here i is the index of the rule. possible candidates for this task.

2.2
2.1 Assessment of the rule system
Definition of the rule system The problem is to find the rule system R with optimal
The rule system can thus be described in the form of a performance PðRÞ. The method chosen here is simulated
matrix consisting of natural numbers ki;j annealing using the Metropolis algorithm. The algorithm
0 1 is as follows:
k1;1    k1;J l1
B .. C
R ¼ @ ... ..
. .A ð4Þ (1) The possible fuzzy sets for the arguments Aj;k and the
responses Bl are defined
kI;1  kI;J lI (2) A rule system R is generated at random.
where (3) The performance of the rule system PðRÞ is calcu-
lated.
1  ki;j  Kj ; 1  li  L (4) An initial annealing temperature ta is selected.
(5) An element of the rule system is picked at random. the others will be altered. In case of rule responses they
Suppose the index of this element is ði; hÞ. can be calculated directly.
(6) If h  J, an index 1  h  Kh is chosen at random
and a new rule system R with ki;h replacing ki;h is 2.4
considered. Selection of the possible fuzzy sets for
(7) If h > J, an index 1  h  L is chosen at random the rule arguments
and a new rule system R with li ¼ h replacing li is The possible fuzzy sets for the rule arguments have to be
considered. fixed before the optimisation is done. Expert knowledge
(8) In both cases the performance of the new rule system can be used for this purpose. Another alternative is the
PðR Þ is evaluated. definition by using a uniform coverage of the possible
(9) If PðR Þ < PðRÞ, then R replaces R (positive ranges of the arguments.
372
changes). One of the most important things in assessing rules is
(10) If PðR Þ  PðRÞ then the quantity the recognition that not all arguments play an important

role for each case. The role of arguments varies according
PðRÞ PðR Þ
p ¼ exp to the actual state of the other arguments. This means that
ta one has to consider among the selected possibilities for
is calculated. With the probability p the rule system each argument the membership function lðxÞ ¼ 1. This
R replaces R (negative changes). makes it possible to formulate rules using only some of the
(11) Steps 5–10 are repeated NN times. arguments. This is an advantage of the fuzzy rule systems,
(12) The annealing temperature ta is reduced. as, for example, a linear or polynomial representation
(13) Steps 11–12 are repeated until the proportion of always considers all arguments.
positive changes becomes less than a threshold  > 0. Partial knowledge can be incorporated into the rule
The above algorithm yields a rule system with ‘‘optimal’’ system by fixing some elements of the matrix R. These
performance. However the rules obtained might reflect fixed elements are not altered randomly in the algorithm.
some specific features corresponding to a small number of Thus rules can be fixed, or rules of given structure can be
cases in the data set. To avoid rules to be derived from too identified.
few cases the performance function is modified. The Missing data or fuzzy data in the training set can also be
insufficient generality of a rule can be recognized on the considered. For each missing value a membership function
number of cases for which it is applied. As an alternative identical to 1 is chosen. For fuzzy data the corresponding
the degree of fulfilment of the rules can also be considered. membership function is considered. The DOF is evaluated
In order to ensure the transferability of the rules the using the fuzzy input as
performance of the rule system is modified, by taking the

sum of the DOFs into account. lAi;k ðx^k Þ ¼ max min lAi;k ðxÞ; lx^k ðxÞ ð10Þ
x
Y
0 P 
m t mi ðx1 ðtÞ; . . . ; xJ ðtÞ This means for missing data for each corresponding
0
P ðRÞ ¼ PðRÞ 1þ argument the membership 1 is assumed.
i
m0 þ

ð9Þ 2.5
Here ð:Þþ is the positive part function: Choice of the parameters of the algorithm
 In order to assess the rule system one has to select the
x if x  0 parameters ta ; NN; m0 of the performance function and the
xþ ¼
0 if x < 0 algorithm. There are no specific guidelines but some
practical hints can be given. The initial annealing tem-
m0 is the desired lower limit for the applicability of the rules perature ta should be selected in such a way that at the
in this case expressed by the sum of DOFs. If P0 is used beginning of the algorithm about 50–70% of the randomly
in the optimisation procedure then rules which are based generated changes are accepted (i.e. the rule matrix is
on a few cases and are seldom used are penalized. The modified). A very low ta leads to a few accepted negative
degree of penalty depends on the grade to which the changes and risks to end in a local optimum. On the other
desired limit m0 exceeds the actual sum of DOFs for a hand a too high ta means a fully random search at the
selected rule. beginning, thus a long computation before the number of
changes is reduced below the acceptable limit .
2.3 The limit m0 for the sum of DOFs should be selected as a
Calculation of the changes function of the number of observations T and the number
The evaluation of the performance P (or P0 ) requires the of rules. It reflects the minimum number of cases for
calculation of the DOFs for each element of the training which each rule has to be applied. Reasonable choices are
set. This means that for each rule argument the member- around:
ship function has to be evaluated. Thus totally T I J
T
membership functions have to be evaluated. In the case of m0 ¼ 0:1
a big training set when T is large, this means a lot of I
computations. This can be reduced considerably if one The number of iterations NN should be selected in a way
only calculates the DOF of the changed rule i, as none of that for each temperature enough evaluations are carried
out. The value should be proportional to the number rise to the soil profile, lateral flow, and percolation to the
of rules and the number of arguments + number of deep aquifer.
responses. Values around The module representing crops and natural vegetation
is an important interface between hydrology and nutrients.
NN ¼ 5 I ðJ þ 1Þ
A simplified EPIC approach [7] is included in SWIM for
usually give reasonable results. simulating arable crops (like wheat, barley, rye, maize,
potatoes) and aggregated vegetation types (like ‘pasture’,
2.6 ‘evergreen forest’, ‘mixed forest’), using specific parameter
Validation of the rule systems values for each crop/vegetation type. Vegetation in the
Rule systems are obtained using observed data, and thus model affects the hydrological cycle by the cover-specific
have to be validated on independent observations. For this retention coefficient, which influences runoff, and
373
purpose two methods can be applied – split sampling and indirectly – the amount of evapotranspiration, which
cross validation. is simulated as a function of potential evapotranspiration
In the case of a sufficiently large sample one can divide and LAI.
the data in two sets – the training set T and the validation The nitrogen and phosphorus modules include the
set V. The rules are assessed from T and evaluated on V. following pools: nitrate nitrogen, active and stable organic
The real performance of the rule system is that on the nitrogen, organic nitrogen in the plant residue, labile
validation set. phosphorus, active and stable mineral phosphorus, or-
ganic phosphorus, and phosphorus in the plant residue,
3 and the flows: fertilization, input with precipitation,
Modelling nitrogen leaching with the process-based mineralisation, denitrification, plant uptake, leaching to
model groundwater, losses with surface runoff and erosion.
For the selected application, first a training set T has to be Regarding the lateral transport, the runoff and leaching are
found. As there are no representative measurements for more important for nitrogen than for phosphorus. The
this large scale available, a detailed process-based model latter is mainly transported with erosion. The interaction
can be used to generate the training set. Usually, the between nutrient supply and vegetation is modelled by the
detailed dynamic models due to their complexity and plant consumption of nutrients and using nitrogen and
data requirements are not best suited for large-scale phosphorus stress functions, which affect the plant growth.
applications aimed in policy support. Prior to this study SWIM was tested and validated se-
Simulation experiments described here were performed quentially for hydrology in several mesoscale river basins,
using the model SWIM for a restricted set of natural and for nitrogen dynamics, crop growth, and erosion. The
land management conditions, which are representative for validation of nitrogen dynamics was performed for two
the Saale River basin (drainage area 23, 687 km2 ), a trib- mesoscale sub-basins of the Elbe: the Stepenitz and the
utary of the Elbe River. This subsection (a) gives a short Zschopau [8] [9]. In advance to the modelling nitrogen
description of SWIM in Sect. 3.1, (b) describes how the fluxes for the Saale basin, a successful hydrological
simulation experiments were designed in Sect. 3.2, and validation for this basin (gauge Calbe-Grizehne located
(c) analyses the results to be used in Sect. 4 as an input to close to the river mouth), was completed.
the fuzzy rule based model (Sect. 3.3).
3.2
3.1 Design of modelling experiments and data
The process-based model The natural conditions incorporated in the simulation
The modelling system SWIM (Soil and Water Integrated runs included four climates (represented by climate
Model) includes as its kernel, a continuous-time spatially- stations), three different soils (according to soil map BÜK-
distributed model, which integrates hydrology, vegetation, 1000), five elevation classes (represented by slope steep-
and nutrient (nitrogen, N and phosphorus, P) and sedi- ness). The land management conditions included three
ment fluxes at the river basin scale. In addition, it includes crop rotations and three fertilization schemes. Simulation
an interface to the Geographic Information System GRASS, runs were performed for 30 years period 1961–1990 for all
which allows the extraction of spatially distributed pa- possible combinations of four climate zones, three soil
rameters including elevation, land use, soil types, and the classes, three rotation schemes, three fertilization schemes,
routing structure for the model initialisation. The model and five elevation zones, which produced 4 3 3 3
can be applied for mesoscale river basins (or regions) with 5 ¼ 540 time series with daily time step.
an area up to 25,000 km2 , or, after validation in Agricultural land occupies about 70% of the Saale River
representative sub-basins, for regions of similar size. basin. Four climate stations were chosen as representative
The simulated hydrological system consists of four for cropland in the region, based on a classification taking
control volumes: the soil surface, the root zone, the into account the long-term average annual precipitation
shallow aquifer, and the deep aquifer. The soil column is and average temperature for the stations, and considering
subdivided into several layers in accordance with the soil the availability of continuous data during the period 1961–
database. The water balance for the soil column includes 1990. The four climate zones represented by climate sta-
precipitation, surface runoff, evapotranspiration, percola- tions are listed in Table 1. They are ordered according to
tion and subsurface runoff. The water balance for the the annual precipitation ranging from 460 to 732 mm. The
shallow aquifer includes groundwater recharge, capillary altitude of the stations varies from 164 m a.m.s.l. in Artern
Table 1. Classification of natural conditions and agricultural land management in the Saale River basin used for the simulation
experiments with the SWIM model

No Station name Station number Altitude, m a.m.s.l. Average annual Average annual
precipitation (mm) temperature (°C)

I Climate zones, represented by climate stations


1 Artern 3402 164 460 8.64
2 Erfurt 4200 316 503 8.01
3 Gera-Leumnitz 4406 311 611 8.02
4 Hof-Hohensaas 4027 567 732 6.46
No Name of soil types Occurrence in Field capacity in Saturated conductivity
374 arable land (%) root zone (Vol. %) in root zone (mm/h)
II Soil classes, represented by soil types (BÜK-1000)
36 Tschernosem from loess 13.8 39.7 9.9
56 Braunerde from loess and weathering products 6.4 37.5 9.0
55 Braunerde from sour rocks 4.3 32.3 41.1
No Crop sequence (year)
1 2 3 4 5 6 7 8 9 10
III Rotation schemes
1 pob ww sb Wr set-aside ww wb po ww ma
2 po wb ma ww wr ww ma po ww ma
3 po set-aside sb wr set-aside ww wb po ww set-aside
No Fertilisation rate (kg ha)1)
Winter wheat Winter barley Winter rye Spring barley Potatoes Maize Set-aside
IV Fertilisation schemes
1 30+60+30 Nc 20+60+20 N 20+60+20 N 60+20+20 N 140 N 180 N 0 N
30+30 org Nd 30+30 org N 30+30 org N 30+30 org N 30+30 org N 30+30 org N 0 org N
2 45+90+45 N 30+90+30 N 30+90+30 N 90+30+30 N 210 N 270 N 0 N
45+45 org N 45+45 org N 45+45 org N 45+45 org N 45+45 org N 45+45 org N 0 org N
3 15+30+15 N 10+30+10 N 10+30+10 N 30+10+10 N 70 N 90 N 0 N
15+15 org N 15+15 org N 15+15 org N 15+15 org N 15+15 org N 15+15 org N 0 org N
a
occurrence of the main soil plus occurrence of other soils;
b
po – potatoes, ww – winter wheat, wb – winter barley, wr – winter rye, sb – spring barley, ma – silage maize;
c
fertilisation by mineral N;
d
fertilisation by organic N

to 567 m a.m.s.l. in Hof-Hohensaas. The higher altitudes in sandy layer below 85 cm, it has low saturated hydraulic
the basin (up to 1100 m) were ignored, because most of conductivity and a medium field capacity. Soil 55 is sandy-
the agricultural land is at lower altitudes, namely: about loamy, with a loamy-sandy layer below 60 cm, it has the
40% of agricultural land is located in areas lower than lowest field capacity and the highest saturated hydraulic
200 m, 79% lower than 400 m, and 96.4% lower than conductivity among the three chosen soils. The number of
600 m. Though climate stations 1 and 2 have similar soils was restricted in this case for simplification purposes,
average annual precipitation, they both were considered though other soil classifications are possible for better
in order to have better representation of the climate representation of soil characteristics in the region [11].
conditions and elevation for the basin. Five elevation Table 1 includes also three rotation schemes, 10 years
classes represented by topographic slope ranging from 0.2 each, and three fertilization schemes, which represent a
to 10% were considered. range of agricultural management practices. The basic
Loess soils or soils from loess mixed with weathering rotation and fertilization schemes (schemes 1) were de-
products are the dominant soil types in the basin, rocky rived from the typical practices in the area [12] and [13].
soils occur only in the mountain areas. Altogether, the Two additional rotation schemes were applied; a more
cropland contains 34 soil types (BÜK-1000) [10], 23 of intensive rotation (scheme 2), which included two addi-
them occupy 96.7% of the total cropland area. They can be tional years of silage maize instead of one year of spring
roughly represented by three different soils: ‘Tschernosem barley and one year of set-aside, and a less intensive ro-
from loess’ (soil 36), ‘Braunerde from loess and weathering tation (scheme 3), which included two additional years of
products’ (soil 56), and ‘Braunerde from sour rocks’ (soil set-aside instead of one year of silage maize and one year
55) listed in Table 1. These three soils occupy 24.5% of of winter wheat). Winter crops are indicated in Table 1 for
arable land. Soil 36 is loamy, it has a low saturated hy- the year when they are harvested.
draulic conductivity and the highest field capacity among The rates of fertilization used in the model application
three chosen soils. Soil 56 is also loamy, with a loamy- are crop-specific (Table 1). The cereals (winter wheat,
winter barley, winter rye, and spring barley) are fertilised 3.3
by mineral N three times, including once in autumn for Analysis of factors affecting nitrogen fluxes
winter crops, while for silage maize and potatoes the total Simulated 540 time series of daily water fluxes (direct
annual amount is applied only once, during the sowing runoff, interflow, groundwater recharge, and evapotran-
[12]. In addition, organic fertilizer is applied using the spiration) and daily N fluxes (N washoff with direct runoff
following rules: for winter crops 30 kg ha 1 of organic N is and interflow, N leaching to groundwater, N uptake by
applied 23–43 days before sowing (depending on the plants, denitrification, and mineralisation) were aggregat-
previous crop) and 30 kg ha-1 is applied at the beginning ed to monthly, annual, and average annual values and then
of March; and for summer crops 30 kg ha 1 is applied at analysed with respect to the different natural conditions
the end of October of the previous year and 30 kg ha 1 is and management practices. The results of the analysis are
applied six weeks after sowing in spring. Two other fer- presented in Figs. 1–5. 375
tilization schemes were used in the model by increasing Figure 1 shows the effect of climate on the monthly N
(scheme 2) or decreasing (scheme 3) the fertilization losses with water (with direct runoff, interflow and
application rates by 50% without changing the time of leaching to groundwater) for soil 56 for the whole simu-
application. The three rotation schemes and three lation period of 30 years. The increasing trend in wetter
fertilization schemes were combined with one another. conditions is evident: when precipitation increases from

Fig. 1. Monthly dynamics of


total N losses with water (with
direct runoff, interflow and
leaching to groundwater) for
soil 56 for the 30 years
simulation period in different
climate conditions (climate 1, 3
and 4)
376

Fig. 2. Annual dynamics of N


losses with water for three
investigated soils for climate
station 3 (a), and the same
dynamics of (N) losses with
water, and annual precipita-
tion after sorting the years in
accordance with the precipita-
tion gradient (b)

460 through 611 to 732 mm on average for climate stations high precipitation, but rather low N losses. This is due to
1, 3 to 4, nitrogen losses from soil with water fluxes the importance of seasonal distribution of precipitation
increase as well. In dryer climate conditions (station 1) and N dynamics, e.g. high precipitation in summer or
there are some years with very low or even zero N losses early autumn does not cause washoff of N, because usually
with water from this soil. arable cropland is poor in nutrients at that time. On the
Figure 2 (a) shows annual dynamics of N losses with other hand, high precipitation in winter and early spring
water for three investigated soils for climate station 3. In may cause high losses of N from soil with water.
this example rotation scheme 1 and fertilization scheme 1 Figure 3 allows to compare N losses with three above-
were applied. As one can see, soil 36 has the lowest, and mentioned water flows in different climate conditions. It
soil 55 the highest losses with water. Figure 3 (b) compares shows the combined effects of climate, soil and elevation
the same dynamics of N losses with water for three soils, zones on N losses with surface runoff, interflow and
and annual precipitation after sorting the years in accor- leaching to groundwater. Only the basic rotation and fer-
dance with the precipitation gradient: from the smallest to tilization schemes were considered for these graphs. Ni-
the largest. It is clear that though there is some positive trogen losses with surface runoff are very small. The sum
trend for N losses following the trend in precipitation, of N losses with water fluxes is practically independent of
there are some years (e.g. years 13, 12, 28) with rather low elevation, however the elevation affects redistribution of
or medium precipitation, but rather high N losses, and, on fluxes: N losses with interflow increase at higher elevation,
the opposite, there are some years (e.g. years 10, 5) with and leaching to groundwater decreases), while the total
377

Fig. 3. Combined effects of climate


ðcl1; . . . ; cl4Þ, soils (s36, s56, s55), and
elevation (SS = 0.002, 0.05, SS = 0.10,
where SS is the topographic slope) on
modelled long-term average annual N
losses, considering fluxes with direct
runoff (n-qd), interflow (n-inter), and
ground-water recharge (n-gw)

amount remains practically the same. Consequently, the groundwater (lea) is shown in Fig. 4 for three investigated
higher slopes were excluded from further analysis. This soils. The effect of higher fertilization rates is obvious: all
provides a sort of maximum estimate of N leaching to losses and uptake are higher when fertilization scheme 2 is
groundwater for all soils. applied. Also, changing the crop rotation affects N plant
Soil 36 has the smallest N losses with water, and soil 55 uptake and N losses: both are highest for rotation scheme
the highest (Fig. 3) for all climates. There is an increasing 2, and lowest for rotation scheme 3.
trend for all soils with increasing precipitation (from In Fig. 5 the effect of the fertilization rates on N
climate zone 1 to climate zone 4), though the results for mineralisation, N uptake by plants, N denitrification, and
climates 1 and 2 are similar, and even for soil 55 the losses N leaching is depicted for three investigated soils. The
are smallest for climate 2. modelling results for five climate zones were averaged for
The effect of land management practices represented by this example to exclude the climate effects. Here one can
rotation and fertilization schemes on N uptake by plants see the combined effect of soil types and fertilization rates.
(upt), denitrification (den), and N potential leaching to The following tendencies are observed: the highest N
378

Fig. 4. Effect of rotation and fertilisation schemes ðr1f 1; . . . ; r3f 3Þ on simulated long-term average annual N fluxes: N uptake by
plants (upt), gaseous losses (den), and N losses with water (lea+)

mineralisation and N uptake in soil 36, high N losses with ments with the deterministic SWIM model (see Sect. 3).
denitrification in soil 36, and the highest losses with water This Section describes some of the data pre-processing, the
in the more sandy soil 55. The denitrification pattern is building of the fuzzy model and the validation of the
attributed to a high field capacity and low saturated hy- approach.
draulic conductivity of soil 36, which enhances denitrifi-
cation and assures its low permeability for water and 4.1
associated fluxes. In contrast, leaching to groundwater is Pre-processing
much higher in soil 55, which has zero denitrification. ThisAlthough for prospective applications of the fuzzy model
is due to larger proportion of sand and higher saturated in environmental or agricultural management the annual
hydraulic conductivity in this soil. to long term behaviour of N-leaching is the most sought
An indirect validation of nitrogen balance components target scale, a monthly time step has been chosen here.
obtained for the Saale region was performed using regional This allows to consider important seasonal climate and
data on nitrogen balance components for northern and management effects and builds a compromise between the
central Germany obtained from literature [14] [15] and modelling demands resulting from high process dynamics
others. In general, the modelling results fit into the ranges
and the required computational efficiency of the solution.
obtained from the literature for different conditions, and One of the most important steps for the identification of
differences between soils and climates are plausible. The the fuzzy rule systems is the selection and compilation of
results of the simulation experiments with SWIM provide the input variables. They should have significant impact
a basis for the fuzzy rule based model, which is described on the nitrogen dynamics and be easily available for future
in Sect. 4. scenario analysis without the need for running a nitrogen
process model. Table 2 lists some important variables,
4 which are evaluated amongst several others regarding their
Application of the fuzzy rule model use for the fuzzy model. The list mainly comprises climate
The fuzzy rule approach is applied to the estimation of observations, water fluxes from SWIM simulations, fertil-
nitrogen (N) leaching from three different soils (s36, s56, ization amounts and average crop uptake values. The latter
s55) using the results obtained from simulation experi- can be considered as fixed plant specific parameters (12
379

Fig. 5. Effect of the fertilisation rates (scheme 1 : ref, scheme 2: +50% , scheme 3: )50% ) on N mineralisation, N uptake by plants, N
denitrification, and N losses with water for three investigated soils

Table 2. Important variables


used for the identification of No. Symbol Variable Units
the fuzzy rule systems 3,4
1 Nout nitrogen leaching kg N/(ha mon)
2 PCP precipitation mm/mon
3 TAV air temperature °C
4 Q total water outflow3 current month mm/mon
5 Q)1 total water outflow3 last month mm/mon
6 ETR evapotranspiration mm/mon
7 FERT fertilization current month kg N/(ha mon)
8 FERT12 average monthly fertilization over last 12 months kg N/(ha mon)
(including current month)
9 Q12 average water outflow3 over last 12 months mm/mon
(excluding current month)
10 NUPT12 specific2 nitrogen crop uptake averaged over the kg N/(ha mon)
last 12 months (including current month)
11 Qsald difference between total precipitation sum and mm/yr
total water outflow3 over whole simulation
period for each variant1
12 FERTsald difference between total fertilization sum and kg N/yr
specific2 nitrogen crop uptake averaged over
the whole simulation period for each variant1
1
36 variants per soil (combination of ‘‘climates’’, fertilizations and crop rotations)
2
calculated from monthly crop specific values (12 values for each crop, one per month), compiled
from simulation results with SWIM
3
sum over all horizontal and vertical components leaving the soil column: surface, interflow,
groundwater-percolation
4 target variable (model response)
values per crop, one for each month), assessed from long-  the definition of the structure of the model (variables,
term SWIM simulations. To consider a certain hydrolog- number of systems and rules),
ical memory in the fuzzy rule systems several lagged and  the definition of fuzzy numbers on input variables and
aggregated variables are introduced (e.g. average fertil- on the response and
ization and outflow over the last 12 months). Besides,  the assessment of the fuzzy rules (optimization of the
some total balances are used to distinguish between arguments for the rules).
general site conditions represented by each of the variants
(climate + fertilization + crop rotation). All dynamic For simplicity reasons three separate fuzzy rule systems are
process variables describing the nitrogen cycle like N build here, one for each soil profile operating on data from
content in soil, denitrification, mineralization etc. have all combinations of climates, crop rotations and
been avoided. However, hydrological fluxes like water fertilizations (36 variants per soil). For the whole identifi-
380
outflow and evapotranspiration are included, assuming cation process the data set is divided into two samples, one
they can be obtained in a future operational case from a for training and one for validation. This is done by splitting
simple conceptual water balance model. the 30-year simulation period (1961 to 1990) into 20 years
As an initial assessment about the value of the input for training (1961–1970 & 1981–1990) and 10 years for
validation (1971–1980). The validation set is taken from the
variables, a correlation analysis is carried out. Fig. 6 shows
simplified scatterplots between monthly N-leaching and middle part of the whole period to avoid inconsistencies
several explanatory variables. It becomes obvious, that the between learning and training sets caused by trends. So,
relationship between N-leaching and the input variables is altogether each data set for training and validation consists
quite non-linear and that there is no single variable with aof 8640 and 4320 records of monthly data, respectively (1
dominating effect on Nout. Actual N-leaching increases soil 4 climates 3 crop rotations 3 fertilization
with increasing Q, FERT12, FERTsald and with decreasing schemes 20/10 years 12 months).
Q1, Q12, ETR, FERT. Regarding NUPT12, PCP and TAV A decision about the number of variables and the number
the potential for leaching is highest for values in the of rules has to be made before the assessment of the rule
middle of the input range. A comparison for different systems (see also sect. 2). In addition to the information
months shows higher leaching values during the winter gained from the correlation analysis, further trial and error
season. Further analysis is required to decide about an testing is carried out here. Several fuzzy rule systems with
optimal number and combination of input variables (see different number and combination of input variables are
next Section). assessed. Figure 7 shows some of the results for the three
soils using rule systems with 15 rules each. Comparing the
4.2 standard errors, conclusions about the relevance of the
Model building number and selection of specific variables can be drawn.
The building of the fuzzy model involves basically the First, it can be seen that using percolation (Q) and evapo-
following three tasks, which are discussed in this Section: transpiration (ETR) instead of precipitation (PCP) and air

Fig. 6. Simplified scatterplots


between N-leaching ðNout Þ and
several explanatory variables
on a monthly basis for the soil
profile s56 (training period:
1961–1970 & 1981–1990)
temperature (TAV) decreases the errors significantly. rious. Figure 8 shows this effect comparing standard errors
Adding actual fertilization (FERT) as input variable does not averaged for the three soils and calculated for the training
improve the results, but using the average fertilization over and the validation period with number of rules in the fuzzy
the last 12 months (FERT12) does. Further improvements rule systems. While with increasing number of rules the
are made by inclusion of the average outflow over the last training performances becomes continually better, the
12 months (Q12). The specific plant uptake over the last validation performance does not follow this pattern, it
12 month (NUPT12) as a single variable has no effect, but even decreases again with higher number of rules ð> 0Þ.
utilized as a component in the fertilization saldo (FERTsald) For a robust approach simulation results for both training
it seems quite useful. Also, the incorporation of the flow and validation periods should show similar low errors.
saldo (Qsald) is helpful. Generally, the differences between This is the case for a moderate number of rules between 10
the various combination of variables are more pronounced and 20. So, for subsequent modelling the number of rules
381
as higher the leaching potential of the soil is. Concluding is set to 15 for all three rule systems.
from those experiments 7 input variables has been chosen in Another important issue is the adequate definition of
all three fuzzy rule systems: Q, Q1, ETR, FERT12, Q12, fuzzy numbers for the rule arguments (Aj;k ) and on the
FERTsald and Qsald. response Bi . After some tests an automatic procedure is
Regarding the optimal number of rules similar tests are used considering the range of the variables and the fre-
run. In this case the problem of over-learning can be se- quency of the values. As mentioned before only triangular

Fig. 7. Effect of number and


combination of input variables
on the training performance of
the fuzzy rule systems
(training period: 1961–1970 &
1981–1990)

Fig. 8. Standard errors from


simulations with the fuzzy rule
systems averaged for the three
soils for training and valida-
tion periods using different
number of rules
fuzzy numbers are employed. Figure. 9 shows exemplarily becomes smaller then m0 . This forces the optimisation
the fuzzy numbers defined on the input variable percola- procedure to general applicable rules and prevents too
tion (Q). Given any maximum number K1 the procedure strong emphasis on very rare events.
associates one fuzzy number for the infinite case, one for
the zero case, three symmetric numbers equally parti- 4.3
tioning the observed range of the variable and the re- Results and validation
maining ðK1 5Þ according to the frequency of the values The three soil-specific fuzzy rule systems consisting of 15
with the largest fuzzy number extending to infinity of x. K1 rules and 7 input variables each are assessed and validated
is set here to values between 6 and 12, with 12 for the considering the above defined prerequisites and data sets
response variable and decreasing number according to the for training and validation. Table 3 shows some perfor-
relevance of the input variables. Finally, the fuzzy numbers mance criteria for the fuzzy model, listed separately for the
382
are ordered, the distance between adjacent numbers is training and validation periods as well as for different time
calculated and for very similar numbers only one is kept. scales. The performance improves with increasing time
In order to account for values outside the range for future scale. The simulation quality for the training period is
applications the smallest and the largest fuzzy number (i.e. better than for the validation period, but the differences
xi for lðxi Þ ¼ 1) are shifted by 50% beyond the minimum are moderate, which demonstrates the robustness of the
and the maximum values observed, respectively. approach. For long term values quite good results were
For the assessment of the fuzzy rules simulated an- reached, which is the most important scale for potential
nealing is used as an automatic procedure (see also Sect. future assessment exercises. In Fig. 10 comparisons be-
2), but with the possibility to include some rules a priori. tween the SWIM and fuzzy rule based simulations of the
Here only one fixed rule is prescribed: ‘‘If percolation is long term average N-losses for all soils and the training
zero than N-leaching is zero too.’’ This rule is implemented and validation periods are presented, which support this
as an exclusive crisp rule disabling the remaining fuzzy conclusion. Although a drop in performance can be no-
rules, since simultaneous fulfilment is undesirable in this ticed, when comparing the annual criteria in Table 3 with
case. As objective function the sum of mean squared errors the long term values, the former are still satisfactory. In
between SWIM simulated and fuzzy rule simulated Fig. 11 the annual time series simulated by SWIM and the
monthly N-leaching values is taken. The iteration is fuzzy model for both the training and validation periods
stopped if the sum of positive changes remains smaller for soil s56 are presented. It can be seen, that the corre-
than 3 during the last three temperature loops. Addition- spondence is acceptable, however the extreme values are
ally, the parameter m0 is introduced, which leads to a not modelled very well by the fuzzy rule approach.
penalty on the objective function value if the sum of the Finally, the fuzzy rule approach is compared with a
degrees of fulfilment over all time steps for a certain rule multiple linear regression model to assess the differences
between a conventional linear and a non-linear approach.
As an example the soil profile s56 is chosen here, too.
The same 7 input variables as utilised in the fuzzy model
are used here as independent variables in the multiple
regression approach. Also, all records with zero N-losses
are excluded to account for the crisp rule If percolation is
zero than N-leaching is zero, too. The multiple regression
model is fitted for the training period and then applied for
the validation period using the monthly data. One problem
is the occurrence of negative values for some of the
responses. This could have been avoided by a suitable
variable transformation. However, for simplicity and to
Fig. 9. Fuzzy numbers defined on percolation (Q); solid lines: keep a pure linear approach the remaining small number
zero and infinite cases, dashed lines: equal range partitioning, of negative responses after aggregation to annual values
dotted lines: frequency partitioning (2% negative values left) has been set to zero. Figure 12

Table 3. Performance of the


fuzzy rule system for the 3 soil Soil Avg Monthly SE/Avg r Annual r Long-term r
profiles N-loss1 bias2 SE/Avg SE/Avg
Training (1961–1970 & 1981–1990)
s36 9.2 )0.02 2.20 0.80 0.74 0.86 0.19 0.97
s56 30.1 0.01 1.78 0.84 0.61 0.88 0.18 0.97
s55 42.9 )0.09 1.33 0.86 0.49 0.91 0.16 0.97
Validation (1971–1980)
s36 7.2 0.25 2.67 0.78 1.04 0.82 0.58 0.97
s56 30.5 )0.07 2.15 0.70 0.76 0.75 0.29 0.91
s55 49.4 )0.79 1.60 0.81 0.64 0.79 0.32 0.94

Avg – average, SE – standard error, r –correlation, 1


kg ha)1 year)1, 2
kg ha)1 month)1
383

Fig. 10. Comparison of long term N-los-


ses simulated by SWIM and by the fuzzy
model for all three soils and both training
and validation periods (36 variants per
graph: 4 ‘‘climates’’ 3 rotations 3
fertilizations) (for performance criteria
see Table 3)

shows the aggregated annual time series simulated by This means that models are built where only the variables
SWIM and by the regression model. The simulation important for the given state are considered. As no
performance of the multiple regression approach is worse structure of the model is imposed, fuzzy rules can flexibly
compared to the fuzzy model (cp. Fig. 11) with in increase adjusted to the problem considered.
of the standard error by 36%. Although, it is certainly
possible to construct better and quasi non-linear regres-  Fuzzy rules can be derived from datasets using
sion models by special transformations on the variables, it simulated annealing.
would be difficult to achieve the performance of the fuzzy  Expert knowledge can be incorporated via fixed rules.
model, which is not forced a priory to any internal model  Robustness and transferability can be ensured by
structure. limiting the number of rules and the number of cases a
rule is applied.
5  Incomplete datasets can be used.
Discussion and conclusions  Nitrate leaching as a complicated non-linear problem
Environmental systems are often very complex and highly can be modelled effectively by fuzzy rule systems.
non linear. The variables controlling the processes are  Deterministic models can serve as a basis for the
varying with the actual dominating processes. As the derivation of fuzzy rule based replacement models.
importance of the processes depends on the actual state,  The fuzzy models are fast, robust and transparent and
fuzzy rules can provide a natural case specific description. can be applied in many cases.
384

Fig. 11. Comparison of annual N-loss time series for


soil 56 simulated by SWIM and by the fuzzy model
(FuzRul) for 10 years training (upper figure) and 10
years validation (lower figure). The figures show
sequences of 36 10 years simulations comprising
all variants with crop rotations (r1...r3) cycling fastest,
then fertilizations ðf1 . . . f3Þ and finally ‘‘climates’’
ðc1 . . . c4Þ from left to right (e.g. r1f1c1, r2f1c1, r3f1c1
for the first 30 years etc.)

Fig. 12. Comparison of annual N-loss time series for


soil 56 simulated by SWIM and by the multiple re-
gression model (MREG) for the validation period. The
figure shows a sequences of 36 10 years simulations
comprising all variants with crop rotations ðr1 . . . r3Þ
cycling fastest, then fertilizations ðf1 . . . f3Þ and finally
‘‘climates’’ ðc1 . . . c4Þ from left to right (e.g. r1f1c1,
r2f1c1, r3f1c1 for the first 30 years etc.)

The simplicity of the fuzzy rules makes their incorporation 3. Herrera F, Lozano M. Verdegay JL (1998) A learning Process
into Geographical Information Systems possible and can for fuzzy control rules using genetic algorithms, Fuzzy Sets
offer a flexible easy to use modelling system for decision and Systems 100: 143–158
4. Mamdani EH (1974) Applications of fuzzy algorithms for
support. control of a simple dynamic plant, Proceedings of IEEE
121(12): 1585–1588
References 5. Tanaka K (1997) An Introduction to Fuzzy Logic for
1. Pedricz W, Reformat M (1997) Rule-based models of multi- Practical Applications, Springer Verlag, New York, Berlin,
variable functions. Fuzzy Sets and Systems 90: 235–253 Heidelberg
2. Yan Shi, Masaharu Mizumoto (2000) A new approach of 6. Bárdossy A, Disse M (1993) Fuzzy Rule-based
neuro-fuzzy learning algorithm for tuning fuzzy rules, Fuzzy Models for Infiltration. Water Resources Research 29: 373–
Sets and Systems 112: 99–116 382
7. Williams JR, Renard KG, Dyke PT (1984) EPIC – a new 12. Roth D, Knoblauch S, Pfleger I, Herold L (1998) Nitratgehalte
model for assessing erosion’s effect on soil productivity, im Sickerwasser und N-Austrag aus unterschiedlichen
Journal of Soil and Water Conservation 38(5): 381–383 Agrarstandorten Thüringens, Thüringer Landesamt fr
8. Krysanova V, Becker A (1999a) Integrated Modelling of Landwirtschaft, Jena, Germany
Hydrological Processes and Nutrient Dynamics at the River 13. Krönert R, Franko U, Haferkorn U, Hülsbergen K-J,
Basins Scale. Hydrobiologia 410: 131–138 Abraham J, Biermann S, Hirt U, Mellenthin U, Ramsbeck-
9. Krysanova V, Gerten D, Klöcking B, Becker A (1999b) Fac- Ullmann M, Steinhardt U (1999) Gebietswasserhaushalt und
tors affecting nitrogen export from diffuse sources: A mod- Stoffhaushalt in der Löregion des Elbegebietes als Grundlage
elling study in the Elbe basin. In: Heathwaite L, (ed.) Impact fr die Durchsetzung einer nachhaltigen Landnutzung, UFZ
of Land-Use Change on Nutrient Loads from Diffuse Sources, Leipzig-Halle GmbH, Leipzig, Germany
IAHS Publ. No. 257; pp. 201–212 14. DVWK, Deutscher Verband für Wasserwirtschaft und
10. Hartwich R, Behrens J, Eckelmann W, Haase G, Richter A, Kulturbau e.V (1984) Bodennutzung und Nitrataustrag
Roeschmann G, Schmidt R (1995) Bodenübersichtskarte Literaturauswertung ber die Situation bis 1984 in der 385
der Bundesrepublik Deutschland 1:1000000, Hannover, Bundesrepublik Deutschland (1985), Hamburg, Berlin,
Germany Germany
11. Krysanova V, Haberlandt U, Österle H, Hattermann F 15. Scheffer F, Schachtschabel P (1984) Lehrbuch der Bodenk-
(2001a) Effects of natural and anthropogenic factors on unde. 11 Edition. Enke, Stuttgart, Germany
nitrogen fluxes in agricultural soils: a modelling study in 16. Krysanova V, Müller-Wohlfeil DI, Becker A (1998) Devel-
the Saale River basin (central Europe). In: Impact of Human opment and test of a spatially distributed hydrological/water
Activity on Groundwater Dynamics, IAHS Publ. No. 269; quality model for mesoscale watersheds, Ecological Model-
pp. 331–338 ling, 106: 261–289

You might also like