Professional Documents
Culture Documents
c 2004 Kluwer Academic Publishers. Manufactured in The United States.
Abstract. In a collaborative (distributed) Case-Based Reasoning (CBR) environment, an input query case could
be compared with the old cases that are resided in many different CBR agents in the network. How to obtain
the best solution effectively and efficiently from this distributed CBR network depends on a carefully designed
query dispatching strategy. In this paper, we propose a fuzzy integral based approach to measure the competence
of different CBR agents in the network and suggest three query dispatching policies which could be used to fulfill
this task. They are: To-Top policy, Strong-Strong policy and Best-Committee policy. The experimental result shows
that our proposed policies are comparatively better than the existing ones developed by Plaza and Ontañón.
first one is proposed by Smyth and Keane [5] while where GroupDensity is defined as the average
the second one is proposed by the authors. The CBR CaseDensity of the group (see Eq. (2)), and |G| means
agents competence computation and ranking policy are the size of the competence group G, i.e. the number of
given in Section 3. In Section 4, three policies for cases in the group G.
dispatching a new query case are proposed. Each of
these policies is based on a different assumption of GroupDensity(G) = CaseDensity(c, G)/|G|
how to obtain the best solution. An experimental com- c∈G
parison of our approaches to the existing ones is pro- (2)
vided in Section 5. Finally, the conclusion is given in
Section 6. where CaseDensity is defined by Eq. (3)
CaseDensity(c, G)
2. Case-Base Competence Modeling
= Simlarity(c, c∗ )/(|G| − 1). (3)
c∗ ∈G−{c}
The concept of case-based competence was first pro-
posed by Smyth and Keane [5], (i.e. refer as the S-K Different ways of computing Similarity can be used de-
model in this paper), and subsequently it has been de- pending on the problem on hand. For a given case-base,
veloped further to a whole range of concepts which with competence groups G = {G 1 , G 2 , . . . , G n }, n =
are useful for measuring the problem solving abil- 1, 2 . . . , the total coverage is defined by Eq. (4).
ity of case-bases. In the S-K model, many statistical
properties of a case base, such as the size and den- Coverage(G) = GroupCoverage(G i ) (4)
sity of cases, are used as input parameters for mea- G i ∈G
suring competence. However, this model assumes that
there is no overlap among different group of cases (e.g. 2.2. Problem of the S-K Model
features interaction [6] is a common cause of over-
laps). Therefore, if simply taking the group compe- Suppose that in some problem domain, we have a group
tence as the sum of the individual case competence, of non-uniformly distributed cases as depicted in Fig. 1,
and each individual case competence is computed inde- it can be shown that the S-K model is not a good pre-
pendently without considering the overlapping effects, dictor of the group competence because this model as-
the resulting group competence may be over- or under- sumes that the cases are distributed uniformly such
exaggerated. This feature overlap problem has been as those shown in Fig. 2. Assuming that Size(G) =
tackled by Shiu et al. [7] using fuzzy integral (refer as Size(G ), i.e. |G| = |G |, and GroupDensity(G) =
the S-L model in this paper). These two models will be GroupDensity(G ). Then, from Eq. (1), we have
used as the basis to develop our query case dispatch- GroupCoverage(G) = GroupCoverage(G ) where G
ing strategies. Details are explained in the following is an arbitrary competence group in a case-base, so sim-
sections. ilar results can be obtained between two case-bases, in
which one has its cases non-uniformly distributed and
the other uniformly distributed.
2.1. The S-K Model
However, from Figs. 1 and 2, it is obvious that the
coverage of the two competence groups cannot pos-
In this model, two key fundamental concepts are used:
sibly be the same. There are coverage holes in Fig. 1
coverage and reachability. Coverage of a case refers to
compared with that of Fig. 2. If we calculate the com-
the set of problems that the case can solve. Reachability
petence of the groups in Fig. 1 using the S-K model,
of a case is the set of cases that can be used to provide
then the actual competence will be over-exaggerated.
solutions to a case. Furthermore, the competence of a
It is because, the S-K model only considers the group
group of cases (G) (i.e. group coverage of G) depends
density, but ignores their distribution. There are possi-
on the number of cases in the group and their density
bly many ways of cases distributions, therefore a more
(see Eq. (1)).
accurate way of modeling of case-base competence is
required.
GroupCoverage(G) To further illustrate this point, we use the following
= 1 + |G| · (1 − GroupDensity(G)) (1) example:
A Fuzzy Integral Based Query Dispatching Model 303
Figures 1 and 2. (1) Non-uniformly distributed case base. (2) Uniformly distributed case base.
Suppose that the densities of the groups G 1 and also resulted with a computing error that cannot be
G 2 in Fig. 1 are both 0.8 and they are assumed to ignored.
have uniform distribution (i.e. the case density of each In cases distribution such as Fig. 1, the difference
case, in either G 1 or G 2 , is 0.8). The density of the between GroupCoverage1 and GroupCoverage2 can
whole group is 0.2, and the coverage of c∗ is three be further investigated as follows:
cases. The overlap coverage of c∗ and G 1 ∪ G 2 is
two, and c∗ is a pivotal case. It is rather straightfor- GroupCoverage2 − GroupCoverage1
ward to get the coverage of the whole competence = {1 + [|G| · (1 − GroupDensity(G))]}
group G as follows: (note that GroupCoverage1 means − {1 + [|G1| · (1 − GroupDensity(G1))] + 1
the computed group coverage in Fig. 1 while Group-
Coverage2 means the computed group coverage in + [|G2| · (1 − GroupDensity(G2))] + 1}
Fig. 2): = |G| · [(GroupDensity(G1)
− GroupDensity(G)) − 1 − GroupDensity(G1)]
GroupCoverage1(G) = GroupCoverage(G 1 )
(5)
+ GroupCoverage(G 2 ) + [coverage(c∗ )
≥ |G| · [(GroupDensity(G1)
− coverage(c∗ ) ∩ Coverage(G 1 ∪ G 2 )]
− GroupDensity(G)) − 2] (6)
= 1 + [|G1|(1 − GroupDensity(G))] + 1
Given the above Eqs. (5) and (6), if the number of
+ [|G2|(1 − GroupDensity(G))] + 1
cases increases and tends to ∞ (in the extreme case),
= 1 + 5(1 − 0.8) + 1 + 7(1 − 0.8) + 1 the value [GroupDensity(G 1 )−GroupDensity(G)] also
= 5.4 increases at the same time. As a result, the computing
error also tends to ∞. Therefore, the precision error of
However, according to the S-K competence model Smyth and Keane’s competence model can be arbitrar-
(Eq. (1)), the result becomes: ily large in some case-bases.
2.3.2. Computing the Overall Coverage Using Fuzzy spective importance, which indicates the two sets are
Integral. After detecting the weak links in a com- resisting each other.
petence group G, several new competence groups In our problem, consider X = {G 1 , . . . , G n } as the
G 1 , . . . , G n (n ≥ 1) are produced. According to the factor space. There are weak links among the compe-
definition of a weak link, each newly produced group tence groups, linking them to one group G. Here, weak
is said to be quasi-uniformly distributed. The next task links such as c∗ and c∗∗ are enhancing the overall cov-
is to compute the overall coverage of G. In the example erage of G. Hence, the important measure µ defined
described in Fig. 1, we simply sum the competence of on the power set (X ) is a super-additive measure. So
G i (1 ≤ i ≤ n) and the relative coverage of c∗ , but this here we have
method is only suitable in simple situations. There are
more complicated scenarios, such as the one given in
µ(A ∪ B) ≥ µ(A) + µ(B) for A, B ∈ (X ).
Fig. 3. It is difficult to clearly identify the contribution
of each weak link, (i.e. it is difficult to tell whether c∗
has more influence on the coverage of G than c∗∗ or vice For example, in Fig. 3, c∗ enhances the importance of
versa.) To describe this complex relationship, we apply G 1 ∪ G 2 , c∗∗ enhances the contribution of G 2 ∪ G 3 , and
a powerful tool called fuzzy integral (or non-linear in- there is no case to enhance or reduce the contribution
tegral) with respect to a fuzzy measure (a non-additive of G 1 ∪ G 3 , so we have
set function). Details are described in the next section.
µ(G 1 ∪ G 2 ) ≥ µ(G 1 ) + µ(G 2 )
2.3.3. Non-Additive Set Function. Let X be a µ(G 2 ∪ G 3 ) ≥ µ(G 2 ) + µ(G 3 )
nonempty set and (X ) be the power set of X . We
use the symbol µ to denote a non-negative set func- µ(G 1 ∪ G 3 ) = µ(G 1 ) + µ(G 3 )
tion defined on (X )with the properties µ() = 0. If
µ(X ) = 1, µ is said to be regular. When X is finite, µ is When using the fuzzy integral to compute the over-
usually called a fuzzy measure if it satisfies monotonic- all coverage of the original competence group G, we
ity, i.e., A ⊆ B ⇒ µ(A) ≤ µ(B) for A, B ∈ (X ). should determine the importance measure µ in ad-
For a non-negative set function µ, there are some as- vance. However, for a factor space including n factors,
sociate concepts. µ is said to be additive if µ(A ∪ B) = there are (2n − 1) parameters to decide. In the situa-
µ(A) + µ(B) for A, B ∈ (X ); to be sub-additive tion of Fig. 3, seven values of the importance measure
if µ(A ∪ B) ≤ µ(A) + µ(B) for A, B ∈ (X ); to should be determined, which are:
be super-additive if µ(A ∪ B) ≥ µ(A) + µ(B) for
A, B ∈ (X ).
Let X = {G 1 , . . . , G n } be the space of the new com- µ(G 1 ), µ(G 2 ), µ(G 3 ), µ(G 1 ∪ G 2 ), µ(G 1 ∪ G 3 ),
petence groups, and A and B two subsets of the power µ(G 2 ∪ G 3 ), µ(G 1 ∪ G 2 ∪ G 3 ).
set of X . Here, A and B can be a single new group G i or
the joint of several groups. If we consider µ(A) as the
To reduce the load, we apply a kind of fuzzy measure
importance of subset A, then the additivity of the set
called the λ-fuzzy measure, which takes the following
function means that the joint importance of the groups
form:
is just the sum of their respective importance, which
implies that there is no interaction among the compe-
tence groups. However, this is not true in the problem µ(A ∪ B) = µ(A) + µ(B) + λ · µ(A) · µ(B)
considered. In fact, most measures of importance are
λ ∈ (−1, ∞)
non-additive.
The sub-additivity and super-additivity are two spe-
cial types of non-additivity. Super-additivity means that If λ ≤ 0, µ is a sub-additive measure; if λ ≥ 0, µ is
the joint importance of the two sets is greater than or a super-additive measure; if and only if λ = 0, µ is
equal to the sum of their respective importance, which additive. So here we have λ ≥ 0. Applying the λ-fuzzy
indicates that the two sets are enhancing each other. In measure to determine the importance measure µ, we
contrast, sub-additivity means that the joint importance simply need to determine the importance of n on each
of two sets is less than or equal to the sum of their re- single factor and λ.
306 Shiu, Li and Zhang
2.3.4. Determining the λ-fuzzy Measure µ. In our where Fα = {x | f (x) ≥ α} for any α ∈ [0, ∞). When
model, we consider that the importance of each compe- X is finite, the Choquet integral can also be defined in
tence group is equal to 1, i.e. µ(G i ) = 1, (1 ≤ i ≤ n). the same way with respect to a non-negative set func-
This is a reasonable assumption because each group tion that is not necessarily monotone.
makes a unique contribution to the overall coverage, In our model, X = {G 1 , . . . , G n } is finite, f i =
that is, the status of each group is considered to be GroupCoverage(G i ), and importance measure µ satis-
equal. fies:
The next task is to determine the parameter λ, which
is critical to determine µ. It is obvious that the prop- µ(G i ) = 1(1 ≤ i ≤ n);
erties of the weak links between two groups are im- µ(A ∪ B) = µ(A) + µ(B) + λ · µ(A) · µ(B)(λ ≥ 0),
portant factors for determining λ. In our model, cover-
age of a group refers to the area of the target problem where λ is determined by Eq. (9).
space covered by the group. In this sense, the value The process of calculating the value of the Choquet
of λ is closely related to the coverage of weak links integral is as follows:
and the density of their coverage sets. Consider two
arbitrary new groups G i and G j , the W-SET between (1) Rearranging { f 1 , f 2 , . . . , f n } into a non-decrea-
them is C ∗ = {c1∗ , . . . , ch∗ }. We define Coverage(C ∗ ) sing order such that
and Density(C ∗ ) as follows:
f 1∗ ≤ f 2∗ ≤ · · · ≤ f n∗
h
Coverage(C ∗ ) = RelativeCoverage(ci∗ )
i=1
where ( f 1∗ , f 2∗ , . . . , f n∗ ) is a permutation of
( f 1 , f 2 , . . . , f n );
h
∗
Density(C ) = GroupDensity(Cov(ci∗ )) h (2) Computing
i=1
n
where Cov(ci∗ )
is the coverage set of one of the weak (c) fdµ = [ f j∗ − f j−1
∗
] · µ({G ∗j , G ∗j+1 , . . . , G ∗n })
j=1
links between G i and G j .
The coverage contribution of G i ∪ G j must be di-
where f (x0∗ ) = 0.
rectly proportional to Coverage(C ∗ ) and inversely pro-
portional to Density(C ∗ ). With these assumptions, the The value of the Choquet integral is considered as the
parameter λ is given by the formula in Eq. (8). coverage of the considered competence group. Each
competence group in the S-K model is considered in
the same way, and the sum of all group coverage is the
λ = Coverage(C ∗ ) · (1 − Density(C ∗ )) (9) overall coverage of the given case-base.
can be regarded as n agents A1 , A2 , . . . , An for prob- 2. Compute the competence of each CBR agent in
lem solving in a distributed manner. The corresponding the CCBR2 group according to the S-L model,
case-bases are CB1 , CB2 , . . . , CBn , with competence and rank them as C12 , C22 , . . . , Cm2 2 .
groups G 1 , G 2 , . . . , G n respectively.
Step 4. Rank the CBR agents according to the respective
competence of each CBR agent in the CCBR1 and
Compute the Group Competence
CCBR2:
In computing the competence, we define the similarity According to the competence, rank the CBR agents in
between two cases p and q by the following equation: the CCBR1 and CCBR2 system in a descending order
as {A11 , A12 , . . . , A1m 1 }, {A21 , A22 , . . . , A2m 2 }.
n
S M pq = 1/ 1 +
(x pj − xq j )2 ,
j=1
4. Query Dispatching Policies
where xi j corresponds to the value of feature F j (1 ≤
Three query dispatching policies based on the compe-
j ≤ n), (i = 1, . . . , n).
tence of the CBR agents are proposed here: To-Top
policy; Strong-Strong policy and Best Committee pol-
Step 1. Detecting the weak-links in the above compe-
icy. These are:
tence group G i (i = 1, 2, . . . , n):
If ∃c∗ ∈ G i , s.t. CompetenceError(c∗ ) ≥ α, then the
competence group G i is a non-uniform distributed 4.1. To-Top Policy
competence group. Otherwise, G i is a quasi-uniform
distributed competence group. The main idea of this policy is to choose the CBR
Step 2. Partition the CBR agents according to their com- agent which has the maximal competence in the cor-
petence: responding CCBR system, i.e. A11 or A21 in CCBR1 and
the CCBR2 system respectively. CCBR1 is chosen as
1. CCBR1 ← φ, CCBR2 ← φ, i = |G|; where
the problem-solving agent if there are no feature in-
CCBR1 consists of those agents who have no
teractions, otherwise CCBR2 is chosen. For example,
feature interactions (or no overlaps among com-
in a travel-planning problem which will be described
petence groups), while CCBR2 consists of those
in section five, the hotels are classified by the number
agents who have feature interactions (or over-
of stars, therefore when the user specify the type of
laps among competence groups).
accommodations (e.g. the number of stars), this will
2. If (i = 0), compute CompetenceError(c), ∀c ∈
limit the choices of the available hotels. In this case,
G, i = i − 1;
the features “accommodation” and “hotel” are inter-
3. If there is no weak-link in G, then G is called
acting. The dispatching procedure is as follow: if agent
a quasi-uniform distributed competence group,
Ai receives an input query, it will try to solve it. When
then add G to CCBR1, otherwise G is called
the solution is satisfactory (i.e. within a user defined
a non-uniform distributed competence group,
threshold of solution accuracy, and efficiency), it be-
then add G to CCBR2, end;
comes the answer to the input query. Otherwise, it will
4. For (1 ≤ j ≤ n), G ← G j ( j = 1, 2, . . . , n),
dispatch the problem to A11 or A21 for solving. If Ai
repeat the above Steps 1 to 3.
is one of the agents in CCBR1, then it dispatches the
Step 3. Compute the competence of each CBR agent in problem to the agent A11 , otherwise the problem will go
the CCBR1 & CCBR2: to A21 in CCBR2.
1. Compute the competence of each CBR agent in
the CCBR1 group according to the S-K model. 4.2. Strong-Strong Policy
Since the cases are distributed uniformly in
each CBR agent, we can get the competence In this policy, we assume that we could not determine
using the S-K competence model [5] directly. whether the features are having interactions or not.
They are then ranked in a descending order ac- Consider the travel-planning problem again, we are not
cording their competence, and are denoted as sure whether there are feature interactions or not be-
C11 , C21 , . . . , Cm1 1 . tween the features “season” and “hotel” or between
308 Shiu, Li and Zhang
the features “holiday duration” and “season”. Thus, it groups (i.e. each group represent one CBR agent, there-
is better to ask more than one agent to suggest the so- fore if 4 agents are used, each of them consists of 200
lutions. We will choose the most competent agent (i.e. cases, etc.). The feature “price” is chosen as the solu-
one from each collaborative CBR system). That is, if tion feature. The testing is based on the evaluation of
j
the agent Ai (i = 1, j = 1, 2) receives the prob- the solution accuracy and the mean cost of solving (i.e.
lem, and cannot solve it satisfactorily, it will ask the time consumption).
agents A11 in the CCBR1 and agent A21 in the CCBR2 The objective of the experiment is to determine the
to solve it in parallel. One of these suggested solutions “price” of each travel plan using our proposed policies.
will be used based on an earlier assessment of these A comparison of our approach to some exiting ones
two agents’ ability. (Note that these two agents belong [4] is also carried out. The mean relative error (i.e.
to the two CCBR systems, therefore the selection has the difference between the actual result and the pre-
already considered the feature interaction property). dicted result and divided by the actual result) is used
to compute the accuracy. For example in our experi-
ment, if four agents are used to predict the “price” of a
4.3. Best-Committee Policy particular testing case (such as Case number 987), our
three policies will give the following results respec-
If time is not critical, and getting better solution is the tively: $4,708.25, $3,536.72, and $4,561.35. The ac-
main concern, then the user can ask more agents for curacies are 84.94%, 86.43% and 88.26% respectively
suggested solutions. In general, the more the agents (see Fig. 4). The mean cost is the relative CPU time of
are involved, the more accurate the answer will be. the isolated agent, and assuming the mean time cost of
j
That is, if the agent Ai (i = 1, j = 1, 2) receives the isolated agent is ONE unit, then the mean time costs
the problem, it could follow the To-Top and Strong- of the collaborative policies are given in Fig. 5. We have
Strong policies first for solving the problem. However, conducted five testing runs, and each testing has dif-
if the solution is not satisfactory, it can ask the agents ferent number of agents. These agents are formed by
j j j
A1 , A2 , . . . , Ai−1 , (i.e. those agents that are better in randomly re-organize the 800 testing cases.
competence), to solve the problem. Each agent will of- The result shows that our policies use less time and
fer a solution to the problem, and the final solution is still can achieve the same accuracy as the other exist-
chosen according to the user’s preference, such as pre- ing ones. In details, Fig. 4 shows that all of six case
ferred accuracy. This policy provides the user a flexible dispatching policies are better than the isolated agent.
choice, when he wants to get the best solution, then he The Strong-Strong policy is better than the To-Top pol-
can ask all the agents for suggestions. This policy is icy and the Best-Committee policy is the best one.
the same as the “Committee” policy proposed by Plaza The Strong-Strong policy has similar accuracy to the
and Ontañón.
5. Experimental Evaluation
Acknowledgment
References
6. Conclusions
between 1985 and 1990 in several business organizations in Hong the Department of Computing, the Hong Kong Polytechnic Univer-
Kong. His current research interests include Case-base Reasoning, sity. Her interests include fuzzy mathematics, case-based reasoning,
Machine Learning and Soft Computing. He has co-authored (with rough sets theory and information retrieval.
Professor Sankar K. Pal) a research monograph Foundations of Soft
Case-Based Reasoning published by John Wiley in 2004.
Dr. Shiu is a member of the British Computer Society and the
IEEE.