Professional Documents
Culture Documents
The fulltext of this document has been downloaded 1173 times since 2008*
Users who downloaded this article also downloaded:
(2005),"A substantive theory of classification for information retrieval", Journal of Documentation,
Vol. 61 Iss 5 pp. 582-597 <a href="https://doi.org/10.1108/00220410510625804">https://
doi.org/10.1108/00220410510625804</a>
(2008),"Classification, interdisciplinarity, and the study of science", Journal of Documentation,
Vol. 64 Iss 3 pp. 319-332 <a href="https://doi.org/10.1108/00220410810867551">https://
doi.org/10.1108/00220410810867551</a>
Access to this document was granted through an Emerald subscription provided by emerald-srm:478535 []
For Authors
If you would like to write for this, or any other Emerald publication, then please use our Emerald for
Authors service information about how to choose which publication to write for and submission guidelines
are available for all. Please visit www.emeraldinsight.com/authors for more information.
About Emerald www.emeraldinsight.com
Emerald is a global publisher linking research and practice to the benefit of society. The company
manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as
providing an extensive range of online products and additional customer resources and services.
Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee
on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive
preservation.
Core
Core classification theory: classification
a reply to Szostak theory
Birger Hjørland
Royal School of Libray and Information Science, Copenhagen, Denmark 333
Received 7 January 2008
Abstract Revised 9 January 2008
Purpose – The purpose of this paper is to provide an answer to a critique put forward by Szostak Accepted 14 January 2008
against a paper written by the present author.
Design/methodology/approach – The paper is based on a literature-based conceptual analysis
based on Hjørland and Nissen Pedersen and Szostak. The main points in a core theory of classification
are outlined and Szostak’s criticism is examined and answered.
Findings – The paper demonstrates theoretical differences between the views adduced by Hjørland
Downloaded by USP At 11:00 10 October 2017 (PT)
and Nissen Pedersen on the one side and by Szostak on the other.
Practical implications – Theoretical clarification is important for the future development of the
field.
Originality/value – The paper should be seen as one among others developing an argument for a
theoretical foundation of classification informed by the theory of knowledge.
Keywords Classification, Knowledge organizations
Paper type Conceptual paper
Introduction
This paper is an answer to Szostak (2008), who discuss claims put forward in Hjørland
and Nissen Pedersen (2005). Szostak’s points of view correspond to what he has
expressed in other publications (Szostak, 2004; Szostak et al., 2007). One of his main
points is that disciplinary classifications are obsolete and should be replaced by
interdisciplinary classifications of phenomena, theory and method. We both agree in
the value of classification (contrary to the view expressed by Sparck Jones (2005)). This
paper outlines the two positions and provides further arguments to my own position.
Among the claims made by Szostak (2008) are:
.
That interdisciplinarity is important but ignored by Hjørland and Nissen
Pedersen (2005).
.
That the distinction between a positivist and a pragmatic approach to
classification made by Hjørland and Nissen Pedersen is important, and that
interdisciplinarity may be seen as the pragmatic objective, that can define
classification criteria.
.
Concepts should refer to specific phenomena or sets of phenomena, or theories or
methods or components thereof. Journal of Documentation
Vol. 64 No. 3, 2008
.
That Hjørland and Nissen Pedersen (2005) suggest an inductive methodology, pp. 333-342
but that a combination of both inductive and deductive methods should be used q Emerald Group Publishing Limited
0022-0418
to develop classification. DOI 10.1108/00220410810867560
JDOC .
That scholarly works should be classified in terms of the phenomena studied and
64,3 theory and method used by these works.
understood as incomplete.
(3) Description (or every other kind of representation) of objects is both a reflection
of the thing described and of the subject doing the description. Descriptions are
more or less purposeful and theory-laden. They are made from a perspective,
whether or not this is recognized. Pharmacologists, for example, in their
description of chemicals, emphasize the medical effects of chemicals, whereas
“pure” chemists emphasis other things such as their structural properties.
(4) The selection of the properties of the objects to be classified must reflect the
purpose of the classification. There is thus no “neutral” or “objective” way to
select properties for classification. Example: Whether to classify by form or
color. The figures shown in Figure 1 may be classified according to color or
shape. None of those properties are “objectively” more important than the other.
For some purposes the two squares are most alike and should be classified
together. For other purposes the two black figures (a square and a triangle) are
most alike and should be classified together.
(5) The (false) belief that there exist objective criteria for classification may be
termed “empiricism” (or “positivism”), while the belief that classifications are
always reflecting a purpose may be termed “pragmatism”. Hjørland and Nissen
Pedersen (2005) is thus an argument for the pragmatist way of understanding
classification.
(6) We saw that different domains (chemistry and pharmacology) may need
different descriptions and classification of objects to serve their specific purpose
in the social division of labor in society. The criteria for classification are thus
generally domain-specific. Different domains develop specific languages (LSPs)
that are useful to describe, differentiate and classify objects in their respective
domain. As the pragmatic philosopher John Dewey wrote:
Figure 1.
Classification criteria
Core
Cherry trees will be differently grouped by woodworkers, orchardists, artists,
scientists and merry-makers. To the execution of different purposes different ways of classification
acting and re-acting on the part of trees are important. Each classification may be theory
equally sound when the difference of ends is borne in mind (Dewey, 1948, pp. 151-4).
(7) In every domain different theories, approaches, interests and “paradigms” exist,
which also tend to describe and classify the objects according to their views and 335
goals. For example, psychoanalysis and biological psychiatry disagree on how
mental illness should be classified and they disagree about the value of a
particular classification scheme such as the DSMIV. (The documentation for this
claim is collected and continuously updated in the descriptions of the different
domains in the Epistemological Lifeboat (Hjørland and Nicolaisen, 2005)).
(8) Any work on any subject is always made from a point of view, which may be
uncovered by analysis (e.g. a feminist point of view or a “traditional” or an
eclectic point of view). The same is the case with any classification. Ørom (2003)
Downloaded by USP At 11:00 10 October 2017 (PT)
sense, is really needed for retrieval in many, or most, cases, or whether classification
in the general (i.e. default) retrieval context has a quite other interpretation. Relevance
feedback simply exploits term distribution information along with relevance
judgements on viewed documents in order to modify queries. In doing this it is
forming and using an implicit term classification for a particular user situation. As
classification the process is indirect and minimal. It indeed depends on what
properties are chosen as the basic data features, e.g. simple terms and, through
weighting, on the values they can take; but beyond that it assumes very little from the
point of view of classification. It is possible to argue that for at least the core retrieval
requirement, giving a user more of what they like, it is fine. Yet it is certainly not a big
deal as classification per se: in fact most of the mileage comes from weighting. And
how large that mileage can be is what retrieval research in the many experiments
done in the last decade have demonstrated, and web engines have taken on board
(Sparck Jones, 2005).
We agree that classification criteria are implicit in the literature to be retrieved,
as outlined above. Spark Jones asks “whether classification in the conventional,
explicit sense, is really needed for retrieval”? Our answer to this question is that
no retrieval mechanism (and also no definition of “relevance”) is ever neutral,
but is always considering some interests at the expense of other interests. To
make a distinction between such views is to make a kind of classification, which
is thus always necessary. To believe in a technical solution employing
“relevance feedback” is a fallback to the positivist failure. The vision of
automated feedback and value-free systems is temptation but based on
problematic philosophical assumptions.
Suppose, for example, a person is searching information about “Sweden”.
Some references are retrieved by using search terms (or otherwise). The user
indicates which references are relevant and the system is supposed to find
“more like this”. In a traditional classification may all Swedish place names be
classified (e.g., Borås, Lund, Malmö, Stockholm). Can such a classification be
replaced by mechanisms providing relevance feedback? One problem might be
that the user does not know which place names are Swedish and which are not
Swedish. He may provide incorrect feedback (e.g. by stating that a reference
about “Bagsværd” is relevant). A possibility may therefore be that users are not Core
able to retrieve the relevant documents and to avoid the non-relevant classification
documents by systems based on relevance-feedback. In other words:
Classification in the traditional sense is still needed. theory
These 11 points constitute a fundamental theory of classification. We shall now
compare these principles with Szostak’s points of view. 337
Discussion of the points of view put forward by Szostak (2008)
(1) Interdisciplinarity
Szostak opined that “ Hjørland and Nissen Pedersen (2005) end their recent paper in this
Journal with a claim that ‘a theory of classification is especially connected to science
studies’ (594). They do not make a similar claim regarding ‘interdisciplinary studies’”
There is, however, a very important difference between the way we believe Science
Studies are important for KO compared to the way Szostak believes interdisciplinary
studies are important to the same field. Different disciplines serve different purposes
Downloaded by USP At 11:00 10 October 2017 (PT)
and according to our classification rules (3)-(6) are different classifications developed to
serve each discipline and reflect the particular needs of each discipline. Szostak, on the
other hand, seems to suggest that one overall interdisciplinary classification is what is
needed, why classifications do not reflect their different purposes. This leads us to his
next point.
(4) That Hjørland and Nissen Pedersen (2005) suggest an inductive methodology, but
that a combination of both inductive and deductive methods should be used to develop
Downloaded by USP At 11:00 10 October 2017 (PT)
classification
Szostak is right in claiming that a combination of inductive and deductive methods
should be used in classification. I shall not here trace how Szostak reached the
conclusion that Hjørland and Nissen Pedersen (2005) expressed a different view. What
is important is to clarify the methods of classification. I believe induction and
deduction are both necessary, but not sufficient, even in combination. The use of the
inductive method may be termed empiricism, the use of the deductive method may be
termed rationalism, and their combination may be termed positivism. My main
argument has been for years that more than these two methods is needed. Historicism
is a third method (e.g. recommended by Ereshefsky, 2000, in biological taxonomy). The
fourth needed method is pragmatic analysis (pragmatism), already mentioned. It is
important to realize that historicism and pragmatism are different kinds of necessary
methods. Many people, including Szostak, believe that only induction and deduction
are proper methods, that these two methods cover the whole field. I shall not try to
provide the technical argument here, but maintain that historicism and pragmatism
cannot be reduced to induction or deduction (see Table I).
It is also important to consider that classification is made in the sciences as well as
in information science/knowledge organization. The first kind of classification may be
termed “scientific classification”, the last one “bibliographic classification”. Any theory
of bibliographical classification should reveal how it is related to scientific
classification. Today are principles of scientific classification almost totally ignored
by information scientists. Szostak seems to believe that when we claim the importance
of scientific classification for bibliographic classification are we suggesting an
“inductive method”.
(5) That scholarly works should be classified in terms of the phenomena studied and
theory and method used by these works
This principle is contrasted to classification by discipline. These two ways of
classification have a history in knowledge organization and a body of literature and
arguments. Classification by phenomenon has also been termed “entity” or “one place”
classification, while classification by discipline has also been termed “aspect”
Core
“Scientific classification” “Bibliographic classification”
classification
Empiricism
(Observations and
Classification provided by statistical
generalizations (e.g. factor analysis)
Documents clustered on the basis of
some kind of similarity, e.g.
theory
inductions) based in “similarity” common terms in traditional IR or
Examples: Classifications of mental bibliographical coupling
illness in psychiatry (DSMIV) kinds of Examples: “Atlas of science” and 339
intelligence in psychology based on visualizations (White and McCain,
statistical analysis of test scores 1998). “Research Fronts” I SCI and
algorithms for information retrieval
Rationalism Classification based on logical, Facet analysis built on logical
(Principles of pure universal divisions divisions and “eternal and
reason. Deductions) Examples: Frame based systems in unchangeable categories”
Artificial Intelligence. Chomsky’s Examples: Ranganathan, Bliss II
analysis of the deep structure in and Langridge. Semantic networks.
language According to Miksa (1998) the DDC
has increasingly used this approach
Historicism Classification based on historical or Systems based on the study of the
Downloaded by USP At 11:00 10 October 2017 (PT)
disciplinary classification seems to have been the most useful alternative for scholarly
literature). The idea to classify by phenomena is certainly not new. What kind of new
evidence or arguments do Szostak put forward in favor of his preference? As far as I
can see is the answer “none”, because increased interdisciplinarity has also formerly
been used as argument.
We may also ask: Why choose between these two kinds of classification? Why not
produce both? They may supplement each other in a fruitful way. In the new digital
environment we do not have the same limitations as when the classical systems like the
DDC were developed. Toulmin (1972) differentiates between the content-knowledge of
a science and the institutional aspects of science, such as the professional forums. He
suggests that science is generally continuous because either the content or the
institution will remain stable while the other changes. In response then the first will
adapt, in an iterative process of constant change and constant stability. There is
continuity because each generation is always taught by the preceding generation of
scientists and also because the research questions in which a community is interested
are predicated on the current concepts they hold, even when the results of such
research might indicate that changes are needed to better adapt the concepts in
response to other concepts or other facts about nature. Toulmin’s differentiation
between content-knowledge and institutional aspects corresponds to Hjørland and
Hartel’s (2003) ontological versus social dimension of a domain. Both dimensions
should be studied by information scientists.
Conclusion
Szostak is arguing for an interdisciplinary classification of phenomena, theory and
method. I believe that implicit in this goal are points of view which are in disagreement
with the pragmatic theory of classification, which I find most fruitful for information
science.
Szostak’s position seems to be based on what we termed the positivist view: That
we can describe and list all properties of objects independent of our theories and
interests, that different purposes need the same classification and that the classification
of phenomena, theories and methods are classifications of independent dimensions. I Core
believe that the phenomena to be classified are discovered by, for example a science classification
such as chemistry, why these phenomena cannot be and should not be separated from
the fields of human activity to which they belong. The understanding of why, for theory
example, biological species are classified the way they are, can only be understood by
considering the development of biological systematics. Classification theory in Library
and Information Science (as well as in Philosophy) should be understood as a 341
metascience. One cannot study metascience by ignoring the sciences you are studying.
Metascience is an interpretation of what good science is, based on the history of
science. It cannot offer a priori principles on how science should be performed.
In spite of these theoretical differences are Szostak’s contributions most welcome.
We need dedicated scholars to engage in this important area, and science advances
through discussion.
References
Dewey, J. (1948), Reconstruction in Philosophy, enlarged ed., Beacon, New York, NY, (originally
Downloaded by USP At 11:00 10 October 2017 (PT)
published 1920).
Dupré, J. (2006), “Scientific classification”, Theory, Culture & Society, Vol. 23 Nos 2-3, pp. 30-2.
Ereshefsky, M. (2000), The Poverty of the Linnaean Hierarchy: A Philosophical Study of Biological
Taxonomy, Cambridge University Press, Cambridge.
Gordon, A.D. (1987), “A review of hierarchical-classification”, Journal of the Royal Statistical
Society, A, Vol. 150, pp. 119-37.
Hjørland, B. (2003), “Fundamentals of knowledge organization”, Knowledge Organization, Vol. 30
No. 2, pp. 87-111.
Hjørland, B. and Hartel, J. (2003), “Afterward: ontological, epistemological and sociological
dimensions of domains”, Knowledge Organization, Vol. 30 Nos 3-4, pp. 239-45.
Hjørland, B. and Nicolaisen, J. (2005), “The epistemological lifeboat. Epistemology and
philosophy of science for information scientists”, available at: www.db.dk/jni/lifeboat/.
Hjørland, B. and Nissen Pedersen, K. (2005), “A substantive theory of classification for
information retrieval”, Journal of Documentation, Vol. 61 No. 5, pp. 582-97.
Miksa, F.L. (1998), The DDC, the Universe of Knowledge, and the Post-modern Library, Forest
Press, Albany, NY.
Mills, J. and Broughton, V. (1977), Bliss Bibliographic Classification, Second Edition. Introduction
and Auxiliary Schedules, Butterworths, London.
Ørom, A. (2003), “Knowledge organization in the domain of art studies – history, transition and
conceptual changes”, Knowledge Organization, Vol. 30 Nos 3/4, pp. 128-43.
Schneider, J. and Borlund, P. (2007), “Matrix comparison, part 1: motivation and important issues
for measuring the resemblance between proximity measures or ordination results”,
Journal of the American Society for Information Science and Technology, Vol. 58 No. 11,
pp. 1586-95.
Sparck Jones, K. (2005), “Revisiting classification for retrieval”, Journal of Documentation, Vol. 61
No. 5, pp. 598-601, (reply to Hjørland & Nissen Pedersen, 2005).
Szostak, R. (2004), Classifying Science, Phenomena, Data, Theory, Method, Practice, Springer,
Berlin.
Szostak, R. (2008), “Classification, interdisciplinarity, and the study of science”, Journal of
Documentation, Vol. 64 No. 3.
JDOC Szostak, R. et al. (2007), “The León manifesto”, paper presented at the 8th Conference of the ISKO
Spanish Chapter, available at: www.iskoi.org/ilc/leon.htm (accessed December 12, 2007).
64,3 Toulmin, S. (1972), Human Understanding: The Collective Use and Evolution of Human Concepts,
Princeton University Press, Princeton, NJ.
Wallerstein, I. (1996), Open the Social Sciences, report of the Gulbenkian Commission on the
Restructuring of the Social Sciences, Stanford University Press, Stanford, CA.
342 White, H.D. and McCain, K.W. (1998), “Visualizing a discipline: an author co-citation analysis of
information science, 1972-1995”, Journal of the American Society for Information Science,
Vol. 49 No. 4, pp. 327-55.
Further reading
Toulmin, S. (1977), Human Understanding: The Collective Use and Evolution of Human Concepts,
paperback ed., Princeton University Press, Princeton, NJ.
Corresponding author
Birger Hjørland can be contacted at: bh@db.dk
Downloaded by USP At 11:00 10 October 2017 (PT)
1. Pooria Niknazar, Mario Bourgault. 2017. Theories for classification vs. classification as theory:
Implications of classification and typology for the development of project management theories.
International Journal of Project Management 35:2, 191-203. [CrossRef]
2. References 131-142. [CrossRef]
3. Umi A. Mokhtar, Zawiyah M. Yusof. 2016. Records management practice: The issues and models for
classification. International Journal of Information Management 36:6, 1265-1273. [CrossRef]
4. Umi Asma’ Mokhtar, Zawiyah M. Yusof. 2015. Classification: The understudied concept. International
Journal of Information Management 35:2, 176-182. [CrossRef]
5. Martin FrickéSchool of Information Resources and Library Science, The University of Arizona, Tucson,
Arizona, USA. 2013. Reflections on classification: Thomas Reid and bibliographic description. Journal of
Documentation 69:4, 507-522. [Abstract] [Full Text] [PDF]
6. Shirley A. WilliamsSchool of Systems Engineering, University of Reading, Reading, UK Melissa
M. TerrasDepartment of Information Studies, University College London, London, UK Claire
WarwickDepartment of Information Studies, University College London, London, UK. 2013. What
Downloaded by USP At 11:00 10 October 2017 (PT)
do people study when they study Twitter? Classifying Twitter related academic papers. Journal of
Documentation 69:3, 384-410. [Abstract] [Full Text] [PDF]
7. Sjoerd Hardeman. 2013. Organization level research in scientometrics: a plea for an explicit pragmatic
approach. Scientometrics 94:3, 1175-1194. [CrossRef]
8. References 145-170. [CrossRef]
9. Bee-Yeon Kim. 2011. A Study on Classification of Interdisciplinary Subjects in DDC. Journal of the
Korean Society for Library and Information Science 45:1, 333-351. [CrossRef]
10. Yael KeshetWestern Galilee Academic College, Akko, Israel. 2011. Classification systems in the light of
sociology of knowledge. Journal of Documentation 67:1, 144-158. [Abstract] [Full Text] [PDF]
11. Jens‐Erik MaiFaculty of Information, University of Toronto, Toronto, Canada. 2010. Classification in a
social world: bias and trust. Journal of Documentation 66:5, 627-642. [Abstract] [Full Text] [PDF]
12. Birger Hjørland. 2010. Answer to Professor Szostak (concept theory). Journal of the American Society for
Information Science and Technology 61:5, 1078-1080. [CrossRef]
13. Peter J. WildInstitute for Manufacturing, University of Cambridge, Cambridge, UK Matt D.
GiessInnovative Design and Manufacturing Research Centre, University of Bath, Bath, UK Chris A.
McMahonInnovative Design and Manufacturing Research Centre, University of Bath, Bath, UK. 2009.
Describing engineering documents with faceted approaches. Journal of Documentation 65:3, 420-445.
[Abstract] [Full Text] [PDF]
14. Juris Dilevko, Lisa Gottlieb. 2009. The relevance of classification theory to textual analysis. Library &
Information Science Research 31:2, 92-100. [CrossRef]
15. Birger Hjørland. 2009. The foundation of the concept of relevance. Journal of the American Society for
Information Science and Technology n/a-n/a. [CrossRef]