You are on page 1of 3

Is the availability of more data always helpful in the production of knowledge?

Explore this
question with reference to two areas of knowledge.

Data is often utilized as a part of the process of extracting, creating and the growth of knowledge. 
 In
some areas of knowledge, like the ones that will be used and analyzed in this essay, the more data is
acquired, the easier the process of production of knowledge will be. This means that the data
recollected could be helpful in the production of knowledge. However, data has a different meaning in
different areas of knowledge. In the Natural Sciences data is information (statistical, quantitative or
factual) in its raw form that can be analyzed and used in order to gain knowledge or make decisions.
In History, data is in the form of past records. The only method for the data to be of any use in the
production of knowledge is if the human brain can process the data and comprehend it, furthermore,
create required links amongst a pool of data and then obtain significant value from it. For example, as
historical accounts are written by people, the data available to historians tend to be subjective and it is
up to historians to read through these data and come up with the most accurate version of the past. In
this essay, I will analyze what is the role of data in the production of knowledge.

The sciences, are mostly an evidence-based are of knowledge, as opposed to the history. This means
that the acquiring of data is very important in the process of production and development of
knowledge. The more data scientists manage to recollect and have, the easier it will be for them to
develop and produce knowledge. And comprising all of this data gives accurate results especially in
complex subjects such as computer science. Artificial intelligence is the simulation of human
intelligence processes by machines, especially computer systems. Particular applications of AI include
expert systems, speech recognition, and machine vision. This is accomplished by utilizing algorithms
which determine designs and patterns and moreover create meaningful knowledge from the data they
are presented to, for application to making decisions in the future, a procedure that evades the
requirement to be programmed precisely for each and every conceivable action. 
 So basically, the
technology that the world is relying on right now (Artificial Intelligence) is solely relying on the data that
is provided to it, which helps the machine to make future predictions (by learning the outcomes and
patterns). By increasing the data provided to the machines, it is certain that the degree of uncertainty
decreases and the accuracy of the produced knowledge increases. For instance, one common
example of AI can be Apple’s Siri, which is an internal micro-processed AI machine, that recognizes
verbal language and gives data that circumstanced by the information the developers have
programmed in her. If we ask Siri “how is the weather?”, using the location of the phone and recorded
weather reports, Siri will most certainly provide the user with the accurate answer. However, if we ask
Siri something which is not in her database, (“how many lemons does it take to make 300ml glass of
mildly sweet lemonade”), she will most certainly provide website links and say, “I have found this on
the internet”, which means that due to lack of data, the last alternative was for the user to find it
themselves.

However, many people argue that as science is an area of knowledge that is time relative, the
continuous gathering of data may complicate experiments and theories being postulated, guiding
many scientists down wrong or even redundant paths. Basing ourselves on the “Phlogiston Theory”, at
first the data they had acquired wrongly guided them into believing that the substance called
phlogiston did, indeed exist. The acquirements of the data to prove the Phlogiston theory, made
scientists believe something that wasn’t accurate. It was just until other scientists, falsified and
corrected this theory with newly found data.

I do disagree with these people because the recollection and acquirement of more data can and will
help the process of production of scientific knowledge a lot easier. It does not only speed the process
up, but it also helps the accuracy and certainty of the knowledge acquired, making it more reliable and
certain. I can also help to falsify and correct other theories, like the Phlogiston Theory I mentioned
before. However, these speculations additionally bring up the issue and raise a question, imagine a
scenario wherein a few years, when researchers secure more information and data, they distort the
oxygen hypothesis or different hypotheses that we now believe are accurate.

History, on the other hand, is a subject where there is no guarantee. Everything is real if it is on the
books and if you believe it. It is purely subjective. As Winston Churchill said, ‘history is written by the
victors’. This causes the concern of whether the data is reliable. Having the availability of more data
would be no use if none of it is trustworthy. Taking the example of the most widely used resource, the
internet: If one day Wikipedia claims that all the information it provides is fake, 50% of the students
across the globe of this generation will most probably fail in their academics. We do know that the
USA attacked Japan with nuclear missiles because it is on the books, transcripts or videos,
authenticated by trustworthy organizations and published by known authorities (either television or
textual copies).

However, the subjective ambiguity is still not completely eliminated if we wish to acquire the
knowledge of extremely past events, due to different perspectives and versions of the same story. For
instance, if I were to google about the world war and see if the USA did the right thing by attacking
Japan, I would most certainly get a mixture of data encompassing different stories of the same case,
culminating with varied opinions. But what if none of what I find is true? What if all the perspectives are
circumstanced and partial towards a side? The point of believing in the resource is completely useless.

Which leads us to the layered verification method: checking from a variety of resources. We are often
told to verify news or information by; different sets of resources, a variety of data-types (statistical,
transcripts, textual), different people for their past experience and first-hand knowledge (interview,
survey); in order to confirm the validity and authenticate the data that we have. Therefore, the more I
read about it, the more I explore and exhaust my resources, the deeper I can learn about the scenario
and be qualified enough to put up my own perspective, which will give my argument a stronger built
and more precision, further making the knowledge produced more accurate. Yet, if the reliability of the
resource remains unknown, the availability of more data does not help in the production of knowledge.

To conclude, I think that the more data acquired can make the production and development of
knowledge easier, however, it depends on the area of knowledge the data is being applied to. For
example, as I mentioned before, in history it is imperative for the resources to be authenticated and
trustworthy in order to produce accurate knowledge. If the reliability remains unknown, the availability
of data will not help in the production of knowledge. On the other hand, in the sciences, more data
acquisition is often useful and helpful for the process of production of knowledge because it makes it
easier to prove theories or carry out operations in the field of data-sciences (Artificial Intelligence and
Machine Learning). This also shows that these two areas of knowledge do not work in the same way
due to the fact that they are different, history is mostly a subjective area of knowledge, whereas the
sciences are more of an evidence-based area of knowledge and they need data to justify theories. The
exactness of additional data extracted during the time spent on learning and gaining knowledge
eventually rely upon the area of knowledge that is being reviewed and the data it is being connected
to.


You might also like