You are on page 1of 88

Research Methods II

Chapter 1: Introduction
What is NOT research?
o
o
o

Just collecting facts or information with no clear purpose


Reassembling and reordering facts or information without interpretation
A term to get your product or idea noticed and respected

What is research?
o
o
o

Data are collected systematically


Data are interpreted systematically
There is a clear purpose = to find things out

Definition: Something that people undertake in order to find out things in a systematic way,
thereby increasing their knowledge.

BUSINESS AND MANAGEMENT RESEARCH


o
o
o
o

Transdisciplinary nature
Development of ideas that are related to practice Requirement to have some practical
consequence
Personal or commercial advantages related to research
Theory + practice (?)

In sum, research (also business and management research) should


o
o
o

Collect data systematically


Interpret data systematically
Have a clear purpose: to find things out

Different types of research depending on


its purpose and context

FUNDAMENTAL RESEARCH
Purpose:
o Expand knowledge of processes
o Universal principles
o Findings of significance and value to
society in general
o Find rules that might explain a theory
(E.g.: How students are motivated)

APPLIED RESEARCH
Purpose:
o Improve understanding of particular
problem
o Results in solution to problem
o New knowledge limited to problem
o Findings of particular relevance
o To solve practical problems
(E.g.: lack of motivation in a company)

Context:
o Universities (Mostly academic context)
o Choice determined by researcher
o Flexible time scales

Context:
o Organisations and universities
(E.g.: Consultancy)
o Negotiation with originator
o Tight time scales

You develop a model that can be used in other


situations.

There can be interactions between these two researches.


Question:
A research team will investigate which location is most suited for the establishment of a new
Carrefour supermarket. This research project leans towards:
A) Fundamental research
B) Applied Research
Answer: A Because other companies can use that.
A research study was carried out to see whether and why people notice web addresses on television
adverts. This study leans towards:
A) Fundamental research
B) Applied Research
Answer: A Principle that can be used for other companies
A study was carried out to see how the rise of the Internet has changed consumers buying behavior.
This study leans towards:
A) Fundamental research
B) Applied Research
Answer:
Which of the following statements is wrong?
A) Research is either fundamental or applied
B) Fundamental research might be of practical relevance
The key outcomes of applied research are actionable results with practice impact
C) Research might be of both practical and theoretical relevance
Answer: A They are the 2 extremes. However, we can have a mixture of fundamental and applied
research.

Wherever your research lies, whether fundamental or applied, or anywhere in between, you should
undertake it with rigour.
Pay careful attention to the research process!

Chapter 2: The research topic


THE RESEARCH PROCESS
o
o
o
o
o

Formulate and clarify the research topic (What: Formulate the question. Make a
good/relevant question)
Develop the research design. (How am I going to develop the research? Who are my target
group?)
Gather data
Analyse and interpret the data
Write the project report (Write research work with an answer to question)

FORMULATE AND CLARIFY THE RESEARCH


Research topic
Why are you doing research?
Basically, because you are dealing with a certain problem or question; you want to find something
out. There must be a problem to conduct a research.
Why this particular problem or question?
o
o
o

Intellectual reasons
Practical reasons
Personal reasons

Formulating Research Questions


Research question = The question on which you will try to find an answer by means of your research.
This question should give direction to your research
No good research question = No good answer!!!
Was our recent advertising campaign successful?
NOT a good question.
What do we define as successful/recent/advertising Campaign?
We have to be as concrete as possible by defining all the elements. We should try to narrow it down
as much as possible.
Make sure you critically evaluate your research question(s):
o Suitable focus?
o Am I able to investigate my research question(s)?
o Is it feasible to investigate my research question(s)?
Suitable Focus?
No simplistic inductive research question
o

Entirely separate from previous research or theory

No omnivore-research question
o
o

Involves as many elements as possible; Wants to investigate everything


So broad that any profundity becomes impossible

No theoretical research question


o

Entirely separate from empiricism; Not tuned into social reality

Every research project involves making choices:


o

Not everything can be investigated, and what you investigate cannot be known to perfection

Make sure you make this clear:


o

As a researcher, you should communicate the theoretical and practical


boundaries/limitations of your research question(s) (and thus research results)

Am I able to investigate my research?


Do I have the appropriate research skills?
Is it ethical? (E.g.: Can young people buy the product?)
Is it feasible to investigate my research question(s)?
Time?
o Longitudinal Research (Made several times over long periods of time). It enables us to see if
the problem changes over time.
o Cross-sectional research (investigate/compare different population groups at a single point
in time)
Money?
Accessible to and willingness to participate of the research objects.
o

Am I able to get access to the data?

Research Questions: Some examples


Problem statement:
The Flemish movie industry is growing but has still to cope with limited attendance, possibly caused
by the general public not being familiar with such movies. It is not clear
how this lack of
familiarity could be countered.
Research questions?
o

How many times do Flemish people go to the cinema to watch Flemish movies?

Problem statement:
Company X deals with significant financial problems. To solve these problems, the turnover should
increase with at least 10 percent during the next 18 months.
Research questions?
o
o

What is the cause of the financial problem?


How can company X increase turnover by 10% during the next 18 months?

Problem statement:
Previous academic research thoroughly investigated the shoplifting phenomenon as it negatively
influences business, other consumers, and society more generally. Although many examined the
socio-demographic profile of shoplifters, research about how to prevent shoplifting among
consumers is rather scarce.
Research questions?
o
o

Why are they shoplifting?


What is the most effective way to fight shoplifting?

THE NATURE OF YOUR RESEARCH


EXPLORATORY RESEARCH
to discover what is happening and gain insights about a topic of interest. It is particularly useful if
you wish to clarify your understanding of a problem, such as if you are unsure of the precise nature
of the problem. It may be that time is well spent on exploratory research, as it might show that the
research is not worth pursuing! () has the advantage that it is flexible and adaptable to change.
For qualitative research. In case there are not a lot of research about this topic.

DESCRIPTIVE RESEARCH
o
o
o
o

To gain an accurate profile of events, persons or situations


May be a precursor or extension of exploratory and/or explanatory research
Means to an end vs. An end in itself
Descripto-explanatory studies (= precursor/predecessor to explanatory studies)

EXPLANATORY/CAUSAL RESEARCH
To establish relationships between variables
o
o

E.g.: Quantitative study investigating whether certain colours in the shop lay-out result in
higher levels of customer satisfaction
E.g.: Qualitative study investigating whether Corporate Social Responsibility activities in a
company influence employee involvement
Gender

Independent
Variable

Study Results
Discipline
(Mediator)

Hypothesis/Expectation: Female doing better


Discipline is the mediator

Dependent
variable

Gender
Independent
variable

Results
Belgium
v.s
Other countries
(Moderator)

Dependent
variable

Research projects may serve more than one purpose


Cinite, I., Duxbury, L.E., & Higgins, C. (2009). Measurement of perceived organisational readiness for
change in the public sector. British Journal of Management, 20(2), 265-277.
o
o
o

Exploratory phase: To identify behaviours, based on their participants experiences, of


organisational change (interviews)
Descriptive phase: Used as a forerunner for the next phase (web-based survey)
Explanatory phase: To explain the relationship between organisational actions and readiness
or unreadiness to implement change, based on employees perceptions (web-based survey)

Question:
Which of the following statements is false?
A) Profiling KUL-students in terms of gender and age is an example of descriptive research.
B) Exploratory research may follow descriptive or causal research.
C) When little is known about the problem situation, it is desirable to start with exploratory
research.
D) Investigating whether and why a decrease in price influences sales and market share results
in descriptive research.
Answer: D

RESEARCH OBJECTIVES
Might be useful for complex research questions. It comes after the question.
What?
Operationalise how you intend to conduct your research by providing a set of coherent and
connected steps to answer your research question.
Why?
o
o

Likely to lead to greater specificity compared to research questions


Require more rigorous thinking

What kind of work do I need to do in order to answer my research question? What successive steps
do I need to take in order to answer my research question?
These are statements, not questions and they are numbered in a list.
Example:
As a sales manager, you notice that your sales staff becomes less and less motivated to sell the
companys products. Therefore, you decide to investigate in which way you could increase the level
of motivation among your sales staff.
To define the concept of motivation
To review key literature on the existing measures to motivate sales people
To identify the strengths and weaknesses of the identified measures
To determine which measures are most relevant to use in the context of my company
To carry out primary research in my company to measure the effectiveness of the selected
measure

HYPOTHESES
It is something you would like to test. It is based on a theory, but not always. E.g.: For inductive
approach.
Directional hypotheses:
o
o

The direction of the relationship between the variables is indicated. E.g.: The greater the
stress experienced in the job, the lower the job satisfaction of employees
The difference between two groups on a variable is postulated. E.g.: Women are more
motivated than men to lose weight

Non-directional hypotheses:
o

Do postulate a relationship or difference, but offer no indication of the direction of these


relationships or differences. E.g.: There is a relationship between age and job satisfaction

HYPOTHESES: SOME COMMON MISTAKES


Ambiguous formulation
o

E.g.: Belgians consume much candy, Americans dont. (should avoid using words like much)

No point of reference
o

E.g.: Adolescents consume much more alcohol. (much more)

Unfounded (not based on literature; Theory as fuel)


o

However, this would not be problematic when following the inductive approach (see
afterwards)

RESEARCH QUESTIONS VS. HYPOTHESES


Research questions
o

E.g.: Which measure is most effective in preventing consumers from shoplifting?

Hypothesis
o

A tentative, yet testable, statement which predicts what you expect to find in your
(empirical) data. E.g.: Financial punishments are more effective in preventing consumers
from shoplifting compared to social punishments.

Question:
Is the following hypothesis well formulated? Explain your answer.
Belgian adolescents have a better self-image compared to French adolescents.
Answer: This is a good directional hypothesis. Because it is well formulated and there is no
ambiguous word and with is a clear point of reference.

Question:
Research topic: Salespeople of Samsung and their preference for payment by commission vs. salary
Formulate a research question in line with the research topic above that results in descriptive
research and a research question that results in causal research.
Answer:
Descriptive Research: What percentage prefer payment by salary versus payment by commission?
Causal Research: How many products are sold by salespeople paid by salary in comparison to
salespeople paid by commission?

Chapter 3: The Research Design


How you will conduct your research. At this stage you need to think of all the elements needed to
fulfil the research. Should be aware of advantages and disadvantages.
Definition of Research Design
A general framework or plan for conducting a research project; It details the procedures
necessary for obtaining the information needed to answer the research question.
Motivate the choices you make!
Exam: Give arguments for choice of design. No right/wrong answer.

RESEARCH PARADIGM
The development of your research design will be influenced by your research paradigm
o
o

A cluster of beliefs and dictates which for scientists in a particular discipline influence what
should be studied, how research should be done and how results should be interpreted
A basic belief system or worldview that guides the investigator, not only in choices of
method but in ontologically and epistemologically fundamental ways

The difference between research paradigms is based on assumptions within three domains:
Ontology:

What is the reality? How does reality look like? Is there a reality external to
humans? If yes, what does it look like?
Epistemology: How can we built knowledge about that reality? How do we know what we know?
What counts as knowledge, what doesnt? How is the relationship between
researcher and subject?
Methodology: How can the researcher acquire knowledge about his beliefs? (Is limited by
ontological and epistemological viewpoints.)

POSITIVISM

o
o
o
o
o
o
o

Positivism
Explaining relationships
Accumulating data
Objective process
Knowledgeable researcher, known
subjects
Theory verification
Deduction Testing hypotheses
Focus on quantitative research

CONSTRUCTIVISM

o
o
o
o
o
o
o

Constructivism
Understanding subjects meaning
Constructing information
Intersubjective process
Researcher becomes involved with the
subjects
Theory building
Induction Developing hypotheses
Focus on qualitative research

There are many other research paradigms in between the extremes of positivism and
constructivism!
It could happen that a qualitative research could be positivism.

Example:
Relationship between CSR activities within a company and employees involvement.
How would a positivist deal with this topic?
He would test a theory. E.g.: Giving the employees a quantitative questionnaire.
How would a constructivist deal with this topic?
He wouldnt start with theory. Depending on the field of study, he would develop a theory.

Question:
Which of the following statements is false?
A) Quantitative research can follow an inductive approach.
B) Qualitative research might be inductive as well as deductive.
C) In positivistic research, the researcher intervenes in the research process.
D) Examining peoples motivation for luxury consumption can be done by quantitative as well
as qualitative research.
Answer: C
B Might be true sometimes. But not always. Not typical. C Qualitative is more appropriate but we
can have a quantitative approach as well. (E.g. checking consumption)

RESEARCH APPROACH
The development of your research design will also be influenced by your research approach

Survey: more deduction approach


Case study: more induction
However, Induction and Deduction can be combined within the same research project!
Data

Theory
Induction

Data
Deduction

Once youve collected data and made your theory, you can decide to test it again.

Deductive or Inductive?
1. It rains, everything outside becomes wet. It rains. The car is outside.
The car will become wet.
Answer: Deduction
2. The first duck in the park is brown. The second duck in the park is brown. The third duck in
the park is brown.
Every duck in the park is brown.
Answer: Induction
Some practical criteria:
o
o
o
o
o

Emphasis of the research and nature of the research topic


Wealth of literature
Time available
Risk
Audience

Inductive approach is more risky. You might not develop a good theory with your data.
Question:
Which of the following statements is true?
A) With deduction, data are collected and a theory developed as a result of the data analysis.
B) Research projects should include either the deductive or inductive research approach.
C) A research topic about which little literature exists, is more likely to result in an inductive
research approach than a deductive research approach.
D) The deductive research approach is less strict compared to the inductive research approach.
Answer: C
Patrick is a member of the Human Relations Research Group of KUL. He read about the large amount
of adolescents slipping into shoplifting behaviour and wonders how this behaviour could be
prevented. Therefore, he runs a study in which he tests whether the Protection Motivation Theory is
applicable to this particular issue. Patricks study leans towards:
a) An inductive research approach
b) A deductive research approach
Answer: b
He develops some expectations/theory. Then he gathers data to test the theory.
In sum, you should be aware of the fact that research paradigm and research approach influence
your research design
Core elements of your research design are:
o Research choice
o Research strategy
o Time horizon

RESEARCH CHOICE: QUANTITATIVE , QUALITATIVE OR MULTIPLE METHODS RESEARCH DESIGN


How will you combine quantitative and qualitative data collection techniques and data analysis
procedures?
Quantitative
Often used as a synonym for data collection techniques/data analysis procedures that
generate or use numerical data.
Qualitative
Often used as a synonym for data collection techniques/data analysis procedures that
generate or use non-numerical data (such as text)
This distinction might be both problematic and narrow.
Why problematic?
Many research designs are likely to combine quantitative and qualitative elements
o E.g.,
Research design using a questionnaire in which respondents also have to answer
some open questions in their own words
o E.g.,
Qualitative research data may be analysed quantitatively (i.e., qualitative data being
quantitised)
Why narrow?
Reinterpret quantitative and qualitative methodologies through their associations to research
paradigms, research approaches and research strategies.
Quantitative research design
o Research paradigm: Positivism
o Research approach: Deduction
o Characteristics: Causal Relationships,
numbers, statistical analysis
techniques, standardised, probability
sampling, generalizability, independent
researcher
o Research strategies: Experiments,
surveys

Qualitative research design


o Research paradigm: Constructivism
o Research approach: Induction
o Characteristics: Meanings, text,
interpretation, non-standardised, nonprobability sampling, develop
conceptual framework, researcher part
of the research process
o Research strategies: Case study

Still, it is possible that a quantitative research design is more in line with induction, and that a
qualitative research design is more in line with deduction Many research designs are thus likely to
combine quantitative and qualitative elements. No need to learn figure by heart.

Triangulation is one of the advantages of using more than one data collection technique and
analysis procedure

Multiple methods may be used in order to combine data to ascertain if the findings
from one method mutually corroborate the findings from the other method. ( to see
whether we have the same result)

Whatever methods you use to collect and analyse data


o Be explicit about the grounds on which multiple methods research is conducted!
o And do not forget that they must serve your research question!

RESEARCH STRATEGY:
We have different ways of conducting a research. We can combine different strategies within the
same project.
Various research strategies exist:
o
o
o
o
o
o
o
o

Experiment
Survey
Archival Research
Case Study
Ethnography
Action research
Grounded theory

Choice of research strategy (strategies) is, among others,


guided by research questions, research objectives, research
paradigm, research approach and research purpose, as well as
by more practical concerns (e.g., time resources, access to
potential participants).

EXPERIMENT
The only way to investigate Causal Relationship. To infer whether a change in one or more
independent variables produces a change in one or more dependent variables.
E.g.:

Independent
Mood (A)

Negative

Dependent
Creativity (B)

Positive

There might be a problem:


Is it mood that affects creativity? Or is it the other way round?
Now we can manipulate A. Then we test the creativity. Then there will be an effect of mood on
creativity.

CLASSIC EXPERIMENT
o
o

Participants randomly assigned to either the experimental group or control group


Each group should be similar in all aspects relevant to the research other than whether or
not they are exposed to the planned intervention or manipulation
Experimental group: Some form of planned intervention/manipulation will be tested
Control group: No such intervention/manipulation is made

Pre-test
measurement of
Purchasing
Behaviour

Buy two, get one


free promotion:
Yes or no

Post-test
measurement
of
Purchasing
Behaviour

Example:
Independent

Dependent

Promotion

Purchasing Behaviour

No promotion
Control group

Promotion
Experimental group

Success if promotion increased purchase behaviour.

Internal Validity:
The extent to which the findings can be attributed to the
interventions rather than any flaws in your research design.
External Validity:
Whether the cause-and-effect relationship(s) found in the
experiment can be generalised.

Field experiment is better than lab experiment in terms of external validity.

Question:
o

As a marketer, you are wondering whether rock versus pop music in supermarkets
influences the time consumers spend in these supermarkets.

Design an experiment which would enable this marketeer to find an answer on this
problem.

Answer:

Independent
Music

Dependent
Time spent in supermarket

Control group: No music


Experimental group:
1. Pop
2. Rock
Choice of supermarket is important. The supermarket, as well as the days have to be the same.
Because some days people might be happier or shop more.
We have to consider all the elements.
SURVEY
o Involves the structured collection of data from a sizeable population.
E.g.: Questionnaire, structured observation, structured interviews
o Usually associated with the deductive research approach
o Popular and common research strategy in business and management research
o Most frequently used to answer what, who, where, how much and how many
questions
o
o
o
o

Collection of standardised data from a sizeable population in a highly


economical way, allowing easy comparison
Perceived as authoritative by people in general
Easy to explain and to understand
When sampling (see next chapter) is used, it is possible to generate
findings that are representative of the whole population at a lower
cost than collecting the data for the whole population

Data collected by the survey strategy is unlikely to be as


wide-ranging as those collected by other research
strategies
Limited number of questions can be included
In case a questionnaire is used
Capacity to do it badly (see later)

ARCHIVAL RESEARCH
o

Analysis of administrative records and documents as principal source of data because they
are products of day-to-day activities

Recent as well as historical documents

Secondary data analysis: Data are part of the reality being studied rather than having been
collected originally as data for other (research) purposes

Allows research questions which focus upon the past and changes over time to be
answered

Disadvantages might be the nature of the records and documents, missing data, and access
to data (confidentiality)

CASE STUDY
o

Empirical investigation of a particular contemporary phenomenon within its real-life context,


using multiple sources of (data) evidence

The boundaries between the phenomenon being studied and the context within which it is
being studied are not clearly evident
Experiment: Research undertaken in a highly controlled context
Survey: Ability to explore and understand the context is limited by the
number of variables for which data can be collected

Relevant strategy if you wish to gain a rich understanding of the context

Has considerable ability to generate answers to why, what and how questions

Likely to use multiple sources of data (interviews, observation, documentary analysis,


questionnaires, )

TRIANGULATION
Example:
o
o
o
o
o

Building high quality interaction and cooperation during organisational change (Grieten &
Lambrechts, 2007, 2009)
Problem definition: 2/3 of change processess fails, although it is known that these failures
are often caused by relational aspects
Research question: What makes relational practices of such a quality that they improve
common progress during organisational change?
Case selection: Two organisations with contrasting change processes in terms of results
(best practice and worst practice), but similar in terms of relational approach
Data collection methods: (participant) observation, in-depth interviews, focus groups,
document analysis (Triangulation!)

ETHNOGRAPHY
o

Used for studying people in groups, who interact with one another and share the same
space (e.g., street level, work group, organisation, )

Origins in (colonial) anthropology

Focuses upon describing and interpreting the social world through first-hand field study
o

Researchers living amonst those whom they study, to observe and talk to them in
order to produce detailed cultural accounts of their shared beliefs, behaviors,
interactions, language, rituals and the events that shaped their lives

Ideas about this strategy or not unified!

ACTION RESEARCH
o

An emergent and iterative process of inquiry that is designed to develop solutions to real
organisational problems through a participative and collaborative approach, which uses
different forms of knowledge, and which will have implications for participants and the
organisation beyond the research project

Research in action rather than research about action

Demanding strategy in terms of the intensity involved and the resources and time required

GROUNDED THEORY
Uses Inductive research approach. We start with the data and theory/a relevant model.
o

Developed as a response to the extreme positivism of past social research

Theory is developed through the systematic and simultaneous process of data collection and
analysis involving a mainly inductive approach
to generate theory grounded in your data

A process of constant comparison moving between inductive and deductive thinking

Theoretical sampling until theoretical saturation is reached


= Conceptual density
= Conceptual saturation

Time-consuming, intensive and reflective

Will something significant emerge?

Will something emerge that is more than simply descriptive?

Example:
Nyilasy, G., & Reid, L.N. (2009). Agency practitioners metatheories of advertising. International
Journal of Advertising, 28(4), 639-668.
o

What do advertising agency practitioners think about how advertising works? This
studys basic aim was to understand practitioners thinking about the work of
advertising in their own terms. As there was little substantive research of this
perspective, a grounded theory approach to qualitative research was used.

Semi-structured, in-depth interviews were used until theoretical saturation was


achieved

TIME HORIZON:
Cross-sectional studies
o

The study of a particular phenomenon or phenomena at a particular time, i.e. a


snapshot

Choice of moment may be important

Longitudinal studies
o

The study of a particular phenomenon or phenomena over an extended period of


time (different moments in time)

Possible to study changes and developments

Be careful for relevant changes in variables you do not take into account!

E.g., Consumer Sentiment Index (University of Michigan)

When developing your research design, you should also consider the ethnics and the quality of your
research design.

ESTABLISHING THE QUALITY OF THE RESEARCH DESIGN

Reliability: Consistency in research. Is it consistent when I replicate exactly the same experiment.
Validity: Testing the right variable.

An example by means of a scale as measurement instrument


1.
2.
3.
4.

73 kg
73 kg
73 kg
73 kg

Real weight= 78 kg

VALIDITY
It doesnt measure what it intends to measure.

RELIABILITY
Reliability does not involve validity!!! &

Validity does not involve reliability!!!

Not Reliable
Not Valid

Reliable
Not Valid

Not Reliable
Valid

Reliable
Valid

Question:
1. The student administration department of HUB examines the extent to which HUB-students
are satisfied with the teaching skills of the HUB-staff. By means of a questionnaire on Time 1,
researcher X finds that the overall satisfaction is equal to 8.7 on 10. Two weeks later (Time
2), researcher X conducts the same research (among the same respondents) and finds that
the overall satisfaction is equal to 8.7 on 10. Consequently, researcher Xs results are:
Valid: ? => No information to tell us whether it is valid or not.
Reliable: Yes
2. You developed a measurement instrument to examine employees level of job autonomy
perception (i.e., the extent to which they experience autonomy in their job). This
measurement instrument seems to be sensitive to social desirability (i.e., respondents
tendency to give answers that may be desirable from a social standpoint / when people
answer according to what they think is expected of them and not according to their own
opinion.)
Question: What is the implication of social desirability for the quality of your measurement
instrument?
Not valid because of measuring job autonomy, they are asking/measuring social desirability.
3. Which of the following statements is correct?
A) Experiments are more valid compared to surveys
B) If a study is reliable, it means that it measures what we think it should measure
C) External validity is about the extent to which the reliability of a study can be
generalised
D) An interviewer who writes down a wrong answer from absent-mindedness threats
the reliability of his study
Answer: D => Just one instance of absent-mindedness will not influence the validity of
the research.

Chapter 4: Sampling

The full set of cases are not necessarily people!


E.g.: Whats the everage price of chicken soup in Chinese Restaurants located in Brussels?
Population: Chinise restaurants located in Brussels.
Sample: A sub-group within a population.
Research question: How many beers do Belgian adults drink on average each week?
You could collect and analyse data from every possible case in the population = CENSUS
However, there might be restrictions in terms of time, money, accessn currency, speed, practice,
accuracy, detail ...
Therefore, consider data from a subgroup rather than all possible cases or elements of the
population = SAMPLE

Sampling is about selecting a number of elements from a populaton you would like to study, with
the intention to derive characteristics of the population from characteristics of the sample.

THE SAMPLING PROCESS

DEFINE THE POPULATION


Depends on your research question!
E.g.: How satisfied are HUB-students with the teaching skills of the HUB-professors?
Defining the population is not always that straightforward
E.g.: Research project assessing consumer response to a new brand of mens moisturiser
Be careful for population specification error = Consequence of not studying a specific part of the
target group
Question:
Define the population for the following research questions:
How do employees of Carrefour think the proposed introduction of compulsory Sunday
working will affect their working lives?
Population: Employees of Carrefour

What is the normal range in miles that can be travelled by electric cars in everyday use?
Population: Electric cars you use everyday

DETERMINE THE SAMPLE FRAME


A list of all elements in the population from which your sample will be drawn
Examples:
o Telephone book
o Companies customer database
o Membership lists
o
In some cases you will have to develop the sample frame yourself!
Sampling frame error
Sampling frame is not a perfect reproduction of the research population
= The variation between the population defined by the researcher and the population as
implied by the sampling frame used
Examples of causes of sampling frame errors:
o Not up to date
o Elements of sampling frame that are not part of the population
o Elements of population are not in sampling frame
o Elements that are included multiple times
E.g. of sampling frame errors
Telephone book:
Not up to date
Not everyone has telephone
Companies are in phone book as well

Checklist:
Are elements listed in the sampling frame relevant to your research question?
How recently was the sampling frame compiled, in particular is it up to date?
Does the sampling frame includes all elements, in other words is it complete?
Does the sampling frame contain the correct information, in other words is it accurate?
Does the sampling frame exclude irrelevant cases, in other words is it precise?
For purchased lists and online panels, can you establish and control precisely how the
sample will be selected?
For an online panel, can you establish whether incentives will be used to enhance the likely
response and provide an assessment of the impact of this on respondent characteristics and
consequently responses?
You should not generalise beyond your sampling frame
E.g.: Sampling frame consists of all employees of an organisation You can only generalise to
employees of that particular organisation
Sometimes not possible (or very hard) to develop a sampling frame!
Question:
Which sampling frame is suited for the following research questions?
How do employees of Carrefour think the proposed introduction of compulsory Sunday working will
affect their working lives?
Answer:
Which factors influence Belgian lawyers decision to work in other European countries?
Answer:
SELECT SAMPLING TECHNIQUES
First of all you need to decide whether you will examine all elements of the population (=census) or
you will dram a sample
For populations fewer than 50, it is usually more sensible to collect data from the entire population.
Draw a sample => Conditions:
o Practical constraints
o Budget constraints
o Time constraints
o Access constraints
o Results need to be quickly available
o Testing includes destroying of population (e.g.: Establish the actual duration of long-life
batteries)
Two types of sampling:
1. Probability Sampling
2. Non-probability Sampling

Probability Sampling Techniques


o Sampling techniques in which each element of the population has a fixed probabilistic
chance (usually an equal chance) of being selected for the sample.
o It becomes possible to answer research questions that require you to estimate statistically
the characteristics of the population from the sample (i.e., with a certain level of confidence,
you are able to generalise the findings to the population)
o Probability sampling is often associated with survey and experiment research strategies.
Non-Probability Sampling Techniques
o
o

The probability of each case being selected from the total population is not known.
It is impossible to answer research questions that require you to make statistical inferences
about the characteristics of the population.
Note: You may still be able to generalise from non-probability samples about the
population, but not on statistical grounds.

Question:
Which of the following statements is true?
A) With probability samples the chance, or probability, of each case being selected from the
population is unknown.
B) Generalizations about populations from data collected using any probability sample are
based on intuition.
C) Sampling provides a valid alternative to a census when it would be impracticable for you to
survey the entire population.
D) The sampling frame gives an overview of all the elements which will be included in your final
sample.
Answer: C
Probability Sampling Techniques
o Simple random sampling
o Systematic random sampling
o Stratified random sampling
o Cluster sampling
o Multi-stage sampling

Non-Probability Sampling Techniques


o Quota sampling
o Judgemental sampling
o Snowball sampling
o Self-selection sampling
o Convenience sampling

Probability Sampling Techniques


Simple Random Sampling
A probability sampling technique in which each element has a known and equal probability of
selection. Every element is selected independently of every other element, and the sample is drawn
by a random procedure from a sampling frame.
E.g.,
o Each element of the sampling frame is assigned a unique identification
number (0, 1, 2, )
o Random numbers are generated to determine which elements to include in
the sample (e.g., by means of a random number table) and until sample size
is reached

Example of random number table:

If the same number is read off a second time, it must be disregarded as you need different cases.
This means that you are not putting each cases number back into the sampling frame after it has
been selected. This is termed sampling without replacement.
If a number is selected that is outside the range of those in your sampling frame, you simply ignore it
and continue reading off numbers until your sample size is reached.
Disadvantages of this procedure:
o Time-consuming
o Requires adapted table with sufficient radom numbers
Other random procedure:
o Computer generated random numbers / Online random number generator ( random
number tables)
o Random telephone numbers
- Often used when doing computer-aided telephone interviewing (CATI)
- Dialing telephone numbers at random from an existing database
- Or random digit dialling
+
Does not consider the telephone book
Some households have more than one telephone number!
Simple random sampling:
o Sample without (systematic) bias
o Best used when you have an accurate and easily accessible sampling frame that lists the
entire population
Disadvantage: These lists are not always available!
o If your population covers a large geographical area, random selection means that selected
cases are likely to be dispersed throughout the area
Disadvantage: This sample is not suited if collecting data over a large geopgraphical
area using a method that requires face-to-face contact (high travel costs)
Example:
Jemma was undertaking her work placement at a large supermarket, where 5011 of the
supermarkets customers used the supermarkets Internet purchase and delivery scheme. She was
asked to interview customers and find out why they used this scheme. As there was insufficient time
to interview all of them, she decided to interview a sample using the telephone. Her calculations
revealed that to obtain acceptable levels of confidence and accuracy she needed an actual sample
size of approximately 360 customers. She decided to select them using simple random sampling.
Having obtained a list of Internet customers and their telephone numbers, Jemma gave each of the
cases (customers) in this sampling frame a unique number. In order that each number was made up

in exactly the same way she used 5011 four-digit numbers starting with 0000 through 5010. So
customer 677 was given the number 0676.
She selected at random a first random number in the random number table. After that, she read off
the other random numbers in a regular and systematic manner. She continued in this manner until
360 different cases had been selected. These formed her random sample. Numbers selected that
were outside the range of those in her sampling frame (such as 8321, 5953 and 7932) were simply
ignored.

Systematic Random Sampling


A probability sampling technique in which the sample is chosen by selecting a random starting point
and then picking every ith element in succession from the sampling frame

Selecting the sample at regular intervals from the sampling frame

Similar to Simple random sampling but in a systematic order. We apply an interval for sample
selection.
Example:
Number each of the cases in your sampling with a unique number (0, 1, 2 )
1500 patients: number each of these patients (0,1,2 1499) Sample of 300
participants
Calculate the sampling fraction (actual sample size/total population)
Sampling fraction: 300/1500=1/5
Select the first case using a random number (depends on sampling fraction)
Random starting point (i.e., random number between 0 and 4)
Select subsequent cases systematically (until sample size is reached) using the sampling fraction to
determine the frequency of selection.
Continue to select every fifth patient until the sample size of 300 patients is reached.
Systematic random sampling:
o Sometimes not necessary to develop a sampling frame (e.g., every tenth visitor of a
website)
o Easy to understand and to explain
o Despite these advantages, be careful when using existing lists as sampling frames
- You need to ensure that the lists do not contain period patterns! (See next 2 slides)
- Systematic random sampling is suitable for geographically dispersed cases only if
you do not require face-to-face contact when collecting data ( simple random
sampling)
The impact of period patterns on systematic random sampling:
Consider the use of systematic random sampling to generate a sample of monthly sales from the
Harrods store in London. The sampling frame contains monthly sales for the last 60 years. A
sampling interval of 12 is chosen.

A high street bank needs you to administer a questionnaire to a sample of individual customers with
joint bank accounts
Sampling fraction = 1/2 = you will need to select every second customer on the list
The names of the customer list, which you intend to use as the sampling frame, are arranged as
depicted below:

Stratified Random Sampling


You divide the population into two or more relevant strata based on one or a number of attributes
(e.g., gender, income, region: these attributes are relevant for your research).
In other words, your sampling frame is divided into a number of subsets.
A random (simple or systematic) sample is then drawn from each of the strata.
More concrete
o Choose the stratification variable(s)
- These variables need to be relevant for the research problem
- Stratification needs to result in homogeneity within each strata with regard to the
stratification variable(s)
o Divide the sampling frame into the discrete strata
o Number each of the cases within each stratum with a unique number
o Select your sample using either simple random or systematic random sampling
Example:
Sarah worked for a major supplier of office supplies to public and private organisations. As part of
her research into her organisations customers, she needed to ensure that both public and private
sector organisations were represented correctly. An important stratum was, therefore, the sector of
the organisation. Her sampling frame was thus divided into two discrete strata: public sector and
private sector. Within each stratum, the individual cases were then numbered.

She decided to select a systematic random sample. A sampling fraction of 1/4 meant that she
needed to select every fourth customer on the list. As indicated by the ticks, random numbers were
used to select the first case in the public sector (001) and private sector (003) strata. Subsequently,
every fourth customer in each stratum was selected.

Stratified random sampling:


o

o
o

Dividing the population into a series of relevant strata means that the sample is more likely
to be representative, as you can ensure that each of the strata is represented proportionally
within your sample.
Proportionate stratified random sampling = the sample size drawn from the strata
are proportionate to the stratas share of the total population
Disproportionate stratified random sampling (oversampling enables separate analyses)
Despite the advantages of proportionate and disproportionate sampling, there are some
disadvantages as well:
- Only possible if you can easily distinguish significant strata (in your sampling frame)
- Extra stage of sampling procedure
More time
More expensive
More difficult to explain compared to simple and systematic random
sampling

Cluster Sampling
All elements of a number of randomly selected clusters are selected
More concrete:
o Choose the cluster grouping for your sampling frame
- Heterogeneity in clusters is important! Cluster small universe
(e.g., Population=football lovers; Cluster=football stadium)
o Number each of the clusters with a unique number (0, 1, )
o Select your sample of clusters using some form of random sampling
o Select all elements of the selected clusters

Every cluster has an equal chance to be selected Random sampling technique


Still, the technique normally results in samples that represent the total population less
accurately compared to stratified random sampling (Make sure that clusters are thus
heterogeneous!)
Advantage: Restricting the sample to a few relatively compact geographical sub-areas (clusters)
maximises the amount of data you can collect using face-to-face methods within the resources
available.
Example:
Abdel needed to select a sample of firms to undertake an interview-based survey about the use of
large multiple-purpose digital printer copiers. As he had limited resources with which to pay for
travel and other associated data collection costs, he decided to interview firms in four geographical
areas selected from a cluster grouping of local administrative areas. A list of all local administrative
areas formed his sampling frame. Each of the local administrative areas (clusters) was given a unique
number, the first being 0, the second 1 and so on. The four sample clusters were selected from this
sampling frame of local administrative areas using simple random sampling.
Abdels sample was all firms within the selected clusters. He decided that the appropriate telephone
directories would probably provide a suitable list of all firms in each cluster.

Stratified random sampling vs. Cluster sampling

Multi-stage sampling
Select a stage and research within the cluster.
o

Modifying a cluster sample by adding at least one more stage of sampling that also involves
some form of random sampling
o Procedure:
Choose the cluster grouping for your sampling frame
Heterogeneity in clusters is important!
Number each of the clusters with a unique number (0, 1, )
Randomly select a number of clusters
Repeat the above steps (e.g., districts cities neighbourhoods streets)
Randomly select elements of the most recently selected clusters
Example:
Laura worked for a market research organisation that needed her to interview a sample of 400
households in England and Wales. She decided to use the electoral register as a sampling frame.
Laura knew that selecting 400 households using either systematic or simple random sampling was
likely to result in these 400 households being dispersed throughout England and Wales, resulting in
considerable amounts of time spent travelling between interviewees as well as high travel costs. By
using multi-stage sampling, Laura fest these problems could be overcome.
In her first stage, the geographical area (England and Wales) was split into discrete sub-areas
(counties). These formed her sampling frame. After numbering all the counties, Laura selected a
small number of counties using simple random sampling. Since each case (household) was located in
a county, each had an equal chance of being selected for the final sample.
As the counties selected were still too large, each was subdivided into smaller geographically
discrete areas (electoral wards). These formed the next sampling frame (stage 2). Laura selected
another simple random sample. This time she selected a larger number of wards to allow for likely
important variations in the nature of households between wards.
A sampling frame of the households in each of these wards was then generated using a combination
of the electoral register and the UK Royal Mails postcode address file. Laura finally selected the
actual cases (households) that she would interview using systematic random sampling.
Multi-stage sampling:
Advantages:
o Geographically dispersed population becomes possible against lower cost.
o Compared to normal cluster sampling, larger clusters with many cases is possible
Disadvantages:
o Selecting smaller and smaller subgroups might impact the representativeness of your sample
Can be solved through applying stratified random sampling techniques as well

Impact of various factors on choice of probability sampling techniques:


o
o
o
o
o
o
o
o

Sampling frame required


Size of sample needed
Geographical area to which suited
Necessity of personal contact with respondents
Relative cost
Easy to explain to support workers?
Advantage compared with simple random sampling

Question:
o

BNP Paribas Fortis has about 400 000 Benelux-clients using their credit card. The credit card
application form contains common information such as name, address, age, telephone
number, educational level, etc.

BNP Paribas Fortis wants to examine whether there is a relationship between the way in
which credit cards are used (e.g., frequency of use) and the socio-economic profile of its
users.
Questions: Identify the population and the sampling frame. Consider the suitability
of the various probability sampling techniques in this situation.
Answer:

Probability Sampling Techniques


Quota Sampling
Stratified sampling though the selection of cases is not random (often used for structured interviews
as part of a survey strategy)
Procedure:
o Divide the population into specific subgroups (quota) based on relevant variables
o Calculate, based on relevant and available data, for each subgroup the amount of elements
to be selected
o Give each researcher an assignment which states the number of cases in each quota from
which they must collect data
o Combine the data collected by interviewers to provide the full sample
E.g.: 2 quotas, Female and Male. Then within the group we select the elements.
Quota:
o Are based on relevant and available data
o *Are usually relative to the proportions in which they occur in the population* (e.g., 48%
females in population 480 females and 520 males being selected in a sample of 1000
participants)
o Without sensible and relevant quotas, data collected may be biased
*Precision control = Proportions in sample perfectly mirror the proportions in the population
Precision control Example:
o
o
o
o
o

Interest in consumption habits among +16 in a medium village


Sample must be representative in terms of residence and age
Population statistics: 24 420 16+-residents
Sample: 1/12 of population 2035 sample cases
3 districts and 4 age groups 12 quota

Frequency Control:
Representative in terms of criterion

Question:
An association has 750 members. In the table below, the distribution of these members is given in
terms of gender and age.

Draw a quota sample of 125 subjects, taking into account:


- Gender
- Age
- Gender & Age
Answer:
Gender:
Males : 125 * (367/750)= 61
Females: 125 * (383/750)= 64
Age:
18-25: 125 * (173/750) = 29
26-49: 125 * (379/750) = 63
50+: 125 * (198/750) = 33
Gender & Age:
(We now consider every single cell)
Males - 18-25: 125 * (98/750) = 16
Males 26-49: 125 * (191/750) = 32

Quota Sampling:
Advantages compared to probability sampling techniques
o Less costly
o Can be set up very quickly
o Does not require a sampling frame
Disadvantages
o Because the interviewer can choose within quota boundaries whom they interview, your
quota sample may be subject to bias (e.g., easily accessible respondents who appear to be
willing to answer the questions)
o As the sample is not probability based, you cannot measure the level of certainty or margins
of error

Judgemental Sampling
o
o
o

= Purposive sampling
You need to use your judgment to select cases that will best enable you to answer your
research question
Often used when:
- Working with very small samples (such as in case study research or when you wish
to select cases that are particularly informative)
E.g., Industrial research among experts
- Doing qualitative research
- Doing exploratory research

Those samples cannot be considered to be statistically representative of the total population!


The more common judgemental sampling strategies:
o Extreme case or deviant sampling
o Heterogeneous or maximum variation sampling
o Homogeneous sampling (Focus group discussing: stimulate conversation by putting similar
people together. Then they will be likely to debate more)
o Critical case sampling
o Typical case sampling
o Theoretical sampling (cfr. Grounded Theory)

Snowball Sampling
Commonly used when it is difficult to identify members of the desired population
Procedure:
o Make contact with one or two cases in the population
o Ask these cases to identify further cases
o Ask these new cases to identify further new cases (and so on)
o Stop when either no new cases are given or the sample is large enough (or when theoretical
saturation is reached)
Main problem = Making initial contact
Problems of bias is huge
o Respondents are most likely to identify other potential respondents who are similar to
themselves, resulting in a homogeneous sample

Self-selection Sampling
o
o

It occurs when you allow each case, usually individuals, to identify their desire to take part in
the research
You therefore:
- Publicise your need for cases either by advertising or by asking them to take part
- Collect data from those who respond
Problem = representativeness
- Cases that self-select often do so because of their feelings or opinions about the
research question

E.g.: People posting on Facebook to ask people to participate in a survey.


Problem: Bias
Example:
Patricks research was concerned with the impact of student loans on studying habits. He had
decided to administer his questionnaire using the Internet. He publicised his research on Facebook
in a number of groups pages, using the associated description to invite people to self-select and
clicking on the link to the questionnaire. Those who self-selected by clicking on the hyperlink were
automatically taken to the online questionnaire he had develop using the Qualtrics online survey
software.

Convenience Sampling
o

o
o

Involves selecting cases haphazardly only because they are easily available (or most
convenient) to obtain for you sample
- E.g., the person interviewed at random in a shopping centre for a television
programme
Widely used
Advantages:
- Cheap
- Quick (Suited for exploratory research)
Though prone to bias and influences that are beyond your control Cases appear in the
sample only because of the ease of obtaining them. Bias decreases as the population
becomes more homogeneous

Impact of various factors on choice of non-probability sampling techniques


o Likelihood of sampling being representative
o Types of research in which useful (e.g., non-probability techniques often used in exploratory
research)
o Relative costs (Note: Non-probability techniques are often used as they often imply less
costs compared to probability sampling techniques)
o Note: Where it is not possible to construct a sampling frame you will need to use nonprobability sampling techniques

Question:
For the following research question, it has not been possible for you to obtain a sampling frame.
Suggest the most suitable sampling technique to obtain the necessary data, and motivate your
choice.
Research question: Would users of the tennis club be prepared to pay a 10 per cent increase in
subscriptions to help fund two extra tennis courts? You need the answer by tomorrow morning.
Answer:
Convenience sample (not much time)
But if we have time, probability sample technique
EXAM: If we have sampling frame => Probability technique
For many research projects, you will have to combine different sampling techniques

Question:
Is the following statement true or false? Motivate your answer
Stratified sampling can be seen as random quota sampling
Answer:

DETERMINE THE SAMPLE SIZE


Probability sampling techniques
The confidence interval approach
Normal Distribution

95% of the values is in between -1.96*standard deviation and +1.96*standard deviation

A z-score of 1.96 corresponds with a confidence level of 95%


Statistical Interference
Important in research is to calculate statistics, such as the sample mean and sample proportion, and
use them to estimate the corresponding true population values
Statistical interference: The process of generalising the sample results to a target population
Confidence Intervals
o
o

o
o
o

o
o
o
o

We are thus interested in using the sample statistic (e.g., the sample mean) as an estimate
of the value in the population
An approach to assessing the accuracy of the sample mean as an estimate of the mean in
the population is to calculate boundaries (Confidence Intervals) within which we believe the
true value of the mean will fall
Typically, we look at 95% confidence intervals
This means that for 95% of the time, the true value of the population will fall within the
boundaries of the confidence interval
In other words, if you would collected 100 samples, calculated the mean and then calculated
a confidence interval for that mean, then for 95 of these samples, the confidence intervals
we constructed would contain the true value of the mean in the population
X = sample mean
= population mean
= standard deviation of population
n = sample size
Confidence level (Z)

Define the level of precision


=
The maximum permissible difference
(D) between the sample mean and the
population mean
We already determined the level of precision (D)
But what about Z and ?
o Specifying Z is about specifying the level of confidence
A 95% confidence level is desired Z = 1.96
o Determine (=the standard deviation of the population)
Secondary sources, pilot study or (max valuemin value)/6

The range of a normally distributed variable is approximately equal to +/- 3 standard deviations, and
one can thus estimate the standard deviation by dividing the range by 6.
Example:
Suppose a researcher wants to estimate the monthly household savings more precisely so that the
estimate will be within +/- 5 of the true population value. What should be the size of the sample?
1) The researcher should specify the level of precision. This is the
maximum permissible difference between the sample mean
and the population mean.
D=5
2) The researcher should specify the level of confidence and determine the z-value associated
with this confidence level
Confidence level=95% => z-value=1.96
3) The researcher should determine the standard deviation of the population.
Secondary sources indicate a standard deviation of 55 (=)
D=5

z=1.96

=55

n=(1 .96 * 55) / 5 = 465 (rounded to next highest integer)

Sample size:
The larger the population, the larger the sample size
The higher the degree of confidence, the larger the sample size
The higher the degree of precision, the larger the sample size
The choice of sample size is thus governed by:
o The confidence you need to have in your data: The level of certainty that the characteristics
of the data collected will represent the characteristics of the total population
o The margin of error that you can tolerate: The accuracy you require for any estimates made
from your sample
o The variability in the population in terms of the variable(s) of interest

EXAM questions:
A big company wants to know how much money (in euro) each of its managers spends on lunches
per month. It is known that the maximum amount of money spent is 700 euros while the minimum
is 400 euro. The company wants the result to be accurate in terms of 5 euro and wants to make a
prediction with a confidence level of 95%.
How large should the sample size be?
Answer:
Level of precision: D = 5
Confidence 95%: z = 1.96
= 700-400/6 = 50
N= (1.96 * 50) / 5 = 385

Sample size determination: Proportions

? Population proportion?
Secondary sources, pilot study, or
conservative (=0.5)

Example:
Suppose a researcher is interested in estimating the proportion of households in a particular region
that have bought clothes online. What should be the sample size?
1) The researcher should specify the level of precision. This is the maximum permissible
difference between the sample proportion and the population proportion.
D = 0.5
2) The researcher should specify the level of confidence and determine the z-value associated
with this confidence level.
Confidence level=95% z-value = 1.96
3) The researcher should determine the population proportion.
Secondary sources indicate a population of 0.64
D = 0.05

z = 1.96

= 0.64

n = [1.96 * 0.64(1-0.64)] / 0.05 = 355

EXAM Question:
A researcher wants to know the percentage of households that has a loyalty card of a certain
supermarket. You desire a precision level of 5 percentage points (and a 95% confidence level).
How large should the sample size be?
Level of precision: D = 5
Confidence 95%: z = 1.96
= 50

N= [1.96 * 50(50)] / 5 = 385

Other factors influence the determination of the sample size:


o Time resources
o Financial resources
o Type of data analysis
o Access
o Expected response (see later response rate)
o

Non-probability sampling
o

o
o

Formulas of probability sampling techniques


- Are based on the assumption that the sample cases are randomly selected
- Formulas are just guidelines
Larger sample sizes do not necessarily lead to higher levels of confidence and precision
However, take into account
- Variability in the target group
- Goal of sampling
- Importance of research for management/client
Or you could consider sample sizes used in similar studies, for instance

Non-response and response rate


The non-sampling response problem
o In reality, you are likely to have non-responses
o Possible causes of non-response
- Refusal to participate
- Ineligibility to respond
- Inability to locate respondent
- Respondent located but unable to make contact
o Possible consequences of non-response
- Lower confidence and precision levels due to smaller sample size
- Non-response bias: People who refuse differ from actual respondents
o As part of your research report, you will need to include the
o Response rate:
Total number of responses
Total response rate =
Total number in sample - Ineligible
Active response rate =

Total number of responses


Total number in sample (ineligible + unreachable)

Total and active response rate: Example


Suzan has decided to administer a telephone questionnaire to people who had left her company
over the past five years. She obtained a list of the 1034 people who had left over this period (the
total population) and selected a 50 per cent sample. Unfortunately, she could obtain current
telephone numbers for only 311 of the 517 ex-employees who made up her total sample. Of these
311 people who were potentially reachable, she obtained a response from 147. In addition, her list
of people who had left her company was inaccurate, and 9 of those she contacted where ineligible
to respond, having left the company over five years earlier.
Total response rate = Total number of respondents / (total number in sample ineligible)
Total response rate = 147 / (517 9) = 28.9 %
Active response rate = Total number of respondents / Total number in sample - (ineligible +
unreachable)
Active response rate = 147/ 3119 = 48.7 %
Estimating response rates and actual sample size required
o Non-response = Reality You should estimate the likely response rate and increase the
sample size accordingly
-

First of all, determine the minimal sample size (taking into account certain
confidence and precision levels)

Second, estimate the likely response rate

Third, calculate the actual sample size you require

na = The actual sample size required


n = The minimum sample size
re % = The estimated response rate expressed as a percentage
Example:
Peter was a part-time student employed by a large manufacturing company. He had decided to send
a questionnaire to the companys customers and calculated that a minimum sample size of 439 was
required. From previous questionnaires that his company had used to collect data from customers,
Peter knew the likely response rate would be approximately 30 per cent. Using these data he could
calculate his actual sample size:
na = 439 x 100 / 30 = 43 900 / 30 = 1463
Peters actual sample size, therefore, needed to be 1463 customers.

n = n x 100 / re%

How to estimate the response rate?


Consider the response rates achieved for similar research that has already been undertaken
Beware, response rates can vary considerably when collecting primary data!
- E.g., postal questionnaires: often lower than 50% (?)
- E.g., face-to-face contact: often higher (?)
- E.g., online questionnaires: often lower than 30% (?)
Alternatively, err on the side of caution (35-50 per cent reasonable?)

Possible consequences of non-response:


Lower confidence and precision levels due to smaller sample size
Increasing the actual sample size is useful in case non-response only results in less confidence and
precision.
However increasing the actual sample size is no solution
o When doing longitudinal research in which the same respondents need to be re-examined
o If it is a matter of non-response bias
- Refusers differ on observable characteristics (gender, education, ) compared to
respondents
Refusers might also differ on non-observable characteristics!
How to trace non-response bias?
o Comparing characteristics of respondents with refusers
- On moment of refusal
- Afterwards by means of additional contact
o Comparing characteristics of respondents with population
o Still not the solution when there would be differences in terms of non-observable
characteristics
How to tackle non-response bias?
o Increasing the number of contacts
o Work with substitutes that are randomly selected, but which match on crucial characteristics
(e.g., gender)
- However, this measure is not able to solve the bias completely

EXECUTE SAMPLING PROCESS


VALIDATE SAMPLE
o
o

Once data are collected from a sample, comparisons between the structure of the sample
and the structure of the population should be made
If it is found that the structure of a sample does not match the target population (due to
population specification error, sampling frame error, sample selection bias, non-response
bias), a weighting scheme can be used
- A statistical procedure that attempts to account for these errors/biases by assigning
differential weights to the data depending on the response rates

Weighing

Each case in the database is assigned a weight


The effect of weighting is to increase or decrease the number of cases in the sample that
possess certain characteristics
Most widely used to make the sample data more representative of a target population on
specific characteristics
Also used to adjust the sample so that greater importance is attached to participants with
certain characteristics
Because it destroys the self-weighting nature of the sample design, this procedure should be
applied with caution! Do not forget to report this procedure!

Chapter 5: Using secondary data


Research questions might be answered using some combinations of primary and secondary data.
Secondary data can be more effective in terms of money and time.

TYPES OF SECONDARY DATA AND USES IN RESEARCH

May be both quantitative and qualitative data


May be raw data (received little if any processing) or compiled data (received some form of
selection or summarising)
Primarily used in descriptive and explanatory research (also possible in exploratory
research!)
Within business and management research, secondary data are most frequently used as
part of a case study or survey research strategy (also used as part of other research
strategies!)

Three main subgroups of secondary data

DOCUMENTARY SECONDARY DATA

Often used in research projects that also collect primary data (But you can also use them on
their own or with other sources of secondary data!)
Include text materials and non-text materials. (can be nice to create a background with text)
Can be analysed both quantitatively and qualitatively
Can be used to help to triangulate findings based on other data
Documentary sources you have available can depend on access issues as well as succes in
locating these sources

SURVEY-BASED SECONDARY DATA

Data collected using a survey strategy (e.g., questionnaires) that have already been analysed
for their original purpose
Collected through one of three distinct subtypes of survey strategy:
o Censuses
- Usually carried out by governments Data are often:
clearly defined
well documented
of high quality
easily accessible
widely used
- Are unique as, unlike surveys, participation is obligatory Therefore,
they provide very good coverage of the population surveyed
- E.g., population and housing censuses
o

Continuous and regular surveys


- Those surveys, excluding censuses, that are repeated over time
E.g., surveys where data are collected throughout the year
E.g., UKs General Lifestyle Survey (GLF)
E.g., surveys repeated at regular intervals
E.g., EU Labour Force Survey
Comparative data
- Also carried out by non-governmental bodies
E.g., market research surveys
Data often costly to obtain
- Also carried out by large organisations
E.g., employee attitude survey
Often difficult to gain access due to sensitive nature

Ad-hoc surveys (result of one survey/ doing the survey just once)
- = A general term normally used to describe the collection of data
that only occurs once due to the specificity of focus
- Usually one-off surveys
- Usually far more specific in their subject matter
- Because of their ad hoc nature, it will probably be more difficult to
discover relevant surveys

MULTIPLE CHOICE SECONDARY DATA

= Secondary data created by combining two or more different data sets prior to the data
being accessed for the research. These data sets can be based entirely on documentary or
on survey data, or can be an amalgam of the two
E.g., Various compilations of company information
o E.g., Europes 15,000 Largest Companies
Some methods of compilation
o Extract and combine selected comparable variables from a number of surveys or
from the same survey that has been repeated a number of times to provide
longitudinal data (time-series data)
o Data compiled from the same cases over time using a series of snapshots to form
cohort studies
o Secondary data from different sources can be combined, if they have the same
geographical basis, to form area-based data sets (E.g., Europe in Figures: Eurostat
Yearbook)

Question
The Facebook-page of McDonalds is an example of:
a) Documentary secondary data
b) Survey secondary data
c) Multiple source secondary data
d) None of the above types of secondary data
Answer: A) Documentary
HOW TO LOCATE SECONDARY DATA?
Are the data you need available?
Requires you to:
STEP 1: Establish whether the sort of data you require are likely to be available as secondary data
STEP 2: Locate the precise data you require
STEP 1: ESTABLISHING THE LIKELY AVAILABILITY IF SECONDARY DATA
Literature review (Reference list)
Quality national newspapers
Subject-specific textbooks
Tertiary literature (e.g., indexes and catalogues)
Informal discussions
STEP 2: LOCATE SECONDARY DATA
Once you have ascertained that secondary data are likely to exist, you need to find their
precise location
o Relatively straightfoward for secondary data held in online databases or held by
specialist libraries
o Data held by organisations are more difficult to locate (time consuming, quality?, )
o Once you have located a possible secondary data set, you need to be certain that it
will meet your needs

Advantages of Secondary Data


May have fewer resource requirements
Unobtrusive
Longitudinal studies may be feasible
Can provide comparative and
contextual data
Can result in unforeseen discoveries
Permanence of data

Disadvantage of secondary Data


May be collected for a purpose that does not
match your need
Access may be difficult or costly
Aggregations and definitions may be unsuitable
No real control over data quality
Initial purpose may affect how data are
presented

EVALUATING SECONDARY D ATA SOURCES


We need to make sure that the secondary data is valid.
Secondary data must be viewed with the same caution as any primary data!
You need to be sure that:
o They will enable you to answer your research question
o The benefits associated with their use will be greater than the costs
o You will be allowed access to the data
Most authors suggest a range of validity and reliability criteria against which potential
secondary data can be evaluated
These criteria can be incorporated into a three-stage process
1. Overall suitability
2. Precise suitability
3. Costs and benefits

1. OVERALL SUITABILITY
Measurement validity
o Do the measures used match those you need?
o E.g., A manufacturing organisation recording monthly sales whereas you are
interested in monthly orders
o E.g., Use minutes of company meetings as a proxy for what actually happened in
those meetings
Coverage and unmeasured variables
o Do secondary data cover the population about which you need data, for the time
period you need, and contain variables that will enable you to answer the research
question?
o Some secondary data sets may not include variables you have identified as
necessary for your analysis (i.e., unmeasured variables)
Checklist:
Does the data set contain the information you require to answer your research
question(s)?
Do the measures used match those you require?
Is the data set a proxy for the data you really need?
Does the data set cover the population that is the subject of your research?
Does the data set cover the geographical area that is the subject of your research?
Can data about the population that is the subject of your research be separated from
unwanted data?
Are the data for the right time period or sufficiently up to date?
Are data available for all the variables you require to answer your research question(s)?
Are the variables defined clearly?
2. PRECISE SUITABILITY
Reliability and validity
o Quick option: Assess the authority or reputation of the source
o In-depth assessment:
Who is responsible for the data?
Method used to collect the data? (we can contact the person who
conducted the research and ask about the methodology)
Context in which the data were collected?
How were data analysed and reported?
Measurement bias: Can occur for two reasons
1) Deliberate or intentional distortion of data
E.g.,
Purpose of study is to reach a predetermined conclusion
E.g.,
People responding to a structured interview adjusting their
responses to please the interviewer

Triangulation!
2) Changes in the way data were collected
Particularly important for longitudinal data sets!
Checklist:
How reliable is the data set you are thinking of using?
How credible is the data source?
Is it clear what the source of the data is?
Do the credentials of the source of the data (author, institution or organisation
sponsoring the data) suggest it is likely to be reliable?

Do the data have an associated copyright statement?


Do associated published documents exist?
Does the source contain contact details for obtaining further information about the
data?
Is the method described clearly?
If sampling was used, what was the procedure and what were the associated
sampling errors and response rates?
Who was responsible for collecting or recording the data?
(For surveys) Is a copy of the questionnaire or interview checklist included?
(For compiled data) Are you clear how the data were analysed and compiled?
Are the data likely to contain measurement bias?
What was the original purpose for which the data were collected?
Who was the target audience and what was its relationship to the data collector or
compiler (where there any vested interests)?
Have there been any documented changes in the way the data are measured or
recorded including definition changes?
How consistent are the data obtained from this source when compared with data
from other sources?
Have the data have been recorded accurately?
3. COSTS AND BENEFITS
Checklist:
What are the financial and time costs of obtaining these data?
Can the data be downloaded into a spreadsheet, statistical analysis software or word
processor?
Do the overall benefits of using these secondary data sources outweigh the associated
costs?
Question:
Suppose you are undertaking a research project as part of your research methods course in which
you need to investigate the following research question:
How has Belgiums import and export trade with other countries altered since its entry into
the European Union?
Give one argument that you could use to convince the project leader of the suitability of using
secondary data to answer this research question.
Answer:
Time wise it is not possible to gather primary data as it is very time consuming.
Question:
Which of the following statements is wrong?
A) Primary data become secondary data
B) Primary data are more reliable and valid compared to secondary data
C) Research projects might combine primary and secondary data.
D) Secondary data enable researchers to triangulate their primary research findings
Answer: A) Primary data are more reliable and valid compared to secondary data

Chapter 6: Collecting Primary Data through observation


If your research question is concerned with what people do, an obvious way in which to discover this
is to watch them do it
This is essentially what observation involves:
The systematic observation, recording, description, analysis and interpretation of
(peoples) behaviour
Observation is very often used in marketing research. E.g.: Loyalty card: observing the buying
pattern of customers. Cookies online: browsing behaviour (Structured Obcervation)
Two types of observation are examined in this chapter
1. Participant observation
Qualitative
Emphasis is on discovering the meaning that people attach to their actions
2. Structured observation
Quantitative
Emphasis is on the frequency of actions
Observation can be used as either the main method of data collection or to supplement other
methods!!!
What can be observed?
Behavior and physical actions
Verbal behavior
Body language
Spacial aspects of relationships
Time patterns
Physical objects (products on shelves)
Activities from the past
.
Dimension based on which observation methods differ
Natural or Manipulated
Natural: observe people in their natural environment
Manipulated: In a lab setting
Personal or Mechanical
Personal: Human being observing
Mechanical: Done by an eye tracker (To see what attracts people more)
Hidden or Not Hidden
Concealed identity

Question:
The sellers of a multimedia store visit competitive stores and write down their prices. This
observation is:
a) Natural - Personal - Not hidden
b) Manipulated - Mechanical - Not hidden
c) Manipulated - Personal Hidden
d) Natural - Personal - Hidden
Answer: d) natural personal hidden

PARTICIPANT OBSERVATION

Qualitative
Emphasis is on discovering the meaning that people attach to their actions.

Observation in which the researcher attempts to participate fully in or closely observe the
lives and activities of the research subjects and thus becomes a member of the subjects
group(s), organisation(s) or community
This enables researchers to share their experiences by not merely observing what is
happening but also feeling it
E.g., Street Corner Society by W.F. Whyte

TYPOLOGY OF PARTICIPANT OBSERVATION RESEARCHER ROLES

The time you have to conduct the research will determine the role you would take as a researcher.
Complete participant:
Preventing social desirability
Raises questions of ethics
Might lose sight of research purpose
Observer as participant:
Able to focus on the researcher role
Lose the emotional involvement

Participant as observer:
Not always easy to gain trust of the group you observe
FACTORS THAT WILL DETERMINE THE CHOICE OF PARTICIPANT OBSERVER ROLE
Purpose of your research
Which role suits your research question?
E.g.: A phenomenon about which the research informants would be naturally
defensive is one that lends itself to the complete participant role

The time you have to devote to your research


Some of the roles may be very time consuming
E.g.: A period of attachment might be necessary

The degree to which you feel suited to participant observation

Organisational access

Ethical considerations
The degree to which you reveal your identity as the researcher or adopt a covert
stance will be dictated by ethical considerations

Not making and recording data


Note making: Your notes are likely to be composed of different types of data:
o Primary observations
What happened? What was said?
o Secondary observations
Statements by observers about what happened or was said
o Experiential data
Perceptions and feelings as you experience the process you are researching
Contextual data
Data related to the research setting and organisational structures
and communication patterns that will help you to interpret other
data
Data Collection

No formal interviews but informal discussions


Recording must take place on the same day as the fieldwork in order to not forget valuable
data

Data Analysis

Data from participant observation are analysed like other qualitative data (not part of this
course; see BBA3)
Data will start to be analysed at the time you collect them (i.e., data collection and data
analysis will be carried out simultaneously)
o Promising lines of enquiry that you wish to follow up in your continued observation
will emerge (cfr. analytic induction)

ISSUES RELATED TO RELIABILITY AD VALIDITY


Participant observation has high ecological validity as it involves studying social actors and
social phenomena in their natural settings
However, using participant observation may lead to a number of threats to reliability and
validity:
o Observer error
o Observer bias
o Observer effect
Observer Error
Lack of understanding of or overfamiliarity with setting may lead you to unintentionally
misinterpret what is happening
Observer Bias
The observer uses his or her own subjective view or disposition to interpret events in the
setting being observed
o Always question your own interpretations and conclusions
o Informant verification: Form of triangulation in which the researcher presents
written accounts to informants for them to verify the content
Observer Effect
By simply being present, the researcher may affect the behaviour of those being observed
o Covert observation vs. Ethics
o Minimal interaction (observer melts into the background)
o Habituation: The informants being observed become familiar with process of
observation so that they take it for granted and behave normally
Advantages of Participant Observation
Good at explaining what is going on in particular social settings
Heightens the researchers awareness of significant social processes
Particularly useful for researchers working within their own organisations
Affords the opportunity for the researcher to experience for real the emotions of those
who are being researched
Virtually all data collected are useful
Disadvantages of Participant Observation
Can be very time-consuming
Can pose difficult ethical dilemmas for the researcher
Can be high levels of role conflict for the researcher (e.g., colleague vs. researcher)
Closeness of researcher to the situation being observed can lead to significant bias
Very demanding role to which not all researchers will be suited
Access to organisations may be difficult
Data recording is often very difficult for the researcher

STRUCTURED OBSERVATION

Quantitative
Emphasis is on the frequency of actions

It enables us to see relationships between variables. Helps for causal research.

What is structured observation?


In contrast to participant observation:
o A high level of predetermined structure
o Adopt a more detached stance
o Concern is to quantify behavior (How often do things happen?)
Example:
Mintzberg, H. (1973). The nature of managerial work. New York: Harper & Row.
o Mintzberg questioned whether managerial work is a rational process of planning,
controlling and directing
o Therefore, he studied what five chief executives actually did during one of each of
the executives working weeks
o He did this by direct observation and the recording of events on three
predetermined coding schedules (which were developed based on a period of
unstructured observation)
What is structured observation?
The Internet has widened the scope to conduct forms of structured observation
Internet may be used in real time to make virtual structured observations (e.g., count the
number of visitors to websites in a given period)
Internet behavior may also be tracked and analysed (e.g., search engines such as Google
regularly do research on the search behavior of their users; indirect observation)
Advantages of using the internet for structured observation:
o Non-intrusiveness
o Removal of possible observer bias
Using coding schedules to collect data
One of the key decisions you need to make before undertaking structured observation is
whether:
o to use an off-the-shelf coding schedule
o or to design your own coding schedule

One of the key decisions you need to make before undertaking structured observation is
whether:
o to use an off-the-shelf coding schedule
o or to design your own coding schedule

Using off-the-shelf coding schedule


Often used in management and business to record interpersonal interactions in social
situations such as meetings or negotiations (see example on next slide)
Advantages of using off-the-shelf coding schedules:
o You save a lot of time
o Has been tried and tested Reliability and validity
When choosing such an off-the-shelf coding schedule, you need to ask yourself a number of
questions.

Questions to ask when choosing an off-the-shelf coding schedule:


For what purpose was the coding schedule developed? Is it consistent with your research
question?
Is there overlap between the behaviors to be observed?
Are all behaviors in which you are interested covered by the schedules?
Are the behaviors sufficiently clearly specified so that all observers will place behaviors in
the same category?
Is any observer interpretation necessary?
Are codes to be used indicated on the recording form to avoid the necessity for
memorisation by the observer?
Will the behaviors to be observed be relevant to the inferences you make?
Have all sources of observer bias been eliminated?
Developing your own coding schedule: Checklist:
Are the meanings of codes to used transparant and have you written these down?
Have you ensured that the meanings of different codes do not overlap?
Are the codes you have developed flexible enough in practice to be applied across different
settings?
Are the codes you have developed strictly relevant for the behaviors that you wish to
observe and record?
Do the range of codes you have developed cover all the behaviors you wish to observe and
record?

Are the codes you have developed simple to understand and undemanding to apply so that
you will not need to memorise or check their meanings?
An alternative to the use of an off-the-shelf schedule or the development of your own may be a
combination of the two!
Data Analysis
The complexity of your analysis will depend on your research question
o It may be that you are using the coding schedule to establish the number of
interactions by category in order to relate the result to the output of the meeting.
Simple manual analysis may be sufficient for this purpose.
o Alternatively, you may be using the coding schedule to see what patterns emerge.
This level of analysis is more complex and will usually need statistical software.
Issues related to validity and reliability
See participant observation as well
Informant error: Errors that occur when informants are observed in situations that are
inconsistent with their normal behaviour patterns, leading to atypical responses
o E.g., You want to observe the amount of orders sales administrators process in a day
and choose administrators in a section that was short-staffed owing to illness
Time error: The time at which you conduct an observation provides data that are untypical
of the total time period in which you are interested
o E.g., The number of calls taken in a call centre is often higher in the hours
surrounding lunchtime in comparison to any other two-hour period
Advantages of Structure Observation
Can be used by anyone after suitable training in the use of the measuring instrument.
Therefore, you could delegate this extremely time-consuming task.
May be carried out simultaneously in different locations. This would present the opportunity
of comparison between locations.
Should yield highly reliable results by virtue of its replicability. The easier the observation
instrument is to use and understand, the more reliable the results will be.
Capable of more than simply observing the frequency of events. It is also possible to record
the relationship between events. For example, does a visit to a website lead to the
exploration of related pages and video recordings; does this lead to a decision to purchase?
Allows the collection of data at the time they occur in their natural setting. Therefore, there
is no need to depend on second-hand accounts of phenomena from participants who put
their own interpretation on events.
Secures data that most informants would ignore because to them these are too mundane or
irrelevant.
Disadvantages of Structure Observation
Unless virtual observation is used, the observer must be in the research setting when the
phenomena under study are taking place
Research results are limited to overt action or surface indicators from which the observer
must take inferences
Data are slow (and may be expensive) to collect

Question:
Which of the following statements is wrong?
A) Data based on observational studies can be used as secondary data
B) Structured observation might serve an explanatory research purpose
C) Collecting data through observation is part of qualitative research designs
D) Not revealing your researcher role in observational studies is advantageous as informants
probably behave in less socially desirable ways
Answer: C => Can also be quantitative. In exam explain why and with an example
A => Correct. But not very usual
B => Correct. We can investigate causal relationships between variables.

Chapter 7: Collecting primary data using Interviews


Research Interview
= A purposeful discussion between two or more people requiring the interviewer to
establish rapport, to ask concise and unambiguous questions and to listen attentively
Is a general term for several types of interview
o Structured interviews
o Semi-structured interviews
o Unstructured interviews
STRUCTURED INTERVIEWS
= Data collection technique in which an interviewer physically meets the respondent, reads
them the same set of questions in a predetermined order, and records his or her response to
each
Also called interviewer-administered questionnaires
(Interviewer asks the question. It is the main difference with online survey /questionnaire.)
Are used to collect quantifiable data
o Therefore, they are also referred to as quantitative research interviews
SEMI-STRUCTURED INTERVIEWS & UNSTRUCTURED INTERVIEWS
Semi-structured interview = Wide-ranging category of interview in which the interviewer
commences with a set of interview themes but is prepared to vary the order in which
questions are asked and to ask new questions in the context of the research situation
Unstructured interview = Loosely structured and informally conducted interview that may
commence with one or more themes to explore with participants but without a
predetermined list of questions to work through
Compared to structured interviews, those interviews are non-standardised
(Do not have the same questions and can deviate from questions standardised)
Are often referred to as qualitative research interviews
Types of interview
According to the nature of interaction between the researcher and those who participate
o One-to-one versus One-to-many
o Face-to-face versus Telephone, Internet, intranet

In Focus groups the questions might change depending on peoples answers.


A research design may incorporate more than one type of interview!
Links to the purpose of research and research strategy
Structured interviews are normally used to gather data which will then be the subject of
quantitative analysis (e.g., as part of a survey strategy)
Semi-structured and unstructured interviews are normally used to gather data which will
then be the subject of qualitative analysis (e.g., as part of a case study strategy)

Next to the research purpose and the research strategy, which other factors determine when to use
qualitative research interviews?
The significance of establishing personal contact
The nature of the data collection questions
Length of time required and completeness of the process
The significance of establishing personal contact
People are more likely to agree to be interviewed rather than complete a questionnaire
An interview provides the opportunity to receive feedback and personal assurance about the
way in which information will be used
People may feel it is not appropriate to provide (sensitive and confidential) information to
someone they have never met
People may be reluctant to spend time providing written answers if the meaning of any
question is not entirely clear
More control over those who fill in the questionnaire

The nature of the questions


Qualitative research interviews will be the most advantageous approach to attempt to
obtain data in the following circumstances:
o Where there are a large number of questions to be answered
o Where the questions are either complex or open-ended
o Where the order and logic of questioning may need to be varied
Length of time required and completeness of the process
Length of time required
o In case gathering data will involve much time, people are more likely to agree to
participate in an interview rather than filling in a questionnaire
o Interview can be arranged at a time that suits the interviewee
Completeness of the process
o You are able to convince the participant to still answer certain questions
o You are able to form some indication of why a participant refuses to respond, and to
modify the question

In sum, a checklist to help you decide whether to use qualitative research interviews
Does the purpose of your research suggests using semi-structured and/or in-depth
interviews?
Will it help to seek personal contact in terms of gaining access to participants and their data?
Are your data collection questions large in number, complex or open-ended?
Will there be a need to vary the order and logic of questioning?
Will it help to be able to probe interviewees responses to build on or seek explanation of
their answers?
Will the data collection process with each individual involve a relatively lengthy period?
Question:
Suppose you are an economist. You would like to do research among the CEOs of multinationals to
examine their opinion about opportunities and threats for the European economy during the next 10
years.
Question: What kind of interview would you undertake? Why?
Answer:
Qualitative, in order to explore more. Take a look at the target group. For a closed qualitative
questionnaire you wont have a lot of response and you wont be able to go much in depth.
Question:
Which of the following statements is wrong?
A) Structured interviews are suited for descriptive research purposes.
B) Qualitative research interviews allow to check respondents interpretation of questions.
C) Quantitative research interviews are more appropriate in case you have complex questions
compared to qualitative research interviews.
D) Unstructured interviews can be combined with quantitative data collection methods in the
same research project.
Answer: C => (Quantitative research interviews)
Its best to have qualitative because you can add more questions and expand the topic more.

Chapter 8: Collecting Primary Data using Questionnaires

Within business and management research, the greatest use of questionnaires is made
within the survey strategy. However, other research strategies (e.g., experiments) can make
use of these data collection methods as well.
Questionnaire = A general term to include all methods of data collection in which each
person is asked to respond to the same set of questions in a predetermined order
o Thus also includes, for instance, structured interviews (see Chapter 7), certain
telephone or online questionnaires,

When to use questionnaire?


Questionnaires are usually not particularly good for exploratory or other research that
requires large numbers of open-ended questions
Questionnaires usually work best with descriptive or explanatory research using standardised
questions that you can be confident will be interpreted the same way by all respondents

How to choose a type of questionnaire?


Your choice of questionnaire will be influenced by a variety of factors:
Characteristics of the respondents from whom you wish to collect data
Importance of reaching a particular person as respondent
Importance of respondents answers not being contaminated or distorted
Required sample size for you analysis, taking into account the likely response rate
Feasible length of questionnaire
Types of question you need to ask to collect your data
Number of questions you need to ask to collect your data
Time available to complete the data collection
Financial implications of data collection and entry
Availability of interviewers and field workers to assist
Use of automated data entry
Combining questionnaire types could be a problem in terms of validity. However, if it is done, we
have to report the different data collection methods.

INTERNET- AND INTRANET -MEDIATED QUESTIONNAIRES

Populations characteristics for which suitable: Computer-literate individuals who can be


contacted by email, or accessed using the Internet or intranet
Confidence that right person has responded: High if using email
Likelihood of contamination or distortion of respondents answer: Low
Size of sample: Large, can be geographically dispersed
Likely response rate: Variable, 30-50% reasonable within organisation/via intranet, 11% or
lower using Internet
Feasible length of questionnaire: Equivalent of 6-8 A4 pages, minimise scrolling down
Suitable types of question: Closed questions but not too complex; complicated sequencing
fine if uses software; must be of interest to respondent
Time taken to complete collection: 2-6 weeks from distribution (dependent on number of
follow-ups).
Main financial resource implications: If via a web page, web page design. Subscription to
online software.
Role of the interviewer/field worker: None
Data input: Automated

Online survey tools: Qualtrics, Snap Surveys, Sphinx, Survey Monkey

POSTAL QUESTIONNAIRES

Populations characteristics for which suitable: Literate individuals who can be contacted by
post; selected by name, household, organisation
Confidence that right person has responded: Low
Likelihood of contamination or distortion of respondents answer: May be contaminated by
consultation with others
Size of sample: Large, can be geographically dispersed
Likely response rate: Variable, 30-50% reasonable
Feasible length of questionnaire: 6-8 A4 pages
Suitable types of question: Closed questions but not too complex; must be of interest to
respondent
Time taken to complete collection: 4-8 weeks from posting (dependent on number of followups).
Main financial resource implications: Outward and return postage, photocopying, clerical
support, data entry.
Role of the interviewer/field worker: None
Data input: Closed questions can be designed so that responses may be entered using
optical mark readers after questionnaire has been returned

Optical mark reader:


Recognises and converts marks into data at rates often exceeding 200 pages a minute

DELIVERY AND COLLECTION QUESTIONNAIRES

Populations characteristics for which suitable: Literate individuals who can be contacted by
post; selected by name, household, organisation,
Confidence that right person has responded: Low
Likelihood of contamination or distortion of respondents answer: May be contaminated by
consultation with others
Size of sample: Dependent on number of field workers
Likely response rate: Variable, 30-50% reasonable
Feasible length of questionnaire: 6-8 A4 pages

Suitable types of question: Closed questions but not too complex; simple sequencing; must
be of interest to respondent
Time taken to complete collection: Dependent on sample size, number of field workers,
Main financial resource implications: Field workers, travel, photocopying, clerical support,
data entry
Role of the interviewer/field worker: Delivery and collection of questionnaires, enhancing
respondent participation
Data input: Closed questions can be designed so that responses may be entered using
optical mark readers after questionnaire has been returned

TELEPHONE QUESTIONNAIRES

Populations characteristics for which suitable: Individuals who can be telephoned; selected
by name, household, organisation,
Confidence that right person has responded: High
Likelihood of contamination or distortion of respondents answer: Occasionally distorted or
invented by interviewer
Size of sample: Dependent on number of interviewers
Likely response rate: High, 50-70% reasonable
Feasible length of questionnaire: Up to half an hour
Suitable types of question: Open and closed questions, including complicated questions;
complicated sequencing fine
Time taken to complete collection: Dependent on sample size, number of interviewers, but
slower than self-completed for same sample size
Main financial resource implications: Interviewers, telephone call, clerical support;
photocopying and data entry if not using CATI; programming, software, and computers if
using CATI (= computer-aided/assisted telephone interviewing)
Role of the interviewer/field worker: Enhancing respondent participation; guiding the
respondent through the questionnaire; answering respondents questions
Data input: Response to all questions entered at time of collection using CATI

STRUCTURED INTERVIEW

Populations characteristics for which suitable: Any; selected by name, household,


organisation, in the street
Confidence that right person has responded: High
Likelihood of contamination or distortion of respondents answer: Occasionally
contaminated by consultation or distorted/invented by interviewer
Size of sample: Dependent on number of interviewers
Likely response rate: High, 50-70% reasonable
Feasible length of questionnaire: Variable depending on location
Suitable types of question: Open and closed questions, including complicated questions;
complicated sequencing fine
Time taken to complete collection: Dependent on sample size, number of interviewers, but
slower than self-completed for same sample size
Main financial resource implications: Interviewers, travel, clerical support; photopcopying
and data entry if not using CAPI; programming, software and computers if using CAPI (=
computer-assisted/aided personal interviewing)
Role of the interviewer/field worker: Enhancing respondent participation; guiding the
respondent through the questionnaire; answering respondents questions
Data input: Response to all questions entered at time of collection using CAPI

Many other factors to consider


Suitability for sensitive questions
Proneness to social desirability
Ability to show visual stimuli
Insight into reason of non-respons
Amount of information than can be gathered
Possibility to guarantee anonymity
Interviewer effects
Possibility to control correctness of information
Controlling peoples interpretation of questions
Possibility for clarification
Question:
Suppose you are the manager of a local multimedia store. Two weeks ago, you organised a special
opening weekend with unique promotions. You would like to know how many people in the near
environment have heard about this weekend.
Question: What kind of questionnaire would you choose? Why?
Answer:
Telephone questionnaire => more time efficient
Could also be postal. Problems: time/cost/response
Question:
Which of the following statements is correct?
A) Structured interviews can include more complicated questions than postal questionnaires.
B) The confidence about the right person responding to the questionnaire is higher for postal
questionnaires than telephone questionnaires.
C) In general, online questionnaires have higher response rates compared to structured
interviews.
D) The likelihood of answers being contaminated by consultation with others is higher for
online questionnaires than postal questionnaires.
Answer: A (has a researcher. He/she can explain if the respondent does not understand.)
Once you have chosen a particular questionnaire type, you should decide what data need to be
collected
Unlike semi-structured and unstructured interviews, the questions you ask in questionnaires
need to be defined precisely prior to data collection
The questionnaire offers only one chance to collect data as it is often difficult to identify
respondents or to return to collect additional information
This means that the time you spend planning precisely what data you need to collect, but also how
you intend to analyse those data and how you should design your questionnaire to meet these
requirements is crucial if you are to answer your research questions.
Types of data variable that can be collected through questionnaires?
Three types:
1. Opinion variables
2. Behaviour variables
3. Attribute variables

OPINION VARIABLES
Record how respondents feel about something or what they think or believe is true or false. Asking
for someones opinion.
Example: How do you feel about the following statement? Teachers at KUL should place their
students interests before their own.
Strongly disagree
Mildly disagree
Neither agree or disagree
Mildly agree
Strongly agree

BEHAVIOUR VARIABLES
Contain data on what people (or their organisations) did in the past, do now or will do in the future
(E.g.: What were you studying last year? What toothpaste do you use?)
Example: Did you ever went to Paris?
Yes
No

ATTRIBUTE VARIABLES
Contain data about the respondents characteristics; Attributes are best thought of as things a
respondent possesses, rather than things a respondent does. They are, among others, used to
explore how opinions and behavior differ between respondents as well as to check that the data
collected are representative of the total population (see Chapter 4 Sampling)
Used to link to an opinion/behaviour variable. E.g.: Gender => Opinion about
Example: What is your gender?
Female
Male
Question:
What type of variable (opinion, behaviour or attribute) is measured by means of the following
question?
What is your marital status?
Single
Married or living in long-term relationship
Widowed
Divorced
Other (Please describe)
Answer: Attribute

Question:
What type of variable (opinion, behaviour or attribute) is measured by means of the following
question?
Do you agree or disagree with the right to tourism?
Strongly agree
Agree
Neither agree, nor disagree
Disagree
Strongly disagree
Answer: Opinion
Deciding what data need to be collected
For most business and management research, the data you collect using questionnaires will
be used for either:
Descriptive purposes
o It is important that you select the appropriate characteristics to answer your
research question. Therefore, you will need to have:
- Reviewed the literature carefully
- Discussed your ideas with colleagues, project tutor and other
interested parties
Explanatory purposes
o You need to be clear about which relationships you think are likely to exist
between variables:
o Dependent variable(s)? Independent variable(s)? Mediating variable(s)?
Moderating variable(s)?

Independent variable (1): Variable that causes changes to a dependent variable or variables
Dependent variable (3): Variable that changes in response to changes in other variables
Mediating variable (2): A variable that transmits the effect of an independent variable to a
dependent variable
Moderating variable (4): A variable that affects the relationship between an independent
variable and a dependent variable

Mediator can be an independent variable as well as a dependent variable.

You need to ensure that essential data are collected


= Collecting data that enables you to answer the research question(s)
One way to do this is to create a data requirements table
o Step 1: Decide whether the main outcome of your research is descriptive or
explanatory.
o Step 2: Subdivide each research question into more specific investigative questions
about which you need to gather data.
o Step 3: Repeat the second stage if you feel that the investigative questions are not
sufficiently precise.
o Step 4: Identify the variables about which you will need to collect data to answer
each investigative question.
o Step 5: Establish the level of detail required from the data for each variable.
o Step 6: Develop measurement questions to capture the data at the level of data
required for each variable.
An example of this six-step process:
Research question/objective:
To establish customers attitudes to the outside smoking area at restaurants and bar.
Step 1: Decide whether the main outcome of your research is descriptive or explanatory.
Research question/objective:
To establish customers attitudes to the outside smoking area at restaurants and bar. (Descriptive)
(If we wanted to analyse gender and/or age and opinion about smoking, it would have been
explanatory)
Type of research:
Predominantly descriptive, although wish to examine differences between restaurants and bars, and
between different groups of customers
Step 2: Subdivide each research question into more specific investigative questions about which
you need to gather data.
Some investigative questions
o Do customers feel that they should have an outside smoking area at restaurants and
bars as a right? (opinion)
o Do customers opinions differ depending on age? (attribute)
o Do customers opinions differ depending on whether or not they smoke? (behavior)
Step 3: Repeat the second stage if you feel that the investigative questions are not sufficiently
precise.
Step 4: Identify the variables about which you will need to collect data to answer each
investigative question.
Investigative question
= Do customers feel that they should have an outside smoking area at restaurants
and bars as a right? (Opinion)
Variable required
= Opinion of customers on restaurants and bars providing an outside smoking area
as a right
Investigative question
= Do customers opinion differ depending on age? (Attribute)
Variable required

= Opinion of customers on restaurants and bars providing an outside smoking area


as a right (dependent variable)
= Age of employee (independent variable)
Investigative question
= Do customers opinion differ depending on whether or not they smoke? (behavior)
Variable required
= Opinion of customers on restaurants and bars providing an outside smoking area
as a right (dependent variable)
= Smoking behaviour (independent variable)

Step 5: Establish the level of detail required from the data for each variable.
Details are important as they influence the statistical technique of the research.
Investigative question
= Do customers feel that they should have an outside smoking area at restaurants
and bars as a right? (Opinion)
Variable required
= Opinion of customers on restaurants and bars providing an outside smoking area
as a right
Detail in which data measured
= Feelshould be a right, should not be a right, no strong feelings (NB will need s
separate questions for restaurants and for bars)
Investigative question
= Do customers opinion differ depending on age? (Attribute)
Variable required
= Opinion of customers on restaurants and bars providing an outside smoking area
as a right
= Age of employee
Detail in which data measured
= Feelshould be a right, should not be a right, no strong feelings
= To nearest 5-year band (youngest 16, oldest 65+)
Investigative question
= Do customers opinion differ depending on whether or not they smoke?
(Behaviour)
Variable required
= Opinion of customers on restaurants and bars providing an outside smoking area
as a right
= Smoking behaviour
Detail in which data measured
= Feelshould be a right, should not be a right, no strong feelings
= Non-smoker, smokes but not in own home, smokes in own home
Step 6: Develop measurement questions to capture the data at the level of data required for each
variable.
Detail in which data measured
= Feelshould be a right, should not be a right, no strong feelings (NB will need
separate questions for restaurants and for bars)
Measurement question:
Do you feel that smoking in restaurants:
Should be a right
Should not be a right
No strong feelings

Question:
Develop a data requirements table for the following research objective:
To establish KUL-students opinion about vegetarian meals in student restaurants.
Data requirements table
So every question in your questionnaire is essential (i.e., each question should be relevant for your
research)
There are, however, some exceptions:
o First question
o Cushion questions (threatening questions) e.g.: asking people about sensitive subjects like
salary. Start by asking general questions about the topic.
o Control questions (A question to make sure that the interviewer is really reading the
questions. E.g.: Do not tick any box. If you have people who ticked the box, you can exclude
these people from the research. This should be reported.
- Trace inconsistency
- Interviewer fraud
o Filler items (reveal the true nature) Questions completely unrelated to the research. In doing
this the participant is not aware of the topic of the research.
Ok, you have an idea about the data you should collect by means of your questionnaire. In other
words, you have an idea about the data you should collect to, in the end, be able to answer your
research question(s).
=> Now, you should consider how to design those individual questions
Designing individual questions
The reliability and validity of the data you collect depend, to a large extent, on the design of
your questions
A valid questionnaire will enable accurate data that actually measure the concepts you are
interested in to be collected, whilst one that is reliable will mean that these data are
collected consistently.
The design of each question should be determined by the data you need to collect (see slide
about data requirements table)
When designing individual question, researchers do one of three things:
- Adopt questions used in other questionnaires
- Adapt questions used in other questionnaires
- Develop their own questions
Adopting and adapting questions used in other questionnaires

May be necessary if you wish to replicate or to compare your findings with another
study (reliability can be assessed)
More efficient than developing your own questions
Questions might be validated already

Beware, there are poor questions in circulation as well


Copyright issues
Not always appropriate for own research goal

When making individual questions, make neutrals questions so that participants are not guided in a
certain direction.

Question types
The population or the background of the people can influence the type of question.
Open or open-ended questions allow respondents to give answers in their own way
Closed or closed-ended or forced-choice questions provide a number of alternative answers
from which the respondent is instructed to choose
Open questions: An example
Please list up three things that you like about KUL:
1
2
3
OPEN QUESTIONS
Are widely used in unstructured and semi-structured interviews (see Chapter 7)
In questionnaires they are useful:
o If you are unsure of the response (e.g., exploratory research)
o When you require a detailed answer
o When you want to find out what is uppermost in the respondents mind
The precise wording of the question and the amount of space partially determine the length
and fullness of the response
When questionnaires are returned by large numbers of respondents, responses to open
questions are extremely time-consuming to code (keep the use of open question to a
minimum)
CLOSED QUESTIONS
Compared to open questions:
o Usually quicker and easier to answer as they require minimal writing
o Responses are easier to compare as they have been predetermined (If these
responses cannot be easily interpreted, then these benefits are marginal)
We discuss six types of closed question:
1. List questions
4. Rating questions
2. Category questions
5. Quantity questions
3. Ranking questions
6. Matrix questions
1. List Questions
The respondent is offered a list of items, of which they can choose one or more items
Example: Which benefits do you receive in your job next to your salary? Please tick the
appropriate box(es).
Mobile phone
13th month
Laptop
Car
Free internet at home
Food cheques
Public transport
Other (Please say:..)
Dont know

Useful when you need to be sure that the respondent has considered all possible responses
(However, the list of responses must be defined clearly and meaningfully to the respondent)

For structured interviews, it is often helpful to present the respondent with a prompt card
listing all responses
Extra items can be added
o Does not apply
o Dont know - Not sure
o Other
What in case of unmarked response?
o Are often inferred as negative responses
o Non-response could also indicate uncertainy or, for some questions, that an item
does not apply

2. Category Questions
Where only one response can be selected from a given set of categories
Example: Examine the pictures portraying several male celebrities below? Which celebrity do
you think is the most attractive?
Brad Pitt
Orlando Bloom
Chris Martin
Are particularly useful if you need to collect data about behavior or attributes
The number of categories that you can include without affecting the accuracy of responses is
dependent on the type of questionnaire (self-completed and telephone questionnaires <
structured interviews using a prompt card)
Arrange responses in a logical order so that it is easy to locate the response category that
corresponds to respondents answer
Categories should be mutually exclusive and should cover all possible responses
3. Ranking Questions
Where the respondent is asked to place something in rank order (you can discover the
relative importance to the respondent)
Example: Please number each of the factors listed below in order of importance to you in
your choice of a new car. Number the most important 1, the next 2 and so on.
Factor
Importance
Carbon dioxide emissions
Boot size
Depreciation
Safety features
Fuel economy
Price
Driving enjoyment
You need to make sure that the instructions are clear and will be understood by the
respondent
In general, ranking more than seven items takes too much effort and reduces motivation to
complete the questionnaire
Respondents can rank accurately only when they can see or remember all items.
o This can be overcome with face-to-face questionnaires by using prompt card.
o Telephone questionnaires should ask respondents to rank fewer items, as the
respondent will need to rely on their memory
Can be combined with a list question: List-ranking question
Example: Please number each of the factors listed below in order of importance to you in
your choice of a new car. Number the most important 1, the next 2 and so on. If a factor has
no importance at all, please leave blank.

Factor
Carbon dioxide emissions
Boot size
Depreciation
Safety features
Fuel economy
Price
Driving enjoyment

Importance

4. Rating Questions
In which a rating device is used to record responses
Often used to collect opinion data
Three types of rating question that are often used:
o Likert-style rating question
o Numeric rating question
o Semantic differential rating question
Likert-style rating question
Allows the respondent to indicate how strongly she or he agrees or disagrees with a
statement
Often used to examine attitudes, importance, intentions
Usually on a four-, five-, six- or seven-point rating scale
Points are accompanied by a verbal description (sometimes also a number)
Even number of points forces the respondent to choose, an odd number of points allows to
choose the middle not sure category

Numeric rating question


Uses numbers as response options to identify and record the respondents response. The
end response options, and sometimes the middle, are labelled
Graphics may also be used to reflect the rating scale visually
An additional category of not sure or dont know may be added and should be separated
slightly from the rating question
Example:
For the following statement, circle the number that matches your view most closely.
This concert was
Poor value for money 1 2 3 4 5 6 7 8 9 10 Good value for money
Semantic differential rating question
Allows the respondent to indicate his or her attitude to a concept defined by opposite
adjectives or phrases (bipolar rating scale)
In case you have several bipolar rating scales, you should vary the position of positive and
negative adjectives from left to right to reduce the tendency to read only the adjective on
the left

Rating questions Scale


Rating questions have been combined to measure a wide variety of concepts (e.g., customer
loyalty, service quality, job satisfaction, )
In other words, different rating questions are combined to form a scale
Each question is then referred to as a scale item
The resultant scale is represented by a scale score created by combining the scores for each
of the rating questions
An example: A scale to measure relationship commitment (De Wulf et al., 2001)
To which extent do you agree with the following statements?
(5-point Likert-scale going from totally disagree to totally agree)
1.
I am willing to go the extra mile to remain a customer of this store.
2.
I feel loyal towards this store.
3.
Even if this store was more difficult to reach, I would still keep buying there.
Scales
Since scaling techniques were first used in the 1930s, literally thousands of scales have been
developed to measure attitudes and personality dimensions and to assess skills and abilities.
These scales can be used in your own research providing they:
o Measure what you are interested in
o Have been empirically tested and validated
o Were designed for a reasonably similar group of respondents
You should only make amendments to existing scales where absolute necessary as
significant changes could impact upon the validity of the scale and thus the results!
Copyright issues
To save time and money, we can use a scale that has already been developed. However, they have
to be reliable. E.g.: If a scale was developed for adults, we can use it for children. Or the other way
round.
Be careful about copyright!!!
5. Quantity Questions
The response is a number giving the amount of a characteristic
Such questions tend to be used to collect behaviour or attribute data
Example: What is your year of birth? 1 9.
Because the data collected by the above question could be entered into the computer
without coding (see later), the question can also be termed a self-coded question, that is one
which each respondent codes her or himself

6. Matrix Questions
Grid questions; Enable you to record the responses to two or more (similar) questions at the
same time using the same grid
Saves space
However, it is suggested that respondents may have difficulties comprehending these
designs and that they are a barrier to response Clear instructions might be a solution

More creative question types

Thermometer Scale
Please indicate how much you like McDonalds hamburgers by coloring in the thermometer. Start at
the bottom and color up to the temperature level that best indicates how strong your preference is.

Question:
What type of question is the question below?
Please describe what you think is the main reason why students study at KUL
Answer: Open Question
Question:
What type of question is the question below?
What is your age?
Less than 20 years
21-40 years
41-60 years
More than 60 years
Answer: Category question (you can only select one)
Question:
What type of question is the question below?
How many computers do you have at home?
____ computers
Answer: Self-coded question
DESIGNING INDIVIDUAL QUESTIONS
When designing your individual questions, you should consider several important
questions!
o Will respondents have the necessary knowledge to answer your question? Does the
respondent has the requested information?
o Does your question appear to talk down to respondents? Are there any words in
your question that might cause offence? Is your question likely to embarrass the
respondent?
o Are the words used in your question familiar to the respondents? Will all
respondents understand the questions in the same way? Do you use vague words
that are open to all sorts of interpretation (e.g., sometimes)?
o Are there any words that sound familiar and might be confused with those used in
your question? Are there any words that look similar and might be confused if your
question is read quickly?
o Can your question be shortened?
o Are you asking more than one question at the same time?
o Does your question include a negative or double negative?
o Is your question unambiguous?
o Does your question imply that a certain answer is correct? (leading questions)
Central tendency effect
During last week, how many hours have you watched television per day (on average)?
o Less than 1 hour
o Between 1 hour and less than 2 hours
o Between 2 hours and less than 3 hours
o Between 3 hours and less than 4 hours
o 4 hours or more
People tend to choose the average answer. Social desirability problem. (Validity)

Designing individual questions


Does your question prevent certain answers from being given? (e.g., Is this the first time you
have pretended to be sick?)

Are the instructions on how to answer and interpret each question clear?
In case of a structured interview, include instructions for the interviewer!
(e.g., Read aloud all possible responses)
Introduce new topics (e.g., The following questions refer to)
Does the respondent has to put lots of effort in providing the information?
Example: Van Kenhove (1989) 152 students: How many cups of coffee did you had on
average each day during last week?

People tend to avoid answering a question if they have to think too much.
Does your the respondent remembers the requested information?
o The extent to which respondents remember facts from the past depends on:
- The importance of the event: Routine and involvement
- The time between the event and the question
- The extent to which respondents memory is stimulated

SOLUTIONS FOR THREATENING QUESTIONS


Not at the beginning of the questionnaire ( Always ask for personal questions like
name, address, email, etc at the end of the questionnaire)
Sufficiently broad response categories
Graduality
Indirect
Different places in questionnaire in slightly different wording
Cushion questions
Frame the behavior as normal or frequent
Projective techniques
Guarantee anonymity
ORDER EFFECTS
Order effects within one multiple choice question
Primary-Effect
Recency-Effect

Question:
What is wrong with the following question?
1. Would you rather not eat non-vegetarian dishes?
2. When did you stop smoking?
3. Many students believe their advisors are not helping them with their master theses. Do you
believe this to be the case?
4. What is the income of your parents?
5. Do you have a good relationship with your mother and father?
6. Does the government should prohibit promotion of tobacco products?
Answer:
1. Non-vegetarian dishes => Too negative
2. Stop smoking => Should indicate year
3. Advisor =>leading question. Most respondents would say yes
4. Income => might not have the knowledge
5. Relationship mother and father => 2 questions in one
6. Tobacco => Leading question
Ok, you have designed your individual questions and the wording of each of these questions is
correct.
Now, you should consider how to construct the questionnaire

CONSTRUCTING THE QUESTIONNAIRE

Consider the order and the flow of the questionnaire


o Should be logical to the respondent (and interviewer)
o To assist the flow of the questions it may be necessary to include filter questions,
but beware of using more than two or three of these in paper and pencil
questionnaires

Sometimes it is necessary to deviate from the main principle order above. E.g.: Quota sample
Also,
Make the questions simple and not leading.
Start with simple questions followed by more difficult ones.
Least threatening to most threatening.
CONSTRUCTING THE QUESTIONNAIRE : ORDER AND FLOW
Are questions at the beginning of your questionnaire more straightforward and ones the
respondent will enjoy answering? (introduce the questionnaire with some introductory and
easy questions) Are questions at the beginning of the questionnaire obviously relevant to
the stated purpose of your questionnaire?
Are questions and topics that are more complex placed towards the middle of your
questionnaire?
Are personal and sensitive questions towards the end of your questionnaire, and is their
purpose clearly explained?
Are questions grouped into obvious sections that will make sense to the respondent?
ORDER EFFECTS
The effect of previously asked questions on the responses of subsequent questions
o Example:
Have you passed your driving test?
Did you have a good driving instructor?
Funneling = General specific questions
CONSTRUCTING THE QUESTIONNAIRE : ORDER AND FLOW
Consider the layout of the questionnaire. This is important for both interviewer- and selfcompleted questionnairs:
o Make reading questions easy
o Make filling in responses easy
o Attractive to encourage to fill in and return
o Not appear too long (see previous class / use of matrix questions)
o Keep the visual appearance simple
When constructing your questionnaire, do not forget to explain the purpose of the
questionnaire. Next to developing a covering letter or email (affects the response rate), you
should:
o Introduce the questionnaire
Why you want the respondent to complete the survey
Interviewer-administered questionnaire: Short introduction that interviewer
reads to respondent
Prepare answers to some questions the respondent might ask you
o Close the questionnaire
Thanking respondent
Providing contact details for any queries
When/How/Where to return
Ok, you have constructed the questionnaire.
Prior to administering your questionnaire to collect data, it should be tested first.

TEST YOUR QUESTIONNAIRE

The purpose is to refine the questionnaire so that respondents will have no problems in
answering the questionnaire. In addition, it will enable you to obtain some assessment of
reliability and validity.
o Expert(s) to comment on your questionnaire
o Pilot test (trial run) among a smaller group (similar to population)
The number of people and pilot tests depends on research question, size of
research project, time and money resources and quality of initial
questionnaire design
You should find out:
o How long the questionnaire took to complete
o The clarity of instructions
o Which, if any, questions were unclear or ambiguous
o Which, if any, questions the respondent felt uneasy about answering
o Whether in their opinion there were any major topic omissions
o Whether the layout was clear and attractive
o Any other comments
In case of interviewer-completed questionnaires, you should find out whether:
o There are any questions for which visual aids should have been provided
o They have difficulty in finding their way through the questionnaire
o They are recording answers correctly

ONCE YOU COLLECTED YOUR DATA, YOU CAN START ANALYSING


However, in some cases (e.g., some paper and pencil questionnaires) you will have to code
the answers yourself and enter these into a data matrix (e.g., SPSS data set)
o Quantity question: Actual numbers = Codes
o Other question types:
Design a coding scheme yourself
Or use an existing coding scheme (enables comparison)
QUESTION CODING
Pre-coding: The process of incorporating coding schemes in questions prior to a
questionnaires administration
Is the quality of teaching you get
Excellent Good Reasonable
Poor
Awful
5
4
3
2
1
Online form (questionnaire) using online software tools (e.g. Qualtrics - returns the data
electronically in a variety of formats such as SPSS and Excel)

Open questions = More complex coding

For office use only

What is your full job title?

So now the data are entered into a data matrix (e.g., SPSS data set) either manually (after coding) or
electronically/automatically (e.g., by using Qualtrics). You thus have raw data to be analysed.
In general, each column of this data matrix represents a variable, while each row represents a case.

Case 1
Case 2
Case 3

ID
1
2
3

Variable 1 Variable 2 Variable 3


5
1
0
6
2
1
4
1
1

Details on how SPSS works on slide


MISSING DATA
Four main reasons for missing data:
1. Data not required from the respondent
2. The respondent refused to answer the question (non-response)
3. The respondent did not know the answer or did not have an opinion
4. The respondent may have missed a question by mistake, or the respondents answer may
be unclear
Before presenting, summarising and/or analysing your data, you should check for errors. Main
methods:
Illegitimate codes (e.g., letter o instead of zero)
Illogical relationships (e.g., higher manual occupations and manual work)
Check that rules in filter questions are followed. Certain responses to filter
questions mean that other variables should be coded as missing values. If
this has not happened, there has been an error.
In some cases, you will have to transform your data. For instance, recode and compute (see next
slides)
RECODE
Example: Three items to measure relationship commitment (7-point Likert-scale from
disagree to agree)
1. I am willing to get the extra mile to remain a customer of this store
2. I dont feel loyal toward this store
3. Even if this store was more difficult to reach, I would still keep buying there
The scale for question 1 and the scale for question 2 dont measure the same thing. In order to
calculate we need to change the order from 1 -> 7 to 7 -> 1. Then we will be able to
measure/calculate everything.
Steps:
1) Select variables that have to be recoded
2) Define new name & label Click change
3) Click old and new values
4) Specify the old values, the new values and click on add

Chapter 9: Experiments
Commonly used to infer causal relationships
Simply stated, to infer whether a change in one or more independent variables produces
a change in one or more dependent variables.

CLASSIC EXPERIMENT (PRETEST-POST-TEST CONTROL GROUP DESIGN)

Participants randomly assigned to either the experimental group or control group


Each group should be similar in all aspects relevant to the research other than whether or
not they are exposed to the planned intervention or manipulation
o Experimental group:
Some form of planned intervention/manipulation will be tested
o Control group:
No such intervention/manipulation is made

Pre-test
measurement
of purchasing
behaviour

Buy two, get one free


promotion: yes or no

Post-test
measurement
of purchasing
behaviour

For causal relationships, experiment is ideal.


Example: Marketing Research
(Ind) X

Y (dep)
Attitude towards promotion

Mood
Positive

Negative

The attitude change is due to the independent variable.


Random attribution is important to rule out the effect of third variables (for either the control or the
experimental group).
Identical scales or measurement scales have to be used to assess the experiment.

CLASSIC EXPERIMENT STATED OTHERWISE


R= Random
E= Experiment
C= Control
O= Dependent variable
O21 => 2: group no. , 1: First experiment
Conditions for causality
Before making causal inferences or assuming causality, three conditions must be satisfied:
1. There must be a statistical relationship between A (independent variable or cause) and B
(dependent variable or effect)
2. A must occur before B
3. The relationship under investigation is not the result of some third variable(s)
THIRD VARIABLES:
Extraneous or Confounding variables = Variables, other than independent and dependent variables,
which may influence the results of the experiment.
They are a great threat to the experiment. It is something else that was going on during the
experiment that you were not controlling. May lead to a problem of validity.
We can calculate the third variable in the control groups.

If there is a difference between O22 and O21, this is caused by extraneous variables
The difference between O12 and O11 is also influenced by these extraneous variables, but also
by the intervention
Consequently, the effect of the independent variable:
= (O12 O11) (O22 O21)

Experimental group
Control group
(Manipulation + third variable) Third variable = Manipulation

CLASSIC EXPERIMENT

Is easier to set up in a laboratory environment (= an artificial one that the researcher


constructs with the desired conditions specific to the experiment)
However, it is also possible to set up experiments in a field environment Field
experiments

Lab => Highly controlled and internal validity is good.


Field => No control and external validity is high (real environment)
External validity = A study that readily allows its findings to generalise to the population at large.
Will the result hold for different kinds of people? (can I generalize my result?)
Internal validity = to the degree that we are successful in eliminating third variables within the study
itself. We are really measuring what we intended to measure.

Laboratory vs. Field experiments

Reactive error:
The artificiality of the lab environment may cause reactive error in that participants react to the
situation itself rather than to the independent variable
Demand artefacts:
A phenomenon in which participants attempt to guess the purpose of the experiment and respond
accordingly.
For example, while viewing a film clip, participants may recall pre-treatment questions about the
brand and guess that the advertisement is trying to change their attitudes towards the brand.
Internal validity:
To draw valid conclusions about the effects of independent variables on the study group.
External validity:
To make valid generalisations
INTERNAL VALIDITY
Refers to whether the manipulation or treatment of the independent variables actually
caused the observed effects on the dependent variables. Refers to whether the observed
effects on the test units could have been caused by variables other than the treatment.
If the observed effects are influenced or confounded by extraneous variables, it is difficult to
draw valid inferences about the causal relationship between the independent and
dependent variables.
Internal validity is the basic minimum that must be present in an experiment before any
conclusion about treatment effects can be made.
Control of extraneous variables is a necessary condition for establishing internal validity.
EXTERNAL VALIDITY
Refers to whether the cause-and-effect relationships found in the experiment can be
generalised. Can the results be generalised beyond the experimental situation and, if so, to
what populations, settings, times, independent and dependent variables can the results be
projected?

Threats to external validity arise when the specific set of experimental conditions does not
realistically take into account the interactions of other relevant variables in the real world.
To control for extraneous variables, a researcher may conduct an experiment in an artificial
environment. This enhances internal validity, but may limit external validity.
Regardless of the deterrents to external validity, if an experiment lacks internal validity, it
may not be meaningful to generalise the results.

THREATS TO VALIDITY
Factors that threaten internal validity may also threaten external validity, the most serious
of these being extraneous or confounding variables as they offer alternative explanations.
Some examples of extraneous variables (which can occur jointly and also interact with each
other):
o History
o Maturation
o Testing effects
o Instrumentation
o Selection bias
o Mortality
Threats to validity: History
= Specific events that are external to the experiment but that occur at the same time as the
experiment
The longer the time interval between observations, the greater the possibility that history
will confound the results.
Example:
o You measure ticket sales for a Christmas market before and after a new promotional
campaign. You find no difference between the ticket sales.
o Was the promotional campaign ineffective? What if, for instance, general economic
conditions declined during the experiment and the local area was particularly hard
hit by redundancies through several employers closing down their operations?
Threats to validity: Maturation
= An extraneous variable attributable to changes in the test units themselves that occur with the
passage of time
In an experiment involving people, maturation takes place as people become older, more
experienced, tired, bored or uninterested. (they might also think differently)
Studies that span several months are vulnerable to maturation, since it is difficult to know
how participants are changing over time.
Maturation effects also extend to test units other than people. For instance, travel
companies changing over time in terms of personnel, physical layout, decoration,
THREATS TO VALIDITY : TESTING EFFECTS
= Effects caused by the process of experimentation
An example
An experiment measuring the effect of advertising on attitudes towards taking a holiday in Egypt.
You measure the attitudes before and after the advertisement. If there is no difference between the
pre- and post-treatment attitude, it might be that the participants tried to maintain consistency
among these two attitudes.

There could be an interaction effect between the before-measurement and the


manipulation. In other words, the manipulation combined with the before-measurement
has another effect compared to the manipulation without the before-measurement.
o E.g., The sensitivity to the content of an information campaign increases by a beforemeasurement of certain knowledge and attitudes related to the content of the
information campaign

How to change the above design to examine the presence of such interaction effects?
Solomon Four group design:

Post-test-only control group design:

Another possibility is to rule out the effect of the pre-test measurement


THREATS TO VALIDITY : INSTRUMENTATION
= An extraneous variable involving changes in the measuring instrument or the observers

E.g., Measurement instruments modified during the course of an experiment.

E.g., Using different observers or interviewers, observer boredom, fatigue, anticipation of


results can distort the results of separate observations.

One way to solve this, is by means of a double blind experiment: Both the test units and the
researcher conducting the experiment are not aware of the goal of the experiment
THREATS TO VALIDITY : SELECTION BIAS
= An extraneous variable attributable to the improper assignment of participant to treatment
conditions
This bias occurs, among others, when the selection or assignment of participants results in
treatment groups that differ on the dependent variable before the exposure to the
treatment condition.

THREATS TO VALIDITY : MORTALITY


= An extraneous variable attributable to the loss of participants while the experiment is in
progress
This happens for many reasons such as participants refusing to continue in the experiment
Mortality confounds results because it is difficult to determine whether the lost participants
would respond in the same manner to the treatments as those that remain.

Suppose that age influences health and that all older people in your study are attributed to
the placebo condition (i.e., the control condition) and not to the medicine condition.

Suppose that health is poorer in the control condition compared to the medicine condition.
Is this result due to the placebo or rather due to the age variable?
In other words, make sure that age is equally distributed among the different
experimental conditions (i.e., the control condition and the medicine condition)

These extraneous variables represent alternative explanations of experimental results.


Consequently, you should control for these confounding variables!
CONTROLLING EXTRANEOUS VARIABLES
There are some ways of controlling extraneous or confounding variables, such as:
o Randomisation
o Matching
o Statistical control
o Manipulation (check)
o Design control
Randomisation
= A method of controlling extraneous variables that involves randomly assigning participants to
experimental conditions.
As a result of random assignment, extraneous factors can be represented equally in each
treatment condition. Randomisation is the preferred procedure for ensuring the prior
equality of experimental groups.
It is possible to check whether randomisation has been effective by measuring the possible
extraneous variables and comparing them across the experimental groups.
Randomisation is not always possible
Manipulation of fixed characteristics (such as race, gender, age, )
o E.g., Examining the effect of gender on the development of social skills It is not
possible to randomly attribute people to one of both gender categories
Ethical issues

E.g., Examining the effect of social status of defendant on penalty It is not


possible to change the social status and replace it by a randomly attributed social
status
E.g., Not treating certain people with an effective medicine just for experimental
reasons

If you are not allowed to or if you are not able to apply randomisation, you can ensure similarity of
both experimental and control group by means of the matching procedure
Matching
= A method of controlling extraneous variables that involves matching participants on a set of key
background variables before assigning them to the treatment conditions.
This method has some drawbacks:
1. Participants can be matched on only a few characteristics, so the participants may be
similar on the variables selected but unequal on others.
2. If the matched characteristics are irrelevant to the dependent variable, then the
matching effort has been futile.
3. Matching is only possible in case you have all relevant information for all test units
Statistical control
= A method of controlling extraneous variables by measuring the extraneous variables and
adjusting for their effects through statistical methods
E.g., ANCOVA = Analysis of Covariance
Manipulation (check)
Suppose you want to examine the influence of a negative versus a positive mood on the
attitude towards an advertisement. You run a laboratory experiment. Respondents are
confronted with one of two film clips aimed at inducing either a negative or positive mood
state (i.e., two experimental groups, no control group).
In this experiment, there must be a manipulation check to test whether respondents being
confronted with the negative mood film clip indeed have a negative mood while
respondents being confronted with the positive mood film clip indeed have a positive
mood. If not, this is a threat to the interval validity of your experiment!
Design control
= A method of controlling extraneous variables that involves using specific experimental designs
Each experimental design has its own advantages and disadvantages, also in relation to
confounding variables
However, the goal of this chapter is not to explain all the possible experimental designs
EXPERIMENTAL DESIGNS
Up til now, we talked about some experimental designs in which the influence of one
particular manipulation/intervention/independent variable with one level is examined.
However, there are also experimental designs in which the effect of more than one level
and/or more than one independent variable is examined. (Some of these designs also not
include a before-measurement)
COMPLETELY RANDOMISED DESIGN
The effect of one independent variable with different levels is examined (i.e., different
experimental groups are attributed to different experimental treatments with regard to
one variable)

Example: The effect of three different promotional campaigns (coupon, gift, extra amount of
product) on sales
o At random, you select 27 similar supermarkets that sell Snickers
o At random, these supermarkets are divided in three groups of 9 supermarkets (9
coupon, 9 gift, 9 extra amount)
o After one month, the actual sales is measured and compared between the three
experimental groups

FACTORIAL DESIGNS
The effect of more than one independent variable with each variable having more than
one level is examined
Example: The effect of two different promotional campaigns (coupon, gift) and two package
colours (red, yellow) on sales
o = 2x2 between-subjects design (i.e., 2 factors with each two levels). In other
words, there are 4 experimental conditions. Respondents are randomly assigned to
one condition.

2 campaigns, 2 colours
Types of campaign
2 levels

Sales

Colour
2 levels
BETWEEN-SUBJECTS DESIGN VS . WITHIN-SUBJECTS DESIGN
Between-subjects design: Participants belong to one and only one experimental condition.
For example

Between-subjects design
o Advantages
Different levels of the independent variable will not influence each other
o Disdavantages
Test units in the different conditions can be different in terms of relevant
variables

Within-subjects design: Each test unit is assigned to each experimental condition. For this reason,
this approach is known as repeated measures.
For example

Within-subjects design
o Advantages
May be more practical than a between-subjects design because it requires
fewer participants
No differences between the different groups in your design
o Disdavantages
May lead to carryover effects where familiarity or fatigue with the process
distorts the validity of the findings

Question:
Suzan is a marketing researcher. She is interested in examining the impact of advertisements
depicting celebrities on consumer behavior. Based on an extensive literature review, she formulates
the following hypothesis:
Intentions to buy sporting clothes are higher when endorsed by male celebrities compared to female
celebrities, while intentions to buy healthy food are higher when endorsed by female celebrities
compared to male celebrities.
a) What is the dependent variable in Suzans research?
b) Draw a bar chart that represents the hypothesis of this research. Define each element of this
line chart.
c) How many experimental conditions are included in Suzans experimental design? Explain
your answer.
d) Suppose that Suzan opts for a between-subjects design, while her colleague opts for a
within-subjects design. What is the difference between those two experiments?
e) Develop a rating question that enables Suzan to measure the dependent variable
Answer:
(Ind) Gender of celebrities

Male

Female

Intention to buy (dep)

Type of product

Sportswear

Healthy Food

Intention to buy

4
3

2
1
0
Male celebrities

Female celebrities

Gender of celebrities
Sportswear

Healthy food

How to measure intention to buy?


Ask: To what extent would you be willing to buy?
Rating method: 1 7 or 0, 1 or Yes, No
* => we can see how big the difference is. P < 0.5
We have 4 experimental conditions
Between => 1
Within => 4

Question:
You are doing research on the influence of specific positive emotions (pride versus happiness) on the
extent to which people behave ethically. In other words, you are wondering whether people behave
ethically to a different extent when they are feeling pride compared to happy.
Design a lab experiment that would enable you to examine this research topic. How would you
measure the independent and dependent variables?
Answer:

You might also like