Professional Documents
Culture Documents
(iii)
click on the download link to RAndFriends, and download the file to a
temporary location;
(iv)
you will require Administrator access to your computer in order for Excel and
R to communicate;
(v)
make sure you have closed Excel and any previous version of R;
(vi)
(vii)
[Troubleshooting]
Occasionally, the installer is unable to locate Excel and this affects the
communication between Excel and R. To resolve this:
(a)
start R by double-clicking on the R icon;
(b)
at the R prompt, enter
> library(RExcelInstaller)
> installRExcel()
Verify the installation by double-clicking on the RExcel2007 icon on your desktop,
and going to the Add-Ins tab at the top of the screen.
You should be able to see the series of drag-down menus for performing various
statistical analyses.
3.
GETTING STARTED
We will explore the use of RExcel through a series of guided worked examples,
which will include data management, basic statistical analysis and exploratory data
analysis. Double-click on the RExcel2007 icon on your desktop to launch both R
and Excel.
3.1
Height data for Children
Three groups of 10 children each have been identified in a survey of childhood
puberty, and the height data for the 30 children are shown below.
Group 1
93
95
101
103
108
111
114
115
115
117
Group 2
105
107
110
110
115
118
120
120
123
126
Group 3
100
101
103
107
111
113
115
115
118
125
(a) Enter the data into an Excel spreadsheet or SPSS such that each row contains
the data for an unique individual (thus, you should have the data in thirty
rows and two columns, one for the height data and one to indicate which
group the child was from).
(b) Produce numerical summary statistics for the height of all children,
irrespective of the groupings. Interpret your results.
(c) Explore the data, stratified by the groupings. Interpret your results.
(d) Produce informative figures that will aid the understanding of the dataset. Did
you produce a histogram? Was it useful for understanding the distribution of
the height data?
3.2
Mathematical ability and omega 3 consumption
The mathematics.xls dataset describes the data from an artificial study into the
effect of omega 3 consumption on the marks of the mock Secondary 4 exams in
Additional Mathematics from 3 schools. The dataset can be downloaded from
http://www.statistics.nus.edu.sg/~statyy/ST1232/bin/mathematics.xls.
(a)
(b) Through the use of appropriate figures, identify the problematic data and
remove them from further analyses.
(c)
Through the use of an appropriate graphical summary, explore whether
there is any graphical evidence to suggest that the mathematics scores
before starting the omega 3 treatment differs significantly between males
(coded with sex = 1) and females (coded with sex = 2).
(d) Produce a scatterplot of the scores before and after the omega 3 treatment,
and comment on the relationship between the two variables.
(e)
Calculate the empirical correlation between the scores before and after
the omega 3 trial (explore how to do this in either RExcel or SPSS yourself, try
looking through the drop-down menus).
(f) Introduce a new variable, which is defined as the difference in the scores
before and after the omega 3 trial (explore how to do this in either RExcel or
SPSS). Produce a scatterplot of the difference in scores with omega 3
consumption. Comment on the figure obtained.
(f) Produce a cross-tabulation table of school against sex, including the
frequencies of the school by sex. Comment qualitatively on whether there
exists any difference in the frequencies of the students from the different
schools between the different genders.
(g) Finally, investigate graphically with the use of a boxplot whether there is any
evidence to suggest a difference between the daily omega 3 consumption
and the schools. Is this plot informative?