the temperature, impeller speed, duration, and whether or not the reactor has baffles. There are 5 variables in the dataset: temperature: measured in Celsius duration of experiments, measured in minutes speed of impeller, measured in RPM baffles is a categorical Yes or No variable yield: is a percentage yield at the end of the batch. Simulated data 14 rows and 5 columns
Cheddar cheese Description: Data source:
Data shape:
Concentrations of acetic acid, H2S, and lactic
acid in 30 samples of mature cheddar cheese. A subjective taste value is also provided. David Moore and George McCabe (1989). Introduction to the Practice of Statistics. Also see http://lib.stat.cmu.edu/DASL/Datafiles/Cheese. html 30 rows and 4 columns
Blender efficiency Description: Data source:
The effect of 4 factors on blending efficiency.
particle size, mixer diameter mixer rotational speed blending time Modified based on Box, Hunter, and Hunter,
Data shape:
Statistics for Experimenters, 2nd edition, page
486. 18 rows and 5 columns
Class grades Description: Data source:
Data shape:
Grades from a Chemical Engineering course at
McMaster University. The recorded values are the average of subcomponents: e.g the Tutorial variable is the average of all tutorials, the Final exam variable is the average of all questions in the final, written exam. The Prefix column is the year in which the student first enrolls at the university and is a crude approximation of the student's age (maturity). This particular course permitted students to work in groups for assignments, tutorials and the take-home exam. The groups were selfselected, and varied during the semester. Of interest is whether the assignments, tutorials, midterms or take-home exam are a good predictor of the student's performance in the final exam. Also, does the Prefix variable show any promise as a prediction variable? 99 rows and 6 columns