You are on page 1of 14

Modeling Assessment Activity

Kathleen Hawkey Emily Liu

Functions, Statistics, and Trigonometry Mrs. Kincaid Dewey October 10, 2012

Hawkey Liu 2 Section 1: Chemistry Water solubility is a physical property and is often expressed as the mass of solute that dissolves in 100 g of water at a certain temperature. An experiment was conducted and data was collected for when different amounts of potassium nitrate (KNO3) completely dissolved at different temperatures. A model needs to be found that best fits the data: linear, quadratic, or exponential. Table 1 Solubility at Different Temperatures Data Trial 1 2 3 4 5 6 7 8

Solubility 40 47 65 79 93 120 139 160 (g per 100 H20) Temperature 25.5 30.1 42 47.2 53.2 64 70 76.4 (C) Table 1 is the data gathered in the experiment on solubility at different temperatures. For this experiment, the independent variable is the temperature and the dependent is the solubility in grams per 100 H20. The data was graphed and three types of models are compared (linear, quadratic, and exponential).

Figure 1. Linear Model Fit to Data Figure 1 shows how a linear model fits with the data. According to the graph, the observed values dont stray too far from the predicted.

Hawkey Liu 3

Figure 2. Quadratic Model Fit to Data Figure 2 shows the quadratic model fit to the data. The observed values do not seem to stray far from the predicted, seeming to be a better fit than the linear.

Figure 3. Exponential Model Fit to Data Figure 3 shows the exponential data for the data. Like the quadratic model, most of the data points seem close to the predicted and this model seems a better fit than the linear, also. Residual is the difference between the observed values and the predicted values, or the errors in the model compared to the data. Observed values are the actual data from the experiment and predicted values are values predicted by a model. The sum of squared residuals helps measure lack of fit and helps to determine the best model. The sum of squared residuals is found by squaring the residuals and then finding the sum of those squared numbers. Since residuals are the errors, the model with the least sum of squared residuals is better than one with

Hawkey Liu 4 a larger sum of squared residuals, meaning less error was found. Since the goal is to find the model that fits the data best, a way to find that would be to find the one with the least sum of squared residuals.

Figure 4. Finding the Sum of Squared Residuals for the Linear Model Figure 4 shows how the sum of squared residuals was found. The list solubility contains the observed values and the predicted values are found with the linear model 2.34709*temp26.9439. In list a the residuals are calculated. For the sum of squared residuals, list b squares the residuals in list a and then list c finds the sum of the values in list b, which is discovered to be about 224.386 units.

Figure 5. Finding the Sum of Squared Residuals for the Quadratic Model Figure 5 shows how the sum of squared residuals was found for the quadratic model. Observed values are in the list solubility and predicted values are found with the model

Hawkey Liu 5 0.02088temp2 + 0.227532tmp + 20.6769. List d finds the residuals. List e takes the residuals and squares them so list f can calculate the sum of squared residuals as about 7.57186 units.

Figure 6. Finding the Sum of Squared Residuals for the Exponential Model Figure 6 finds the sum of squared residuals for the exponential model. The list solubility contains the observed values and predicted values are found with the model 20.8028(1.02763)temp. List g finds the residuals by subtracting the model, or predicted values from the observed values. That done, list h takes the residuals, squares them, and finally, list i finds that the sum of squared residuals is about 85.2361 units. Now that the sum of squared residuals for all three models has been calculated, the model with the least sum can be easily determined. The sum for the linear model was discovered to be about 224.386 units, the sum for the quadratic about 7.57186 units, and the exponential to be about 85.2361 units. The one with the least sum of squared residuals is the quadratic model, meaning there was the least error in this model compared to the others. Correlation coefficient (often represented by the letter r) is used to measure the strength of the linear relation between two variables. The correlation coefficient is always a number between -1 and 1, no matter what the data is. When r is -1 or 1, it indicates a perfect linear relation, meaning that all the data points are on the line and sometimes called a perfect correlation.

Hawkey Liu 6

Figure 7. Correlation Coefficient for the Linear Model Figure 7 shows the correlation coefficient for the linear model as about 0.991489. This number is very close to 1, meaning that the linear model has a strong positive correlation, and most observed data points are close to, or on the model.

Figure 8. Linear Model Residual Plot Figure 8 shows the residual plot for the linear model. The residuals are never too far from when the residual equals zero but seems to express somewhat of a parabolic pattern, suggesting that there may be a better model out there besides a linear for this set of data.

Hawkey Liu 7

Figure 9. Quadratic Model Residual Plot Figure 9 graphs the residual plot for the quadratic model. The range of the residuals is not too far from zero, implying that the quadratic model is a good fit for the data.

Figure 10. Exponential Model Residual Plot Figure 10 graphs the residual plot for the exponential model. The range for the residuals is higher than the quadratic models, but lower than the linear models. This implies that this model is better than the linear in fit, but a slightly worse fit compared to the quadratic. After comparing the scatterplots, the sum of residual squares, and the residual plots of the three different models, the best fit model can be chosen. The graph of the quadratic model has the data points fairly close to the model, seemingly closer than either or the other two models, meaning this is a better fit of a graph. The sum of the squared residual of the quadratic model was proven previously to be the least of the three graphs, having a sum of about 7.57186 units

Hawkey Liu 8 while the linear model has 224.386 units and the exponential has 85.2361 as their sum of squared residuals. And finally, the residual plots also support the statement that the quadratic model is the best fit model. The residual plot of the linear showed that the residuals range was wide and that there was a possible parabolic pattern that suggests that there may be a better model. When the exponential models residual plot was graphed, this one had a smaller range than the linear models, but as the x value (temperature) got higher, the residuals started getting farther from when the residual is zero and may have kept getting farther away. However, the residual plot of the quadratic had the smallest range of the residuals and was close to where the residual is equal to zero. From this reasoning, it can be concluded that the quadratic model is the best fit model for this set of data. The quadratic model equation is y = 0.02088temp2 + 0.227532temp + 20.6769. In a situation where the temperature is at 50, the grams of potassium nitrate expected to dissolve at that specific temperature is found by plugging in 50 where temp is in the quadratic model equation. The result is about 84.2535 grams. This prediction is reliable since 50C is interpolation and when referring to the data table, the temperatures above and below 50C is 53.2C and 47.2C, respectively, where the observed grams to dissolve were 79 grams and 93 grams. The number 84.2535 grams is between those numbers. But what happens when the independent and dependent variables are switched, so that the solubility in grams per 100 H2O is now the independent variable and the temperature is the dependent variable? Is the best model still the quadratic model for the data?

Hawkey Liu 9

Linear

Quadratic

Exponential

Figure 11. Linear, Exponential, and Quadratic Model When Variables are Switched Figure 11 shows the how the three models fit the data when the variables are switched. When the models are compared, it seems that the quadratic model has the least sum of residuals than the others and more of the observed values are closer to when the residual is zero than the others. Due to this, the model that best fits the data when the variables are switched is still the quadratic model due to the residuals.

Scenario 2: Wallops Balloon Experience Mrs. Duddles, Mrs. Dewey and Mr. McMillan traveled to Palestine, Texas, to take part in the Wallops Balloon Experience for Educators this past summer. In learning about high altitude balloon science, they built a scientific payload that measured pressure, temperature and humidity. Each individual payload was started, then combined together with the remaining

Hawkey Liu 10 payloads and attached to the balloon. The string of payloads was then walked out to the launch area where the balloon was filled to a pre-determined amount with compressed helium and then launched. The balloon rose rapidly, reached a maximum height of 88,000 ft. where the balloon burst prematurely and plummeted to earth. It was deduced that the attached parachute did not deploy as it was entangled with the balloon. Half of the payloads, including the camera that recorded the entire process broke off and is theorized to have landed in an area cordoned off by a state penitentiary. After a period of time, the remaining payloads were recovered from a height of 40 ft in a tree on private property.

Figure 12. Pressure of the Atmosphere as Time Increases After Balloon Launch Figure 12 shows the pressure of the atmosphere in millibars as time increases since the balloons launch. Indicated by the graph, there wasnt a big change in pressure until the balloon reached about the 235 mark on the graph (or 12.4 minutes after the balloons launch according to the data) and then began decreasing at a constant rate. When the graph reaches around the point 487 (or 13.43 minutes) the pressure of the atmosphere starts to increase again at a constant rate until at about 603 on the graph, which is 14.12 minutes.

Hawkey Liu 11

Figure 13. Linear Regression for the Ascent of the Balloon Data Figure 13 takes the portion of the data that represents the ascent of the balloon, where the graph decreases at a constant rate. When a linear regression is applied, the residual plot shows that this model does not fit the data well, seeing how it shows a pattern, implying that there may be a better model out there. The residuals also sometimes stray far from when residuals are zero.

Figure 14. Exponential Regression for the Ascent of the Balloon Data Figure 14 shows the exponential regression for the portion of the data representing the ascent of the balloon. According to the residual plot for the exponential regression, this is not a very good model, considering that the residuals have wandered to over 300 units and the residuals dont form an overall horizontal bar around when the residual is zero.

Hawkey Liu 12

Figure 15. Quadratic Regression for Ascent of Balloon Data Figure 15 shows the quadratic regression for the portion of the data that represents the ascent of the balloon. Looking at the residual plot, the quadratic may not be the best since the residuals do not form somewhat of a horizontal bar and near the end of the data starts to get farther away from when residuals equal zero.

Figure 16. Inverse Variation Model Compared to Ascent of Balloon Data Figure 16 shows the inverse variation model for the ascent of the balloon data, along with its residual plot. The parent function of the inverse variation is y=k/x, k being the constant. To find the constant, a known point on the model (the center of mass) was found by finding the mean of the x and y values. Those were substituted for x and y in the parent function to find what the constant k was. K was discovered to be about 101703.578947 for this set of data so the inverse variation model for this data is f(x) or y=101703.578947/x. According to the residual

Hawkey Liu 13 plot, this is not a good model for the data, considering that it does not form a somewhat horizontal bar around when residuals equal zero.

Figure 17. Inverse Squared Variation Model for Ascent of Balloon Data Figure 17 shows the inverse squared variation model for the ascent of the balloon data with its corresponding residual plot. The residual plot indicates that this is not a very good model for the data since the residuals wander far from the observed values and does not form a horizontal bar about where residuals are zero. Therefore, this is not a good model. After analyzing all the mathematical models fit to the data, and whether or not the model is a good model, it can be concluded that none of the models are appropriate for the portion of the data that represents the ascent of the balloon. Neither the linear, exponential, quadratic, inverse variation, nor inverse squared variation was shown to be a good model due to how far the residuals were from being zero. According to the graph in Figure 12, pressure decreased as time progressed, and then began to increase at a constant rate. The reason that pressure had decreased as time progressed may be due to the increasing height of the balloon as time progressed, since pressure decreases the higher the altitude. The balloon had burst prematurely at 88,000 feet and had begun to drop to earth, which is why the graph began to increase as the balloon neared the earth, where pressure was increasingly less.

Hawkey Liu 14 Bibliography Vertical Structure of the Atmosphere. www.okfirst.mesonet.org. OK-First. 3 June 2009. Web. 8 October 2012.< http://okfirst.mesonet.org/train/meteorology/VertStructure2.html> Air Pressure and Altitude above Sea Level. www.engineeringtoolbox.com. EngineeringToolbox. 17 September 2012. Web. 8 October 2012. <http://www.engineeringtoolbox.com/air-altitude-pressure-d_462.html>

You might also like