Course Notes:
November 8 Lecture Notes on Chapters 6, and 7
November 13 Lecture Notes on Section 7.5
Resources
Arithmetic of Fractions Assignment
Getting Started with My Math Lab
Linear Modeling Project
The project requires collecting a table of data where the x-value represents time, and the y-value represents some magnitude that is changing roughly linearly. You would then first find a linear model by choosing two appropriate points, and then a second model using manual manipulation; and finally, a third model using linear regression with the Desmos graphing calculator. Subsequently you would compare the two results in your final write-up.
Please see the Rubric that will be used to grade the assignment.
STEP 1: Find a real-life table of values x, and y, such that their relationship is roughly linearly.
This means that as x-values get large, the y values also get large or small in steady fashion. Once you graph these points in a scatter plot you would notice whether the pattern is roughly linear or not. Below are several options to choose from, but I urge you to think of a project idea that would interest you, and you do not need to use any of the options below. You could also modify them as you like.
Possible websites to look for data:
Now develop a question that pertains to your research. Here are some questions to choose from:
- Has crime increased in a particular city over time?
- Does median household income predict the quality of education in Los Angeles?
- Does race play a role in determining median housing income?
- Does race play a role in determining the academic performance index in school?
- Does median household income determine the damaging concentration of particulate matter in the air?
- Is there a relationship between race and the quality of air (Air Quality Index) in Los Angeles?
- Is there a correlation between per capita crime and the quality of air?
- Is the infant mortality rate higher in areas with polluted air?
Example Project Idea 1: Time and Crime Rate in a particular City
Here x represents time (usually after a given year), and y represents another magnitude. I recommend using City-data.com to find this data easily. For example, for the city of Burbank, CA, you could look at the crime statistics, and choose a table that looks roughly linear.
In this case x-values represent a proportion of a particular race living in a city, and y-values represent the academic performance index
of a school in a corresponding city. City Data generally lists the racial makeup of town in Los Angeles:
For each city you'd need to look up a corresponding high-school or middle-school (your choice) using e.g. Google Maps, then look up the Academic Performance Score of that school using California Department of Education website.
You could, for example, collect a table of values where the first column would represent the percentage of Hispanic or Latino people living in a particular city, and the Weighted 3-Year Average API score of that area.
You would do this for several cities, but I recommend collecting at least ten cities. Collect the data in a MsWord Table, and label the columns accordingly. Below you will find an example of a nearly completed Project Idea 2. Keep in mind that the example is incomplete, and that you would need to produce one that is unique to your project.
STEP 2: Use desmos.com to plot a scatter-plot of the points you have collected. Be sure to adjust the window so that the scatter-plot is visible. Print out a copy of the scatter-plot to include it as a diagram in your project.
Enter the data you created in Microsoft Word using a table into Desmos. Note below that percentage signs have been removed. Also, notice that x1 represents percentages, and y1 represents academic performance index scores.
As you create the table, Desmos automatically creates a corresponding scatter plot. If you are not seeing the plot it is because you need to adjust the window.
STEP 3: Obtaining the Two-Point Model: Pick two points and the point-slope formula $y-y_1=m(x-x_1)$ to find the equation of the line that passes through these given two points. Graph this line along with the scatter-plot in desmos, and print a copy of this picture as well. This is the 2-Point Model.
Suppose we pick the points $(1.8, 788)$ and $(42.9, 582)$. Using the point-slope formula we can find the slope as follows: $m=\frac{582-788}{42.9-1.8}=-5.01217$
Putting the latter into the point-slope formula we obtain: $y-788=-5.01217(x-1.8)$, which we enter into Desmos to obtain our first model.
Include the graph in your MS Word project write-up.
STEP 4: Obtaining the Manual Model: Type y=mx+b, and use the sliders m, and b to find the best model that would fit the points.
Be sure to turn-off the Two-Point Model before exploring the Manual Model. You also need to adjust the range of the sliders to ensure that the line fits through the scatter plot.
What is the slope and y-intercept of the Manual-Model? Record these in your report. Also, be sure you can interpret the slope and the y-intercept. In the graph above, the y-intercept represents the point (0,776) which corresponds to the theoretical possibility that in a town with 0% African American population, the average API score would be 776. On the other hand the slope of $-4.8=\frac{-4.8 points}{1%}$ (remember that slope is $\frac{rise}{run}$ represents the trend that with every percentage increase in the proportion of African Americans in a city there is a corresponding decrease in the average API score by 4.8 points.
STEP 5: Find the Regression Line Model for the plot by typing y1~a1x1+a0. Graph this line, and record its slope.
Note that when graphing the regression line $a_1$ represents the slope, while $a_0$ represents the y-intercept of the function. This is the best model for the data.
Desmos will produce the Regression Line:
STEP 6: Compare and contrast the three different models that you found. How do the slopes and y-intercepts differ? Interpret the slope of the Regression Line model.
At this point you should have three different linear equations: (1) The Two-Point Model, (2) The Manual Model, and (3) The Regression Line Model. Which of these has the largest slope? Provide an interpretation of the Regression Line Model just as you did it in the Manual Model case above.
STEP 7: In the write-up of your project discuss the following questions:
- Are there any points with large residuals (points which are far away from the models)? Which points are these? What do they represent? Why do you think these specific cases involve large residuals?
- How large is the correlation coefficient r? Do you think there is a causal relationship between the explanatory variable x, and the response variable y?
- Pick an x-value and use the regression line model to make a prediction of the corresponding y-value. Write a sentence describing the meaning of this pair of x and y values. For example, using the Regression Line Model example above, and approximating the slope and y-intercept to the nearest tenth, $y=-5.5x+778$. We pick an x-value $x=13%$. This means that according to the theoretical Regression Line model, $y=5.5*13+778=849.5$, which means that theoretically the school that corresponds to that particular proportion should correspond to an API score of 849.5. In reality, the API score that corresponds to that proportion is San Gorgonio High School, which has a much lower API score (645) than the model is actually predicting.