Introduction to Linear Functions


This page is in the form of a worksheet.  It leads you through the process of creating a linear model for the effect of weights hung from a vertically suspended door spring.  It constitutes both a set of notes and a homework assignment.

If you do not have a spring you may do the first part, Orientation by Prediction, then see Class Notes #07 for position vs. load data.


Understanding the characteristics of linear functions by observing and analyzing a door spring hung from the ceiling as it stretches with added weight.
Predicting, by sketching a graph, the way the spring stretches with added weight.
Observing the behavior of the spring.
Creating a table and graph.
Deciding on a linear vs. a quadratic model.
Selecting data points for the model.
Determining slope and y-intercept; using linear regression
Determining y from x and x from y; interpretation of results; interpretation of slope.

Activity 1: Vertical position of a hanging door spring vs. weight on the spring

Orientation by Predicting Results

Suppose that we suspend a standard screen door spring by one end so that it hangs vertically with one end in front of us. If we gradually add weight to the lower end of the spring, the spring will at some point begin stretching, and will continue stretching as long as weight is added, until it reaches the floor.

We can measure the height of the lower end of the spring from the floor each time we add weight. In this way we can obtain a table of height vs. weight. This table can be graphed.

Suppose that the lower end of the spring starts off at a height of 1.5 meters (about 5 feet) from the floor. Suppose furthermore that the spring first begins stretching when 2 kilograms of mass is hung from the end (a kilogram is very close to the mass of a 1-liter diet soft drink; the drink would have a mass of precisely 1 kg if it was filled with water), and that a total of 25 kilograms are required to stretch the lower end all the way to the floor. Sketch a graph representing the predicted height of the low end vs. the weight on the spring.

Let the shape of the graph reflect your opinion regarding the way the spring will stretch out. Do you think it will stretch the same amount for every added kilogram? Or will it stretch more for an added kilogram at the beginning, when there isn't much weight on the spring, or at the end, when the spring is already pretty stretched out? State your opinion clearly and make your graph consistent with your opinion.

 

Making observations

Use a door spring and a measuring tape or measuring stick to perform this experiment. You will also need a bucket or a couple of 1-gallon milk jugs, a supply of water and a measuring cup or an average-sized coffee mug, and some string strong enough to support the bucket or jugs when then are full of water.

Suspend the spring by one end from a hook, a tree branch, or whatever is handy. Add water to the bucket or milk jugs, adding a cupful at a time until the spring begins stretching, counting the cups. After every cup measure the height of the low end of the spring. After the spring begins stretching, measure the height of the low end with every added cup, and record your results.

You will end up with a set of end heights and corresponding numbers of cups of added water.

Organizing Data

Organize your data into a table of end heights vs. number of cups of water supported by the spring. Then make a graph of this data set. Use an appropriate scale for your horizontal and vertical axes.

Postulating a model

Does the graph seem to follow a straight line, or is there a tendency to curve, possibly as part of a parabola?

If the graph seems to follow a straight line, or if there is no obvious tendency to curve, we generally begin with a straight-line or linear model.

We recall that the general form of a linear function is y = mx + b. There are only two parameters, m and b, to be determined.

We postulate a linear model for this data.

Selecting data points

There are two primary techniques for finding the parameters m and b. We can choose two points and solve a pair of simultaneous linear equations, or we can use a technique called linear regression which uses all the points in our data set, and which is usually applied using a computer or calculator.

We first use the familiar method of selecting two points from a curve that seems to fit the data. In the case of a linear model, the curve is actually a straight line.

To approximate the best straight line for the given data, you will use a piece of string or a thread, or possibly a thin straight stick or metal rod. Whatever you use be sure that it is thin enough that it doesn't obscure many data points when placed over the graph.

Stretch out the string, thread or rod over your graph of the data points, trying to come as close as possible, on the average, to those points. Don't even look at whether the line you form will go through any actual data points. Mark the positions of the ends. Then take a straightedge and draw a straight line through those endpoints. Note any systematic deviation of the data from the straight line (e.g., notice if the points start out above the line then mostly dip below then come back above, as would be the case if the actual data was in fact quadratic).

The line you just drew will be your linear model. Select two points on the line, lying near the left-hand and right-hand extremes of your data set (you don't want the points close together because such a choice would magnify the effect of small errors in your estimate of the coordinates, and result in a less accurate model). Estimate as accurately as you can the x and y coordinates of these points.

Substitute these coordinates into the general linear form y = mx + b. You will obtain two simultaneous linear equations whose unknowns are the parameters m and b.

Substitute these parameters into the form y = mx + b and obtain your linear model.

Alternative ways of finding a linear model

The process you just used is simple enough, especially when compared with the more computation-intensive process of fitting three points to a quadratic. There are in fact two other ways to obtain a model, one being even faster than the preceding method and equally accurate, the other being completely accurate but usually implemented using a computer or calculator.

First the quickest and easiest method for obtaining a reasonable linear model. All this model requires is a good estimate of the y-intercept and the slope of the graph.

If your straight line didn't reach the y axis, extend it so it does. Estimate the y coordinate at the y axis. This is the x = 0 point; at x = 0 we clearly have y = m(0) + b = b. So you just found b.

Now estimate the slope m. Choose a point on the line near the left-hand side of your graph. It is usually convenient, though not necessary, to choose a point on one of the vertical gridlines so you know the x coordinate accurately. Estimate the y coordinate as accurately as you can.

Then choose a convenient 'run' that spans most of the graph. Since you will be dividing by the run, it is helpful to choose a number you can easily divide by. Numbers like 1, 10, 100, etc. are easiest, but might not fit the graph right; numbers like 2 or 5, 20 or 50, etc., are next-best if you are going to do the calculation mentally. In any case a run that takes you to another vertical gridline is usually the best choice. Move the appropriate distance to the right, then up or down to the graph and estimate the y coordinate.

Use the two y coordinates to obtain an approximate value of the rise. Use the rise and run to get the slope.

Plug your slope m and y intercept b into the form y = mx + b. How does this model compare with the one you obtained previously from two points?

The other way to obtain a linear model is to use a computer or calculator. You will place your data in the machine, then follow the appropriate steps to obtain your model. The actual steps vary according to the calculator or computer program. In this class the standard method is to use the computer program DERIVE.

If you haven't yet completed the introduction to DERIVE, defer the indented steps below until you have.

Begin by authoring your data set. Recall that the format is as illustrated by the example [ [1, 3], [2, 2.5], [3, 2.3], [4, 1.9] ]. Each data point is enclosed in square brackets [ ], data points are separated by commas, and the entire set of data points is enclosed in its own set of square brackets.

Plot your data set (choose the Plot window and give the Plot command). Set the range (select Range in the Plot window) so that your plot window is appropriate to your data.

Return to the Algebra window and author the Fit command Fit([x,mx+b],#&&), where && is the line number of your data set (alternatively highlight the data set and use F3 to insert it at the appropriate point).

Simplify the Fit command and write down the resulting equation.

Plot this equation to see how well it fits the data. Notice if there is any compelling pattern to the way the points deviate from the line.

The equation you obtain from the computer or calculator is called the Linear Regression Equation, or the Least-Squares line. It is the line for which the total of the squared vertical distances of the points from the line is smallest. This line is in a well-defined statistical sense the best possible linear model for the data.

When you have attained the linear regression model for your data, compare it with the models you have obtained here.

Quantifying the quality of the fit

For each x value (recall that x is the number of cups of added water) find the predicted y value (the height of the spring end), and find the deviation of the observed height from this prediction. Average these deviations. This is a good measure of how closely the line fits the data.

From a statistical point of view an even better measure of the fit is obtained by first averaging the squares of the deviations then taking the square root. This is called the "root-mean-square" deviation. Do so and compare with the previous average.

Note that the quality of the fit is not related only to an average of deviations, in the sense of either of the above calculations. If there is a systematic pattern to the deviations above and below the line, we might still have a small average deviation, despite the fact that the pattern tells us that we probably don't have the right model.

Pose and Answer Questions

We now have a model y = f(x) = mx + b for y = f(x) = height of the spring's end vs. x = number of cups of water.

We can use our model to predict the spring's end height f(x) for a given number x of cups of water, or we can use spring's end height f(x) to predict the number x of cups of water in the container(s).

For practice:

Determine the predicted height of the spring's end for 12.4 cups of water.

Determine the predicted number of cups of water necessary to bring the spring end to the floor or the ground.

Determine the number of cups of water necessary to change the spring's end height from 1 meter to .5 meter (or if you had to measure heights in inches and feet, from 1 yard to .5 yard). Then use this information to determine the average rate at which the spring's end height changes, in meters (or yards) per cup of water. What is the significance of this quantity for the graph of the model?

Determine the change in the spring's end height that would result from adding 8 cups of water. Would your answer change depending on how many cups of water there were before the 8 cups were added?

Precisely what is the meaning of the slope of your graph? What does it mean in terms of the spring and water that the slope is unchanging?