Fitting a Straight Line to Data


As on all forms, be sure you have your data backed up in another document, and in your lab notebook.

Your course (e.g., Mth 151, Mth 173, Phy 121, Phy 232, etc. ):

Enter your access code in the boxes below. 

Remember that it is crucial to enter your access code correctly.  As instructed, you need to copy the access code from another document rather than typing it.

Access Code:
Confirm Access Code:

Your Name:

First Name
Last Name

Your VCCS email address.  This is the address you were instructed in Step 1 to obtain.  If you were not able to obtain that address, indicate this below.

You may enter any message or comment you wish in the box below:

Copy this document, from this point to the end, into a word processor or text editor. 

Note that the data program is in a continual state of revision and should be downloaded with every lab.

Most student report spending between 1 and 2 hours on this lab exercise.  A few report as little as 40 minutes.  Some report significantly longer times.  If you are adept with graphs you will probably tend toward the shorter times; if the exercise requires longer then there's a good chance you need the practice.  This topic is important, and its importance goes well beyond your physics class.

The figure below depicts a series of six data points and a straight line segment.

The line segment is part of the 'best-fit line' for the data. 

You may think of this as the line that comes as close as possible, on the average, to the data points.  This definition can be improved upon--it doesn't specify exactly how 'closeness' is defined or how 'closeness' is averaged--but it gives you the general idea, and it is sufficient for 'eyeball' estimates of the best-fit line.

Each data point lies either above or below the line, at some specific vertical distance from the line.  This vertical distance is the deviation of that point from the line.  The best-fit line is the straight line that minimizes the sum of the squared deviations--there is no other line with a smaller sum of squared deviations. 

The exercise here will be to obtain eyeball estimates of the best-fit line.

Any graphing calculator or computer spreadsheet will calculate the best-fit line for a given set of data, but neither calculator nor spreadsheet should be used in this exercise.

Take a piece of string and stretch it over the line (you may do this on the screen or you may use a printout).  Then raise and lower the string, and see how the distances to the various points change.  Vary the slope of the string.

Can you see that if the line in the figure below is raise or lowered a significant amount, and/or if its slope is changed significantly, the average distance of the points from the line will change?

Answer this question below and indicate also by how much the slope of the string needs to change before the increased average distance between the string and the points is apparent.

 

 

#$&*

Give in the first line below the horizontal and vertical coordinates, in comma-delimited form and in that order, of the rightmost point on the line, as best you can estimate them.

In the second line give the same information for the leftmost point on the line.

Starting in the next line give a brief statement telling in your own words what the numbers you have entered mean.

 

 

#$&*

At you move from the leftmost point to the rightmost, by how much does your horizontal coordinate change, and by how much does the vertical coordinate change?  Indicate in the first line using comma-delimited format.  Use + or - with each result to indicate whether the change in the coordinate is positive or negative.

Starting in the next line give a brief statement telling in your own words what these numbers mean and how they were obtained.

 

 

 

 

#$&*
 

The change in horizontal coordinate is called the 'run', and the change in vertical coordinate is called the 'rise', between the two points.

The slope between the two points is the rise divided by the run:  slope = rise / run.

In the first line give the run and the rise, delimited by commas.

In the second line give the slope.

Starting in the next line give a brief statement telling in your own words what these numbers mean and how they were obtained.

 

 

 

 

#$&*
 

The vertical axis of a graph occurs at horizontal coordinate 0.  If a graph doesn't show an axis through horizontal coordinate zero then it doesn't show the vertical axis.  Not all graphs will show the vertical axis.  For example of the data are clustered closely together at a great distance from the vertical axis, a graph would have to be extended through a lot of 'blank space' to get to that axis, and the detailed behavior of the data would be obscured.

If the straight line is extended until it intercepts the vertical axis then the point where this occurs is called the vertical intercept of the graph. 

Answer these questions in the below.  Answer each in a concise yet complete sentence that includes the reasons for your answer.

 

 

 

 

#$&*
 

The equation of a straight line in the x-y plane can be expressed in the common form y = m x + b, where m is the slope of the line and b the y-intercept.

Substituting your estimates of the slope and vertical intercept for m and b in the form y = m x + b, you obtain an estimate of the equation of the best-fit line.

Give your estimated equation for this line below.  Add a brief statement about the meaning of your answer and how it is related to previous answers.

 

 

 

 

#$&*
 

In may cases it is not appropriate to include x = 0 in the scale of the graph, in which case it is not possible to extend the line to the actual y axis in order to estimate the y intercept.  If this is the case we can still estimate the equation, based on the coordinates of our two points.

You have already estimated the slope of the line.  Leaving x, y and b in symbolic form, substitute m into the form y = m x + b.  What equation do you get?

 

 

 

 

#$&*
 

Now substitute the x and y coordinates of either of your two points.  It doesn't matter which point you choose.  Choose a point, substitute its x coordinate for x and its y coordinate for y.  What is your equation?  What symbol(s) remain in the equation?

 

 

 

 

#$&*
 

You can solve this equation for the one remaining symbol b.  Do so, and indicate your solution in the first line below.  Starting in the second line show and explain the steps in your solution, in detail.

 

 

 

 

#$&*
 

How does the value of b compare with the y-intercept you estimated earlier?  How far in actual distance on the y axis is the b from the vertical intercept you estimated earlier?  Why can we not expect the two values you obtained to match exactly?

Answer these questions in complete statements that indicate your answers as well as the meaning and basic of your answers.

 

 

 

 

#$&*
 

The actual best-fit equation for the above data set is y = -2.2032x + 18.982.

By what percent of the actual value -2.2032 of m in the above equation did your estimate differ from that value?

By what percent of the actual value 18.982 of b in the above equation did your estimate differ from that value?

Give your two results, delimited by commas, in the first line below.

Starting in the next line give a brief statement telling in your own words what these numbers mean and how they were obtained.

 

 

 

 

#$&*
 

Now stretch your string out to form the line you think best fits the data in the figure below. 

Estimate and record the coordinates of two widely-separated points on your line, but do not include the y-intercept among these two points.

Estimate and record the y-intercept of your line, if the graph does indeed pass through the true y axis.  If this is not the case, indicate it.

Starting in the next line give a brief statement telling in your own words what these numbers mean and how they were obtained.

 

 

 

 

#$&*
 

Give the rise, run and slope between the two points on your line.  Give these results in three lines, one number in each line.  Starting in the fourth line explain how you obtained the rise, run and slope.

 

 

 

 

#$&*
 

Give in the first line the equation of your line, based on the slope and the y intercept.  Starting in the second line give a complete explanation of how you obtained your equation.

 

 

 

 

#$&*
 

Give the equation of your line, based on your two points (as before, substitute the slope into the y = m x + b form, substitute the x and y coordinates of one of the two points, and solve for b).  Give the equation in the first line, then starting in the second line explain how you obtained your result for b.

 

 

 

 

#$&*
 

Repeat this exercise for the graph below:

 

 

 

 

 

#$&*
 

Give the rise, run and slope between the two points on your line.  Give these results in three lines, one number in each line.

 

 

 

#$&*
 

Give the equation of your line, based on the slope and the y intercept.

 

 

 

 

 

#$&*
 

Give the equation of your line, based on your two points (as before, substitute the slope into the y = m x + b form, substitute the x and y coordinates of one of the two points, and solve for b).  Give the equation in the first line.  A subsequent explanation is optional, provided you feel you understand the process.

 

 

 

 

 

#$&*
 

The figure below shows the best-fit line for the second of the three graphs shown so far.

How did your estimates compare with this result?  Summarize briefly below:

 

 

 

 

 

#$&*
 

The figure below shows the best-fit line for the third of the graphs shown so previously.

How did your estimates compare with this result?  Summarize briefly below:

 

 

 

 

 

#$&*
 

Stretch your string over your estimated best-fit line for the graph below.  You don't need to write down any data.

How close do you think you can come to the actual best-fit line? 

In the space below, indicate how close you think you could come to the correct vertical intercept, and how close you could stay (in the vertical direction) to the actual best-fit line.

 

 

 

 

 

#$&*
 

The two figures below show the actual best-fit line.  The second figure also includes the equation of that line.

Look at these figures and in the box below either give revised estimates to your previous answers, or state why you think that no revision is necessary:

 

 

 

 

 

#$&*
 

Repeat the above exercise for the graph below.

How closely do you think you can come to the actual best-fit line? 

In the space below, indicate how close you think you could come to the correct vertical intercept, and how close you could stay (in the vertical direction) to the actual best-fit line.

 

 

 

 

 

#$&*
 

The points of this graph are clearly more scattered from any straight line than in any of the preceding graphs.  It is clearly harder to tell exactly where the best fit line should be.  That is, there is more uncertainty than before when attempting eyeball a best fit to this data.

Using your string, estimate how much uncertainty there might be in the y intercept of your straight line:

Give these two numbers in the first line, separated by a comma.

 

 

 

 

 

#$&*
 

The lines x = 6, at the right of the graph region, and y = 1.8, at the top of the graph region, for the top and the right-hand sides of what we will call the graph rectangle.

It is likely that some of your estimated best-fit straight lines will pass through the top of this rectangle, and some through the right-hand side.

Two of your answers should be numbers, and two should be 'none'.

 

 

 

 

 

#$&*
 

The actual best-fit line is depicted in the figure below.  My best guess had the line a little steeper and a little lower on the y axis.  How did yours compare?

 

 

 

 

 

#$&*
 

The figure below includes the best-fit line and its equation.

See where you think the best-fit line for each of the two graphs below intercepts the y axis, and the y coordinate of the point at which here it passes through the right-hand side of the graph rectangle.  Jot down your answers.  Don't look at the pictures below these.  Just see what you think and see how you do.  There are no 'wrong' answers and when I myself do this my own estimates aren't that great, so there's certainly margin for error, but try to be as accurate as you can.


 

Give your results in the space below, with the y-intercept and the y coordinate of the right-hand point in the first line separated by commas, and the same information for the second graph in the second line of the space below:

 

 

 

 

 

#$&*
 

Here are the actual best-fit lines, for reference and comparison.

If you have Excel the best-fit line can be created from a scatter graph by right-clicking on a data point and choosing 'Add Trendline'.  To get a straight line, choose 'Linear'.  So see the equation on the graph, set Options to Display Equation etc.. 

Your instructor is trying to gauge the typical time spent by students on these experiments.  Please answer the following question as accurately as you can, understanding that your answer will be used only for the stated purpose and has no bearing on your grades: 


Please copy your completed document into the box below and submit. 


Author information goes here.
Copyright © 1999 [OrganizationName]. All rights reserved.
Revised: 06 Aug 2012 00:11:33 -0400