Error Analysis II, Using the Data Analysis Program


As on all forms, be sure you have your data backed up in another document, and in your lab notebook.

Your course (e.g., Mth 151, Mth 173, Phy 121, Phy 232, etc. ):

Enter your access code in the boxes below. 

Remember that it is crucial to enter your access code correctly.  As instructed, you need to copy the access code from another document rather than typing it.

Access Code:
Confirm Access Code:

Your Name:

First Name
Last Name

Your VCCS email address.  This is the address you were instructed in Step 1 to obtain.  If you were not able to obtain that address, indicate this below.

You may enter any message or comment you wish in the box below:

Copy this document, starting in the line following these instructions, into a word processor or text editor. 

In the Error Analysis I experiment you were to have observed 30 swings of a certain pendulum. 

In the first error analysis lab (entitled Error Analysis Part I; if you don't clearly remember this activity you should review your results as posted at your access site) you compared the mean and standard deviation of a (not-quite-random) sample of five of these intervals with the mean interval for the entire sequence of observations.

In this exercise you will perform additional analysis of your data. This analysis can be done easily enough using a spreadsheet, but one of the purposes of this lab is to introduce you to a rudimentary analysis program designed to perform certain very common operations on certain common types of data.

In this exercise you will also learn more about the normal distribution and its application to experimental data.

The average time reported to complete this experiment is about 2 hours, with times pretty evenly distributed between 1 and 3 hours.  A few students report under 1 hour, and a few report over 3 hours.

The data program should save you several hours in analyzing some of the subsequent labs.

The program can be obtained by clicking on the link data program.  In case this link doesn't work the program is located at

or can be access by going to the Access Site using a path similar to the one used to access your site:

Analyze 30-interval data

In the space below include a copy of your data from the Error Analysis I experiment.  Then state the mean of your 30 time intervals, the mean and standard deviation of your sample of 6 intervals, and whether the difference of the means is less tha the standard deviation of your sample divided by sqrt(6).

Your response (start in the next line):

data from Error Analysis I:

 

#$&*

You may download the data program or run it directly from the site.  Note the following:

Run the program and click through any extraneous messages.  (If necessary you might need to click on the maximize button to maximize the size of the form and make all the buttons visible, but this should not be an issue.)

What are the mean and standard deviation of your 30 time intervals, as reported by the program? Report
below, using two tab-delimited numbers in the first line.  Starting in the next line give a brief explanation of what your numbers mean and how you obtained them.  After that explanation, include a copy of your data set for reference.

Your response (start in the next line):

mean of 30 intervals:          standard deviation:

how obtained:

#$&*

Investigate 'first differences' of 30-interval data

Now restore your original 30 time intervals to the box. You will have to do this manually, clearing the contents of the box and then copying and pasting the data from the text editor or word processor where you stored it before. Make sure your data also stays in that location, because you'll need it at least once again.

Click on the First Difference button. You will see a report of the differences between your successive time intervals.

Are all the differences between your time intervals all different, or do some occur more than once?

Where have you see this information before and what does it mean?

Your response (start in the next line):

first three differences of clock times:

are differences all the same:

where seen before:

#$&*

Sum your 30 time intervals and speculate on meaning

Restore your original 30 intervals to the box. Click on the Running Sum button.

Your response (start in the next line):

running sum of intervals, first three sums:

explanation of how running sums were obtained:

#$&*

Analyze the first difference of the running sums, and the first difference of this result

Delete everything but the single-column report of the running sums, so the data box contains just the running sums with one sum on each line, and click on the 'first difference' button.

Your response (start in the next line):

first difference of running sums, first three numbers:

description and meaning:

explanation of how numbers were calculated:

#$&*

Again isolate only the single-column report and again click on First Differences.

Your response (start in the next line):

first three numbers from first difference:

how calculated:

#$&*

Find difference quotients for a new set of data and speculate on the meaning of the difference quotient

Clear the box then copy the following 4 lines into the textbox:

0, 0

10,10

20,25

30,45

Your response (start in the next line):

difference quotient from sample data:

how you think the results were calculated:

#$&*

The information in the table

0, 0

10,10

20,25

30,45

represents the position of an object rolling down an incline vs. clock time, with position in meters and clock time in seconds. Recall that according to our 'y vs. x' convention, in a position vs. clock time table the clock time is in the first column.

Your response (start in the next line):

distance during   first interval:            elapsed time while traveling this distance:            average speed during interval:

distance during second interval:            elapsed time while traveling this distance:            average speed during interval:

distance during third interval:            elapsed time while traveling this distance:            average speed during interval:

explanation:

explanation of 'difference quotient' operation:

#$&*

Select and analyze 5 random intervals from 30-interval data, using the data program to find mean and standard deviation

Using a coin according to the following instructions, you will now select 5 intervals randomly from your 3-interval data. You will do this by generating 5 numbers corresponding to the numbers of your data point. The process should take only a couple of minutes:

Using the coin you will generate a series of numbers between 0 and 31. Note that there are 32 numbers between 0 and 31. This process can generate 32 possible numbers.

If you generate a number you have generated before you will discard it and generate an alternative.

If you generate a number that does not correspond to one of your intervals (probably 1-20 or 1-19) you will discard that number.

You will continue until you have generated 5 numbers that haven't been discarded.

To generate each number will require 5 flips of your coin. You will write down 5 numbers.

Your first flip is worth 1. Flip the coin. If you get Heads write down the number 1. If you get tails write down 0. Whichever number you write down will be at the top of a column.

Your second flip is worth 2. Flip the coin a second time. If you get Heads write down the number 2. If you get tails write down 0. This number does in the column below the previous.

The third, fourth, and fifth flips are respectively worth 4, 8 and 16 on Heads, 0 if you get Tails.

You should now have five numbers in your column. Add them up.

The result will be not less than 0 + 0 + 0 + 0 + 0 and not more than 1 + 2 + 4 + 8 + 16 = 31.

Go ahead and generate your first number according to these instructions. If the number is between 1 and the number of intervals you observed (e.g., between 1 and 30, or between 1 and 29), circle the number.

Now generate another number, using the same procedure with 5 flips of the coin. If this number is between 1 and your number of intervals (e.g., between 1 and 30), and if it does not duplicate the first number you generated, circle it.

Continue this process, generating totals between 0 and 31 and circling those that lie in the correct range and do not duplicate any your previous numbers. Stop when you have generated 5 distinct numbers within the appropriate range.

Now select the time intervals corresponding to the numbers you have generated (e.g., if you had a 30-interval set and your numbers were 23, 8, 11, 19, 5 and 22 you would select the 23d, 8th, 11th, 19th, 5th and 22d time intervals).

In the first line below, report the five random numbers you generated, in comma delimited format.

In the second line below, report the five time intervals you put into the box, in comma delimited format.

In the third line, report the mean and the standard deviation in comma-delimited format.

Starting in the fourth line give a brief explanation of what your numbers mean and they were obtained.  Optional comments may be added.

Your response (start in the next line):

your five random numbers:

your five intervals:

mean and standard deviation:

explanation:

#$&*

In three lines report the following numbers:

By how much does the mean of your 5-interval sample differ from the mean of the entire data set of 30 intervals?

What is the standard deviation of the 30-interval set?

What is the first number you reported as a percent of the second. That is, what is the difference between your sample and the entire data set, as a percent of the standard deviation of the data set?

Starting in the fourth line give a brief explanation of what your numbers mean and how you obtained them

Your response (start in the next line):

difference of means:

standard deviation of 30-interval set:

first number as percent of second:

explanation and meanings:

#$&*

Analyze a set of 'made-up' time intervals and look at their distribution

The set of numbers given below represents a set of 30 'made-up' quick-click time intervals. You will answer a few questions about this data set, including the mean and standard deviation of a 5-interval random sample. Later the results of all students will be compiled and used to demonstrate the 'sample standard deviation', which is an important statistical characteristic of sample and very relevant to interpretation of experimental results.

.1752

.172

.1979

.1991

.176

.1711

.1664

.1665

.1858

.1764

.1765

.1885

.173

.1853

.1683

.1674

.1833

.1632

.1783

.1962

.1704

.1914

.1751

.1715

.1967

.1852

.1851

.1771

.1639

.1824

.1877

Copy these numbers into a cleared textbox, click on Mean and Standard Deviation, and report their mean and standard deviation in comma-delimited format in the first line below. Starting in the next line give a brief explanation of what your numbers mean and how you obtained them.

Your response (start in the next line):

mean and standard deviation of given data:

explanation and meanings:

#$&*


below, enter the following numbers, one to a line, in the given order:

Starting in the next line give a brief explanation of what your numbers mean and how you obtained them

Your response (start in the next line):

two standard deviations less than the mean:

one standard deviation less than the mean:

the mean:

one standard deviation more than the mean:

two standard deviations more than the mean:

explanation and meanings:

#$&*


below, report each of the following numbers, one number to each line:

Your response (start in the next line):

number of given intervals which are less than the number which is two standard deviations less than the mean:

number of given intervals which are between two standard deviations less than the mean and one standard deviation less than the mean:

number of given intervals which are between one standard deviation less than the mean and the mean:

number of given intervals which are between the mean and one standard deviation more than the mean:

number of given intervals which are between one standard deviation more than the mean and two standard deviations more than the mean:

number of given intervals which are greater than the number which is two standard deviations more than the mean:

explanation and comments:

#$&*


below, report each of the numbers you reported above, but expressed as a percent of the 30 intervals (rounded to the nearest percent). For example, the number 10 would be 33% of 30.  Include a brief explanation of what your numbers mean and how you obtained them

Your response (start in the next line):

percent of given intervals which are less than the number which is two standard deviations less than the mean:

percent of given intervals which are between two standard deviations less than the mean and one standard deviation less than the mean:

percent of given intervals which are between one standard deviation less than the mean and the mean:

percent of given intervals which are between the mean and one standard deviation more than the mean:

percent of given intervals which are between one standard deviation more than the mean and two standard deviations more than the mean:

percent of given intervals which are greater than the number which is two standard deviations more than the mean:

explanation and comments:

#$&*

Perform a similar analysis with your 30-interval data

Return to your own 30 time intervals. Count the numbers in each range (less than mean - 2 std dev, between mean - 2 std dev and mean - 1 std dev, between mean - 1 std dev and mean, etc.), using the mean and standard deviation of that data set.

Report each number as a percent of your total number of intervals, one number in each of the first six lines below.  Starting in the 7th line give a brief explanation of what your numbers mean and how you obtained them

Your response (start in the next line):

number and percent of given intervals which are less than the number which is two standard deviations less than the mean:

number and percent of given intervals which are between two standard deviations less than the mean and one standard deviation less than the mean:

number and percent of given intervals which are between one standard deviation less than the mean and the mean:

number and percent of given intervals which are between the mean and one standard deviation more than the mean:

number and percent of given intervals which are between one standard deviation more than the mean and two standard deviations more than the mean:

number and percent of given intervals which are greater than the number which is two standard deviations more than the mean:

explanation and comments:

#$&*

In a standard 'normal' distribution, we expect that the respective percents in the six ranges will be about 2%, 14%, 34%, 34%, 14% and 2%. In a very large sample of data (say, at least tens of thousands of data points), if the data are in fact distributed normally, we expect actual results to very nearly reflect this distribution. If a large distribution does not closely match the expected results, we suspect that something in the system or in our observation process in fact deviates from the 'standard normal' expectation. Not everything we observe does in fact follow the standard normal pattern. You 3-interval results might or might not be expected to follow a standard normal distribution.

If the data sample is not very large, the chance fluctuations in the distribution could have a significant effect on the percents, which in that case may not be all that close to the expected distribution. However in a medium-sized sample of 30 or so, we definitely expect more observations to lie in the middle two ranges and in either of the outer ranges, and we aren't too surprised if no results at all appear in the outermost ranges (more than 2 standard deviations from the mean).

Based on the percents you reported and the percents quoted above, by how much would you say your actual 30-interval results deviated from the standard normal distribution? Did your results deviate enough to make you suspect that your clicks were not normally distributed about their mean?

Your response (start in the next line):

 

 

#$&*

Answer the same question for the 30 made-up time intervals given earlier.

Your response (start in the next line):

 

 

#$&*

Compare your distribution with the standard normal distribution

We will in a subsequent exercise learn to sketch a standard normal curve, and to represent our information using this sketch.

For the present, simply copy this figure below and label it as indicated below:

There are five vertical lines on the graph, representing respectively

Label the x axis with the z numbers -2, -1, 0, 1 and 2.


below:

Your response (start in the next line):

numbers representing percents in each of six regions:

x-axis labels for your data:

explanation:

#$&*

Your instructor is trying to gauge the typical time spent by students on these experiments.  Please answer the following question as accurately as you can, understanding that your answer will be used only for the stated purpose and has no bearing on your grades: 


You may add optional comments and/or questions in the box below.


Author information goes here.
Copyright © 1999 [OrganizationName]. All rights reserved.
Revised: 05 May 2014 21:49:09 -0400