# Statistic Project

**Part 1: **Assessing
normality. Is it Normal? The object of this project is to identify a data set
that is appropriate to test for a normal distribution and then to do the three
tests. Find a data set on the internet, use one of the provided data sets in LEO,
or collect your own data that measures some quantitative variable from a
population sample. Your data set should have at least 30 values. Include a
summary of your data as in the example below as well as a table with your first
30-50 data values. You can replace the sample data below with your own.

**Data Title:**
Eruption Interval of Old Faithful in 1990

**Source:**
Self-collected (include entire data set at end of assignment), From LEO
(Indicate which one), or from the internet (Provide link).

# of Data Points | 272 |

Mean | 70.9 |

Stdev | 13.6 |

cumulative | TRUE |

Median | 76 |

Minimum | 43 |

maximum | 96 |

Q1 | 58 |

Q3 | 82 |

IQR | 24 |

> Outlier | 118 |

< Outlier | 22 |

Interval (minutes) |

43 |

45 |

45 |

45 |

46 |

46 |

46 |

46 |

46 |

47 |

47 |

47 |

47 |

48 |

48 |

48 |

49 |

49 |

49 |

49 |

49 |

50 |

50 |

50 |

50 |

50 |

51 |

51 |

51 |

51 |

A. Histogram: Does it appear normal, or is it skewed right or left, or is it bi-modal? Place your histogram below.

B. How many outlier more than Q3+1.5 times the IQR and how many outliers less than Q1-1.5 time the IQR?

C. Is the normal probability plot more or less linear or not? Place your graph below?

D. After reviewing A, B, and C, do the data come from a population that is normally distributed? Explain.

**Part 2:** If your
answer for part 1 was that your data was normally distributed, you can then use
that data for part 2, otherwise, you need to find another normally distributed
data set for part 2. Create a cumulative probability distribution for your data
and include a summary of your data (just # of data points, median, standard
deviation) as well as a table with your first 30-50 data values (if it is not
the same data as part one).

**Data Title:**

**Source: **Same as
in Part 1 or if not same as part 1, then, self-collected (include entire data
set at end of assignment), from LEO (Indicate which one), or from the internet
(Provide link).

A. What is your random variable?

B. Who or what is your random variable about (population)?

C. What is the probability that a (insert population here) has a (insert random variable here) greater than (insert a test value of your choice here)?

D. What is the probability that a (insert population here) has a (insert random variable here) less than (insert a test value of your choice here)?

E. What is the probability that a (insert population here) has a (insert random variable here) between (insert larger test value of your choice here) and (insert smaller test value here)?

F. Is it unusual for a (insert individual in the population here) to have a (insert random variable from part A here) greater than (test value from part A)?

G. What (insert random variable here) do 90% of all individuals in/of (insert population here) have less than?

H. What (insert random variable here) do 5% of all individuals in/of (insert population here) have less than?