Math 104

February 9, 2005

What’s on the Exam?  #1

 

The closed-book, in-class exam on February 18 covers the following material in the text:

 

            Chapter 3, sections 1, 2, 3, 4  (and related exercises)

            Chapter 4, all sections

            Chapter 5, all sections

            Chapter 7, to the extent necessary

            Chapter 8, all sections

            Chapter 9, TBD…at most, sections 1, 3, 5.

 

It does not cover the computer assignments.  I’ll try to limit the exam to material we have covered in class, but I can’t guarantee that absolutely.  Calculators are encouraged.  Rulers and extra paper are allowed. 

 

The homework is the best guide for studying.  The following checklist is just advisory.

 

Checklist

 

Qualitative (labels) vs. quantitative (numerical) variables

            Also, for qualitative variables, ordinal vs. nominal

 

Interpreting histograms…

            Given a reasonably well labeled histogram,

                        Estimate the percentile associated with a particular value.

                        Estimate the value associated with a particular percentile.

 

                        Estimate the percentage of cases in any range of values

 

                        If you know the total number of cases, estimate the number of cases in any

                                    bar of the histogram or any range of values

 

Constructing a histogram…

            Understand the “equal area principle” — equal areas in the histograms

                        should represent equal numbers (or percentages) of observations.

 

            You won’t be required to construct a vertical scale on the exam, but you may

                        need to do it anyway to get the box heights right.

 

Describing a distribution…

            Unimodal / bimodal / multimodal

            Symmetrical / skewed left / skewed right

            Outliers

 

Measures of “center”…

            Given a pile of numbers, be able to compute…

                        Average (same as mean or “arithmetic mean”)

                        Median

                        Mode (if it makes sense for your data)

                        “RMS” average  (square root of the average of the squares)

 

                        Any given percentile (Might be called, say, “10th percentile” or

“0.10 percentile)

                        Q1, Q3  (same as 25th, 75th percentile respectively)

 

            Which measures of center are best for what purposes?

 

            If you add the same number to all values, what happens to the average?  Median?

                        Standard deviation?

 

            If you multiply all of the values by the same number, what happens to the

                        average?  Median?  Standard deviation?

 

Measures of spread…

            Given a pile of numbers, be able to compute…

                        The deviations from the average

                        SD

                        IQR ( = interquartile range, same as Q3 – Q1)

 

            If a book or computer program computes a standard deviation that is slightly

                        larger than yours, is one of you necessarily wrong?

 

Robust statistics…

            What does that mean? 

            “Average” and “SD” aren’t very robust.  Median and IQR are more robust.

 

Normal distributions…

            If a distribution isn’t unimodal and reasonably symmetrical, then it can’t very well be normal…

 

            Given that your data are normally distributed…

 

            Given average and SD, convert original values to standard units and vice versa

 

            Given average and SD, what fraction of cases are between __ and __ ?

                        Or, above ___ ?

                        Or, below ___ ?   (OK to use either the book’s method, or the

                                    method we used in class.  Both tables will be part of the exam.)

 

            Given average and SD…

                        Estimate the value corresponding to a given percentile.

                        Estimate the percentile corresponding to a given value.

 

            Given some percentiles, have at least a fighting chance to calculate the average and SD.

 

Scatterplots…

            Given two quantitative (= numerical) variables (for the same list of cases),

                        construct a scatterplot with one point for each case.

 

            From a scatterplot, estimate the average and SD for each variable separately.

 

            From a scatterplot, be able to guess the correlation coefficient (r).

                        (At least, be able to recognize the differences between

                        -0.95, -0.20, 0.00, +0.20, +0.95)

 

            Recognize danger signs…

                        Cases are really in two or more separate groups

                        Outliers are present

 

            Describe the relationship between two variables:

                        Strong or weak?

                        Positive or negative?  (Or too complicated to be either?)

                        Linear or non-linear?

 

            Construct the “SD line.”

                       

Correlation coefficients…

            Given two numerical variables, be able to compute the correlation coefficient (r).

 

            What does it mean for the correlation coefficient to be…

                        Positive vs. negative?

                        Small (close to 0) vs. large (close to +1 or -1) ?

                        Exactly +1 ?

                        Exactly -1 ?

 

            What happens to r when…

                        You add the same number to all values of one of the variables

                        You multiply all values of one variable by the same number

                                    (does it matter if the number is negative?)

 

Combining subgroup averages…

            Given enough information, get an overall average from subgroup averages

                        or vice versa.

 

(end, for now)