To reach Mrs. Linner by email, please click here.
Come by before school (8:00) for extra help if you need it, but don't wait until the last minute!
Syllabus for APStat.
Standards for AP Statistics.
To reach Mrs. Linner by email, please click here.
Come by before school (8:00) for extra help if you need it, but don't wait until the last minute!
Syllabus for APStat.
Standards for AP Statistics.
Posted at 06:55 PM | Permalink
Hey, we're finally going to do confidence intervals for proportions, that subject we started with back in Aug/Sept, again. This time should be a breeze for you! We'll start with part of section 10.1, then do section 10.3 (proportions), and head back to pick up the parts we missed (10.1 and 10.2). Our Chapter test will be February 9.
First, here are the results of the South Carolina Republican primary for president 2012. (Source: http://www.guardian.co.uk/news/datablog/2012/jan/22/south-carolina-primary-data-mapped) What seems so interesting is that, although Speaker Gingrich amassed 40% of the vote state-wide, the individual counties had widely different results. The z-scores computed use 40% as p and 60% as q. Check out how different the values are for different counties and remember that (1) anyone can vote as a Republican in the open primary system, and (2) low z-values for Speaker Gingrich mean that there must have been lots of votes for his Republican opponents, Governor Romney, Representative Paul, and Senator Santorum in particular. The votes for the other candidates have been dropped from this report so it would fit on the screen.
Now, the first HW assignment of the chapter:
10.1, 10.2, 10.3, 10.7, 10.8
10.45-10.47, 10.49
These should be completed by Friday.
------
Friday, January 27, 2012
We've learned how to create confidence intervals for the population proportion given a sample proportion. The four sections of detail that we have to provide for a complete answer are the SETUP (all the information given, plus the definition of the parameter of interest), the CONDITIONS (where we check to see if we can perform inference, if the p-hats will be Normally distributed, and if the short formula for the standard deviation can be used), ARITHMETIC (including the formula, the numbers inserted, and the result), and the DECISION IN THE CONTEXT OF THE PROBLEM.
Regarding context, if you can't tell the nature of the problem (subject, goal) from what you wrote in a "quadrant," then you need to insert context.
For Monday, finish the book State of Fear, and bring the homework problems that were due today.
St Andrew's in Scotland (where Prince WIlliam went to college) has provided the following biography of W. S. Gosset: http://www-history.mcs.st-andrews.ac.uk/Biographies/Gosset.html
Also, a webpage has been created by fans of statistics to highlight the accomplishments of Sir Francis Galton. You may recognize him from the endnotes of State of Fear.
Posted at 05:31 PM | Permalink
Welcome back!
CHAPTER 9 TEST TUESDAY, JANUARY 24
Today (January 10th) we picked up the syllabus for the new semester, contributed more data to our collection of average penny ages, and investigated a problem related to Simpson's Paradox. We defined Simpson's Paradox as what happens when the "direction" of an association is reversed upon disaggregation of data. We paraphrased: When you break a set of data into components based on a lurking variable and the direction of the association switches you have a Simpson's Paradox.
Parents, ask your kids to show you the data. The situation is mind-blowing.
As an example, consider two students shooting baskets. Student A makes 40% of her two point shots and 30% of her three point shots. Student B makes 1/3 of her two point shots and 1/4 of her three point shots. They each attempt 1000 shots over the course of a season. Student A makes a higher percentage of her shots, right?
If student A shoots 40 two-pointers and 960 three pointers, then she makes 30.4% of her shots.
If student B shoots 900 two pointers and only 100 three pointers, then she makes 32.5% of her shots.
When the data are aggregated, student B has a higher success rate. When the shots are disaggregated by type of shot, student A has a higher success rate for each type of shot.
Behold, Simpson's Paradox.
Sampling Distributions Goals for the chapter:
Sample proportion problems:
Section 9.1: 9.2, 9.5 a-b, 9.6, 9.7, 9.10, 9.14, SHOULD BE DONE BY 1/13
Section 9.2: 9.20, 9.21, 9.22, 9.23, 9.25, 9.26, 9.28, 9.29, 9.30 SHOULD BE DONE BY 1/17
Do not neglect the vocabulary. You should see that your understanding of the vocabulary affects your efficiency solving the problems above.
Sample mean problems will follow.
January 12, 2012
Yesterday and today we explored an example of statistical process control. We used JMP software to review the results of our data collection and saw how quality control people in the workplace would interpret the results.
Our next aspect to investigate is sampling distributions of proportions. Today we determined that the mean of the sample proportions is the true population proportion, and the standard deviation of the sample proportions is sqrt (pq/n). This will be approximately Normal when the conditions would make x, the number of successes, approximately Normal: np>= 10 and nq>=10. We also want the sample to be a small portion of the population, generally less than 10%. Enough of this theoretical stuff. . .we need to investigate!
Special HW due Friday: Spin a post 1970 penny on a flat surface until it stops naturally. Record whether it landed heads up. Repeat until you have 50 observations. Bring the number of heads up (successes) you have in the 50 tries. It will most likely NOT be 25.
Penny update: We have been calculating the average ages of samples of pennies in the classroom since last semester. The formula we should be using is age = 2011 minus the year of the penny. When we realized that many people had been calculating their average ages of the samples of pennies using the base year of 2012, we talked about how that would introduce bias into our results--the estimated age would be a year off from the way we calculated the first set of numbers, and the average sample ages would be bad estimates for the average age as of November 2011. We set aside the flawed data and started anew.
Tuesday, January 17, 2012
You should now understand the conditions under which the distribution of sample proportions will be approximately Normal and be able to compute the areas associated with related inequalities (like P(p-hat is less than .64)).
Wednesday we explore the Central Limit Theorem.
HW for section 9.3: 9.32, 9.33, 9.34, 9.38, 9.39, 9.40, 9.41, 9.45 DUE Thursday
HW for the Chapter review: 9.47, 9.49, 9.52, 9.54, 9.55, 9.58. DUE Friday
HW: find the loose pennies in your household, and create a histogram of their ages.
Last chapter test of the semester: Chapter 8, Thursday, December 15.
problems due Tuesday, December 13:
8.1-8.7, 8.9, 8.13, 8.10, 8.17, 8.21, 8.24, 8.32, 8.34, 8.38, 8.40
8.41, 8.42, 8.48, 8.50, 8.58
8.59, 8.60, 8.63, 8.68
They are in a funny order so you can work related problems back-to-back.
Friday, December 9, 2011
We looked at Bernoulli random variables and saw that Binomial random variables were just combinations of Bernoulli RVs. The mean (E[X]) of the binomial random variable x is np. The standard deviation is sqrt (n*p*(1-p)).
We also computed the probabilities associated with X~binom(n, p): P(x=k) = nCk * p^k * (1-p)^(n-k).
Your calculator has some functions that will calculate these probabilities for you without much trouble, but you should know how to use the formula, too.
Now, about the OLD stuff. There were still FAR too many people who refused to learn the real meaning of R-squared. You need to copy the interpretation given in class 10 times to make it stick in your memory. No more bogus answers!!! I'll start taking off 25 points out of 4.
If you are currently failing, you need to create 2-3 page chapter summaries for each chapter covered this semester (1-8) to demonstrate minimal understanding of the material and prepare for the fianl exam.
Posted at 03:32 PM | Permalink
HW for Chapter 7:
7.3, 7.6, 7.7, 7.8, 7.12, 7.15, 7.24, 7.28, 7.32, 7.37, 7.40, 7.42, 7.43, 7.47, 7.49, 7.51, 7.52, 7.54, 7.55, 7.56, 7.57, 7.58, 7.64 DUE December 6th
Friday, December 2, 2011
We took a step backward and revisited independence today. Please complete the attached worksheet to refine your understanding of mathematical independence.
Download Venn template answers
Percentages cartoon
Monday, Dececmber 5, 2011
We worked through the attached worksheet (or a similar one), calculating means and variances from basic principles and also using the formulas from this chapter.
Random variables worksheet answers
You have to be able to work problems using these methods without assistance--no partially-filled in tables, no headers on columns, etc. You have to KNOW what to do. Practice. Check your work against the answers in the back of the book. Check your answers using the special functions on the calculator.
Wednesday, December 7, 2011
You had a quiz today. The first page is available for download. The second page consisted of problems from the homework.
Answers to second page:
mu = 2.78,
standard deviation = .8669
Mu of Total = 70 minutes
and 7.71%
Answers to the first page will vary according to the probabilities of A and B. Refer to the Venn Diagram worksheets above for help here.
Posted at 03:07 PM | Permalink
The test for this unit will be Thursday, December 1.
The assigned problems for the unit are as follows:
Simulation: 6.2, 6.4, 6.5, 6.6, 6.9, 6.10, 6.11, 6.17, 6.20,
Models: 6.24, 6.26, 6.30, 6.36, 6.41, 6.446.45, 6.46, 6.49. 6.50
Probability Rules: 6.67, 6.68, 6.69, 6.72, 6.75, 6.82, 6.83, 6.87, 6.90, 6.94
We will start with simulation on Friday, November 18th.
Our plan is to finish chapters 6, 7, and 8 before the end of this semester so we can start the new semester with sampling distributions.
-----------------------------------------
Tuesday, November 29, 2011
Monday and Tuesday we worked problems from old AP exams.
2004 #4
1999 #5
2001 #2 and #3
2008 #3
Wednesday we will work problems from 2011, #2 and part of #6.
You can access most of these problems online at apcentral.
_____________________________________________________
Wednesday, November 30, 2011
Today we worked the problems from the 2011 exam in groups.
The worksheet is attached here: Prob activity - FR2011
The solutions are attached here: Prob activity - FR2011 solution
The reminders about the topics on tomorrow's test are here: Chapter 6 test reminders
Posted at 08:56 AM | Permalink
HW due 11/16:
4.2, 4.3, 4.4, 4.5, 4.6, 4.10, 4.12, 4.13, 4.14, 4.41-4.48 (no calculations), 4.50, 4.51, 4.54, 4.57
Please note that there are a wide variety of school activities that will be taking students out of class this week. If you miss class, please make sure that you keep up by reviewing what we have on the blog and completing the homework.
The test will be on November 17th.
Friday, November 11, 2011
Today we learned the steps for transforming exponential data into linear form and then (ta da!!!) used the linear regression formula to model the curve of the original data.
First, we collected and entered the data. If you want to play along, please use these (x,y) pairs:
(0,50), (1,42), (3,29), (5,20), (7,14), (9,10)
Graph them and observe that they look like they follow an exponential decay pattern.
We determine that if the y values have an exponential relationship with the x values, then the ln y values will have a linear relationship with the x values.
Put the ln y values in L3. Graph L3 against L1 and confirm that the data look approximately linear.
Run the LSRL through L1 and L3. Confirm that the LSRL goes through the data.
Now, write down the equation the way the calculator wrote it.
y = 3.90099 - .1802 x
We used the REAL x values to create this equation, so leave the x alone.
We used the LN of y to create this equation, so change y to ln y. And add a hat to make it a predictor.
(ln y)hat = 3.9099 - .1802 x
Unfortunately, we can't enter this equation in this form to graph on our calculators. We have to solve for y.
y hat = e3.9099 (e-.1802)x, which is roughly equivalent to
y hat = 49.89 (.8351)x
This equation makes sense because the theoretical model would be y=50(5/6)x. This is pretty close!
Now, for the big finish. Graph this curvy equation through the original curvy data (L1, L3).
If you've done it according to this scheme, then your curve should go through the data pretty well.
You would follow up with an analysis of the residuals. . .
For power function models, follow the same procedure, but use the natural logs of BOTH the x and y values. You'll see this in practice on Monday.
__________________________________________________________________________________
An explanation break with a story.
My birthday was Saturday. My brother Rob, whose family is Chinese, wrote his birthday greeting for me on my Facebook wall in Chinese. I wanted to respond in Chinese, but, having no skill in this area, had to use Google translater to compose my comment. I wrote my phrase in English, ran it through the translator, and then checked the answer by running the Chinese version back into English. It took a few tries before my response in Chinese resembled my English intentions. I posted my response, and my brother has not acknowledged my excellent command of another language!
This is similar to what we're doing with these non-linear data sets. We're recognizing that they are a different language (shape) from what we can work with (linear). We translate the data into our familiar form and make sure that the fit is good. We get our phrase (formula) and translate it back into the unfamiliar (non-linear) form.
谢谢你,我的第二个弟弟
__________________________________________________________________
Monday, November 14, 2011
Today we collected data again, this time related to a power function. Consider the following pairs of (circumference, volume) data for balls:
(20.1, 5/8), (31, 2), (22.6, 13/16), (13.6, 1/6), (11.1, 1/10), (21, 2/3)
Circumferences were measured in cm. Volumes were measured in cups.
Note that a ball of 0 circumference would have no volume. Then, take the natural log of both the circumference and the volume to straighten the data. Work with the transformed data: find the LSRL, check the residuals. Transform the LSRL back into a curve of the form y-hat = a * x^b.
_______________________
Quick hint for transformations: if the data would hit an axis if the curve were extended, then take the natural log of that variable. Exponentials-ln of y only. Power-ln of both x and y.
Be sure to get a start on the HW problems. There are too many to work in just one night.
Wikipedia's article for Archimedes has a nice explanation of the displacement concept described in class, as well as an animated graphic that shows a more likely method for answering the king's question.
_____________________________________________
Wednesday, November 16, 2011
The test is tomorrow. We checked HW today. If you did not have the homework (which was assigned Friday of last week!!!), then you have a lot of work to prepare for tomorrow's test.
Remember to read through section 4.3 about causation, common response, and confounding. Know the difference between these concepts. Realize that a lurking variable can influence both the presumed explanatory variable and the response variable or it can influence just the response variable. Once it is identified, the "lurking" nature is revealed and it becomes an explanatory variable!
You will not be given credit on the test for using expreg or powerreg functions from the calculator. Your curriculum requires you to straighten the data, so the test will be set up for you to demonstrate your skills.
And, speaking of skills, here's copy of the worksheet from today. You need to be able to read the computer output to create a least squares regression line for the straightened data, then convert the line into the curved form that would match the original data.
Posted at 08:17 AM | Permalink
We're starting our formal pass through Chapter 3 on Friday, continuing on Monday and Wednesday, and testing on Thursday. Since we've already covered this material (August and September), we should be able to master these concepts quickly.
This chapter is all about the explanatory and response variables. Although there may be a relationship between the variables, we cannot determine whether one causes changes in the other without an experiment.
The most common way to depict bivariate data (x,y pairs) is through a scatterplot. When we describe a scatterplot we will include descriptions of the form of the data, the direction, the strength, and any unusual observations. Categorical elements can be added by using special markers or colors in the scatterplot.
Correlation is the measure of a linear association. Positive associations will have positive correlation coefficients. Negative associations will have negative correlation coefficients. The correlation coefficient can be thought of as an average product of the z-scores for the x and y components of all the points (if that is any help). Your calculator will compute the correlation for you if you turn Diagnostics On. Some valuable characteristics of correlation are posted on page 191 of the book. Examples of graphs with a variety of correlation coefficients are posted on page 192. These may be useful if you are a visual learner. Please pay attention to the four cautions on pages 192 and 193.
The least squares regression line is a line of best fit that minimizes the sum of the squared vertical distances between the observed points and the regression line. In statistics we usually use the formula y-hat = a + bx for this line, where y-hat is the predicted value of y for a particular value of x.
The slope, b, is interpreted as the average change in y that we would expect for each additional unit of increase in x. Of course, we would cram as much context as we were provided into the interpretation, for instance:
We expect the sales price of the house to increase by $0.55 for every additional dollar spent on the new kitchen.
And the slope is equal to the correlation coefficient times the std dev of y/the std dev of x.
b = r * sy/sx
Extrapolation is the term used when you use the prediction formula with values of x that are outside the reasonable set of values--for instance using a model that predicts a child's height based on age with adult ages.
Residuals tell us how closely the line fits the data AND their pattern tells us whether the linear model is appropriate. Residuals are the difference between the observed value of y and the predicted value of y.
The coefficient of determination, r^2, is the square of the correlation coefficient and a measure of how much of the variability in the y values (from using the mean) could be eliminated by using the least squares predictions instead of the mean of y to predict a value when x is known. It can be thought of as the effectiveness of the x-value in predicting its y-value.
Section 3.3 is FULL of theoretical and helpful philosophical concepts that will help you get the big picture.
Problems from the book due Wednesday, November 9:
3.5, 3.10, 3,13, 3.17, 3.20, 3.24, 3.29, 3.33, 3.35, 3.37, 3.44, 3,47, 3.55, 3.65, 3.85, 3.86
Please note that most of these problems are odd, so their answers are in the back of the textbook. You can check your work as you go along.
Friday, November 4, 2011
We worked through the complete regression problem from data collection to linreg t-test for predicting the weight of a Fun-size bag of Skittles based on the number of candies within.
Monday, November 7, 2011
We went old school today. We measured the diameters and circumferences of a set of balls to find the relationship between the two variables. Our empirical values of the slope (an estimate for pi) ran from about 2.5 to 3.9. We identified all the steps required for a complete answer to the problem and revisited the linReg T test.
The linear regression t-test tells us how likely our experimental slope is if there is really no relationship between x and y. It looks at the ratio of the slope to the standard error of the slope. If the standard error of the slope is larger than the slope itself, then it is quite likely that there is not really a relationship between the two variables.
Interpreting the p-value of the LinReg T test:
If the p-value is very small (less than 5%), then it is unlikely that we would get a slope as "strong" as what we got if there is not really a relationship between the variables. This represents good evidence that the relationship between x and y is legitimately linear--not just accidental.
If the p-value is not very small (greater than 5%), then we do not have evidence that casts doubt on our hypothesis. We cannot be sure if there is or is not a relationship between the variables.
You will have to perform and interpret the LinReg T-test on Thursday's test.
You will also have to interpret output from a computer program for regression. There was a homework problem that asked you to interpret the output. There are a lot of exam problems that ask you to to the same. Today I handed out three more examples in class.
Here's another example that might help you understand what to look for when interpreting the output.
http://www.jerrydallal.com/LHSP/slrout.htm
We are most interested in the values of the y-intercept and slope, the standard error of the slope, the t-statistic, and the p-value.
If you were given a partially-filled out table, could you find the rest of the numbers?
Wednesday, November 9, 2011
Today we reviewed for the test. A copy of some of our computer output is linked here:
Download EXAMPLES FROM CLASS 11-9
Please review the elements from our first test that applied to linear regression. It would be a shame to miss the same questions AGAIN!
There are some questions that mirror the questions from a recent AP exam near the bottom of the document above. Make sure that you can answer these questions.
Posted at 05:30 PM | Permalink
Please visit the website for the textbook to take a practice quiz before the test on Thursday, 11/3.
http://bcs.whfreeman.com/tps3e/default.asp?s=&n=&i=&v=&o=&ns=0&uid=0&rau=0
October 28, 2011
We began Chapter 2 today with an investigation into standardizing variables. To standardize we find the difference between the observed value and the mean, then we divide by the standard deviation. THis gives us a z-score, which tells us how far in standard deviations a point is away from the mean (and the direction). Z-scores below -3 or above 3 are rare. Z-scores between -1.5 and 1.5 are very common.
We looked at the Normal Probability Plot on the calculator. If the graph on the NPP is relatively straight, then the data look like they are approximately Normally distributed. If the NPP is curvy, then we doubt that they are Normal.
Finally, to compare two measures that have different scales, we standardize the values. For instance, is a 32 on the ACT a worse score than a 1000 on the SAT??? After all, 32 is less than 1000! No, we would standardize the values or turn them into percentile scores to compare these two different measures.
HW due Tuesday: 2.3, 2.4, 2.7, 2.12, 2.13, 2.20, 2.29, 2.30, 2.32, 2.34, 2.40, 2.54, 2.55, 2.56
Key concepts: 68-95-99.7 rule (Empirical rule), Chebyshev's inequality, z-score, percentile
HW due Monday: problem 1 from the 2011 exam. It was handed out in class.
Tuesday, November 1, 2011
We had a not-a-quiz today over some of the skills from this chapter. Practice working problems like these and showing your work.
Download Not a quiz - Normal and density skills
This article shows the application of standardization in curving grades.
Download Score Normalization as a Fair Grading Practice
Read through Chapter 2. Look for key concepts and anything we haven't covered in class. Work some of the problems we did not assign. Answer to the odds are in the back of the book.
Download Chapter 2 Textbook Companion
Visit the textbook website to take the online quiz for Chapter 2.
Posted at 06:53 PM | Permalink
Online resources for the textbook-Chapter 1:
http://bcs.whfreeman.com/tps4e/#t_628644
(glosssary, online quiz!!!!!)
October 14, 2011
We're starting a formal pass through Chapter 1. Today we looked at qualitative (categorical) and quantitative (numerical) data. We split the numerical data into ordinal, interval, and ratio variables. We also looked at a few graphs. HW due Monday 1.3, 1.6, 1.12, and 1.14. (or was it 1.16?)
For Wednesday the 17th have four recently-published graphs that are examples of a bar graph, a histogram, a cumulative frequency graph, and a scatterplot. Collect enough information so you can explain why that type of graph is appropriate and what the graph shows. Wikipedia graphs, graphs from professors' websites, and graphs from other "encyclopedic" sources are not acceptable. Instead, find articles that discuss data and include one of these type of graphs. Pick topics that are interesting. Follow the news, the economy, sports, medicine, or some other topic that you would be interested in studying.
Monday October 17, 2011
Some students did not do the homework assigned on Friday. If you are one of those, then you have the following homework in addition to the regular homework: 1.3, 1.6, 1.12, 1.14, 1.23, 1.31, 1.32, 1.33, 1.38.
The homework for everybody: 1,27, 1,28, and 1.34
About Minitab's stem and leaf graphs: (Page 47)
The first column represents a cumulative count. It is the count from the top or the bottom of the number of observations including that line. The line containing the median is the line with the count in parentheses.
Today we reviewed bar graphs and histograms, learned how to create relative frequency graphs and cumulative frequency (and cumulative relative frequency) plots.
We used the REAL formula for the mean of a population: Sum of (x * P(x))
We reviewed the REAL formula for the std dev of a population: SQRT(Sum of ((x-mu)^2 * P(X))
If you have questions, your first move should be to read the textbook. Please don't waste your time waiting for something to be explained in class when you have an excellent resource in front of you!
Tuesday, October 18, 2011
You did a much better job completing the homework for today. Those of you who did not have homework either of the days will need to complete ALL of the assigned homework AND serve a homework detention next Tuesday afternoon from 3:30 to 4:30.
Some of you still have not taken the test from last Thursday. Your final opportunity will be THursday afternoon from 3:30 to 4:25 in Room 214. Tests will not be returned until these make-ups are complete.
For Wednesday, remember to bring in your graphs of a bar graph, a histogram, a cumulative frequency plot, and a scatterplot. You will need to write about the graph, so make sure that you know what it means. You must choose graphs that come from REAL articles, not from wikipedia or any other encyclopedia-type documents.
For Monday, October 24, you need to complete the following problems: 1.39, 1.40, 1.43, 1.45, 1.46, 1.50, 1.54, 1.56, 1.57. Some of them are really quick. WHen it asks you to compute a mean and standard deviation using the formula, you have to show all the work. You cannot just use 1-var statistics.
The textbook is a valuable resource. Use it frequently. Read the sections prior to the questions. There are excellent examples that you may find useful. Bring it to class every day through Monday.
Tuesday's lesson covered calculating standard deviations, finding the mean and standard deviation of transformed variables (like finding the mean and standard deviation of y when you know the average mand standard deviation of x and the formula for y in terms of x), and the effect of outliers on the mean. We have, therefore, completed Chapter 1.
I will plan to be at CiCi's on Sunday.
___________________________________________________________
Tuesday, October 25, 2011
Test on Chapter 1 is Thursday, October 27th. You've been preparing in class for this test for nearly two weeks. It's time to prepare outside of class.
Well, a lot of people did not do the homework assigned before I left--only 9 problems! We went over the linear transformations portion in class on Monday. No homework was assigned on Monday so everybody could get caught up. Based on many of the quiz grades from today, it looks like some people are waiting until Wednesday night to prepare for Thursday's test. Can you imagine?
The quiz today touched on the mechanics of summary statistics: calculating a sample standard deviation, linear transformations of random variables, and creating a modified box and whisker plot. I'll give them back on Wednesday. It was out of three points in the minor assessment category. People with excused absences will have an "exempt" for this quiz, but unexcused absences will result in a zero. Hey, the first period band kids were even here for the quiz after their lat night at exhibition. You really need to be in class every day, folks.
Class today centered on interpretation of graphs. We looked at the problems from 2004 and 2008's exams, part of the assignment from last Thursday. The problems required students to create boxplots, compare and contrast them, interpret a linear transformation of one of the variables in a boxplot, and estimate where the mean would fall based on the position of the median and the shape of the distribution.
Friday's assignment from last week was a study guide to complete using the general topics from the chapter. By now you all should have completed your first pass through the terms and concepts and should be starting another. You should be re-doing the homeworks assigned and working problems from the chapter review.
All of these assignments and activities lead up to the expectations of the test. On the test you will need to create graphs, compute summary statistics, transform random variables, and perform other skills-based tasks. You will need to interpret graphs and summary statistics. The vocabulary of the chapter will be important. Use your words wisely. You will be asked to synthesize ideas. You are expected to use technology appropriately and to explain your work. You will show your work.
Thursday, October 27
The test was today. If you missed it, you need to make it up Tuesday morning at 7:15 in room 214.
HW: bring in the count of Facebook friends that you have. Next up: Measures of relative position and the Normal distribution.
Posted at 07:45 PM | Permalink
Wednesday, September 21
We're starting a unit on observational studies and experimental design. This will align with Chapter 5 in the text.
TEST October 13.
The concepts for today's lesson were
sampling
experiments
observational studies
sample / population
sampling method
probability sampling methods
Simple Random Sample
Stratified Random Sample
Cluster Sample
Elements of good experimental design: randomness, replication, control
HW: problems 1-5 on the paper handed out in class today
Preview topic for Thursday here.
Need something to do since there's no football game this week? Moneyball hits theaters this weekend (PG-13 language). I have recommended this book for years. It is the story of how the Oakland As used statistics to create a championship team using statistics instead of the usual scouting system.
Interesting article from Clearwater today about a highly unlikely event. (Warning--health related topic)
----------------------------------------------------------------------------------------------------------
Thursday, September 22
We looked at randomness and replication again today. Students picked up a handout with a discussion of the need for both. We also looked over the homework. Please be prepared with the homework every day. Some students did not do the homework and received 0/2 points today for preparedness.
HW: use the table of random digits provided and the example on the back side of that page to work problems 11-13.
Two online activities related to today's topics are http://www.nsa.gov/academia/_files/collected_learning/high_school/statistics/random_number_table.pdf
and http://www.mdk12.org/share/clgtoolkit/lessonplans/RandomDigitTables.pdf
You don't have to do all the sections in these activities, but the more you understand, the better prepared you will be.
--------------------------------------------------------------------------------------------------------
Friday, September 23
We looked at test scores today. Many students did very well. If you were one of those who didn't do well, let's make some plans for improvement. Have you come in before school for help? Have you done all your homework? Have you taken notes every day in class? What can we do to help you "get it?"
We also investigated blocking. We block an experiment when we believe that there is an existing characteristic of some of the experimental units that will influence the response significantly, possibly more significantly than the treatment itself!
Example: imagine a pool of seniors and second graders. We're going to divide them up to train them using one of two different reading programs. Complete randomization would yield results that had really high scores and really low scores because the people had such differing skills going into the experiment. It would be difficult to tell which program was better.
If we blocked on grade, then we are breaking the experiment into two smaller experiments. The range of results for the second graders will be shorter, so it should be easier to differentiate between the results of the two programs. This improves the "power" of the test. Similarly, the scores of the seniors should be closer together than the original sets of scores, so this should follow suit.
HW problems 3 and 4 from the 2001 AP Exam. Email me if you need a copy. __________________________________________________________________________________
September 26 & 27, 2011
We've been splitting our time between simulations and blocking. You should have completed problems 13-15 on the page of problems, plus the fish tank problem from the 1997 exam. Also, if southpaws (left-handed people) make up 10% of the population, use simulation (your calculator or the TORD) to estimate how many people you would have to meet UNTIL you met a left-handed person. Repeat this 19 more times. Formulate a reasonable theoretical estimate for this value.
September 28 & 29, 2011
Thank goodness for the old books! We used the older version of the textbook in pairs to practice simulation. The problems included 54-59 and 78-82. If you need a copy of 78-82, please swing by the classroom and pick up a copy from the pocket on the door.
You will recall that the focus for this week has been split between experimental design and simulation. The two topics intersect in the descriptions of how to randomize.
Experimental design: must include a treatment applied to experimental units. Good experimental design incorporates control, randomness, and replication. Completely randomized designs have no blocking. Blocking MAY be implemented in order to reduce the variability in the results that stems from a pre-existing condition. After blocking into two (or more) groups that are DIFFERENT from each other but homogeneous within the group, the units within the blocks are randomly allocated to treatment groups.
2 blocks x 3 treatments means 6 different treatment groups.
The videos about mathemagic that we saw in class can be found online at http://www.math.hmc.edu/~benjamin/mathemagics/video.html
Be prepared for a quiz over experimental design at any time.
___________________________________________________________________________________
Tuesday, October 4, 2011
We got books today and covered them to protect them. I DO expect the books to remain covered, and that you will bring them to class every day for the next three weeks. We have a lot of catching up to do.
In the last two days we investigated the biases associated with survey design. Specifically, we looked at response and wording bias (both can lead to answers that do not represent the truths for the people surveyed--a validity problem), non-response bias (stemming from the chosen people's refusal to contribute, either actively or passively), undercoverage (when your sampling frame does not cover your population), and the Garbage-in-garbage out twins: voluntary response bias and convenience samples.
We took a moment to discuss validity and reliability. Like the precision and accuracy concepts from science class, these topics are not synonyms, but they are commonly addressed at the same time. Look online for more information about how these concerns are addressed in practice.
HW from the book :) due Wednesday: problems 5.33-5.36. The three pagtes prior to the questions have great explanations of the terms in the problems. Use them!
_____________________________________________
Friday and Monday, October 7 and 10
We considered two problems from previous AP exams, the shampoo problem, and the tai chi problem. Both of these questions deal with experimental design. You were supposed to write up good answers in the journals. I hope that you used your time wisely.
Today we looked at completely randomized designs and blocked designs for experiments and contrasted the methods with stratified sampling for surveys. Both blocking and stratifying break a large group into partitions based on a previously-existing condition because the responses may vary systematically between these groups. In experimental design, the purpose is to reduce the variability in the responses so you can differentiate between the two groups' responses. For survey design, you sample from each of the partitions (strata) and throw the selected participants into one group that should mimic the responses of the entire population. We stratify to make sure that each portion of the population is adequately represented in the sample.
Multistage sampling requires a series of random selections--for instance randomly picking a class period, then randomly picking 10 classes held that period, then randomly picking 10 kids from each of those 10 classes.
Systematic sampling involves some randomness to start, then an arbitrary method of selecting the rest of the participants based on the one random selection.
Simple random sampling is distinguished from these other sampling methods because every possible sample of size n has an equal chance of being selected. In the other methods there are possible combinations of participants that have NO chance of being selected.
Brush up on your methods for using a TORD to select a sample or randomly allocate experimental units to groups.
HW: problems 5.15-5.20 AND at least 4 of these 5.27 - 5.32. If you are weak at these problems, then you need to work MORE of them, not fewer of them!
You will have a test on Thursday over Chapter 5.
Posted at 02:56 PM | Permalink