openintro statistics exercise solutions

\(H_A : \mu _{2009} \ne \mu _{2004}\). In practice, we'd ask for the data to check the skew (which is not provided), but here we will move forward under the assumption that the skew is not extreme (there is some leeway in the skew for such large samples). a 90% confidence level, and we can use the point estimate \(\hat {p} = 052\) in the formula for SE. 6.29 (a) College grads: 23.7%. (b) The residuals will show a fan shape, with higher variability for smaller x. The sample size is at least 30. See the reasoning of 6.1(b). (b) Population: all 18-69 year olds diagnosed and currently treated for asthma. Plug in \(\bar {x}, \bar {y}, and b_1\), and solve for \(b_0\): 51. Z = 3.09. (e) \(\hat {y} = 40 \times (-0.537)+55.34 = 33.86\), \(e = 40 - \hat {y} = 6.14\). Slope: For each additional centimeter in height, the model predicts the average weight to be 1.0176 additional kilograms (about 2.2 pounds). (d) \(0.3 \times 0.3 = 0.09\). There isn't a clear reason why this distribution would be skewed, and since the normal quantile plot looks reasonable, we can mark this condition as reasonably satis ed. Have questions or comments? (b) \(b_0 = 55.34. b_1 = -0.537\). HA : pC \(\ne\) pT . 8.11 Nearly normal residuals: The normal probability plot shows a nearly normal distribution of the residuals, however, there are some minor irregularities at the tails. 2.19 (a) 162/248 = 0.65. 7.1 (a) The residual plot will show randomly distributed residuals around 0. (b) True. (b) No. 2.5 (a) \(0.5^{10} = 0.00098\). 4.45 (a) \(H_0 : p_{2009} = p_{2004}\). 4.19 (a) This claim does not seem plausible since 3 hours (180 minutes) is not in the interval. 2.33 (a) 13. (c) Second distribution has higher median. Solutions manual - compact format. (b) The professor suspects students in a given section may have similar feelings about the course. 3.41 (a) Geometric: \((5/6)^4 \times (1/6) = 0.0804\). Only total_length has a positive association with a possum being from Victoria. 3.7 (a) \(Z = 1.2 \approx 0.1151\). The data also indicate that fewer college grads say they "do not know" than noncollege grads (i.e. The data do not provide strong evidence of temperature warming in the continental US. (b) \(0.5^{10} = 0.00098\). (c) \(\mu = 1/0.471 = 2.12, \sigma = 2.38\). 5.37 \(H_0: \mu _1 = \mu _2 = \dots = \mu _6\). \(H_0 : \mu _{diff} = 0\). If using a normal model with a 0.5 correction, the probability would be calculated as 0.0017. 2.31 For 1 leggings (L) and 2 jeans (J), there are three possible orderings: LJJ, JLJ, and JJL. (b) There is a 19% difference between the pain reduction rates in the two groups. 7.19 (a) First calculate the slope: \(b_1 = R \times \frac {s_y}{s_x} = 0.636 \times \frac {113}{99} = 0.726\). It may also be possible to use a more advanced sampling method, such as strati ed sampling, though the required analysis is beyond the scope of this course, and such a sampling method may be difficult in this context. (c-i) H0: Independence model. (c) Negative binomial: 0.0193. Solutions cannot be shared except in the limited capacity described in the PDFs. Subscript T means truck drivers. We are 95% confident that the proportion of Democrats who support the plan is 23% to 33% higher than the proportion of Independents who do. (b-ii) \(E_{row_2;col_2} = \frac {(row 2 total) \times (col 2 total)}{table total} = \frac {150 \times 230}{300} = 115\). (d) \(Z = 1.60 \rightarrow\) p-value = 0.0548. There are potential outliers on the higher end. (c) The answers are very close because only the units were changed. Even when a patient tests positive for lupus, there is only a 7.14% chance that he actually has lupus. The sample size is at least 30. Furthermore, the data indicate that the direction of that difference is that accidents are lower on Friday the 6th relative to Friday the 13th. HA: Alternate model. (d) Even though the population distribution is not normal, the conditions for inference are reasonably satis ed, with the possible exception of skew. The data provide strong evidence that the true proportion of those who once a month or less frequently delete their spam email was higher in 2004 than in 2009. (c) Males are taller on average and they drive faster. We are 95% confident that the average body fat percentage in men is 11.32% to 10.88% lower than the average body fat percentage in women. 4.21 Independence: The sample is presumably a simple random sample, though we should verify that is the case. In practice, we would ask to see the data to check this condition, but here we will move forward under the assumption that it is not strongly skewed. The skew is strong, but the sample is very large so this is not a concern. However, since the sample sizes are extremely large, even extreme skew is acceptable. (c) Construct Z scores for the values from part (b) but using the supposed true distribution (i.e. 6.51 The subscript pr corresponds to provocative and con to conservative. Grading Workflow for Labs. (b) 6.94%. (b) Binomial: 0.0322. (Note: we would generally use a computer to perform these simulations.) (d) Invalid. (d) \(0.0024+0.0284+0.1323 = 0.1631\). (c) Expected values are the same, but the SDs differ. 4.35 The centers are the same in each plot, and each data set is from a nearly normal distribution (see Section 4.2.6), though the histograms may not look very normal since each represents only 100 data points. 8.13 (a) There are a few potential outliers, e.g. (c) The change would have shifted the con dence interval by 1 pound, yielding CI = (0.109; 5.891), which does not include 0. The data do not provide strong evidence that the rates of sleep deprivation are different for non-transportation workers and truck drivers. \(1.65 \sqrt {\frac {0.52(1 - 0.52)}{n}} \le 0.01\). (b) Let \(p_{CG}\) and \(p_{NCG}\) represent the proportion of college graduates and noncollege graduates who responded "do not know". (b) 0.200. There is one slightly distant observation on the lower end, but it is not extreme. (b) df = 21 - 1 = 5, \(t^*_20 = 2.53\) (column with two tails of 0.02, row with df = 20). (d) Yes, since the hypothesis of no difference was not rejected in part (c). HA : \(\mu\) > 15 (The average amount of company time each employee spends not working is greater than 15 minutes for March Madness.). For more information contact us at info@libretexts.org or check out our status page at https://status.libretexts.org. (e) 1 - 0.47 = 0.53. (c) Changing units doesn't affect correlation: R = 0.636. (d) \(\hat {travdel time} = 51 + 0.726 \times distance = 51 + 0.726 \times 103 \approx 126 minutes\). 3.25 (a) \(0.875^2 \times 0.125 = 0.096\). Leads to the Github repository Why the term "Data Science" is so confusing. 7.15 In each part, we may write the husband ages as a linear function of the wife ages: (a) \(age_H = age_W + 3\); (b) \(age_H = age_W - 2\); and (c) \(age_H = age_W/2\). Since the p-value is low, we reject H0. 1.19 (a) Experiment, as the treatment was assigned to each patient. The students are not a random sample. Variance: 8.95. 2.3 (a) 10 tosses. (b) True. The p-value for the two-sided alternative hypothesis ( \(\beta _1 \ne 0\)) is incredibly small, so the p-value for the onesided hypothesis will be even smaller. H_A : \mu \ne 130\). (c)We can predict spending for a given number of tourists using a regression line. However, it would be more appropriate to use the point estimate of the sample. (b-i) 0.21. So the three independent games is lower risk, but in this context it just means we are likely to lose a more stable amount since the expected value is still negative. 4.33 (a) The distribution is unimodal and strongly right skewed with a median between 5 and 10 years old. Exercise 8.44. Additionally, there are only 5 interventions under the provocative scenario, so the success-failure condition does not hold. Non-smoker: \(123.05 - 8.94 \times 0 = 123.05\) ounces. Since the p-value is very small, we reject H0. This would suggest that rosiglitazone increases the risk of such problems. 7.27 (a) There is a negative, moderate-to-strong, somewhat linear relationship between percent of families who own their home and the percent of the population living in urban areas in 2010.