The children at two large schools, \(P\) and \(Q\), are all given the same puzzle to solve. A random sample of size 10 is taken from the children at school \(P\). Their individual times to complete the puzzle give a sample mean of 9.12 minutes and an unbiased variance estimate of 2.16 minutes \({ }^{2}\). A random sample of size 12 is taken from the children at school \(Q\). Their individual times, \(x\) minutes, to complete the puzzle are summarised by
\(\sum x=99.6 \quad \sum(x-\bar{x})^{2}=21.5\)
where \(\bar{x}\) is the sample mean. Times to complete the puzzle are assumed to be normally distributed with the same population variance.
Test at the \(5 \%\) significance level whether the population mean time taken to complete the puzzle by children at school \(P\) is greater than the population mean time taken to complete the puzzle by children at school \(Q\).
An inspector is checking the lengths of metal rods produced by two machines, \(X\) and \(Y\). These rods should be of the same length, but the inspector suspects that those made by machine \(X\) are shorter, on average, than those made by machine \(Y\). The inspector chooses a random sample of 80 rods made by machine \(X\) and a random sample of 60 rods made by machine \(Y\). The lengths of these rods are \(x \mathrm{~cm}\) and \(y \mathrm{~cm}\) respectively. Her results are summarised as follows.
\(\sum x=164.0 \quad \sum x^{2}=338.1 \quad \sum y=124.8 \quad \sum y^{2}=261.1\)
(a) Test at the 10\% significance level whether the data supports the inspector's suspicion.
(b) Give a reason why it is not necessary to make any assumption about the distributions of the lengths of the rods.
A company has two machines, \(A\) and \(B\), which independently fill small bottles with a liquid. The volumes of liquid per bottle, in suitable units, filled by machines \(A\) and \(B\) are denoted by \(x\) and \(y\) respectively. A scientist at the company takes a random sample of 40 bottles filled by machine \(A\) and a random sample of 50 bottles filled by machine \(B\). The results are summarised as follows.
\(\sum x=1120 \quad \sum x^{2}=31400 \quad \sum y=1370 \quad \sum y^{2}=37600\)
The population means of the volumes of liquid in the bottles filled by machines \(A\) and \(B\) are denoted by \(\mu_{A}\) and \(\mu_{B}\).
(a) Test at the 2% significance level whether there is any difference between \(\mu_{A}\) and \(\mu_{B}\).
(b) Find the set of values of \(\alpha\) for which there would be evidence at the \(\alpha \%\) significance level that \(\mu_{A}-\mu_{B}\) is greater than 0.25 .
A company manufactures copper pipes. The pipes are produced by two different machines, \(A\) and \(B\). An inspector claims that the mean diameter of the pipes produced by machine \(A\) is greater than the mean diameter of the pipes produced by machine \(B\). He takes a random sample of 12 pipes produced by machine \(A\) and measures their diameters, \(x\) cm. His results are summarised as follows.
\(\sum x=6.24 \quad \sum x^2=3.26\)
He also takes a random sample of 10 pipes produced by machine \(B\) and measures their diameters in cm. His results are as follows.
| 0.48 | 0.53 | 0.47 | 0.54 | 0.54 | 0.55 | 0.46 | 0.55 | 0.50 | 0.48 |
The diameters of the pipes produced by each machine are assumed to be normally distributed with equal population variances.
Test at the 2.5% significance level whether the data supports the inspector's claim.
A scientist is investigating the masses of birds of a certain species in country \(X\) and country \(Y\). She takes a random sample of 50 birds from country \(X\) and a random sample of 80 birds from country \(Y\). She records their masses in kg, \(x\) and \(y\), respectively. Her results are summarised as follows.
\(\sum x=75.5 \quad \sum x^2=115.2 \quad \sum y=116.8 \quad \sum y^2=172.6\)
The population mean masses of these birds in countries \(X\) and \(Y\) are \(\mu_x\) kg and \(\mu_y\) kg respectively.
Test, at the 5% significance level, the null hypothesis \(\mu_x=\mu_y\) against the alternative hypothesis \(\mu_x\gt \mu_y\). State your conclusion in the context of the question.
A scientist is investigating the lengths of the leaves of birch trees in different regions. He takes a random sample of 50 leaves from birch trees in region \(A\) and a random sample of 60 leaves from birch trees in region \(B\). He records their lengths in \(\mathrm{cm}, x\) and \(y\), respectively. His results are summarised as follows.
\(\sum x=282 \quad \sum x^{2}=1596 \quad \sum y=328 \quad \sum y^{2}=1808\)
The population mean lengths of leaves from birch trees in regions \(A\) and \(B\) are \(\mu_{A} \mathrm{~cm}\) and \(\mu_{B} \mathrm{~cm}\) respectively.
Carry out a test at the \(5 \%\) significance level to test the null hypothesis \(\mu_{A}=\mu_{B}\) against the alternative hypothesis \(\mu_{A} \neq \mu_{B}\).
A scientist is investigating the masses of a particular type of fish found in lakes \(A\) and \(B\). He chooses a random sample of 10 fish of this type from lake \(A\) and records their masses, \(x \mathrm{~kg}\), as follows.
| 2.1 | 1.8 | 0.9 | 3.0 | 2.4 | 2.6 | 1.8 | 2.2 | 1.9 | 2.5 |
The scientist also chooses a random sample of 12 fish of this type from lake \(B\), but he only has a summary of their masses, \(y \mathrm{~kg}\), as follows.
\(\sum y=24.48 \quad \sum y^{2}=53.75\)
Test at the \(10 \%\) significance level whether the mean mass of fish of this type in lake \(A\) is greater than the mean mass of fish of this type in lake \(B\). You should state any assumptions that you need to make for the test to be valid.
A company has two different machines, \(X\) and \(Y\), each of which fills empty cups with coffee. The manager is investigating the volumes of coffee, \(x\) and \(y\), measured in appropriate units, in the cups filled by machines \(X\) and \(Y\) respectively. She chooses a random sample of 50 cups filled by machine \(X\) and a random sample of 40 cups filled by machine \(Y\). The volumes are summarised as follows.
\(\sum x=15.2 \quad \sum x^{2}=5.1 \quad \sum y=13.4 \quad \sum y^{2}=4.8\)
The manager claims that there is no difference between the mean volume of coffee in cups filled by machine \(X\) and the mean volume of coffee in cups filled by machine \(Y\).
Test the manager's claim at the \(10 \%\) significance level.
Students at two colleges, \(A\) and \(B\), are competing in a computer games challenge.
(a) The time taken for a randomly chosen student from college \(A\) to complete the challenge has a normal distribution with mean \(\mu\) minutes. The times taken, \(x\) minutes, are recorded for a random sample of 10 students chosen from college \(A\). The results are summarised as follows.
\(\sum x=828 \quad \sum x^{2}=68622\)
A test is carried out on the data at the \(5 \%\) significance level and the result supports the claim that \(\mu\gt k\).
Find the greatest possible value of \(k\).
(b) A random sample of 8 students is chosen from college \(B\). Their times to complete the same challenge give a sample mean of 79.8 minutes and an unbiased variance estimate of 9.966 minutes \(^{2}\).
Use a 2 -sample test at the \(5 \%\) significance level to test whether the mean time for students at college \(B\) to complete the challenge is the same as the mean time for students at college \(A\) to complete the challenge. You should assume that the two distributions are normal and have the same population variance.
Question 11 OR alternative.
The times taken to run 200 metres at the beginning of the year and at the end of the year are recorded for each member of a large athletics club. The time taken, in seconds, at the beginning of the year is denoted by \(x\) and the time taken, in seconds, at the end of the year is denoted by \(y\). For a random sample of 8 members, the results are shown in the following table.
| Member | \(A\) | \(B\) | \(C\) | \(D\) | \(E\) | \(F\) | \(G\) | \(H\) |
|---|---|---|---|---|---|---|---|---|
| \(x\) | 24.2 | 23.8 | 22.8 | 25.1 | 24.5 | 24.0 | 23.8 | 22.8 |
| \(y\) | 23.9 | 23.6 | 22.8 | 24.5 | 24.2 | 23.5 | 23.6 | 22.7 |
\[ \left[\sum x=191,\quad \sum x^{2}=4564.46,\quad \sum y=188.8,\quad \sum y^{2}=4458.4,\quad \sum xy=4510.99\right] \]
(i) Find, showing all necessary working, the equation of the regression line of \(y\) on \(x\).
The athletics coach believes that, on average, the time taken by an athlete to run 200 metres decreases between the beginning and the end of the year by more than 0.2 seconds.
(ii) Stating suitable hypotheses and assuming a normal distribution, test the coach's belief at the \(10\%\) significance level.
Question 11 OR alternative.
The times taken to run 200 metres at the beginning of the year and at the end of the year are recorded for each member of a large athletics club. The time taken, in seconds, at the beginning of the year is denoted by \(x\) and the time taken, in seconds, at the end of the year is denoted by \(y\). For a random sample of 8 members, the results are shown in the following table.
| Member | \(A\) | \(B\) | \(C\) | \(D\) | \(E\) | \(F\) | \(G\) | \(H\) |
|---|---|---|---|---|---|---|---|---|
| \(x\) | 24.2 | 23.8 | 22.8 | 25.1 | 24.5 | 24.0 | 23.8 | 22.8 |
| \(y\) | 23.9 | 23.6 | 22.8 | 24.5 | 24.2 | 23.5 | 23.6 | 22.7 |
\[ \left[\sum x=191,\quad \sum x^{2}=4564.46,\quad \sum y=188.8,\quad \sum y^{2}=4458.4,\quad \sum xy=4510.99\right] \]
(i) Find, showing all necessary working, the equation of the regression line of \(y\) on \(x\).
The athletics coach believes that, on average, the time taken by an athlete to run 200 metres decreases between the beginning and the end of the year by more than \(0.2\) seconds.
(ii) Stating suitable hypotheses and assuming a normal distribution, test the coach's belief at the \(10\%\) significance level.
Question 11 OR alternative.
A large number of people attended a course to improve the speed of their logical thinking. The times taken to complete a particular type of logic puzzle at the beginning of the course and at the end of the course are recorded for each person. The time taken, in minutes, at the beginning of the course is denoted by \(x\) and the time taken, in minutes, at the end of the course is denoted by \(y\). For a random sample of 9 people, the results are summarised as follows.
\[\sum x=45.3,\quad \sum x^2=245.59,\quad \sum y=40.5,\quad \sum y^2=195.11,\quad \sum xy=218.72.\]
Ken attended the course, but his time to complete the puzzle at the beginning of the course was not recorded. His time to complete the puzzle at the end of the course was 4.2 minutes.
(i) By finding, showing all necessary working, the equation of a suitable regression line, find an estimate for the time that Ken would have taken to complete the puzzle at the beginning of the course.
The values of \(x-y\) for the sample of 9 people are as follows.
\[0.2,\quad 0.8,\quad 0.5,\quad 1.0,\quad 0.2,\quad 0.6,\quad 0.2,\quad 0.5,\quad 0.8.\]
The organiser of the course believes that, on average, the time taken to complete the puzzle decreases between the beginning and the end of the course by more than 0.3 minutes.
(ii) Stating suitable hypotheses and assuming a normal distribution, test the organiser's belief at the \(2\frac12\%\) significance level.
During the summer months, all members of a large swimming club take part in intensive training. The times taken to swim 50 metres at the beginning of the summer and at the end of the summer are recorded for each member of the club. The time taken, in seconds, at the beginning of the summer is denoted by \(x\) and the time taken at the end of the summer is denoted by \(y\). For a random sample of 9 members the results are shown in the following table.
| Member | \(A\) | \(B\) | \(C\) | \(D\) | \(E\) | \(F\) | \(G\) | \(H\) | \(I\) |
|---|---|---|---|---|---|---|---|---|---|
| \(x\) | 38.5 | 40.2 | 32.3 | 35.1 | 36.2 | 41.4 | 32.0 | 38.2 | 38.2 |
| \(y\) | 37.4 | 38.1 | 31.6 | 34.7 | 34.2 | 38.6 | 31.8 | 36.3 | 36.8 |
The swimming coach believes that, on average, the time taken by a swimmer to swim 50 metres will decrease by more than one second as a result of the intensive training.
(i) Stating suitable hypotheses and assuming a normal distribution, test the coach's belief at the \(10\%\) significance level.
(ii) Find a \(95\%\) confidence interval for the population mean time taken to swim 50 metres after the intensive training, assuming a normal distribution.
Nine athletes in a club have a new coach. The coach adopts a new training programme which he believes will reduce the race times of these athletes. Each athlete completes a 1500 m time trial before and after completing the new training programme. Their times, in seconds (s), are recorded.
| Athlete | \(A\) | \(B\) | \(C\) | \(D\) | \(E\) | \(F\) | \(G\) | \(H\) | \(I\) |
|---|---|---|---|---|---|---|---|---|---|
| Time before training (s) | \(250\) | \(251\) | \(252\) | \(267\) | \(276\) | \(291\) | \(310\) | \(320\) | \(335\) |
| Time after training (s) | \(245\) | \(251\) | \(253\) | \(261\) | \(275\) | \(293\) | \(302\) | \(313\) | \(320\) |
(a) Carry out a paired \(t\)-test at the \(5\%\) significance level to test the coach's belief.
Further research suggests that the effects of the training programme tend to reduce the times of the slower athletes by more than those of the faster athletes.
(b) Suggest a reason why the paired \(t\)-test used in part (a) may not have been an appropriate test in this case.
(c) Suggest a suitable alternative test that could have been used instead of a paired \(t\)-test.
| \cline { 2 - 11 } | pair | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| \cline { 2 - 11 } | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| machine \(X\) | 65 | 73 | 58 | 61 | 72 | 79 | 64 | 65 | 69 | 71 |
| machine \(Y\) | 68 | 72 | 64 | 63 | 75 | 82 | 63 | 63 | 72 | 74 |
Jade is a swimming instructor at a sports college. She claims that, as a result of an intensive training course, the mean time taken by students to swim 50 metres has reduced by more than 1 second. She chooses a random sample of 10 students. The times taken, in seconds, before and after the training course are recorded in the table.
(a) Test, at the \(10 \%\) significance level, whether Jade's claim is justified.
(b) State an assumption that is necessary for this test to be valid.
Scientists are studying the effects of exercise on LDL blood cholesterol levels. Over a three-month period, a large group of people exercised for 20 minutes each day. For a randomly chosen sample of 10 of these people, the LDL blood cholesterol levels were measured at the beginning and the end of the three-month period. The results, measured in suitable units, are as follows.
| Person | A | B | C | D | E | F | G | H | I | J |
|---|---|---|---|---|---|---|---|---|---|---|
| Beginning | 72 | 84 | 120 | 90 | 102 | 135 | 64 | 75 | 80 | 88 |
| End | 64 | 76 | 105 | 92 | 105 | 115 | 67 | 75 | 75 | 84 |
(a) Test, at the \(2.5\%\) significance level, whether there is evidence that the population mean LDL blood cholesterol level has reduced by more than 2 units after the three-month period.
(b) State any assumption that you have made in part (a).
A manager is investigating the times taken by employees to complete a particular task as a result of the introduction of new technology. He claims that the mean time taken to complete the task is reduced by more than 0.4 minutes. He chooses a random sample of 10 employees. The times taken, in minutes, before and after the introduction of the new technology are recorded in the table.
| Employee | A | B | C | D | E | F | G | H | I | J |
|---|---|---|---|---|---|---|---|---|---|---|
| Time before new technology | 10.2 | 9.8 | 12.4 | 11.6 | 10.8 | 11.2 | 14.6 | 10.6 | 12.3 | 11.0 |
| Time after new technology | 9.6 | 8.5 | 12.4 | 10.9 | 10.2 | 10.6 | 12.8 | 10.8 | 12.5 | 10.6 |
(a) Test at the \(10\%\) significance level whether the manager's claim is justified.
(b) State an assumption that is necessary for this test to be valid.
A random sample of 9 members is taken from the large number of members of a sports club, and their heights are measured. The heights of all the members of the club are assumed to be normally distributed. A 95\% confidence interval for the population mean height, \(\mu\) metres, is calculated from the data as \(1.65 \leqslant \mu \leqslant 1.85\).
(i) Find an unbiased estimate for the population variance.
(ii) Denoting the height of a member of the club by \(x\) metres, find \(\Sigma x^{2}\) for this sample of 9 members.
The heights, in metres, of a random sample of 8 trees of a particular type are as follows.
\[\begin{array}{llllllll}
14.2 & 11.3 & 10.8 & 8.4 & 12.8 & 11.5 & 12.1 & 9.2
\end{array}\]
Assuming that heights of trees of this type are normally distributed, calculate a 95\% confidence interval for the mean height of trees of this type.