STAT1412 Data Analysis Laboratory
Assignment 3
Semester 2 2014
__________________________________________________________________________________
Due:
• This assignment must be submitted electronically using Assignment 3 Link on FLO provided under Week 11 by 12pm (noon) of Friday 24th October.
• The link will be opened until Thursday 30th October 12pm (noon) for late submission (late penalty applies as mentioned in the SAM).
• Hard copy submission or submission by email will not be accepted.
Weighting: This assignment (out of 40 marks) comprises a total of 3 questions and is worth 15% of your final assessment mark.
Instructions:
• You MUST comply to Academic Integrity as indicated on the electronic submission. Please note that this is an INDIVIDUAL assignment, not a group assignment. Inappropriate collaboration will be penalized.
• Your submission should contain one (1) file in PDF format with size no bigger than 20 MB.
• You can update your submission for unlimited number of times before the due date.
• Refer to the “Statement of Assessment” pdf document on FLO regarding late assignment penalties.
• Medical extension or extension due to compassionate ground may be granted. Only applications with legitimate reasons will be considered.
• Keep a copy of the submission yourself.
Writing Up Your Assignment....
• Answer all questions in this assignment. Questions should be answered in the order they appear.
• MS-Word (or other typesetting software of your choice) may be used in preparing your assignment submission whenever appropriate which would then being converted to pdf.
• You may use any of the tools that you have been shown to assist with calculations. Answers must be written in clear English sentences with all appropriate working and/or supporting computer output shown. Raw computer output without explanatory text is unacceptable.
• All workings and intermediate answers must be clearly shown.
________________________________________________________________________
Page 1 of 3
Question 1 [Total: 10 marks]
Adapted from Moore et al (2012) Exercise 10.54.
The dataset perch.xls contains length (in cms), width (in cms) and weight (in grams) of perch for 12 perch caught in a lake in Finland.
(a) Fit a least squares line using length to predict weight. Comment about this model.
[2 marks]
(b) To find a better model, use a transformation on length. One possibility is to use the square. Fit a least squares line using transformed length to predict weight. Write down the equation of the line (model). [3 marks]
(c) Compare the sample correlation coefficients between models in (a) and (b). Comment.
[2 marks]
(d) Predict the weight of perch when length equals to 35cm and 60cm respectively.
[3 marks]
Marking Criteria: For full marks, you must provide a relevant Excel or R output for your answer AND suitable explanatory text. Marks will be awarded based on the quality of your assessment of the data and how clearly that assessment is communicated.
Question 2 Do you enjoy driving your car? [Total: 13 marks]
Adapted from Moore et al (2012) Exercise 8.26.
The Pew Research Center recently polled 1048 U.S. drivers and found that 69% enjoyed driving their cars.
(a) Construct a 90% confidence interval for the proportion of U.S. drivers who enjoy driving their cars. [3 marks]
(b) In 1991, a Gallup Poll reported this percent to be 79%. Using the data from this poll, test the claim that the percent of drivers who enjoy driving their cars has declined since 1991. Use all steps for hypothesis testing to make a conclusion at a significance level of 10%.
[5 marks]
(c) Explain the correspondence between the confidence interval in Question 2a and a test at a fixed 10% significance level of the hypotheses you listed in Question 2b. [2 marks]
(d) What assumptions or conditions have you made in statistical inference in parts (a) and (b)? Are they being satisfied? [3 marks]
Marking Criteria: You need to show all working. No marks awarded for the correct answer without working out. For full marks, you need to demonstrate understanding of the statistical inference concepts involved in each part of this question as well as an answer expressed in an English sentence.
Page 2 of 3
Question 3 Podcast downloading [Total: 17 marks]
Adapted from Moore et al (2012) Exercise 8.52.
The Podcast Alley Web site recently reported that they have 53,501 podcasts available for downloading with 3,447,545 episodes. A Pew survey of Internet users described the results of two surveys about podcast downloading. The first was conducted between February and April 2006 and surveyed 2,822 Internet users. They found that 198 of these said that they had downloaded a podcast to listen to it or view it later at least once. In a more recent survey, conducted in May 2008, there were 1,553 Internet users. Of this total, 295 said that they had downloaded a podcast to listen to it or view it later.
(a) Find the estimate of the difference between the proportion of Internet users who had ever downloaded podcasts as of February to April 2006 and the proportion as of May 2008.
[2 marks]
(b) Is the large-sample confidence interval for the difference in two proportions appropriate to use in this setting? Explain your answer. [4 marks]
(c) Find the 95% confidence interval for the difference between the proportion of Internet users who had ever downloaded podcasts as of February to April 2006 and the proportion as of May 2008. [5 marks]
(d) Test the hypothesis that the two proportions are equal. Use a significance level of 5%. [6 marks]
Marking Criteria: You need to show all working. No marks awarded for the correct answer without working out. For full marks, you need to demonstrate understanding of the statistical inference concepts involved in each part of this question as well as an answer expressed in an English sentence.
End of Assignment 3
Page 3 of 3
GET ANSWERS / LIVE CHAT