Description of this paper

Statistics Lab 6 Activity Solution.............................




Question;The term sampling frame refers to the group that actually had a chance to get into the sample.Ideally, the sampling frame and the population are the same, but sometimes the two aredifferent. In the following situation, describe the population, the sampling frame, the sample, theparameter of interest, and the statistic.Suppose you are want to estimate the mean greenhouse gas emissions from all facilities thatproduce this type of pollution in the United States. An estimated 85-90 percent of the total greenhouse gas emissions from the United States come from the over 8,000 facilities covered by theGreenhouse Gas Reporting Program (GGRP). Facilities covered by the GGRP generally emit25,000 metric tons or more of carbon dioxide equivalent per year in the United States.1. If your sampling frame is facilities that report data to the GGRP, what kinds of facilitiesare excluded?2. How might the exclusion of these facilities affect your estimate of the mean emissionsper facility in the United States?Open the data set GGRP Random Sample. This data set is a random sample of facilitiesthat report data to the GGRP. The data set lists the facility types and the cardon dioxideequivalent greenhouse gases released by those facilities in 2013.3. Verify that the normal approximation to the binomial distribution is accurate for thesedata if you assume that the binomial distribution is valid for the number of facilities thatare direct emitters of greenhouse gases.4. In the 2013 GGRP data, the true proportion of direct emitters is 89.90%. Calculate a95% one-sample proportion confidence interval for the proportion of facilities that aredirect emitters in the entire population. (You can calculate a normal approximationinterval or an exact interval.) Does your 95% confidence interval contain the trueproportion?5. Recalculate the interval with a 99% confidence level. Does your 99% confidence intervalcontain the true proportion?6. Recalculate the interval with a 90% confidence level. Does your 90% confidence intervalcontain the true proportion?7. In your own words, explain why the 99% confidence interval is wider than the 90%confidence interval.8. In the 2013 GGRP data, the true mean emissions per facility per year is 403,437 metrictons of carbon dioxide equivalent. Calculate a 95% one-sample mean confidenceinterval based on the t-distribution for the mean metric tons of carbon dioxide equivalentin the entire population. Does your 95% confidence interval contain the true proportion?9. Write a sentence that explains what this confidence interval means.10. The sample data show strong evidence of right-skewness and one outlier that dwarfsother observations that are also possible outliers according to the 1.5*IQR rule. Identifythe row that includes the outlier.11. Exclude the maximum from the sample and recalculate your 95% confidence interval.Does this confidence interval contain the true mean?12. Explain the effect that excluding the outlier had on the width of the confidence interval.13. If your goal is to estimate the mean of all of the facilities in the GGRP data, explainwhether it would be better or worse to include the outlier in the calculation of theconfidence interval.14. In 2013, the GGRP contained data on 7,893 facilities. Using your preferred confidenceinterval, estimate high and low values for the total emissions from all of the GGRPfacilities.15. The GGRP data for 2013 was released on September 30th, 2014. Given that the entirepopulation data will be available after about roughly 10 months, explain whether youthink there are any advantages to taking a random sample of 90 facilities to estimateemissions prior to that time.


Paper#61562 | Written in 18-Jul-2015

Price : $38