Baixe o app para aproveitar ainda mais
Prévia do material em texto
CASE STUDY 8: RESOLUTION ENSAIOS DE HIPÓTESES GRAÇA TRINDADE ISCTE – IUL 2012-2013 1 CASE STUDY 8: RESOLUTION Given the importance of an event like Rock in Rio, the municipality ordered a study by a research center that collected a random sample of residents of Lisbon. At first, we tried to evaluate the impact of this event on the residents of the parish where this has been accomplished. A. To see if the residents are more receptive to the location of the Rock in Rio in Parque da Bela Vista, it was evaluated the degree of satisfaction with the events of 2004 and 2006. To this end it was constructed two indices of measure the degree of Satisfaction (measured on a scale of 1 - not at all satisfied to 10 - very satisfied), having been obtained the following results: TABLE A: Paired Samples Statistics N Mean Std. Deviation Std. Error of Mean Pair 1 Degree of Satisfaction Rock in Rio in 2004 207 4,11 2,681 ,186 Degree of Satisfaction Rock in Rio in 2006 207 8,37 2,034 ,141 TABLE B: Paireds Samples Correlation N Correlation Sig Pair 1 Degree of Satisfaction Rock in Rio in 2004 207 -,451 ,000 Degree of Satisfaction Rock in Rio in 2006 TABLE C: Paired Differences t df Sig (2-tailed) Mean Std. Deviation Std. Error Mean Pair 1 Degree of Satisfaction Rock in Rio in 2004 - Degree of Satisfaction Rock in Rio in 2006 -4,256 4,030 ,280 -15,195 206 ,000 a) Given the variables in the analysis and the assumptions underlying it, do you consider the statistical procedure appropriate? Justify. The parametric t test for the mean difference of paired samples is appropriate since the variables are quantitative and it is intended to measure the degree of satisfaction of the same individuals at two different years (in 2004 and 2006) it is pretended to test if the mean difference between the degree of satisfaction with the event from 2004 for 2006 is zero. This test has one condition and one assumption: 1. CONDITION - The original variables should be correlated; 2. ASSUMPTION - Normality of the new variable difference. CASE STUDY 8: RESOLUTION ENSAIOS DE HIPÓTESES GRAÇA TRINDADE ISCTE – IUL 2012-2013 2 b) What is the relevance of presenting the information contained in Table B? What may be concluded from the results presented in that table (=0,05)? Justify. Table B allows us to analyze one of the conditions to perform a test for paired samples. From the test of the population correlation coefficient in which H0 is equal to zero vs different from zero, a decision can be taken Ho: =0 Ha: 0 Decision: There is a negative sample correlation between the variables which is median ( ) and with an associated probability of 0.000, almost zero ( 0.05), it can be rejected the hypothesis of no correlation in the population between the variables under analysis. We can proceed with this statistical procedure. c) Based on these results can we conclude that the inhabitants of that parish from 2004 to 2006, increased their level of satisfaction with the Rock in Rio ( = 0.05)? Justify. Before any decision it is necessary to validate that the variable difference follows a normal distribution. From samples size = 207> 30, the central limit theorem validate this assumption. So, it can be said that the variable Difference in the Degree of satisfaction of the Rock in Rio 2004 to 2006 approximately follows a normal distribution. The hypotheses of the principal test are the following: H0: D 0, that is, the mean difference between the degrees of satisfaction with the Rock in Rio between 2004 and 2006 is greater than or equal to zero Ha: D < 0, the mean difference between the degrees of satisfaction with the Rock in Rio is less than zero, which means that individuals increased their level of satisfaction from 2004 to 2006 DECISION: with a test value T = -15.195 (the test value is consistent with the alternative hypothesis) and an associated probability of 0.000/2, almost zero, there is statistical evidence to claim that the population on average the difference between the degree of satisfaction with the Rock in Rio between 2004 and 2006 is less than zero. This means that there was an increase in the degree of satisfaction of individuals with regard to this event between 2004 and 2006. d) What is the alternative to the non-parametric test performed? Formulate the statistical hypotheses for that test. What is the main difference for the test shown in Table C. The alternative to the parametric t test for equality of means of paired samples is the Wilcoxon test. H0: The distribution of the Degree of satisfaction with the Rock in Rio in 2004 is at least equal to the distribution of the degree of satisfaction to the same event in 2006 Ha: the distribution of the Degree of satisfaction with the Rock in Rio in 2004 is lower than the distribution of the degree of satisfaction with the same event in 2006 CASE STUDY 8: RESOLUTION ENSAIOS DE HIPÓTESES GRAÇA TRINDADE ISCTE – IUL 2012-2013 3 The main difference is that like any nonparametric the variable Difference is not treated as a quantitative but a qualitative ordinal variable and the values of the variable are ordered so that we can speak in terms of the differences between the ranking of the values of the two variables instead of the mean of the differences between the values of the two variables. Also, in this non-parametric test, there isn’t any assumption to validate. B. It is intended to analyze the relationship between the willingness of residents to participate in this event ("Number of days you think you are going to the Rock in Rio in 2008") and the perception of the ease in buying tickets (measured in prices) to watch the performances (measured on a scale of 1 - not available to 10 - very handy). The following results were obtained: TABLE D: Descriptives Degree of perception of buying the tickets N Mean Std. Deviation Std. Error 95% Confidence interval for Mean Minimum Maximum Lower Bound Upper Bound None 1 day 2 days 3-5 days Total 34 105 51 33 223 2,62 2,92 4,75 9,09 4,20 2,374 1,455 ,440 1,128 2,617 ,407 ,102 ,062 ,196 ,175 1,79 2,63 4,62 8,69 3,86 3,45 3,20 4,87 9,49 4,55 1 1 4 6 1 10 8 5 10 10 TABLE E: Test of Homogeneity of Variances Levene Statistic df1 df2 Sig 13,387 3 219 ,000 TABLE F: ANOVA Sum of Squares dfr Mean Saqure F Sig Between Groups 1063,248 3 354,416 169,963 ,000 Within Groups 456,672 219 2,085 Total 1519,919 222 TABLE G: Ranks TABLE H: Test Statistics N Mean Rank Degree of perception of buying the tickets None 1 day 2 days 3-5 days Total 34 105 51 33 223 63,79 79,13 151,66 204,97 Degree of perception of buying the tickets Chi-Square Df Asymp. Sig. 137,590 3 ,000 CASE STUDY 8: RESOLUTION ENSAIOS DE HIPÓTESES GRAÇA TRINDADE ISCTE – IUL 2012-2013 4 Tabela I: Multiple Comparisons Dependent variable: Degree of perception to buy the tickets in 2008 (I) Number of days to go to the Rock in Rio (J) Number of days to go to the Rock in Rio Mean Difference (I-J) Std. Error Sig None 1 day 2 days 3-5 days -,297 -2,127 -6,473 ,431 ,412 ,452 ,901 ,000 ,000 1 day None 2 days 3-5 days ,297 -1,831 -6,177 ,431 ,155 ,242 ,901 ,000 ,000 2 days None 1 day 3-5days 2,127 1,831 -4,346 ,412 ,155 ,206 ,000 ,000 ,000 3-5 days None 1 day 2 days 6,473 6,177 4,346 ,452 ,242 ,206 ,000 ,000 ,000 The mean difference is significance at the ,05 level. a) Given the variables in analysis what is the proper procedure? Justify. We are presented with two variables: an ordinal treated as nominal with more than two categories ("Number of days to go to the Rock in Rio") and a quantitative variable "Degree of perception of buying the tickets" measured in a Lickert scale (1- not accessible to 10 -very accessible). It is intended to analyze the relationship between the willingness of residents to participate in this event and the perception of the ease in buying tickets (dependent variable). The appropriate procedure will be the single parameter analysis of variance (One-Way ANOVA). b) What are the assumptions of the test shown in Table F? What can be concluded about the verification of these assumptions ( = 0.05)? Justify In Table F there is the One-Way ANOVA test which should meet three assumptions: 1. The samples are independent 2. Normality of variable "Degree of perception to buy the tickets" in each of the categories of the independent factor "Number of days to go to the Rock in Rio" - as the three samples are greater than 30, by the central limit theorem, it can be assumed that the variable "Degree of perception to buy the tickets" follows approximately a normal distribution in all categories. 3. Homoscedasticity of variances population: through the Levene’ test (Table E) it is tested the assumption of equal variances: H0: the variance of the variable "Degree of perception to buy the tickets" is the same for all categories ("Number of days to go to the Rock in Rio) Ha: the variance of the variable "Degree of perception to buy the tickets" is different in at least one of the categories ("Number of days to go to the Rock in Rio) CASE STUDY 8: RESOLUTION ENSAIOS DE HIPÓTESES GRAÇA TRINDADE ISCTE – IUL 2012-2013 5 Decision: with a F value = 13.387 and an associated significance virtually zero, H0 is rejected and it is considered that the variances of the variable "Degree of perception to buy the tickets" are different in at least one of the categories ("Number of days to go to the Rock in Rio). c) What can be concluded about the relationship between the variables in the analysis? Formulate the hypotheses to the test and take the appropriate decision (=0,05). The appropriate test is the non-parametric Kruskal-Wallis test. H0: the mean ranking of the values of the variable "Degree of perception of buying the tickets" is the same for all categories of the variable "Number of days to go to the Rock in Rio" Ha: the mean ranking of the values of the variable "Degree of perception to buy the tickets" is different in at least one of the categories of the variable "Number of days to go to the Rock in Rio" Decision: as the significance associated with the value of the test =137.59 is practically zero, we reject H0 and it is assumed that the mean ranking of the values of the "Degree of perception of buying the tickets" is different in at least one of the categories of "Number of days to go to the Rock in Rio" d) What is the underlying purpose of the statistical procedure to Table I. Say if you think it fits the problem and take the conclusions which seem to be more relevant to the question under analysis. The procedure under the Table I is the Dunnett-C test of multiple comparisons. This test is appropriate when the quantitative variable under testing failed the assumption of the equality of population variances. Because it was reject H0 in the Kruskal-Wallis which means that there is a relationship between the variables, it is desirable to know which group(s) is(are) responsible(s) for this(these) rejection. It is concluded that those who think going to watch one or no days at Rock in Rio are significantly different from those that think to go two or 3-5 days and these two last groups are also significantly different between them. One can find three groups: 1. Those that don’t plan to go or think to do just one day, on average, give a great difficulty in buying tickets due to the high price. Considered, 2.62 and 2.91 to be significantly equal from each other, respectively, despite having a high dispersion CASE STUDY 8: RESOLUTION ENSAIOS DE HIPÓTESES GRAÇA TRINDADE ISCTE – IUL 2012-2013 6 (2.374 and 2.91, respectively) and the sample size of those who think just go one (105) day to be the triple from the sample size of those who do not think (34). 2. A group of those that think to go two days are already located in the middle of the scale (4.75) and with a small dispersion, that is, they consider the prices to be significantly as much or little high. 3. And finally, those who think going from 3 to 5 days, that are on average at the end of the scale and therefore consider that the prices are significantly reasonable (9.01). e) After the analysis in the preceding paragraphs, do you consider it appropriate to calculate a coefficient to measure the degree of association between the variables? Compute it and justify. To the extent that it was concluded that there is a relationship between variables, meaning that the null hypothesis is rejected, it is necessary to determine the degree of association between variables. Therefore, two different coefficients can be proposed and calculated: 1. If we consider that one variable (the independent) as metric and the other as nominal, the appropriated coefficient will be the ETA, which varies between 0 and 1, and results from the ANOVA test. √ √ √ Conclusion: The degree of association between variables is very high, is 0.863. 2. If we consider one variable (the independent) as ordinal and the other as ordinal, the appropriate coefficient is the Spearman’s Coefficient and results from the KRUSKALL-WALLIS test.
Compartilhar