Interpretation of the p valuea national survey study in academic psychologists from Spain
- Laura Badenes-Ribera 1
- Dolores Frías-Navarro 1
- Héctor Monterde-i-Bort 1
- Marcos Pascual-Soler 2
-
1
Universitat de València
info
- 2 ESIC Business & Marketing School (Valencia)
ISSN: 0214-9915
Año de publicación: 2015
Volumen: 27
Número: 3
Páginas: 290-295
Tipo: Artículo
Otras publicaciones en: Psicothema
Resumen
Antecedentes: las interpretaciones incorrectas de los valores p afectan a las decisiones de los profesionales y ponen en peligro la calidad de las intervenciones psicológicas y la acumulación de conocimiento científico válido. Este estudio analiza los errores de interpretación del valor p y su interpretación correcta entre el profesorado de las universidades de Psicología de España. Método: se encuestó a profesores universitarios sobre sus interpretaciones de los valores p. La muestra está compuesta por 418 profesores de Psicología de las universidades públicas españolas. La media de años como profesor universitario es 14,16 (DT = 9,39). Resultados: nuestros hallazgos sugieren que muchos profesores universitarios no saben interpretar correctamente los valores de p. La falacia de la probabilidad inversa presenta los mayores problemas de comprensión. Los profesores de Metodología también cometen errores de interpretación del valor p. Conclusiones: estos resultados resaltan la importancia de la re-educación estadística de los profesores.
Referencias bibliográficas
- American Psychological Association (2001). Publication manual of the American Psychological Association (5th Ed.). Washington, DC: American Psychological Association.
- American Psychological Association (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association.
- Badenes-Ribera, L., Frías-Navarro, D., Monterde-i-Bort, H., & Pascual- Soler, M. (2013). Informar e interpretar el tamaño del efecto en Psicología y Educación [Reporting and interpreting the effect size in Psychology and Education]. XIV Congreso Virtual de Psiquiatria.com. Interpsiquis, 2013: 1-28 February.
- Belia, S., Fidler, F., Williams, J., & Cumming, G. (2005). Researchers misunderstand confidence intervals and standard error bars. Psychological Methods, 10, 389-396.
- Ben-Zvi, D., & Garfield, J. (Eds.). The challenge of developing statistical literacy, reasoning and thinking (pp. 17-45). Dordrecht: Kluwer Academic Publishers.
- Beyth-Maron, R., Fidler, F., & Cumming, G. (2008). Statistical cognition: Towards evidence-based practice in statistics and statistics education. Statistics Education Research Journal, 7, 20-39.
- Caperos, J.M., & Pardo, A. (2013). Consistency errors in p-values reported in Spanish psychology journals. Psicothema, 25, 408-414.
- Carver, R.P. (1978). The case against statistical significance testing. Harvard Educational Review, 48, 378-399.
- Cohen, J. (1994). The earth is round (p<0.05). American Psychologist, 49, 997-1003.
- Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. New York, NY: Routledge.
- Cumming, G. (2013). The new statistics: A how to guide. Australian Psychologist, 48, 161-170.
- Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7-29.
- Cumming, G., Fidler, F., Kalinowski, P., & Lai, J. (2012). The statistical recommendations of the American Psychological Association Publication Manual: Effect sizes, confidence intervals, and metaanalysis. Australian Journal of Psychology, 64, 138-146.
- Cumming, G., & Finch, S. (2005). Inference by eye. Confidence intervals and how to read pictures of data. American Psychologist, 60, 170-180.
- Falk, R., & Greenbaum, C.W. (1995). Significance tests die hard: The amazing persistence of a probabilistic misconception. Theory & Psychology, 5, 75-98.
- Faulkner, C., Fidler, F., & Cumming, G. (2008). The value of RCT evidence depends on the quality of statistical analysis. Behavior Research and Therapy, 46, 270-281.
- Fidler, F. (2005). From statistical significance to effect estimation: Statistical reform in psychology, medicine and ecology. PhD Thesis History and Philosophy of Science. Melbourne, Australia. Department of History and Philosophy of Science. University of Melbourne.
- Fidler, F., & Loftus, G.R. (2009). Why figures with error bars should replace p values: Some conceptual arguments and empirical demonstrations. Journal of Psychology, 217, 27-37.
- Finch, S., Cumming, G., & Thomason, N. (2001). Reporting of statistical inference in the Journal of Applied Psychology: Little evidence of reform. Educational and Psychological Measurement, 61, 181-210.
- Frías-Navarro, D. (2011). Técnica estadística y diseño de investigación [Statistical technique and research design]. Valencia (Spain): Palmero Ediciones.
- Frías-Navarro, D., Pascual-Llobel, J., & García-Pérez, F. (2000). Tamaño del efecto del tratamiento y significación estadística [Effect size of the treatment and statistical significance]. Psicothema, 12, 236-240.
- Fisher, R.A. (1925). Statistical methods for research workers. Edinburgh, UK. Oliver and Boyd.
- García-García, J., Ortega-Campos, E., & De la Fuent-Sánchez, L. (2011). The use of the effect size in JCR Spanish journals of psychology: From theory to fact. The Spanish Journal of Psychology, 14, 1050-1055.
- Gardner, M.J., & Altman, D.G. (1986). Confidence intervals rather than p-values: Estimation rather than hypothesis testing. British Medical Journal, 292, 746-750.
- Garfield, J.B., Ben-Zvi, D., Chance, B., Medina, E., Roseth, C., & Zieffl er, A. (2008). Developing students’ statistical reasoning. Connecting research and teaching practice. New York, NY: Springer Publishers
- Garfield, J., Zieffl er, A., Kaplan, D., Cobb, G., Chance, B., & Holcomb, J.P. (2011). Rethinking assessment of student learning in statistics courses. The American Statistician, 65, 1-10.
- Gill, J. (1999). The insignificance of null hypothesis significance testing. Political Research Quarterly, 52, 647-674.
- Gliner, J.A., Vaske, J.J., & Morgan, G.A. (2001). Null hypothesis significance testing: Effect size matters. Human Dimensions of Wildlife, 6, 291-301.
- Gliner, J.A., Leech, N.L., & Morgan, G.A. (2002). Problems with null hypothesis significance testing (NHST): What do the textbooks say? The Journal of Experimental Education, 71, 83-92.
- Goodman, S. (2008). A dirty dozen: Twelve p-value misconceptions. Seminars in Hematology, 45, 135-140.
- Gordon, H.R.D. (2001). American vocational education research association members’ perceptions of statistical significance tests and other statistical controversies. Journal of Vocational Educational Research, 26, 1-18.
- Grant, D.A. (1962). Testing the null hypothesis and the strategy and tactics on investigating theoretical models. Psychological Reviews, 69, 54-61.
- Haller, H., & Krauss, S. (2002). Misinterpretations of significance: A problem students share with their teachers? Methods of Psychological Research Online [On-line serial], 7, 120. Retrieved July 30, 2014, from http://www.metheval.uni-jena.de/lehre/0405-ws/evaluationuebung/haller.pdf.
- Hoekstra, R., Finch, S., Kiers, H.A.L., & Johnson, A. (2006). Probability as certainty: Dichotomous thinking and the misuse of p values. Psychonomic Bulletin & Review, 13, 1033-1037.
- Hoekstra, R., Johnson, A., & Hal, K. (2012). Confidence intervals make a difference: Effects of showing confidence intervals on inferential reasoning. Educational and Psychological Measurement, 72, 1039-1052.
- Hoekstra, R., Morey, R.D., Rouder, J.N., & Wagenmakers, E. (2014). Robust misinterpretation of confidence intervals. Psychonomic Bulletin & Review, 21, 1157-1164.
- Hubbard, R., & Lindsay, R.M. (2008). Why p values are not a useful measure of evidence in statistical significance testing. Theory & Psychology, 18, 69-88.
- Johnson, D.H. (1999). The insignificance of statistical significance testing. Journal of Wildlife Management, 63, 763-772.
- Kirk, R.E. (1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746-759.
- Kirk, R.E. (2001). Promoting good statistical practices: Some suggestions. Educational and Psychological Measurement, 61, 213-218.
- Kline, R.B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.
- Kline, R.B. (2013). Beyond significance testing: Statistic reform in the behavioral sciences. Washington, DC: American Psychological Association.
- Lecoutre, M.P., Poitevineau, J., & Lecoutre, B. (2003). Even statisticians are not immune to misinterpretations of null hypothesis tests. International Journal of Psychology, 38, 37-45.
- Levine, T.R., Weber, R., Hullett, C., Sun, H. & Massi, L.L. (2008). A critical assessment of null hypothesis significance testing in quantitative communication research. Human Communication Research, 34, 171-187.
- Mittag, K.C., & Thompson, B. (2000). A national survey of AERA members’ perceptions of statistical significance test and others statistical issues. Educational Researcher, 29, 14-20.
- Monterde-i-Bort, H., Frías-Navarro, D., & Pascual-Llobel, J. (2010). Uses and abuses of statistical significance tests and other statistical resources: A comparative study. European Journal of Psychology of Education, 25, 429-447.
- Nickerson, R.S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241-301.
- Oakes, M. (1986). Statistical inference: A commentary for the social and behavioral sciences. Chicester: John Wiley & Sons.
- Palmer, A., & Sesé, A. (2013). Recommendations for the use of statistics in clinical and health psychology. Clínica y Salud, 24, 47-54.
- Pascual-Llobel, J., Frías-Navarro, D., & Monterde-i-Bort, H. (2004). Tratamientos psicológicos con apoyo empírico y práctica clínica basada en la evidencia [Psychological treatments with empirical support and evidence-based clinical practice]. Papeles del Psicólogo, 87, 1-8.
- Shaver, J.P. (1993). What statistical significance testing is, and what is not. The Journal of Experimental Education, 61, 293-316.
- Thompson, B. (1996). AERA editorial policies regarding statistical significance testing: Three suggested reforms. Educational Researcher, 25, 26-30.
- Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect size. Educational Researcher, 31, 25-32.
- Vacha-Haase, T. (2001). Statistical significance should not be considered one of life’s guarantees: Effect sizes are needed. Educational and Psychological Measurement, 61, 219-224.
- Vacha-Haase, T., & Ness, C.M. (1999). Statistical significance testing as it relates to practice: Use within professional psychology: Research and practice. Professional Psychology: Research and Practice, 30, 104-105.
- Verdam, M.G.E., Oort, F.J., & Sprangers, M.A.G. (2014). Significance, truth and proof of p values: Reminders about common misconceptions regarding null hypothesis significance testing. Quality of Life Research, 23, 5-7.
- Wagenmakers, E.J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779-804.
- Wilkinson, L., & the Task Force on Statistical Inference (1999). Statistical methods in psychology journals: Guidelines and explanations. The American Psychologist, 54, 594-604.