Responsible Research Assessment I: Implementing DORA and CoARA for hiring and promotion in psychology

Downloads

Authors

DOI:

https://doi.org/10.15626/MP.2024.4601

Keywords:

DORA, CoARA, research assessment, research quality, impact

Abstract

The use of journal impact factors and other metric indicators of research productivity, such as the h-index, has been heavily criticized for being invalid for the assessment of individual researchers and for fueling a detrimental "publish or perish" culture. Multiple initiatives call for developing alternatives to existing metrics that better reflect quality (instead of quantity) in research assessment. This report, written by a task force established by the German Psychological Society, proposes how responsible research assessment could be done in the field of psychology. We present four principles of responsible research assessment in hiring and promotion and suggest a two-phase assessment procedure that combines the objectivity and efficiency of indicators with a qualitative, discursive assessment of shortlisted candidates. The main aspects of our proposal are (a) to broaden the range of relevant research contributions to include published data sets and research software, along with research papers, and (b) to place greater emphasis on quality and methodological rigor in research evaluation.

Metrics

Metrics Loading ...

References

Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. https://doi.org/10.1026/0033-3042/a000335

Abramo, G., & D’Angelo, C. A. (2014). How do you define and measure research productivity? Scientometrics, 101(2), 1129–1144. https://doi.org/10.1007/s11192-014-1269-8

Abramo, G., D’Angelo, C. A., & Costa, F. D. (2010). Testing the trade-off between productivity and quality in research activities. Journal of the American Society for Information Science and Technology, 61(1), 132–140. https://doi.org/10.1002/asi.21254

Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The Rules of the Game Called Psychological Science. Perspectives on Psychological Science, 7(6), 543–554. https://doi.org/10.1177/1745691612459060

Bol, T., de Vaan, M., & van de Rijt, A. (2018). The Matthew effect in science funding. Proceedings of the National Academy of Sciences, 115(19), 4887–4890. https://doi.org/10.1073/pnas.1719557115

Bornmann, L. (2012). The Hawthorne effect in jour nal peer review. Scientometrics, 91(3), 857–862. https://doi.org/10.1007/s11192-011-0547-y

Brembs, B., Button, K., & Munafò, M. (2013). Deep impact: Unintended consequences of journal rank. Frontiers in Human Neuroscience, 7. https://doi.org/10.3389/fnhum.2013.00291

Cicchetti, D. V. (1991). The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation. Behavioral and Brain Sciences, 14(1), 119–135. https://doi.org/10.1017/S0140525X00065675

Copiello, S., & Bonifaci, P. (2018). A few remarks on ResearchGate score and academic reputation. Scientometrics, 114(1), 301–306. https://doi.org/10.1007/s11192-017-2582-9

CWTS. (2021). Halt the H-index. https://doi.org/10.5281/ZENODO.4635649

De Rassenfosse, G. (2013). Do firms face a trade-off between the quantity and the quality of their inventions? Research Policy, 42(5), 1072–1079. https://doi.org/10.1016/j.respol.2013.02.005

Dougherty, M. R., & Horne, Z. (2022). Citation counts and journal impact factors do not capture some indicators of research quality in the behavioural and brain sciences. Royal Society Open Science, 9(8), 220334. https://doi.org/10.1098/rsos.220334

Dougherty, M. R., Slevc, L. R., & Grand, J. A. (2019). Making Research Evaluation More Transparent: Aligning Research Philosophy, Institutional Values, and Reporting. Perspectives on Psychological Science, 14(3), 361–375. https://doi.org/10.1177/1745691618810693

Etzel, F. T., Seyffert-Müller, A., Schönbrodt, F. D., Kreuzer, L., Gärtner, A., Knischewski, P., & Leising, D. (2024). Inter-Rater Reliability in Assessing the Methodological Quality of Research Papers in Psychology. https://doi.org/10.31234/osf.io/4w7rb

European Commission, Directorate-General for Research and Innovation. (2022). Agreement on Reforming Research Assessment. https://eua.eu/downloads/news/2022_07_19_rra_agreement_final.pdf

Forthmann, B., Leveling, M., Dong, Y., & Dumas, D. (2020). Investigating the quantity–quality relationship in scientific creativity: An empirical examination of expected residual variance and the tilted funnel hypothesis. Scientometrics, 124(3), 2497–2518. https://doi.org/10.1007/s11192-020-03571-w

Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502–1505. https://doi.org/10.1126/science.1255484

Garfield, E. (2006). The History and Meaning of the Journal Impact Factor. JAMA, 295(1), 90. https://doi.org/10.1001/jama.295.1.90

Gärtner, A., Leising, D., Freyer, N., Musfeld, P., Lange, J., & Schönbrodt, F. D. (2025). Responsible Research Assessment II: A specific proposal for hiring and promotion in psychology. Meta-Psychology, 9. https://doi.org/10.15626/MP.2024.4604

Gopalakrishna, G., Ter Riet, G., Vink, G., Stoop, I., Wicherts, J. M., & Bouter, L. M. (2022). Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands. PLOS ONE, 17(2), e0263023. https://doi.org/10.1371/journal.pone.0263023

Haslam, N., & Laham, S. M. (2010). Quality, quantity, and impact in academic publication. European Journal of Social Psychology, 40(2), 216–220. https://doi.org/10.1002/ejsp.727

Hooper, M. (2022). A Taxonomy of Research Quality v1.7.pdf (pp. 563170 Bytes). figshare. https://doi.org/10.6084/M9.FIGSHARE.20113565.V2

Houtkoop, B. L., Chambers, C., Macleod, M., Bishop, D.V. M., Nichols, T. E., & Wagenmakers, E.-J. (2018). Data Sharing in Psychology: A Survey on Barriers and Preconditions. Advances in Methods and Practices in Psychological Science, 1(1), 70–85. https://doi.org/10.1177/2515245917751886

Hrynaszkiewicz, I., Novich, B., Harney, J., & Kiermer, V. (2024). A survey of how biology researchers assess credibility when serving on grant and hiring committees. https://doi.org/10.31222/osf.io/ht836

Kepes, S., Keener, S. K., McDaniel, M. A., & Hartman, N. S. (2022). Questionable research practices among researchers in the most research-productive management programs. Journal of Organizational Behavior, 43(7), 1190–1208. https://doi.org/10.1002/job.2623

Laurinavichyute, A., Yadav, H., & Vasishth, S. (2022). Share the code, not just the data: A case study of the reproducibility of articles published in the Journal of Memory and Language under the open data policy. Journal of Memory and Language, 125, 104332. https://doi.org/10.1016/j.jml.2022.104332

League of European Research Universities. (2022). A Pathway towards Multidimensional Academic Careers - A LERU Framework for the Assessment of Researchers. https://www.leru.org/publications/a-pathway-towards-multidimensional-academic-careers-a-leru-framework-for-the-assessment-of-researchers

Leising, D., Gärtner, A., & Schönbrodt, F. D. (2025). Responsible Research Assessment (Parts I and II): Responses to the Commentaries. Meta-Psychology, 9. https://doi.org/10.15626/MP.2024.4603

Leising, D., Thielmann, I., Glöckner, A., Gärtner, A., & Schönbrodt, F. (2022a). Ten steps toward a better personality science – how quality may be rewarded more in research evaluation. Personality Science, 3, e6029. https://doi.org/10.5964/ps.6029

Leising, D., Thielmann, I., Glöckner, A., Gärtner, A., & Schönbrodt, F. (2022b). Ten steps toward a better personality science – a rejoinder to the comments. Personality Science, 3, e7961. https://doi.org/10.5964/ps.7961

Luebber, F., Krach, S., Martinez Mateo, M., Paulus, F. M., Rademacher, L., Rahal, R.-M., & Specht, J. (2023). Rethink funding by putting the lottery first. Nature Human Behaviour, 7(7), 1031–1033. https://doi.org/10.1038/s41562-023-01649-y

McKiernan, E. C., Schimanski, L. A., Muñoz Nieves, C.,Matthias, L., Niles, M. T., & Alperin, J. P. (2019). Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations. eLife, 8, e47338. https://doi.org/10.7554/eLife.47338

Menke, J., Eckmann, P., Ozyurt, I. B., Roelandse, M., Anderson, N., Grethe, J., Gamst, A., & Bandrowski, A. (2022). Establishing Institutional Scores With the Rigor and Transparency Index: Large-scale Analysis of Scientific Reporting Quality. Journal of Medical Internet Research, 24(6), e37324. https://doi.org/10.2196/37324

Merton, R. K. (1968). The Matthew Effect in Science: The reward and communication systems of science are considered. Science, 159(3810), 56–63. https://doi.org/10.1126/science.159.3810.56

Michalska-Smith, M. J., & Allesina, S. (2017). And, not or: Quality, quantity in scientific publishing. PLOS ONE, 12(6), e0178074. https://doi.org/10.1371/journal.pone.0178074

Moher, D., Bouter, L., Kleinert, S., Glasziou, P., Sham, M. H., Barbour, V., Coriat, A.-M., Foeger, N., & Dirnagl, U. (2020). The Hong Kong Principles for assessing researchers: Fostering research integrity. PLOS Biology, 18(7), e3000737. https://doi.org/10.1371/journal.pbio.3000737

Moher, D., Naudet, F., Cristea, I. A., Miedema, F., Ioannidis, J. P. A., & Goodman, S. N. (2018). Assessing scientists for hiring, promotion, and tenure. PLOS Biology, 16(3), e2004089. https://doi.org/10.1371/journal.pbio.2004089

Moore, S., Neylon, C., Paul Eve, M., Paul O’Donnell, D., & Pattinson, D. (2017). “Excellence R Us”: University research and the fetishisation of excellence. Palgrave Communications, 3(1), 16105. https://doi.org/10.1057/palcomms.2016.105

Müller, R., & De Rijcke, S. (2017). Thinking with indicators. Exploring the epistemic impacts of academic performance indicators in the life sciences. Research Evaluation, 26(3), 157–168. https://doi.org/10.1093/reseval/rvx023

Niiniluoto, I. (2024). Scientific Progress. In E. N. Zalta & U. Nodelman (Eds.), The Stanford Encyclopedia of Philosophy (Spring 2024). Metaphysics Research Lab, Stanford University.

Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., Fidler, F., Hilgard, J., Kline Struhl, M., Nuijten, M. B., Rohrer, J. M., Romero, F., Scheel, A. M., Scherer, L. D., Schönbrodt, F. D., & Vazire, S. (2022). Replicability, Robustness,

and Reproducibility in Psychological Science. Annual Review of Psychology, 73(1), 719–748. https://doi.org/10.1146/annurev-psych-020821-114157

Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in psychology. Psychonomic Bulletin & Review, 26(5), 1596–1618. https://doi.org/10.3758/s13423-019-01645-2

Osterloh, M., & Frey, B. S. (2019). Dealing With Randomness. Management Revue, 30(4), 331–345. https://doi.org/10.5771/0935-9915-2019-4-331

Osterloh, M., & Frey, B. S. (2020). How to avoid borrowed plumes in academia. Research Policy, 49(1), 103831. https://doi.org/10.1016/j.respol.2019.103831

Paulus, F. M., Cruz, N., & Krach, S. (2018). The Impact Factor Fallacy. Frontiers in Psychology, 9, 1487. https://doi.org/10.3389/fpsyg.2018.01487

Ramani, R. S., Aguinis, H., & Coyle-Shapiro, J. A.-M. (2022). Defining, Measuring, and Rewarding Scholarly Impact: Mind the Level of Analysis. Academy of Management Learning & Education, 21(3), 470–486. https://doi.org/10.5465/amle.2021.0177

Schmitt. (2022). Open peer commentaries to Leising et al., Ten steps toward a better personality science: How quality may be rewarded more in research evaluation. Personality Science, 3, e9227. https://doi.org/10.5964/ps.9227

Serra-Garcia, M., & Gneezy, U. (2021). Nonreplicable publications are cited more than replicable ones. Science Advances, 7(21), eabd1705. https://doi.org/10.1126/sciadv.abd1705

Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. https://doi.org/10.1098/rsos.160384

Tiokhin, L., Yan, M., & Morgan, T. J. H. (2021). Competition for priority harms the reliability of science, but reforms can help. Nature Human Behaviour, 5(7), 857–867. https://doi.org/10.1038/s41562-020-01040-1

Uygun Tunc, D., & Pritchard, D. (2022). Collective epistemic vice in science: Lessons from the credibility crisis [Preprint]. http://philsci-archive.pitt.edu/21120/

Vazire, S., & Holcombe, A. O. (2022). Where Are the Self-Correcting Mechanisms in Science? Review of General Psychology, 26(2), 212–223. https://doi.org/10.1177/10892680211033912

Weissgerber, T., Riedel, N., Kilicoglu, H., Labbé, C., Eckmann, P., Ter Riet, G., Byrne, J., Cabanac, G., Capes-Davis, A., Favier, B., Saladi, S., Grabitz, P., Bannach-Brown, A., Schulz, R., McCann, S., Bernard, R., & Bandrowski, A. (2021). Automated screening of COVID-19 preprints: Can we help authors to improve transparency and reproducibility? Nature Medicine, 27(1), 6–7. https://doi.org/10.1038/s41591-020-01203-7

Wouters, P., Sugimoto, C. R., Larivière, V., McVeigh, M. E., Pulverer, B., De Rijcke, S., & Waltman, L. (2019). Rethinking impact factors: Better ways to judge a journal. Nature, 569(7758), 621–623. https://doi.org/10.1038/d41586-019-01643-3

Downloads

Published

2025-12-30

Issue

Section

Commentaries