Responsible Research Assessment I: Implementing DORA and CoARA for hiring and promotion in psychology
DOI:
https://doi.org/10.15626/MP.2024.4601Keywords:
DORA, CoARA, research assessment, research quality, impactAbstract
The use of journal impact factors and other metric indicators of research productivity, such as the h-index, has been heavily criticized for being invalid for the assessment of individual researchers and for fueling a detrimental "publish or perish" culture. Multiple initiatives call for developing alternatives to existing metrics that better reflect quality (instead of quantity) in research assessment. This report, written by a task force established by the German Psychological Society, proposes how responsible research assessment could be done in the field of psychology. We present four principles of responsible research assessment in hiring and promotion and suggest a two-phase assessment procedure that combines the objectivity and efficiency of indicators with a qualitative, discursive assessment of shortlisted candidates. The main aspects of our proposal are (a) to broaden the range of relevant research contributions to include published data sets and research software, along with research papers, and (b) to place greater emphasis on quality and methodological rigor in research evaluation.
Metrics
References
Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. https://doi.org/10.1026/0033-3042/a000335
Abramo, G., & D’Angelo, C. A. (2014). How do you define and measure research productivity? Scientometrics, 101(2), 1129–1144. https://doi.org/10.1007/s11192-014-1269-8
Abramo, G., D’Angelo, C. A., & Costa, F. D. (2010). Testing the trade-off between productivity and quality in research activities. Journal of the American Society for Information Science and Technology, 61(1), 132–140. https://doi.org/10.1002/asi.21254
Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The Rules of the Game Called Psychological Science. Perspectives on Psychological Science, 7(6), 543–554. https://doi.org/10.1177/1745691612459060
Bol, T., de Vaan, M., & van de Rijt, A. (2018). The Matthew effect in science funding. Proceedings of the National Academy of Sciences, 115(19), 4887–4890. https://doi.org/10.1073/pnas.1719557115
Bornmann, L. (2012). The Hawthorne effect in jour nal peer review. Scientometrics, 91(3), 857–862. https://doi.org/10.1007/s11192-011-0547-y
Brembs, B., Button, K., & Munafò, M. (2013). Deep impact: Unintended consequences of journal rank. Frontiers in Human Neuroscience, 7. https://doi.org/10.3389/fnhum.2013.00291
Cicchetti, D. V. (1991). The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation. Behavioral and Brain Sciences, 14(1), 119–135. https://doi.org/10.1017/S0140525X00065675
Copiello, S., & Bonifaci, P. (2018). A few remarks on ResearchGate score and academic reputation. Scientometrics, 114(1), 301–306. https://doi.org/10.1007/s11192-017-2582-9
CWTS. (2021). Halt the H-index. https://doi.org/10.5281/ZENODO.4635649
De Rassenfosse, G. (2013). Do firms face a trade-off between the quantity and the quality of their inventions? Research Policy, 42(5), 1072–1079. https://doi.org/10.1016/j.respol.2013.02.005
Dougherty, M. R., & Horne, Z. (2022). Citation counts and journal impact factors do not capture some indicators of research quality in the behavioural and brain sciences. Royal Society Open Science, 9(8), 220334. https://doi.org/10.1098/rsos.220334
Dougherty, M. R., Slevc, L. R., & Grand, J. A. (2019). Making Research Evaluation More Transparent: Aligning Research Philosophy, Institutional Values, and Reporting. Perspectives on Psychological Science, 14(3), 361–375. https://doi.org/10.1177/1745691618810693
Etzel, F. T., Seyffert-Müller, A., Schönbrodt, F. D., Kreuzer, L., Gärtner, A., Knischewski, P., & Leising, D. (2024). Inter-Rater Reliability in Assessing the Methodological Quality of Research Papers in Psychology. https://doi.org/10.31234/osf.io/4w7rb
European Commission, Directorate-General for Research and Innovation. (2022). Agreement on Reforming Research Assessment. https://eua.eu/downloads/news/2022_07_19_rra_agreement_final.pdf
Forthmann, B., Leveling, M., Dong, Y., & Dumas, D. (2020). Investigating the quantity–quality relationship in scientific creativity: An empirical examination of expected residual variance and the tilted funnel hypothesis. Scientometrics, 124(3), 2497–2518. https://doi.org/10.1007/s11192-020-03571-w
Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502–1505. https://doi.org/10.1126/science.1255484
Garfield, E. (2006). The History and Meaning of the Journal Impact Factor. JAMA, 295(1), 90. https://doi.org/10.1001/jama.295.1.90
Gärtner, A., Leising, D., Freyer, N., Musfeld, P., Lange, J., & Schönbrodt, F. D. (2025). Responsible Research Assessment II: A specific proposal for hiring and promotion in psychology. Meta-Psychology, 9. https://doi.org/10.15626/MP.2024.4604
Gopalakrishna, G., Ter Riet, G., Vink, G., Stoop, I., Wicherts, J. M., & Bouter, L. M. (2022). Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands. PLOS ONE, 17(2), e0263023. https://doi.org/10.1371/journal.pone.0263023
Haslam, N., & Laham, S. M. (2010). Quality, quantity, and impact in academic publication. European Journal of Social Psychology, 40(2), 216–220. https://doi.org/10.1002/ejsp.727
Hooper, M. (2022). A Taxonomy of Research Quality v1.7.pdf (pp. 563170 Bytes). figshare. https://doi.org/10.6084/M9.FIGSHARE.20113565.V2
Houtkoop, B. L., Chambers, C., Macleod, M., Bishop, D.V. M., Nichols, T. E., & Wagenmakers, E.-J. (2018). Data Sharing in Psychology: A Survey on Barriers and Preconditions. Advances in Methods and Practices in Psychological Science, 1(1), 70–85. https://doi.org/10.1177/2515245917751886
Hrynaszkiewicz, I., Novich, B., Harney, J., & Kiermer, V. (2024). A survey of how biology researchers assess credibility when serving on grant and hiring committees. https://doi.org/10.31222/osf.io/ht836
Kepes, S., Keener, S. K., McDaniel, M. A., & Hartman, N. S. (2022). Questionable research practices among researchers in the most research-productive management programs. Journal of Organizational Behavior, 43(7), 1190–1208. https://doi.org/10.1002/job.2623
Laurinavichyute, A., Yadav, H., & Vasishth, S. (2022). Share the code, not just the data: A case study of the reproducibility of articles published in the Journal of Memory and Language under the open data policy. Journal of Memory and Language, 125, 104332. https://doi.org/10.1016/j.jml.2022.104332
League of European Research Universities. (2022). A Pathway towards Multidimensional Academic Careers - A LERU Framework for the Assessment of Researchers. https://www.leru.org/publications/a-pathway-towards-multidimensional-academic-careers-a-leru-framework-for-the-assessment-of-researchers
Leising, D., Gärtner, A., & Schönbrodt, F. D. (2025). Responsible Research Assessment (Parts I and II): Responses to the Commentaries. Meta-Psychology, 9. https://doi.org/10.15626/MP.2024.4603
Leising, D., Thielmann, I., Glöckner, A., Gärtner, A., & Schönbrodt, F. (2022a). Ten steps toward a better personality science – how quality may be rewarded more in research evaluation. Personality Science, 3, e6029. https://doi.org/10.5964/ps.6029
Leising, D., Thielmann, I., Glöckner, A., Gärtner, A., & Schönbrodt, F. (2022b). Ten steps toward a better personality science – a rejoinder to the comments. Personality Science, 3, e7961. https://doi.org/10.5964/ps.7961
Luebber, F., Krach, S., Martinez Mateo, M., Paulus, F. M., Rademacher, L., Rahal, R.-M., & Specht, J. (2023). Rethink funding by putting the lottery first. Nature Human Behaviour, 7(7), 1031–1033. https://doi.org/10.1038/s41562-023-01649-y
McKiernan, E. C., Schimanski, L. A., Muñoz Nieves, C.,Matthias, L., Niles, M. T., & Alperin, J. P. (2019). Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations. eLife, 8, e47338. https://doi.org/10.7554/eLife.47338
Menke, J., Eckmann, P., Ozyurt, I. B., Roelandse, M., Anderson, N., Grethe, J., Gamst, A., & Bandrowski, A. (2022). Establishing Institutional Scores With the Rigor and Transparency Index: Large-scale Analysis of Scientific Reporting Quality. Journal of Medical Internet Research, 24(6), e37324. https://doi.org/10.2196/37324
Merton, R. K. (1968). The Matthew Effect in Science: The reward and communication systems of science are considered. Science, 159(3810), 56–63. https://doi.org/10.1126/science.159.3810.56
Michalska-Smith, M. J., & Allesina, S. (2017). And, not or: Quality, quantity in scientific publishing. PLOS ONE, 12(6), e0178074. https://doi.org/10.1371/journal.pone.0178074
Moher, D., Bouter, L., Kleinert, S., Glasziou, P., Sham, M. H., Barbour, V., Coriat, A.-M., Foeger, N., & Dirnagl, U. (2020). The Hong Kong Principles for assessing researchers: Fostering research integrity. PLOS Biology, 18(7), e3000737. https://doi.org/10.1371/journal.pbio.3000737
Moher, D., Naudet, F., Cristea, I. A., Miedema, F., Ioannidis, J. P. A., & Goodman, S. N. (2018). Assessing scientists for hiring, promotion, and tenure. PLOS Biology, 16(3), e2004089. https://doi.org/10.1371/journal.pbio.2004089
Moore, S., Neylon, C., Paul Eve, M., Paul O’Donnell, D., & Pattinson, D. (2017). “Excellence R Us”: University research and the fetishisation of excellence. Palgrave Communications, 3(1), 16105. https://doi.org/10.1057/palcomms.2016.105
Müller, R., & De Rijcke, S. (2017). Thinking with indicators. Exploring the epistemic impacts of academic performance indicators in the life sciences. Research Evaluation, 26(3), 157–168. https://doi.org/10.1093/reseval/rvx023
Niiniluoto, I. (2024). Scientific Progress. In E. N. Zalta & U. Nodelman (Eds.), The Stanford Encyclopedia of Philosophy (Spring 2024). Metaphysics Research Lab, Stanford University.
Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., Fidler, F., Hilgard, J., Kline Struhl, M., Nuijten, M. B., Rohrer, J. M., Romero, F., Scheel, A. M., Scherer, L. D., Schönbrodt, F. D., & Vazire, S. (2022). Replicability, Robustness,
and Reproducibility in Psychological Science. Annual Review of Psychology, 73(1), 719–748. https://doi.org/10.1146/annurev-psych-020821-114157
Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in psychology. Psychonomic Bulletin & Review, 26(5), 1596–1618. https://doi.org/10.3758/s13423-019-01645-2
Osterloh, M., & Frey, B. S. (2019). Dealing With Randomness. Management Revue, 30(4), 331–345. https://doi.org/10.5771/0935-9915-2019-4-331
Osterloh, M., & Frey, B. S. (2020). How to avoid borrowed plumes in academia. Research Policy, 49(1), 103831. https://doi.org/10.1016/j.respol.2019.103831
Paulus, F. M., Cruz, N., & Krach, S. (2018). The Impact Factor Fallacy. Frontiers in Psychology, 9, 1487. https://doi.org/10.3389/fpsyg.2018.01487
Ramani, R. S., Aguinis, H., & Coyle-Shapiro, J. A.-M. (2022). Defining, Measuring, and Rewarding Scholarly Impact: Mind the Level of Analysis. Academy of Management Learning & Education, 21(3), 470–486. https://doi.org/10.5465/amle.2021.0177
Schmitt. (2022). Open peer commentaries to Leising et al., Ten steps toward a better personality science: How quality may be rewarded more in research evaluation. Personality Science, 3, e9227. https://doi.org/10.5964/ps.9227
Serra-Garcia, M., & Gneezy, U. (2021). Nonreplicable publications are cited more than replicable ones. Science Advances, 7(21), eabd1705. https://doi.org/10.1126/sciadv.abd1705
Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. https://doi.org/10.1098/rsos.160384
Tiokhin, L., Yan, M., & Morgan, T. J. H. (2021). Competition for priority harms the reliability of science, but reforms can help. Nature Human Behaviour, 5(7), 857–867. https://doi.org/10.1038/s41562-020-01040-1
Uygun Tunc, D., & Pritchard, D. (2022). Collective epistemic vice in science: Lessons from the credibility crisis [Preprint]. http://philsci-archive.pitt.edu/21120/
Vazire, S., & Holcombe, A. O. (2022). Where Are the Self-Correcting Mechanisms in Science? Review of General Psychology, 26(2), 212–223. https://doi.org/10.1177/10892680211033912
Weissgerber, T., Riedel, N., Kilicoglu, H., Labbé, C., Eckmann, P., Ter Riet, G., Byrne, J., Cabanac, G., Capes-Davis, A., Favier, B., Saladi, S., Grabitz, P., Bannach-Brown, A., Schulz, R., McCann, S., Bernard, R., & Bandrowski, A. (2021). Automated screening of COVID-19 preprints: Can we help authors to improve transparency and reproducibility? Nature Medicine, 27(1), 6–7. https://doi.org/10.1038/s41591-020-01203-7
Wouters, P., Sugimoto, C. R., Larivière, V., McVeigh, M. E., Pulverer, B., De Rijcke, S., & Waltman, L. (2019). Rethinking impact factors: Better ways to judge a journal. Nature, 569(7758), 621–623. https://doi.org/10.1038/d41586-019-01643-3
Published
Issue
Section
License
Copyright (c) 2025 Felix D. Schönbrodt, Anne Gärtner, Maximilian Frank, Mario Gollwitzer, Malika Ihle, Dorothee Mischkowski, Le Vy Phan, Manfred Schmitt, Anne M. Scheel, Anna-Lena Schubert, Ulf Steinberg, Daniel Leising

This work is licensed under a Creative Commons Attribution 4.0 International License.