Response to responsible research assessment I and II from the perspective of the DGPs working group on open science in clinical psychology

We comment on the papers by Schönbrodt et al. (2022) and Gärtner et al. (2022) on responsible research assessment from the perspective of clinical psychology and psychotherapy research. Schönbrodt et al. (2022) propose four principles to guide hiring and promotion in psychology: (1) In addition to publications in scientific journals, data sets and the development of research software should be considered. (2) Quantitative metrics can be useful, but they should be valid and applied responsibly. (3) Methodological rigor, research impact, and work quantity should be considered as three separate dimensions for evaluating research contributions. (4) The quality of work should be prioritized over the number of citations or the quantity of research output. From the perspective of clinical psychology, we endorse the initiative to update current practice by establishing a matrix for comprehensive, transparent and fair evaluation criteria. In the following, we will both comment on and complement these criteria from a clinical-psychological perspective.


Specific aspects from the perspective of clinical psychology
Sharing and re-using sensitive data More than other fields of psychology, clinical psychological research is characterized by strict legal regulations and specific requirements due to the handling of highly sensitive (e.g., mental health) data.Data security and de-identification issues have not yet been adequately solved (Walsh et al., 2018) and researchers collecting sensitive data such as biomedical data (e.g.diagnosis, neuroimaging, genetic probes, geospatial activity), medical service utilization data, and raw data from session recordings or interviews (e.g.voice/video recordings) are facing intricate challenges when attempting to adhere to open data principles.Institutional review boards, and sometimes funding agencies, can also impede researchers' ambitions to publish patient data (see Paret et al., 2022).Given these challenges, restricted access to data according to the different access categories of the DGPs (see data sharing sheet1 ) or proof that original study data have been reused by independent research teams should also be highly valued for applicants in appointment processes in clinical psychology as alternatives to making all data publicly available.However, with the current scoring method of Gärtner et al. (2022), these efforts to share sensitive, unpublished data would remain unrecognized.If data request forms are used to provide data to other researchers for specific purposes, the frequency of such secure data sharing could be stated in an application.There is no loss in usability compared to open data for the researchers who requested the data, but the data protection and use for specific purposes only is secured.Given the difficulty and effort of obtaining substantial clinical samples, it is common practice to use the same datasets for different research questions as well as reanalyzing previously published datasets.Therefore, preregistration after data collection, but before conducting secondary analyses, should still be considered as valid.Thus, we strongly endorse the proposal of Schönbrodt et al. (2022) to give applicants the opportunity to specify why some indicators may not fit their research rather than blindly applying algorithms.

Scope for decision-making of early-career researchers
For the proposed algorithms, we see the challenge in imposing requirements on early career researchers (ECR) that may be beyond their control.Often multiple stakeholders (e.g.universities, principal investigators, hospitals, self-help organizations, patient advocates) are involved in clinical studies and especially in complex clinical trials.Therefore, decisions about preregistration and the provision of open data and materials are typically made by principal investigators, which rarely include the ECR.In addition, ECRs are bound by the publishing practices of their stakeholders, which raises the question of individual responsibility vs. changing institutional behavior.In particular, ECRs who started their careers in an environment with a reluctance towards open science practices may be at disadvantage.This may severely limit the extent to which the implementation or omission of open science practices can be attributed to ECRs.Another pertinent issue is academic age as researchers in clinical psychology often take on dual roles as clinicians and researchers and have to devote a substantial amount of time to clinical training and work.This may result in less research output in a given time.In terms of promoting diversity among applicants, in addition to the other aspects proposed by Gärtner et al. (2022), it may also be valuable to consider further factors for academic age, such as first generation academics or academics with migration backgrounds who might take longer for academic achievements due to less familiarity with the academic system or higher funding requirements.

Embracing the existing diversity and strengths of clinical research
In many respects, clinical research already has strong standards of good scientific practice.Study protocol papers and ethics applications that include a statistical analysis plan can provide an equal level of transparency as preregistrations and have been very common in clinical psychology for years.However, the quality and adherence to standards should also be reviewed.Valid versions should be acknowledged and incorporated into the scoring scheme (Gärtner et al., 2022) to allow for a plurality of suitable documents.The scoring scheme proposed by Gärtner et al. (2022) rewards original empirical research over secondary research, reviews and meta-analyses and focuses strongly on confirmatory research, as it extensively rewards registered reports, replications and formal modelling of theories.However, in clinical psychology, reviews and meta-analyses often take on the role of confirmatory research by integrating results from original studies as well as testing hypotheses.Omitting them in the rating introduces a significant bias in the evaluation of candidates that is not justified by the quality or impact of the science.We therefore suggest allowing all research items to be scored in this scheme, regardless of their design.

Beyond the principles by Schönbrodt et al. (2022):
Changing the scientific culture Overall, the proposed change in research assessments should be complemented by a more general, long-term change in the scientific culture: This includes a culture lived by example, characterized by permissive attitudes towards individual mistakes, and a different error culture, so that potential concerns about open science diminish and the process does not become superficial.Cultural change may also include devaluing author positions or funding as an indicator for hiring.There are alternative models such as alphabetical authorship and publishing as a group.Moreover, consideration of even rejected proposals when peer reviews attest high quality could be another approach to mitigate past hiring practices regarding funding.A complete omission of previous funding as an indicator, as suggested by Schönbrodt et al. (2022), would be another option.

A final thought on measuring methodological quality and rigor
As mentioned above, we fully agree with putting methodological quality and rigor at the forefront of the selection process.The criteria proposed by Gärtner et al. (2022) focus strongly on open science practices as a precondition for evaluating research quality via transparency.While transparency is of high value, determining methodological quality is not limited to open science.Applicants should have the opportunity to prove further quality criteria of their research directly for consideration by the committee that are not covered in the criteria of Gärtner et al. (2022).Moreover, we are ambivalent about enforcing a clear threshold based on the proposed scoring method.While the need to shortlist candidates for further selection is obvious, enforcing a strict minimum threshold by simply "counting the boxes" might encourage applicants to surpass this threshold by optimizing their score.This could result in preregistrations without the will to actually adhere to them (Goldacre et al., 2019) or to 'open washing' (e.g., formulate very imprecise hypotheses that fit all analyses; Song et al., 2022).Therefore, responsible research assessment may require a more in-depth assessment of the presented papers rather than a numerical count.To make this more feasible for selection committees, applicants could list the quality features of their papers, which could then be reviewed and scored by the committee.Another creative solution in the future could be that reviewers of a journal article rate certain dimensions of rigor by default on scales according to the extended scheme of Gärtner et al. (2022).These should be made available to authors, who could then include them with their applications and (voluntarily) publish them with their publications.

Concluding remarks
In summary, we support and share the principles of responsible research assessment and endorse the initiative to develop fair, transparent and valid assessment criteria and to strengthen the focus more on quality and societal impact, rather than quantity.Some of these are not uniquely beneficial to clinical psychology but might also reflect the reality and affordances in other areas of psychology.Based on these considerations, we provide some suggestions for applying the principles of Schönbrodt et al. (2022) and the scoring scheme of Gärtner et al. (2022) in clinical psychology.If there is interest, we would also be happy to collaborate on a scoring scheme specifically adapted for clinical psychology.In the end, we hope this process will result in better hiring procedures and a healthier scientific culture overall.