Antidotes to cynicism creep in academia

This is one of these blog posts that doesn’t read well if you stop halfway. First, I provide evidence that academia can look pretty broken: there is low-quality work everywhere you look, the peer-review system has long outlived its utility, and academic publishing is a dumpster fire. Add considerable work pressure, the publish-or-perish culture, and precarious employment situations, and things can look gloomy and disheartening. Second, and this is where the blog becomes a bit personal, I stress how important it is for me not to become a science cynic, because of the responsibility towards my mental health and work, my team, my colleagues, and my students. Third, I then highlight antidotes to cynicism creep, and the many things that have greatly helped me with motivation and staying positive.

(Update November 2024: I summarized this blog post in an invited 45 minute talk recently; you can find the video on YouTube)

1. So. Many. Problems.

My most central claim in the blog post is that there are so many reasons to be disheartened and disillusioned. This pertains to many areas of the system we work in, including published papers, peer-review, the publishing industry, severe issues with workload and job insecurity, and so on. I will talk about some of them, and later come back to discuss how I maintain motivation and energy, and why.

2. Problems with scientific papers

Without exaggeration, I believe that the majority of published works in my field (broadly defined as psychology) do not add value. Many papers draw conclusions that are not supported by evidence, which cascades through the literature, because these papers are cited for the conclusions, not the evidence. The majority of published works are not reproducible, in the sense that authors conduct science behind closed doors without sharing data or code. Many published works are not replicable, i.e., will not hold up to scrutiny over time. Theories are verbal and vague, which means they can never get properly rejected. Instead, as Paul Meehl famously wrote, they sort of just slowly fade away as people lose interest. Let me try to convince you that this is an entirely reasonable position, based on the evidence we have.

(1) There are scientific disciplines dedicated to interrogating the scientific literature for potential flaws. Resulting work includes flashy papers such as “Why Most Published Research Findings Are False”, but also nuanced work highlighting considerable problems in several of areas, e.g. issues of replicability in psychology and cancer biology, large surveys of researchers admitting to engaging in questionable practices(including making up data), an ever-growing number of high-profile fraud cases who have now started suing researchers pointing out problems, and so on.

(2) Like many others, I have also made my own experiences by reading papers and engaging with the literature, and believe that a substantial proportion draw invalid inferences. I have written many commentaries to point out the most egregious flaws, but there aren’t enough hours in the week to engage with and rebut even 10% of bizarre claims in the literature.

(3) Each year, a dramatic number of papers, surpassing 10,000 in 2023 alone, are retracted. These papers are likely only the tip of the iceberg. In fact, “the number of articles produced by businesses that sell bogus work and authorships to scientists [..] is estimated to be in the hundreds of thousands”. This doesn’t speak to a system working as intended where we can rely on inferences we read.

(4) These problems are hard to demonstrate abstractly, so I will list a few concrete examples below. I have also written many dozens of blog posts on problematic papers that you can find on this website — for example, just in the last few months, on depression & temperature, the default mode network, and the disease factor. Below are four papers from our psychedelics overview paper where authors wrote things that were not correct based on their evidence:

“Abbar et al. found in a randomized controlled trial (RCT) comparing ketamine against placebo that there was no persistent benefit of ketamine over placebo at the exit timepoint of the trial in week 6, but concluded in the abstract that ‘ketamine [..] has persistent benefits for acute care in suicidal patients’. Ionescu et al. found in an open-label ketamine study that only 2 of 14 patients show sustained improvement at 3-month follow-up (which may well be due to the placebo effect or other factors), but the title of the paper reads ‘Rapid and Sustained Reductions in Current Suicidal Ideation’. Palhano-Fontes et al. concluded in their ayahuasca study (n = 14 treatment, n = 15 placebo) that ‘blindness was adequately preserved’, when all participants in the treatment group said they believed they had received ayahuasca, but less than half of participants in the placebo group said so. And Daws et al. compared two treatment arms, including one using psilocybin-assisted psychotherapy, against each other, concluding that one treatment outperformed the other despite the lack of a statistically significant interaction term between the treatments.”

2.1 Small detour: Mindfulness and brain morphology

Let me provide you with a specific example, and then I’ll zoom out and embed this in a broader context. A few days ago, I complained (apologies, it happens when I become cynical) that especially the neuroimaging literature in psychology is riddled with studies that are false positive findings and do not replicate subsequently, and posted about a recent paper that found no relation between mindfulness training and changes in brain morphology, contrasting prior work. I embedded this point in broader criticism of the literature at large¹, problems that lead to a massive amount of research waste (i.e. money that could be spent better).

As a response to my complaint, someone posted a recent meta-analysis that supposedly shows convincingly that there are relations between mindfulness training and brain morphology. Here is the relevant part of the paper’s abstract:

”[..] This study aimed to investigate the structural brain changes in mindfulness-based interventions through a meta-analysis. [..] 11 studies (n=581) assessing whole-brain voxel-based grey matter or cortical thickness changes after a mindfulness CT were included. Anatomical likelihood estimation was used to carry out voxel-based meta-analysis with leave-one-out sensitivity analysis and behavioural analysis as follow-ups. One significant cluster (p<0.001, Z=4.76, cluster size=632 mm3) emerged in the right insula and precentral gyrus region (MNI=48, 10, 4) for structural volume increases in intervention group compared to controls. Behavioural analysis revealed that the cluster was associated with mental processes of attention and somesthesis (pain). Mindfulness interventions have the ability to affect neural plasticity in areas associated with better pain modulation and increased sustained attention. This further cements the long-term benefits and neuropsychological basis of mindfulness-based interventions.”

This all looks fine, and only because I was supposed to grade assignments and really looked for something else to do, I scrolled down a little, and found that the authors actually removed 4 of 15 studies that contained null-findings from their meta-analysis. That is, they removed 4 studies that did not find a relationship between brain morphology and mindfulness, and then concluded that the results “cements the [..] neuropsychological basis of mindfulness-based interventions”. This is not a valid conclusion.

That the authors excluded null-findings is unclear from the title or abstract. Digging a little deeper, the sample sizes of the removed studies were nearly twice as large as the sample sizes of the included studies. And of the 11 analyzed studies, only 2 provided any significant effects, with fewer than 70 combined participants — one of which is a study on Yoga with female participants at risk for developing Alzheimer’s disease. Even without the issue of dropping null-findings, this does not qualify as robust evidence using meta-analytic guidelines (1, 2).

3. Problems with peer-review

As a result of the above, when someone sends me a paper in which a particular finding is reported, or when a colleague publishes a paper, I have little a priori confidence that what is communicated follows from the presented data. But wait, aren’t all scientific papers vetted through a system of peer-review? Yes, most journals have systems in place where reviewers critically evaluate papers. In practice, this system does not work very well, which often comes as a surprise to journalists or friends outside of academia I talk to. Here are some reasons why peer-review does not work well.

(1) Authors are often asked to recommend reviewers; you can immediately see how this can lead to problems. “You recommend me and I recommend you” is common, or just recommending academic friends who are not impartial. There are no structural mechanisms in place at scientific journals to interrogate or prevent these issues other than perhaps checking if researchers have published together.

(2) Peer-reviewers often disagree with each other when they rate the quality of a manuscript. When we work as editors as read several peer-reviews, we often face situations where one reviewer is very unhappy with a paper and recommends rejection, but the second reviewer is very enthusiastic. We see this as authors when we often (not always) get conflicting feedback. And we see this in scientific studies on the topic.

(3) There are no real scientific standards for who becomes a reviewer, and some journals like Frontiers use software that automatically selects and invites reviewers. I am regularly invited to review geological work on depression — “a landform sunken or depressed below the surrounding area” — because I work on major depressive disorder. PhD students I supervise are often invited in the second year of their PhD by an automated system. This isn’t to say they aren’t qualified, but I don’t think the public, when they see ‘peer-review’, would think PhD students vet papers.

(4) Most papers are reviewed by 2 or 3 reviewers. They in most cases do not get reimbursed, and there is little motivation other than scientific integrity and the importance of service to the field to take the work seriously. Given how stressed researchers can be, I have seen many, many, many low-quality reviews. This is supported by pretty devastating experimental work showing that reviewers usually miss fatal flaws when being asked to review flawed papers (1, 2, 3).²

(5) There is no accountability for a bad review, because reviews are nearly never public. So even if you just wrote “fuck you” in a review and do nothing else (please don’t), the worst that happens is that the editor or the journal no longer invite you to review again. Journals stress that peer-reviews are confidential, which means they wouldn’t even be allowed to broadly share that you are a terrible reviewer (e.g. to other journals), following their own rules.

In the previous section, I tried to explain why I don’t have a lot of confidence in the conclusions of any random paper before vetting it. Now you know why the fact that this paper was peer-reviewed does not in any fundamental sense increase my trust in the veracity of presented findings.

4. Problems with the publishing industry

But Eiko, these papers are published by journals like Science and Nature. Are those not cornerstones of truth in the world? Are they not beyond reproach? Well .. I believe that the scientific publishing industry is inherently tied to some of these problems. I’ve written about the industry in some detail previously, and recommend the blog to catch up if you don’t work in academia.

(1) In sum, most scientific publishers are for-profit companies. They sell scientific papers in the way Apple sells smartphones or computers, and they have no inherent interest in scientific integrity or cumulative knowledge building because these are not goods that inherently increase profits. I don’t even think this is morally bankrupt: it is just a business, and it is our fault that we have let a system happen in which scientific publishing is a business, rather than organized by states or governments or non-profits or universities, with the goal to make scientific findings available to everyone.

(2) In a nutshell, a researcher does research and submits a paper; other researchers serve as editors for a journal and help select appropriate papers for publication; other researchers peer-review the paper; and a journal eventually publishes the paper, for a profit. None of the researchers (authors, editors, reviewers) usually get compensated. Essentially, tax payers pay researchers who then do work that publishers sell back to tax payers for a ridiculously high profit margin. The procedure varies a bit, but in clinical psychology, psychiatry, and methodology, the above is standard practice and representative of publishing in these fields. How is this not disheartening?

(3) The publisher ‘MDPI’ owns many journals, and in recent years most of these journals, on average, published one special issue per week. Several journals even published more than 2 special issues per day (Wikipedia). There is no way that all of this is robust, thorough, carefully vetted science, especially when one considers the very short times from submission to publication. This may be one of the reasons why some MDPI journals were recently delisted from a list of legitimate journals, and why some researchers have long considered MDPI a predatory publisher. MDPI is not alone here, Frontiers journals and other publishers have also received serious criticism.

(4) I already mentioned that last year, over 10000 papers were retracted. Related headlines such as ‘Scammers impersonate guest editors to get sham papers published’ don’t exactly inspire confidence either. Related, universities often use Elsevier’s Scopus rankings to determine how well they did — Scopus ranks journals on how ‘good’ they are. In 2023, of the top 10 ranked journals in philosophy, 3 were fake journals with fake authors and fake adresses and institutions. The system obviously no longer works if such journals can easily make it into the top 10 in arguably the most well-established journal ranking.

5. Problems with work pressure and incentives

Let’s move to us: researchers and educators.

(1) I’ll start showing you how bad things are, describing my own situation. It’s never nice to hear a privileged German white guy complaining, but the point I’m making is that even with my level of privilege, things were barely manageable. During my PhD in Germany, I was in a precarious employment situation, earning less than €1250 per month, from which I needed to pay for my healthcare. This is because we were employed as ‘freelancers’, a trick universities do to save social welfare and other contributions (they also did not pay into my pension fund). Many of my friends and colleagues during their PhDs were employed and paid for half-time positions, but expected to work full time. I then had 2 temporary postdoc positions for 2 years each. And then I worked as an Assistant Professor, in a temporary position. Only recently, at the age of 39, did I obtain tenure, i.e., my very first permanent job. Without tenure, you can’t get a mortgage from a bank for purchasing property here, so you need to keep wasting more money renting, etc.

(2) This is related to situations surrounding “publish or perish”, long work hours, mental health problems and burnout, and a lack of sustainable and permanent jobs (1, 2, 3). A recent OECD study summarizes the situation:

“Academic careers have become increasingly precarious, endangering rights, subjecting workers to difficult working conditions and stress. [..] Most were on short-term contracts or did not have any employment relationship [..] The problem has long been severe and has gotten worse over time. [..] Many countries are experiencing the emergence of a dual labour market, with the coexistence of a shrinking protected research elite and a large precarious research class that now represents the majority in most academic systems.”

(3) I want to get back to scientific evidence and quality: these issues dramatically exacerbate the problems I discussed above surrounding the validity of findings. Especially when we are employed in precarious situations, there is little overlap between the goals and motivations of scientists, and the core goal of science itself. My colleague Anna van’t Veer created a really nice figure on this topic as part of our workshop on Responsible Scholarship, showing that pressures in the system (e.g. publish or perish, competitiveness, hectic research pace) lead to scientists doing work that benefits their careers at the moment (such as flashy publications), but do not contribute to a rigorous, robust pyramid of cumulative science (cf. “Slow Science”). Scientists are largely still evaluated based on traditional metrics such as the number of publications, journal impact factors, citations rates, and so on, and optimizing those has little to do with optimizing scientific quality — especially if you optimize these in situations of duress (i.e. worried to be unemployment next year).

6. Antidotes to cynicism creep

Let me reiterate what I said before: when someone sends me a paper or a newspaper article about a paper, my view today is that the conclusions may or may not be valid. I don’t expect things to hold up just because this is a scientific paper (compared to a blog post), or because it is peer-reviewed (compared to a preprint), or because it is published in Nature or Science, or because it is published by a famous scientist. I think my view is reasonable and supported by evidence, at least in the fields I work in.

This view can lead to cynicism, which is what I want to avoid for myself — cynicism drains my motivation and energy. It’s not enjoyable to do this sort of work when you doubt the validity of much of the literature. Cynicism can also be contagious, potentially affecting junior colleagues and students. And then you’re disheartened, your team is discouraged, your students are bummed out … not a good place to be in, for you or anyone else. I believe my team does the best work when we’re both critical and motivated. Motivation can of course come from a sense of urgency, but it becomes dangerous (for me anyway) when it tips over into cynicism.

But this view that many scientific conclusions are invalid also implies a call to action, because the status quo threatens the idea of building a robust, cumulative pyramid of foundational blocks that stand the test of time.

And this is where the energy lies. The motivation and hope and amazing opportunities to make things better, together with smart and kind people all around us. There is a good chance we can make a dent in the problems I’ve summarized above, in my lifetime, and we have already seen a lot of improvements over the last years. We need to do so not only to prevent research waste, but also to prevent a further erosion of society’s trust in science, which can be deadly as we have seen during the COVID-19 pandemic in relation to vaccines.

Here’s how I get my energy. This is necessarily idiosyncratic, but I hope some of these will work for you, too.

6.1 Watch 👀

I listed so many problems above, but none of them have to be permanent, and there is progress for all of them.

(1) I’m privileged to get to see a lot of amazing people in action. People like Jessica Schleider, Jennifer Tackett, Don Robinaugh, Marc Molendijk, Praveetha Patalay, Laura Bringmann, Anna van’t Veer, Anna van Duijvenvoorde and so many others who not only do amazing work following modern open science principles: they are also fantastic peers, mentors, and teachers. I find it inspiring and motivating to watch them work. Generally, seeing the massive grassroots movements that have popped up in the last half-decade, such as open science communities and reproducibiliteas, give me a lot of hope for the future. There are too many people and initiatives to list here, but recently, folks have started putting out bounties to find errors in their own work, offering either payments or donations to charity if mistakes are identified (1, 2, 3). And a few weeks ago, ERROR went online, “a bug bounty program to systematically detect and report errors in scientific publications, modelled after bug bounty programs in the technology industry” where investigators are paid for discovering errors.

(2) In terms of precarious contracts, there is progress in the Netherlands both on the national level, and the level of specific universities and faculties. For instance, the Faculty of Science at the Free University Amsterdam decided to discontinue the tenure track system (which is quite unique: other jobs don’t require you to perform very well for 5 years before you are perhaps offered a permanent job). They will replace the tenure track system with a career track policy that assumes a permanent contract after 18 months.

(3) The Netherlands is also among the leading countries for rewards and recognition initiatives, i.e., for initiatives that try to change what criteria we are evaluated on. Away from traditional metrics such as impact factors and fancy journals, to the question of how open and transparent our works are, how effectively we engage with the public, and how much help and collaborative opportunities we can provide to colleagues. If you are primarily teaching, you ought to be evaluated based on .. surprise .. TEACHING rather than on your publications. Utrecht University has taken bold steps, for instance, moving towards a much more collaborative and inclusive environment.

6.2 Do 🤝

(1) I get a lot of energy from activism to improve things in academia, for example, as co-founder of the Open Science Community Leiden, member of the Young Academy Leiden, and all activities that come with these groups and initiatives. Being in the privileged position to talk about the issues that plague researchers, research, and public trust is a huge responsibility, but can also be very motivating when folks in charge respond positively. I highly recommend activism. This is why I find it so utterly alienating when people on Twitter shit on the ‘open science movement’, which to 99% consists of super enthusiastic early career researchers who have learned their lessons from flawed literature, and try to make things better in their own work.

(2) More broadly, for me personally, the way forward is to incentivize, champion, and promote better and more robust scientific work. I find this motivating and encouraging, and an efficient antidote against cynicism creep. I find it intellectually rewarding because it is an effort that spans many areas including teaching science, doing science, and communicating science. And I find it socially rewarding because it is a teamwork effort embedded in a large group of (largely early career) scientists trying to improve our fields and build a more robust, cumulative science. In the best case, these efforts not only safeguard the quality of science and its application, but also enable trust, foster equal opportunities and outcomes, and prevent research waste.

(3) Teach! I’ve been teaching a research methods course to clinical master students in Leiden for a few years, and I love how quickly and clearly they understand problems in the literature, and what can be done to address these issues. I’ve been able to teach workshops for international audiences, e.g. on the importance of proper theory building and testing (rather than vague, narrative ideas that can’t ever be rejected) together with Don Robinaugh; on the importance of improving our measurement practices with Jessica Flake; and on network analysis and other statistical methods with a host of other amazing researchers. Students are skeptical, and they are ready to identify and tackle challenges.

(4) Practice and celebrate, for your own work and that of others, what Merton referred to as a crucial norm in science: organized scepticism, i.e., that “scientific claims should be exposed to critical scrutiny”. And while we do so, lets try to make sure to call out problematic work, not people (at least for a while, unless we have called out work by the same people too many times). Tough on the issue, soft on the person. Science is a social enterprise as well, and it’s no fun to be criticized. Let’s practice together, and start teaching folks earlier on in their careers, that they are not their science, and rejected hypotheses and theories say nothing about you as a person. Let’s practice criticism, for instance, following Rapaport’s rules, and let’s practice being criticized. Quote:

You should attempt to re-express your target’s position so clearly, vividly, and fairly that your target says, “Thanks, I wish I’d thought of putting it that way.”
You should list any points of agreement (especially if they are not matters of general or widespread agreement).
You should mention anything you have learned from your target.
Only then are you permitted to say so much as a word of rebuttal or criticism.

(5) Write about the problems you see. When I joined Denny Borsboom’s lab in 2014 and learned about network models and estimated coefficients, one of my first thoughts was that current estimation routines don’t provide information on model stability, such as confidence regions. So I teamed up with Sacha Epskamp and we wrote a tutorial paper on the topic, urging researchers to estimate and communicate uncertainty about the estimated model results. The community embraced the paper and practice quickly, which was very rewarding to see. Similar things happened when I teamed up with Michiel van Elk to write about problems in psychedelic science, or Jessica Flake to write about problems in psychological measurement. This gives me energy because scientists and teachers and journalists and policy-makers are willing to engage with critical material: it just needs to be out there, in accessible ways.

(6) Publish reviews. If we really keep ye good olde peer-review system, the least we must do is put them online. I would very much like to see the reviews for the mindfulness paper I listed above, what the handling editor wrote, and how the authors responded. Were these issues not caught? Did the reviewers catch them but the editor did not ask the authors to fix the problems? Perhaps most imporantantly, did the authors have good counter-arguments, making my criticism potentially invalid? Publishing reviews resolves these questions, and while it is rare, some publishers have implemented this practice for a long time (e.g., BMC journals for several journals in 2001 already). Modern journals such as Meta Psychology publish all communication. How cool is that? (Sidenote: I sign my reviews, because I think I should be held accountable, but I am also one of the most privileged groups in academia and have a permanent job — there is little objective danger here. But I would never ask early career folks or folks in precarious employment conditions to sign.)

(7) Publish code and data, whenever possible. Researchers should be asked to share their data and code (i.e., their exact statistical analyses, which is usually a programming code). Pushling such information can be incredibly useful because it allows the scientific community at large to practice organized skepticism — mistakes happen, in every job, and we must make sure to catch them when they happen. I have shared code for all my papers since 2015 (because that is free, I don’t think there are ever reasons not to), and data when possible, and doing so enabled the scientific community to identify a mistake in one of my statistical analyses, resulting in a correction of my work.

Of course, some of the proposed solutions and initiatives above may not work out, and some may even put us a step back — I hear you. But there is actual progress, coming from motivated folks in different areas of academia, thinking about these issues and how to solve them in good faith. This gives me a lot of energy. It’s a bit like a first therapy session for PTSD, where one of the most important things to do is normalize: “You may think your symptoms are bizarre or abnormal, but actually, these are quite common symptoms many people experience, and they are quite normal given what you have been through!” I experience the current initiatives as very normalizing for me (it’s not you Eiko, it’s the system, stupid).

And everybody is different, and my answer to the question of what gives energy may not be correct for you. I still hope it may inspire.

Which make me wonder: what inspires you?

7. Conclusion

There are plenty of reasons to be disillusioned about the system as a whole. But maybe that’s the first necessary step: dis-illusionment, seeing things for what they are. This provides energy and motivation for change. So many people have been dis-illusioned, and we stand on the shoulders of giants who have helped to dispel false beliefs, including that peer-review guarantees scientific quality, that journals with high impact factors publish higher quality work, that Nature charging €10,000 for open access fees is fair, or that precarious employment is normal and fine.

The last decade has shown that we can really make a dent into some of these issues, just looking at the reforms we have seen, at the national level (e.g., the Dutch Taverne amendment allowing researchers to share their work in public), international level (e.g., UNESCO’s recommendations on Open Science), journal level (e.g., TOP guidelines and Registered Reports), and level of funders (e.g., the Dutch funder NWO mandates much more transparency than was the case a decade ago). While we share values of open science, work is in progress figuring out how to best implement these values, and this is a genuinely difficult process that will take time. But let’s not mistake important debates on what practices are best suited to implement our values optimally with a lack of progress generally.

In case you’re interested in evidence for this claim, see e.g. here, here, and here for the depression literature ↩
Adam Mastroianni wrote a great piece on problems of peer-review. ↩