gatekeeping.science
Reviewer #3 reporting for duty. Ready to roll the dice of peer-review!       [about]

Gatekeeping Science

Researchers serve as the gatekeepers of science which is on one hand a position of great power, but on the other hand comes with a lot of effort, barely any reward and surprisingly little accountability. As not everyone takes this responsibility seriously, this has lead to the rise of the infamous reviewer number three.

Problem 1: Adversarial Reviewers ("Peer-Review Bullies")

The reality of peer-review is that one single author from prior work can quite reliably kill a top-tier work across multiple submissions, especially if there are not too many suitable reviewers on the topic. This unjustified trust in a single person on top of all the reviewing noise (see further down) makes high quality research ("slow science") unsustainable and explains why many authors opt for quantity instead. Some common causes for adversarial reviewing: Methods: What's done about it: What could be done about it:

Problem 2: Evaluation noise

Sampling a handful reviewers seems insufficient for a high variance population such as the reviewer pool. Reviewers not only strongly disagree with each others opinions, but also significantly differ along all relevant reviewer quality dimensions: A lack of many of this qualities also means reviewers engage with a submission less deeply and the whole reviewing process becomes about the reviewers themselves rather than the submission. Each reviewer will have a varying submission-independent base attitude, agenda, set of unsubstantiated opinions and submission-unrelated pet peeves. Thus, the "background noise" due to reviewer assignment can often drown out the "signal" of the submission's quality and merit. One can think of each accepted paper to be one coin flip away from rejection In the NeurIPS'21 experiment around 10% of all submissions were assigned a second independent set of reviewers and out of the papers accepted by the original reviewers more than half would have been rejected by the second set of reviewers. Caveat: This may have been reinforced by authors focusing more energy on the more positive set of reviewers in rebuttals, as they only needed to convince one set of reviewers. and for submissions that face adversarial reviewers the coin can become quite biased against submissions, even when they offer substantial scientific advancement. So, what can we do to boost the signal?

Proposal: Let authors help denoise the signal!

The basic idea is to have a pre-reviewing phase that requires less effort from reviewers and thereby allows to ask for more pre-reviews. In this phase reviewers familiarise themselves with the paper and only provide their understanding of what the authors tried to do in the work: This makes the reviewer assignment less random, as it gives the submission authors some useful information to select the most qualified reviewers for the submission with only some faint speculation on reviewer sentiment (reviewers were not even yet instructed to make up their mind!): Then the 3 remaining reviewers finish up as usual: In phase I. reviewers together create a heap of questions which are answered by the authors in phase II. (the questions do not inform author vetos as they are not linked to reviewer IDs) and help the selected reviewers in phase III. to avoid severe misunderstandings. The complete workflow is then for 4 uncritical reviews:
sketch
Some more details:

Social Aspect of Reviewing

Reviewing often happens in isolation rather than as a social experience and becomes increasingly anonymous, which makes it less interesting, less accountable, harder to do and feels more like a homework assignment than a scientific activity. Discussions often center around arrividing at consensus rather than scientific facts, which means contradictory criticisms by the reviewers add up to a big negative instead of being exposed as as erroneous/unsupported.

Transparency in Reviewing

An interesting idea to help authors and reviewers to reflect upon the process is to open up the reviews to the public. This is for instance practiced in machine learning conferences such as ICLR and NeurIPS. Such an approach maintains the anonymity of reviewers for the most part and introduces a minimal amount of accountability into the process, as sloppy and presumptious reviews may reflect poorly on a conference. While most computer science conferences move towards double-blinded reviewing, it may also be useful to consider experimenting with non-anonymous reviewing. Ideally, multiple models could coexist such that authors and reviewers are given the choice. Everything being out in the public could potentially hurt some relationships and put too much stress on reviewers, but anonymity also has well-known disadvantages and disassociates reviewers from the valuable work they do. Which approach works best depends a lot on the authors and reviewers involved. If everyone involved is motivated by a search for knowledge, then a system is barely needed. If instead everything is just about the egos and careers of reviewers and authors, even the best system can only do so much. If the problem is more of a cultural nature, it likely cannot be solved behind closed doors and would require to shine a light on the problem. In conclusion: A look behind the curtain could be very helpful.

About this webpage

Rejections are often something researchers are ashamed about and current systems offer no recourse when faced with adversarial reviewers that argue disingenuously. This makes "slow science" very risky for early career researchers. Thus, this webpage intends to:

  1. Take critical look at why peer-review is so unpredictable (or predictably bad)
  2. Constructively propose reforms that could make a real difference

Other related webpages


compsci.science Is it really computer "science" and how can we get there? [link]

slow.science Should researchers focus on fewer papers with greater quality? [link]

shekelyan.science Personal webpage. [link]

Note: The views and opinions expressed on this site are those of the authors and do not necessarily reflect the official policy or position of their employers. [back]