Reviewer #3 reporting for duty. Ready to roll the dice of peer-review! [about]
Researchers serve as the gatekeepers of science which is on one hand a position of great power, but on the other hand comes with a lot of effort, barely any reward and surprisingly little accountability.
As not everyone takes this responsibility seriously, this has lead to the rise of the infamous reviewer number three.
Problem 1: Adversarial Reviewers ("Peer-Review Bullies")
The reality of peer-review is that one single author from prior work can quite reliably kill a top-tier work across multiple submissions, especially if there are not too many suitable reviewers on the topic. This unjustified trust in a single person on top of all the reviewing noise (see further down) makes high quality research ("slow science") unsustainable and explains why many authors opt for quantity instead.
Some common causes for adversarial reviewing:
Cover-Up: submission uncovers errors in the reviewer's work
Corruption: submission competes with reviewer's research
Dogmatism: submission does not adhere to reviewer's personal ideologies
Immaturity: submission is unfortunate victim of reviewer's bad mood
Hunt: bid on the targeted paper, request it when possible
Appeal to authority: exploit status as author of prior/related work to make a lot of unsubstantiated claims that undermine the submission
Distort: dismiss novelties as incremental to their and others' prior work
Play dumb: ignore significant achievements of a submission (even when other reviewers clearly recognise them)
What's done about it:
Nothing: academia naively adopts an honor system!
What could be done about it:
Free COI: Allow authors to block the authors of one paper as reviewers (via conflict of interest) to circumvent gatekeeping by those authors (some journals incorporate something akin to this principle)
Transparency: Instruct reviewers to declare soft conflicts of interests such as being a "competitor" or an author of prior work due to the involved biases (especially if the submission is critical of this prior work)
Appeal: Instruct reviewers to focus more on the readership of the venue and less about their own research, career and opinions.
Reform: push reviewers to at least try to get a submission's point (see proposal)
Problem 2: Evaluation noise
Sampling a handful reviewers seems insufficient for a high variance population such as the reviewer pool. Reviewers not only strongly disagree with each others opinions, but also significantly differ along all relevant reviewer quality dimensions:
Dutiful: puts enough time in to provide an informed review
Open-minded: openness to how a paper can contribute
Thorough: systematically goes through all positives and negatives of a work
Competent: some understanding of the subject area, well-rounded fundamentals and ability to quickly grasp complicated concepts
Reasonable: not expecting every submission to have groundbreaking results and being able to identify and admit if unqualified to review a work
Ethical: same reviewing sentiment regardless if submission supports/criticises/supersedes or competes with reviewer's own research
A lack of many of this qualities also means reviewers engage with a submission less deeply and the whole reviewing process becomes about the reviewers themselves rather than the submission. Each reviewer will have a varying submission-independent base attitude, agenda, set of unsubstantiated opinions and submission-unrelated pet peeves.
Thus, the "background noise" due to reviewer assignment can often drown out the "signal" of the submission's quality and merit. One can think of each accepted paper to be one coin flip away from rejection
In the NeurIPS'21 experiment around 10% of all submissions were assigned a second independent set of reviewers and out of the papers accepted by the original reviewers more than half would have been rejected by the second set of reviewers. Caveat: This may have been reinforced by authors focusing more energy on the more positive set of reviewers in rebuttals, as they only needed to convince one set of reviewers.
and for submissions that face adversarial reviewers the coin can become quite biased against submissions, even when they offer substantial scientific advancement. So, what can we do to boost the signal?
Proposal: Let authors help denoise the signal!
The basic idea is to have a pre-reviewing phase that requires less effort from reviewers and thereby allows to ask for more pre-reviews. In this phase reviewers familiarise themselves with the paper and only provide their understanding of what the authors tried to do in the work:
I. Uncritical/positive reviews: reviewers first summarise the intended contributions by the authors (give benefit of the doubt in terms of execution).
This makes the reviewer assignment less random, as it gives the submission authors some useful information to select the most qualified reviewers for the submission with only some faint speculation on reviewer sentiment (reviewers were not even yet instructed to make up their mind!):
II. Author veto: the submission authors can veto a few reviewers based on their uncritical reviews such that 3 reviewers remain.
Then the 3 remaining reviewers finish up as usual:
III. Reviewing completion: author-selected 3 reviewers then write an ordinary review that scrutinises the submission, identifies both strengths and weakness of the submission and weights them against each other
In phase I. reviewers together create a heap of questions which are answered by the authors in phase II. (the questions do not inform author vetos as they are not linked to reviewer IDs) and help the selected reviewers in phase III. to avoid severe misunderstandings.
The complete workflow is then for 4 uncritical reviews:
Some more details:
Phase I. rejections:
If none of the reviewers in phase 1 was able to write a positive review (while giving the authors the benefit of the doubt whenever possible), this likely indicates that there's not much point in proceeding with the second phase.
Number of phase II. author vetoes: It is difficult to foretell how many uncritical reviews could realistically be collected per paper (at least 4 would already by a huge improvement), but
generally this number can be also be adjusted to reward authors that themselves contribute a lot of useful reviews (e.g., evidenced by not being vetoed in phase II.).
Workflow support: In most reviewing systems some kind of rebuttal is supported which would allow to implement the proposal. First asking authors to purely focus on summarising the intended contributions may also help reviewers to write better reviews, as many reviewers tend to focus too much on the prior work they know, comparisons/establishing of "state-of-the-art" and lose track of what the authors are actually trying to share.
Social Aspect of Reviewing
Reviewing often happens in isolation rather than as a social experience and becomes increasingly anonymous, which makes it less interesting, less accountable, harder to do and feels more like a homework assignment than a scientific activity.
Discussions often center around arrividing at consensus rather than scientific facts, which means contradictory criticisms by the reviewers add up to a big negative instead of being exposed as as erroneous/unsupported.
Transparency in Reviewing
An interesting idea to help authors and reviewers to reflect upon the process is to open up the reviews to the public. This is for instance practiced in machine learning conferences such as ICLR and NeurIPS.
Such an approach maintains the anonymity of reviewers for the most part and introduces a minimal amount of accountability into the process, as sloppy and presumptious reviews may reflect poorly on a conference.
While most computer science conferences move towards double-blinded reviewing,
it may also be useful to consider experimenting with non-anonymous reviewing. Ideally, multiple models could coexist such that
authors and reviewers are given the choice. Everything being out in the public could potentially hurt some relationships and put too much stress on reviewers,
but anonymity also has well-known disadvantages and disassociates reviewers from the valuable work they do. Which approach works best depends a lot on the authors and reviewers involved.
If everyone involved is motivated by a search for knowledge, then a system is barely needed. If instead everything is just about the egos and careers of reviewers and authors, even the best system can only do so much.
If the problem is more of a cultural nature, it likely cannot be solved behind closed doors and would require to shine a light on the problem. In conclusion: A look behind the curtain could be very helpful.
About this webpage
Rejections are often something researchers are ashamed about and current systems offer no recourse when faced with adversarial reviewers that argue disingenuously. This makes "slow science" very risky for early career researchers.
Thus, this webpage intends to:
Take critical look at why peer-review is so unpredictable (or predictably bad)
Constructively propose reforms that could make a real difference
Other related webpages
Is it really computer "science" and how can we get there? [link]
Should researchers focus on fewer papers with greater quality? [link]