Diehard Empiricist: 6 September 2011 – Peer review, and a whiff of grapeshot

Colleagues,

The only thing that differentiates scientists from the mass of “analysts” and other assorted wonks populating the landscape is the “science” part of our occupational classification. Science, as I have opined on more than one occasion, depends for its credibility entirely upon the proven efficacy of the scientific method in weeding out error. Science is self-correcting precisely because any argument may be challenged, and the validity of arguments is judged solely on the basis of which of two or more competing theses is better supported by observations. It is, in essence, a dialectic (in the Socratic rather than corrupt Marxist sense) consisting of hypothesis, antithesis and synthesis - of argument, counterargument, counter-counterargument, and so forth, until the hypothesis that best explains the data is identified. And that hypothesis survives in turn only so long as nothing better comes along. It’s an intellectual “survival of the fittest”, to borrow a well-worn phrase.

As might be expected of a former gunner, I tend to think of scientific arguments in terms of a classical, pre-GPS fire mission. A forward observer, using the best information available and processing it in accordance with his knowledge and expertise, orders an adjusting round. The first such round rarely strikes the target (artillery, after all, is an area weapon, with probable errors in bearing and range inherent in the system); and the observer correspondingly orders a correction, instructing the command post to adjust their aim-point by so many metres left or right and plus or minus of the observed fall of shot.

Figure 1 - Adjusting a Target

(Source: FM 6-30 Tactics, Techniques, and Procedures for Observed Fire, 16 July 1991)

Figure 2 - Bracketing a Target

(Source: FM 6-30 Tactics, Techniques, and Procedures for Observed Fire, 16 July 1991)

The next round hopefully will be closer to the target; and the process continues until the target has been hit (and, with a massed fire-for-effect, destroyed). It’s not a perfect analogy, of course; scientists refining a theory tend to attempt to get as close as possible to the truth with each successive adjustment, whereas in adjusting a target with artillery fire one deliberately overshoots in order to gain a better “bracket” for subsequent adjustments. But the principle is, I think, similar.

Naturally, when conducting a fire mission the gold standard is the (formerly comparatively rare) first-round hit. The closer you are to the intended target on the first adjusting round, the less time and ammunition it takes to bracket the target. We strive for something similar in the scientific field; the “first-round hit”, to us, is analogous to a paper that posits an hypothesis, contains and correctly interprets supporting evidence, and offers conclusions that are justifiable and, hopefully, accurate. In the hard sciences results must be replicable, but those of us who work in the field of social sciences accept that this level of scientific fidelity is almost by definition unobtainable. There is of course no guarantee that subsequent studies will not provide refinements of our understanding of the data (as, for example, Newton’s understanding of motion has subsequently been refined at small scales by quantum mechanics, and at large scales by relativity), but the goal of scientific inquiry should nonetheless always be to get a “first-round hit”.

In science, we have an advantage that forward observers have never enjoyed: a second-glance mechanism designed to improve the likelihood that our “first round” will be a “hit”. I refer, of course, to the peer review process. Peer review serves a fundamental function from the point of view of the writer; resorting to artillery terminology once again, it’s a “gross error check” on our plan for hitting the target. In gunnery training, each gun has an assigned safety officer who stands (or more often sits) behind the gun with a map, a compass, a set of graphical firing tables (slide rules) or a copy of the tabular firing tables (a book). His job is to listen to the fire orders received by the gun crew commander - especially bearing, charge and elevation - and, using his firing tables, compass and map, to determine not whether the shell will hit any particular target, but simply whether it is going to fall within a safe impact area. Since an impact area - in Canada, anyway - may be several kilometres in width and depth, the data being fed to the gun might put the round within the safe zone but nowhere near the target; but that’s not the safety officer’s concern. He merely ensures that the round is not going to land out of bounds. It’s still possible for the shell to be wildly off target - but it won’t be so far off target that the safety officer couldn’t let it go. Indeed, the gun crew commander is not permitted to fire until the safety officer has said “Let it go” (or to use the correct terminology, “Safe!”).

And what, you ask, happens if the round falls wildly off target? The fire mission is suspended while the data are checked to determine the source of the error (i.e., were the data wrong, or were they misapplied to the gun? Was it a charge error? etc.). No opprobrium attaches to the safety officer; so long as the shell didn’t fall outside of the impact area, he is considered to have done his job.

The primordial duty of the peer reviewer is analogous to that of an artillery safety officer: to determine whether a paper is methodologically sound before allowing it to proceed. The peer reviewer looks for gross errors, not minutiae; and apart from highlighting gross errors, it is not his responsibility to attempt to determine whether a paper is “right” or “wrong”, as the means of doing so under the scientific method is, once the paper has been published, to challenge its data, method and/or conclusions in open literature. Where there are differences of opinion on non-methodological issues, the reviewer’s opinion must cede to that of the writer, because it is the writer, not the reviewer, who bears ultimate responsibility for the content of the paper. The reviewer, in short, is not a referee or a gatekeeper; he is merely a “gross error check”. As with an artillery safety officer, the default response, absent a grotesquely crippling flaw that disqualifies the paper from constituting sound science, should be to “let it go”, and allow the author to bear - as he or she must - the responsibility for any errors the paper may contain.

When I say that reviewers should not waste their time on “minutiae”, I mean that a peer reviewer should not be required to serve as a copy-editor. Most authors I know appreciate it when reviewers identify obvious misspellings or typographical errors, or indicate a place where a word seems to be superfluous or missing; in the to-and-fro of editing, re-editing and re-re-editing, I know that I find it difficult to proof-read my own work (I find that repeated re-readings cause me to see what I intended to write rather than what actually appears on the page). But it is not the task of the peer-reviewer to correct spelling, much less offer stylistic or grammatical suggestions; presumably anyone drawing a salary as a writer should be sufficiently familiar with the grammatical conventions of the language in which they are writing that corrections should be unnecessary, while style, as a matter of individual taste, is largely subjective. Tinkering with language consumes enormous amounts of time - time that might be better spent on more fruitful pursuits.

Why am I on about peer review this week? Well, because of how corrupt it has become. I refer, of course, to the climate “science” community - with deliberate sneer quotes, as the scholarly practices of that community have become so shady as to leach the few remaining vestiges of credibility from its veins. Allow me to explain. About six weeks ago, R.W. Spencer and W.D. Braswell published a paper in Remote Sensing entitled “On the misdiagnosis of surface temperature feedbacks from variations in Earth’s radiant energy balance” (Remote Sensing, 2011, 3, 1603-1613). For once I won’t bore you with a summary of the paper; basically, it uses observed data from satellite measurements to challenge the “consensus” position on distinguishing between radiative forcing and radiative feedback. It’s pretty innocuous and uses very bland language - but if its arguments are sustained, then it substantially undermines the AGW thesis by demonstrating that all existing climate models drastically overestimate radiative forcing and, therefore, the likely temperature response to increases in atmospheric CO₂ concentration.

While the content of the paper is interesting, what is more interesting is what has happened since it was first published. First, the timeline from receipt of the paper to publication is worth noting:

· Received: 24 May 2011;

· in revised form: 13 July 2011;

· Accepted: 15 July 2011

· Published: 25 July 2011 [http://www.mdpi.com/2072-4292/3/8/1603/]

In other words, the paper went from submission through peer review and response to acceptance and publication in the space of two months. That’s a pretty impressive turnaround time. Granted, the thing is only 11 pages long, but any reviewer worth his salt would have had to work his way through the data and the math in order to conduct an adequate “gross error check”. Three reviewers did so.

Following publication, though, things really got interesting. First, the usual pro-AGW websites (like Realclimate.org) went berserk, indulging in ad hominem attacks against Spencer and Braswell, and criticizing Remote Sensing for publishing the paper at all. This resulted last week in an unprecedented event: the journal’s editor-in-chief, Wolfgang Wagner, resigned his post, stating in an explanatory article that the paper should never have been published [http://www.mdpi.com/2072-4292/3/9/2002/pdf]. Wagner’s mea culpa, which is a masterpiece of self-righteous obfuscation, argues that the Spencer-Braswell paper should not have been published because it contained “methodological errors and false claims”, none of which apparently were picked up by the peer reviewers (who, unlike Wagner, were actually conversant with the field in which Spencer and Braswell are working). Wagner’s explanation of his resignation, however, does not actually cite any of these alleged “methodological errors and false claims”, nor does he cite any contrarian peer-reviewed publications that Spencer and Braswell did not already cite themselves, specifically for the purpose of challenging them.

Placing the blame on the managing editor, Wagner alleges that the editorial board “unintentionally selected three reviewers who probably share some climate sceptic notions of the authors”. This, apparently, is enough to disqualify them from reviewing scientific papers (although possessing pro-AGW “notions” apparently does not disqualify warmists from reviewing each others’ work; such "pal review", as I have elsewhere observed, has regrettably become the hallmark of the pro-AGW community). This implies that while alarmists can be trusted to review each other’s work to produce unimpeachable science, “sceptics” are a herd of corrupt ne’er-do-wells. This is an amusing conceit from a point of view of numbers, too; if, as the warmists like to argue, those who challenge the “scientific consensus” are a tiny minority, then is it not statistically extremely unlikely that the three reviewers chosen at random for a given paper would ALL be sceptics?

Such questions are of course rhetorical. The proper response to a scientific paper is not an editor’s resignation, but either a published comment pointing out substantive flaws, or another peer-reviewed scientific paper challenging the offending paper’s data, methodology or conclusions. Ad hominem attacks are not science. Snide comments, back-room wheeling and dealing, collusion between interested parties to keep papers out of the literature, and hand-wringing, content-free denunciations of legitimate work are not science. And sending a paper back to peer review over and over again until the editor’s preferred result is achieved is not science.

Wagner’s grandstanding resignation short-circuited the scientific method and sensationalized what should have been a dispassionate debate about the validity of observed data and calculations. His actions are especially egregious given the manner in which certain individuals in the climate “science” field have played fast and loose with methodology and the peer review process; as the Climategate emails demonstrate, it was Phil Jones of the University of East Anglia’s Climate Research Unit, one of the icons of the AGW pantheon, who, discussing papers submitted to the IPCC that challenged the “scientific consensus”, said:

“Kevin and I will keep them out somehow -- even if we have to redefine what the peer-review literature is!” (Note A)

And it was Jones and Michael Mann, Mr. Hockey Stick himself, who, when discussing how to deal with journals that publish sceptics’ papers, engaged in the following exchange:

“Perhaps we should encourage our colleagues in the climate research community to no longer submit to, or cite papers in, this journal,” Mann writes.

“I will be emailing the journal to tell them I’m having nothing more to do with it until they rid themselves of this troublesome editor,” Jones replies. (Note A)

Blacklisting scientific journals and referring to their editors in the same terms that Henry II used vis-à-vis Thomas Beckett is about as far from legitimate science as you can get.

It goes further. Yesterday, a rebuttal to the Spencer-Braswell paper appeared, published in Geophysical Research Letters - a remarkable feat, given that the time from the first appearance of the Spencer-Braswell paper, through to submission, review and publication, was only six weeks (A.E. Dessler, “Cloud variations and the Earth’s energy budget”). That’s not a lot of time. I haven’t read Dessler’s paper yet, but the fact that it got written and pushed out so fast is remarkable - particularly when you consider how long some sceptics’ papers have mouldered in the peer review process. Science, for example, spent “months” sitting on Spencer’s critique of an earlier paper by Dessler, while it took Lindzen and Choi two years to get a paper examining climate sensitivity out due to resistance from hostile reviewers. The most egregious (recent) example was a paper by Ryan O’Donnell, et al., that challenged a paper on Antarctic warming by Eric Steig, et al.; Nature magazine actually selected Steig as a reviewer for the paper challenging his work. That’s not just unscientific, it’s a patent conflict of interest. How can it possibly be deemed legitimate to allow someone to review a paper that contradicts their own work? No credible scientific organization would countenance such a blatantly corrupt practice - and if they did, colleagues and clients alike would be justified in discounting anything that organization thereafter produced.

The peer review process can only serve its intended purpose if it is transparent, professionally applied, and timely. The process is depressingly easy to undermine, and may be fatally compromised by any of a number of methodological flaws. The above examples illustrate the three most dangerous potential pitfalls:

· Reviewer interest or engagement: it is unprofessional and unscientific for a paper to be reviewed by anyone with a personal or professional interest in its conclusions. We are all of us human. While presumably some folks are able to separate personal and professional sentiments sufficiently to give an honest reading to a paper that challenges (or for that matter, that supports) their own work, such treatment undermines peer review by allowing for the appearance of impropriety. Even the suggestion of a conflict of interest undermines the process. Anyone selected to review a paper, whether as a peer or a supervisor, has a professional obligation to disclose any potential conflict of interest before the fact, and an ethical responsibility to recuse himself from the review and publication processes for that paper.

· Reviewer incapacity: peer review only serves its intended purpose if the reviewer is conversant with the material discussed and the methodologies used in the paper under review. In highly specialized fields, this will not always be the case; thus, it is incumbent upon the reviewer to declare incapacity where appropriate, restricting his review to methodological considerations only; and it is incumbent upon the editor to ensure that at least ONE reviewer knows something about the subject of the paper under review. Simply put, the peer review process doesn’t work if the reviewer isn’t a “peer” in any meaningful sense.

· Timeliness: unduly delaying a paper obviates the purpose of the review and publication process; while doing so for any non-scientific reason is a blatant violation of scientific ethics. While it would normally be impossible to discern motivations behind inordinate delays in the review and publication process, the Climategate emails feature statements by certain individuals (e.g., Jones and Mann) of their intent to prevent the publication of papers challenging their work. Delaying papers that challenge a preferred position is grossly unethical; as scientists, we have an obligation to ensure that all viewpoints are published, both in order to reinforce the transparency upon which the scientific method depends, and to provide all comers the opportunity to challenge the data, methods and conclusions offered by our colleagues and ourselves. It is an especially pernicious violation of scientific ethics to deliberately delay publication of a paper in order to allow a contrarian paper or rebuttal to be published concurrently or shortly afterwards; and where the editorial authority is himself an interested author in the field, as was the case with Jones, any attempt to delay or deny publication is grossly unethical. Finally, as the above examples suggest, undue publication delays unfairly disadvantage authors by denying them the opportunity to weigh in on a developing topic of scientific debate. As the Lindzen-Choi example demonstrates, it is difficult to exercise any impact or influence through one’s arguments if one cannot get one’s work past self-appointed scientific cerberuses.

Used properly, in a timely and transparent fashion, peer review is not merely helpful; it is the sine-qua-non of science. Like an artillery safety officer, it serves to keep us within the bounds of legitimacy. But peer review only serves its purpose under strictly-defined circumstances, and only if the above criteria are observed. It is depressingly easy to undermine and circumvent, and we need to be aware of its weaknesses and limitations. As the above examples demonstrate, we need to understand that the process can be subverted to delay, obfuscate and undermine the method that is the heart of our work, and that, when all is said and done, is the only thing that separates those of us who aspire to the title of “scientist” from the vast, heaving heterogeneous masses of undifferentiated “analysts”.

Our methodology is all we have – and, as with “gross error checks” in the artillery, it is all that stands between the literary translation of the gunner motto, Ubique (“Anywhere”), and the translation preferred by the infantry: “All over the frigging place.”

Cheers,

//Don//

Notes:

A) http://www.washingtonpost.com/wp-dyn/content/article/2009/11/21/AR2009112102186.html

Saturday, August 4, 2012

6 September 2011 – Peer review, and a whiff of grapeshot