Saturday, April 26, 2014

Can 2130 physicists pounding on keyboards turn out Shakespeare plays?

The CMS Collaboration, of which I am a member, has submitted 335 papers to refereed journals since 2009, including 109 such papers in 2013. Each of these papers had about 2130 authors. That means that the author list alone runs 15 printed pages. In some cases, the author list takes up more space than the actual content of the paper!
One might wonder: How do 2130 people write a scientific paper for a journal? Through a confluence of circumstances, I’ve been directly involved in the preparation of several papers over the last few months, so I have been thinking a lot about how this gets done, and thought I might use this opportunity to shed some light on the publication process. What I will not discuss here is why a paper should have 2130 authors and not more (or fewer)—this is a very interesting topic, but for now we will work from the premise that there are 2130 authors who, by signing the paper, take scientific responsibility for the correctness of its contents. How can such a big group organize itself to submit a scientific paper at all, and how can it turn out 109 papers in a year?
Certainly, with this many authors and this many papers, some set of uniform procedures are needed, and some number of people must put in substantial effort to maintain and operate the procedures. Each collaboration does things a bit differently, but all have the same goal in mind: to submit papers that are first correct (in the scientific sense of “correct” as in “not wrong with a high level of confidence”), and that are also timely. Correct takes precedence over timely; it would be quite an embarrassment to produce a paper that was incorrect because the work was done quickly and not carefully. Fortunately, in my many years in particle physics, I can think of very few cases when a correction to a published paper had to be issued, and never have I seen a paper from an experiment I have worked be retracted. This suggests that the publication procedures are indeed meeting their goals.
But even though being correct trumps everything, having an efficient publication process is still important. It would also be a shame to be scooped by a competitor on an interesting result because your paper was stuck inside your collaboration’s review process. So there is an important balance to be struck between being careful and being efficient.
One thing that would not be efficient would be for every one of the 2130 authors to scrutinize every publishable result in detail. If we were to try to do this, everyone would soon become consumed by reviewing data analyses, rather than working on the other necessary tasks of the experiment, from running the detector to processing the data to designing upgrades of the experiment. And it’s hard to imagine that, say, once 1000 people have examined a result carefully, another thousand would uncover a problem. That being said, everyone needs to understand that even if they decline to take part in the review of a particular paper, they are still responsible for it, in accordance with generally accepted guidelines for scientific authorship.
Instead, the review of each measurement or set of measurements destined for publication in a single paper is delegated by the collaboration to a smaller group of people. Different collaborations have different ways of forming these review committees—some create a new committee for a particular paper that dissolves when that paper is published, while others have standing panels that review multiple analyses within a certain topic area. These committees usually include several people with expertise in that particular area of particle physics or data analysis techniques, but one or two who serve as interested outsiders who might look at the work in a different way and come up with new questions about it. The reviewers tend to be more senior physicists, but some collaborations have allowed graduate students to be reviewers too. (One good way to learn how to analyze data is to carefully study how other people are doing it!)
The scientists who are performing a particular measurement with the data are typically also responsible for producing a draft of the scientific paper that will be submitted to the journal. The review committee is then responsible for making sure that the paper accurately describes the work and will be understandable to physicists who are not experts on this particular topic. There can also be a fair amount of work at this stage to shape the message of the paper; measurements produce results in the form of numerical values of physical quantities, but scientific papers have to tell stories about the values and how they are measured, and expressing the meaning of a measurement in words can be a challenge.
Once the review committee members think that a paper is of sufficient quality to be submitted to a journal, it is circulated to the entire collaboration for comment. Many collaborations insert a “style review” step at this stage, in which a physicist who has a lot of experience in the matter checks that the paper conforms to the collaboration’s style guidelines. This ensures some level of uniformity in terminology across the all of the collaboration’s papers, and it is also a good chance to check that the figures and tables are working as intended.
The circulation of a paper draft to the collaboration is a formal process that has potential scaling issues, given how many people might submit comments and suggestions. On relatively small collaborations such as those at the Tevatron (my Tevatron-era colleagues will find the use of the word “small” here ironic!), it was easy enough to take the comments by email, but the LHC collaborations have a more structured system for collecting and archiving comments. Collaborators are usually given about two weeks to read the draft paper and make comments. How many people send feedback can vary greatly with each paper; hotter topics might attract more attention. Some conscientious collaborators do in fact read every paper draft (as far as I can tell). To encourage participation, some collaborations do make explicit requests to a randomly-chosen set of institutes to scrutinize the paper, while some institutes have their own traditions of paper review. Comments on all aspects of the paper are typically welcome, from questions about the physics or the veracity of the analysis techniques, to suggestions on the organization of the paper and descriptions of data analysis, to matters like the placement of commas.
In any case, given the number of people who read the paper, the length of the comments can often exceed the length of the paper itself. The scientists who wrote the paper draft then have to address all of the comments. Some comments lead to changes in the paper to explain things better, or to additional cross-checks of the analysis to address a point that was raised. Many textual suggestions are implemented, while others are turned down with an explanation of why they are not necessary or harmful to the paper. The analysis review committee then verifies that all significant comments have been properly considered, and checks that the resulting revised paper draft is in good shape for submission.
Different collaborations have different final steps before the paper is actually submitted to a journal. Some have certain leaders of the collaboration, such as the spokespersons and/or physics coordinators, read the draft and make a final set of recommendations that are to be implemented before submission. Others have “publication committees” that organize public final readings of a paper that can lead to changes. At this stage the authors of the original draft very much hope that things go smoothly and that paper submission will be imminent.
And this whole process comes before the scientific tradition of independent, blind peer review! Journals have their own procedures for appointing referees who read the paper and give the journal editors advice on whether a paper should be published, and what changes or checks they might require before recommending publication. The interaction with the journal and its referees can also take quite some time, but almost always it ends with a positive result. The paper has gone through so many levels of scrutiny already that the output is really a high-quality scientific product that describes reproducible results, and that will ultimately stand the test of time.
A paper that describes a measurement in particle physics is the last step of a long journey, from the conception of the experiment, the design and subsequent construction of the apparatus, its operation over the course of years to collect the data sample, the processing of the data, and the subsequent analysis that leads to numerical values of physical quantities and their associated uncertainties. The actual writing of the papers, and process of validating them and bringing 2130 physicists to agree that the paper has told the right story about the whole journey is an important step in the creation of scientific knowledge.

No comments: