A tense exchange

Andy Buckley

2014-06-06 00:00

Time for another lazy excerpt from private correspondence! This time we visit that most viscerally thrilling and scientifically crucial of subjects: what tense(s) to use in your scientific paper. Daring, I know! But surprisingly controversial, and I'm motivated to write it after reading and reviewing umpteen notes, drafts, and published papers in which the tenses seem (to me) perverse. In particular I think there's a need to write such a thing after being told by one physicist "I think there's a convention in science writing that we always use present tense". Piffle!

I think this sort of view, which is not uncommon, is an interesting phenomenon -- junior physicists read badly written documents, then after initial recoil they rationalise the bad style as "how it's done in science". Later then internalize and view the weirdness to some degree as a badge of honour; and finally they become irrational defenders of the spontaneously arisen faith. The same sort of thing can happen for oddball journal style rules like using "Section" if the first word in a sentence but "Sect." otherwise -- such rules are different for each journal, of course, and if you ask me it's the journal staff's responsibility to enact their own crazy rules if they think that's a good way to justify their continued existence. Bah, humbug. Some people seem to consider these arbitrary rules -- many of which are hang-ons from the utterly irrelevant game of printing and binding scientific articles and stacking them on the physical shelves of an actual library -- as important as the science being described. ATLAS has a wonderfully huge collection of style rules which I suspect do slightly improve the quality of our output, but in inverse proportion to their level of detail. Most fresh in my mind is the many months of collaboration review of the new ATLAS underlying event (UE) measurement paper, where I was the main editor: the internal review process absolutely improved this paper, but I promise that it wasn't the endless text iterations that made the difference. Naturally I anticipate with bated breath the journal reviewers' expert insights on how many commas there should be, and where they should go!

I recall a course taught in Cambridge many moons ago on experimental methods. It was rather a mixed bag: some half-arsed stats here, something about op amp circuits (really!) there, and finally something about how to write a paper. After spouting the usual rigid stuff, the lecturer finally gave up. "Just read Jane Austen until you know how to write", he implored. I take issue with the Austen, but the sympathy seems right: find authors whose style you admire, be they scientific or secular, and get forging. I don't advise channelling Hunter S Thompson into your next HEP paper, but if I find it I will certainly read it with more enthusiasm than the many stiff, colourless, unexceptional manuscripts that our collaboration issues. Such is the bleaching effect of large-scale text review by hundreds of borderline autistics desperate to pick a hole, any hole, in your document.

My own view of appropriate balance is that good scientific writing should be clear and unambiguous to any reasonable reader, but you shouldn't notice the writing at all unless a) it's unusually elegant and refreshing, or b) you snigger at the occasional little joke or flourish. In the end, your work does need to be read by humans, not robots. I'm rather pleased with the UE paper, but not as pleased as I was 18 months ago when it still had a bit of individual character. Such is life in a collaboration.

Ok, back to the "always write in present tense" thing -- in principle at least, it is the subject of this blog. This "rule" just leads to bizarreness; you may have got used to reading science papers that say "blah blah are measured" in the abstract, but it's not really a natural language usage. Your English teacher would tell you off, saying "no, they have been measured". Obviously the measurement is not taking place as the paper is being read. You might get away with "are presented", "are compared", "are used", etc. but the actual measurement, simulation, collection etc. is definitively in the past. One of the reasons for the verbal contortions is the strict application of "passive voice" and not being allowed (at least in ATLAS collaboration papers) to say "we", as in "we present measurement of...". This would be far more natural, and the passive form strikes me as rather spooky, as if the work of ATLAS is being reported by a spectre who only observed but did not participate. Which, as PhD students will tell you when appropriately bribed, is an apt description of their professor's involvement...

This dance of past and present tense isn't as simple as choosing one or the other, as the above example demonstrates. Certain operations that actually involved running the beam, writing events to disk, performing numerical analyses on them, etc. are actions located in the past. But the presentation in the form of the words and pictures of the paper, is current. The comparison of one plot to another is being made by the reader in the present. And if we did our job well, the scientific conclusions drawn from the made-in-the-past data plots are themselves timeless, everlasting, and deserving of the present tense. So the tense in a paper will, if appropriately applied, bounce back and forward a bit within the various sections depending on what you are describing. Which is hardly surprising; we do it all the time in natural language, but as I mentioned before, we have often internalized some odd rules which we would only ever consider using in scientific writing, and we would typically do well to unlearn them.

In our paper we used past tense for aspects of our analysis (or of previous work) which were done in the past relative to the reading of the document, and the present for the timeless things or those which refer to the way in which the reader is studying the document. But the summary section then requires a further nuance; until this I only used the past tense for things which happened weeks or months ago as part of the analysis, but by the time the reader is at the conclusions, their personal studying of the presented plots (which were present tense as they were first seen and described) is in the past. The recent past, but the past nevertheless. And I think that small-scale reference time is the one that feels most natural in concluding remarks; as they read the summary they are not currently looking at the plots etc. but rather remembering them from the previous pages. I find it extremely perverse to read a conclusion section which starts with "An analysis is presented..." -- what, another one?! At that point you are clearly reviewing -- the analysis has been presented. For what it's worth, "has been" feels so much more right than "was", but I'm damned if I can tell you why.

Have I convinced you? Probably not -- that's the thing about good (bikeshedding)[http://en.wikipedia.org/wiki/Parkinson's_law_of_triviality] opportunities, they are a gift that just keeps giving because everyone has something to offer. But my version is right, so there :-P

It just works... or does it? The dark side of Macs in HEP

Andy Buckley

2014-05-17 00:00

Comments

If you attend a particle physics meeting these days (and most of us do, several times a day... this is not a good thing) it looks rather different to how it did 10+ years ago. Not that everyone paid attention then, but the type of laptop everyone's focusing on rather than the speaker has shifted, from the olden times array of various clunky black boxes to the situation now where 2/3 of the room seem to be wielding shiny silver Macbooks.

It seems like a no-brainer: Windows is pretty much 100% dysfunctional for computing-heavy science (unless you are either in a fully management role and never touch data, or for some reason love doing all your work though a virtual machine), but Linux is unfamiliar territory for most starting PhD students. Sure, it's a lot more user friendly than it used to be, with more helpful GUI ways to manage the system and the wifi even works out of the box most of the time. But Macs are perfect: beautifully designed, friendly, but with Unix underneath ... and they only cost an extra 50%! Ideal for HEP users who need Unix computing but want it to just work out of the box... and who doesn't? As the Apple advertising used to say "It just works". But does it?

From my perspective as an author and contact point for an awful lot of particle physics software packages, I can put my hand on my heart and swear that about 90% of user problems are from people with Macs. And even if we adjust for the base rate effect that there are an awful lot of Macs out there, it's still the case that a majority of issues relate to the fact that it's a Mac that they are using. This trend toward everyone having a Mac naturally brings an assumption that, just like the mail client and web browser and office suite, HEP software should "just work" on these machines like it does on the offically supported Linux platforms. But that's often not true, for various reasons, and the demand that everything also work on a rather dysfunctional platform which is not part of the LHC's data processing plan puts rather a heavy load on us developers and maintainers. If you want to be able to run HEP code on your Mac, rather than just use it to log into Linux servers, then Macs are not the "it just works" path of least resistance that you might expect. I've called Macs "dysfunctional" there: accordingly legions of fans will be grinding their teeth and preparing to send me hate mail. Well, first-off I'd like to come clean and admit that I used to be a Mac user... and that I liked it a lot. My Macbook was sleek, light, cool to the touch, bonged endearingly and played Epel when I first switched it on, the screen was vibrant and sharp, the desktop apps were all superbly integrated with the hardware: wonderful! And then I needed to do some work.

This was back in 2002 or so, and Macs didn't have multiple desktops: I had to install a hack. There was also no way to override some keybindings, e.g. turning my Caps-lock into a backspace: another hack. The terminal application was junk... a toy to tick the "we're Unix and good for power users" box, a la the Windows command prompt. And compilers... I could get an old GCC through XCode, but no Fortran compiler: I installed Fink and had to work out how to pick up consistent build tools from /sw. And then the X-based apps from Fink didn't integrate nicely with the OS X display server, didn't know about the existence of that big "Apple" command key, expected a right-hand mouse button, and in short half the programs I needed to use just didn't behave in that beautifully integrated Mac way: I felt like a second- class citizen on my own machine. And since half these hacks weren't package-managed, I needed to do all that by hand, too. Much though I loved my sleek silver machine, after 6 months in my first postdoc job, where there was less time for this sort of frippery, I pulled the ripcord and bought a Lenovo. They have their own issues and my next jump (several Thinkpads later) may be to Dell, but on Linuxes like my current Xubuntu installation a consistent and self-updating set of developer tools is at my fingertips via a quick sudo aptitude search|install command. By comparison when it came to development tools and command-line operation, my Mac did not just work... in fact it was much harder.

Things have moved on quite a bit in Mac-land since then: there are multiple desktops built in, the terminal is better and apparently X-server integration now works, at least to some extent. But it still does not "just work" when it comes to developer tools, and accordingly everyone needs to patch their system if they are to get any Real Work done. In XCode itself, Apple long ago stopped updating GCC, leaving it at version 4.2 while they migrated to clang, and hence forcing many HEP users to install their own copies via Fink, MacPorts, HomeBrew, or manually. And if you needed a Fortran compiler, as you probably did, you definitely needed to get that from somewhere else. (I once had a wonderful exchange with a Mac user who said that LHAPDF was broken and wouldn't compile... ten emails later it turned out that they didn't have a Fortran compiler and couldn't read the "confusing" error message that essentially said "compiler not found". Ten mails after that, it turned out that they had found one online... and dragged it to their desktop as they would a Mac application. I'm not sure we ever managed to solve all their computing woes.)

This array of ways to install the extra tools that particle physicists need, but which Apple doesn't support you at all in doing, and the broken and inconsistent state of Apple's own compiler suite, means that there are a huge number of HEP Macs out there which are essentially borked. They have multiple compilers, in multiple locations on their system, which generate incompatible binary code objects and libraries. They have set different environment variables to try and random-walk their way around the generally screwed-up states of their machines. And every one is different, and most of them (as far as I can tell) still expect that our code will "just work". It doesn't and it can't.

To be clear here: I have no issue with bug reports about bugs in our software, those are always welcome even though we wish we hadn't messed up in the first place! And I get annoyed but can't blame our users for the fact that Apple uses a version of sed that behaves differently from the Linux one, that it uses DYLD_LIBRARY_PATH rather than LD_LIBRARY_PATH, and other such glorious inconsistencies. No, the big problem is that because of their shiny computers' shortcomings as development (or Real Work ;-) ) machines, people have applied hacks upon hacks, from all sorts of different sources, to try and make their computers functional again, without really knowing what's going on. Given this, I'm not sure why I feel like the bad guy when (after trying to help -- I really do) I also gently suggest that maybe it's their responsibility as computer users to make sure that their system is, y'know, functional.

Often I've had responses implying that I'm some sort of tech elitist for this -- and sure, I'm pretty good with computers and "the Unix way", I damn well should be after all these years -- and that this is unreasonable, that everything should just work, that it's their job to do science not computing, etc. ... but is it really? I certainly wasn't born knowing how to script in bash or run a compiler, I worked it out. I also didn't know how to install or administrate a Linux machine: I worked it out (at a time when that was a lot harder than it is now, both because distros were less user friendly and because Q&A resources like StackExchange didn't exist). So I do expect that others, especially particle physicists who are pretty much synonymous with "mad smarts" in the public perception and who need to use this stuff every day whether or not they love computers, can work it out too.

(Hint, if you dislike computers or programming, then experimental particle physics in particular is not going to be a good fit for you. And, worse, you're going to do many key things badly for a lack of interest in the hard-earned craft of doing them well. As a combination of being an inveterate cynic and caring that the science of particle physics is done well, this disturbs me. We undoubtedly already have a malign "standard path" treadmill which involves getting out of the dirty, hard, hands-on, technical business of making and doing cool things with data and code as soon as possible, and in to the higher-valued ranks of the "coordinators" and "convenors" which for some reason are sought by departments appointing new tenured academic staff. Encouraging that trend is not going to end well when there is no-one left to actually do anything anymore... a situation which feels painfully close already!)

To their credit, despite software developers not being a big target market for Apple -- at least not compared to graphic designers and hipsters -- in their 10.9 Mavericks release they have finally sorted out their system compilers. Still no Fortran as far as I know, and there is unlikely to ever be one, but if you can live with just C and C++ then a suite of LLVM-based compatible compilers centered around clang(++) works perfectly. If you buy into that, get rid of all the gcc and g++ that you can get your hands on, and maybe export CC=clang; CXX=clang++ variables in your environment then all should be well. Except one thing: demonstrating how much attention they pay to this area of their activity, Apple managed to issue a clang upgrade which failed to upgrade Python and Ruby at the same time. The significance of this is that Python in particular is an important interpreted language which can build and use compiled "extension modules" that interface to C++ libraries: we do exactly this for HEP projects like LHAPDF, Rivet, and YODA. And that in this clang(++) update, Apple removed support for several command line flags which were still built into how Python and Ruby call clang: cue several users with mysterious bug reports that looked like their systems were borked. Which is just business as usual, but this time it was Apple that did the borking. For the record, until Apple issue another update you should add -Qunused-arguments to your CFLAGS and CPPFLAGS (and maybe CXXFLAGS) variables. I don't know how long that update will take, but I suspect this is not a big blip on Apple's radar.

Finally, as another indicator of just how much Apple aren't into supporting code development, the Rivet team noticed recently that OS X does not have an explictly named python2 alias, as all other systems do these days. For about 5 years, having such an alias has been recommended as a major part of the strategy to allow staged upgrading from Python 2.x to 3.x. Linux installations like Arch have already moved their main python executable to Python 3, so it's important that we be able to ensure that our Python 2.x scripts get run with the correct interpreter until we can update our code. But Macs are once again the sticking point.

There is no way back from the Apple invasion, and I truly sympathise with the users who like their sleek and functional Mac machines and can't understand why they're getting these confusing error messages when they try to build standard, stable bits of HEP software. But they have wasted time for both users and package developers (and "user" in HEP really means "personal developer" -- everyone codes), as well being part of a more subtle and more destructive message that everything should just work and if it doesn't then it's the job of one of the "techie guys" to sort it out for you. Pretty they may be, but the most generous I can be is to say that the jury is still out on whether they have brought a net benefit to the particle physics world.

On unfolding

Andy Buckley

2014-05-12 11:33

Comments

A while ago I was included in a discussion between an ATLAS experimentalist who had been told that some "unfolding" was needed in their analysis, and a theorist who had previously been a strong advocate of avoiding dangerous "unfoldings" of data. So it seemed that there was a conflict between the experimentalist position of what would be a good data processing and the view from the theory/MC generator community (or at least the portion of it who care about getting the details correct). In fact the real issue was just one of nomenclature: the u-word having been used to represent both a good and a bad thing. So here are my two cents on this issue, since they seemed to help in that case. First what the experimentalist was referring to as "unfolding" was almost certainly the "ok" kind: unfolding to hadrons, photons and leptons with lifetimes of at least ctau0 = 10 mm.

This is a one-size-fits-all approach to what we regard as a "primary" particle in the experiments in that it gives a rough measure of the sort of distance that a particle of a particular species might travel, in the form of the distance that light travels in the mean decay time of that species. It's not a great measure: individual particles do not all decay with the mean proper lifetime of their species, and Lorentz time dilation / length contraction means that actually highly boosted particles with speeds close to c will fly much further than the light speed rule of thumb suggests (cf. cosmic ray muons). But it's a well- defined rule and the only semi-short-lived particles whose decays must accordingly be treated by a detector simulation are KS0 and Lambda. (In practice the exact lifetime cut number may differ depending on whether the factor of c is taken as a round $3 x 10^8$ or a more accurate number in configuring the generator... but as long as the same discrete particle species are classed as stable/unstable this detail doesn't matter.)

So this kind of unfolding is a good and necessary thing in that it is intended purely to remove residual detector and reconstruction algorithm effects which have not been included in calibration procedures. We do it by constructing a mapping from our simulations of reconstructed events back to the "stable truth" record of particles produced by an MC generator using this sort of mean lifetime cut, and end up in most cases being fairly insensitive to the details of whether or not the hadron production and decay modelling was super-accurate. And we absolutely have a duty to do this as experimentalists, because we are the only people who can be expected to understand our own detector details and to possess the detailed detector simulation need to construct that mapping.

This kind of unfolding is distinct from extrapolations such as "correction" of measurements within the detector "acceptance", i.e. the angular coverage of active detector elements and their limited ability to detect particles with little kinetic energy, to include the particles that weren't seen. The spectra of very low-energy particles are actually very badly known, so if you correct your data to include some MC model's guess at what's going on, then you have degraded your data to some extent, and in really bad cases you have "measured" Pythia or Herwig. Another class of very bad unfolding is to "correct" data to parton level, i.e. attempt to remove the effects of hadronization, multiple interactions within the proton, and other things which we can never see. This sort of correction was popular in Run 1 of the Tevatron, and there is a reason that barely anyone uses that data now... it's corrupted to the point of uselessness by model-dependence. No-one with a current model, much more sophisticated than was available back in the early/mid '90s, wants to try and re-include the behaviour of some obsolete calculation to test their fancy new one, so you might wonder what the point was of measuring it at all. Well, I'm sure some tenured positions resulted from those papers, but scientifically it's of only passing interest: we need to defend against making our own data useless, and the best way to do that is to make minimal corrections... "say what you see" as Roy Walker used to drone on Catchphrase.

So dodgy acceptance extrapolations and unphysical parton-level corrections are definitely in that "bad" set of unfolding targets. Electroweak bosons (W and Z) are an awkward case, since lots of people like to "correct back" to those as well. We're having a lot of discussion in ATLAS about things like this and I'm glad to say that the trend is toward using "lepton dressing" definitions which do not require use of explicit W and Z particles from MC event records, and hence to publish measurements of e.g. "dilepton pT" and other distributions which are strongly correlated with the parton- level calculation but have the benefit of being forever well-defined based on what we could actually see. In fact this approach is what Rivet has done since the start and hence I am a) compelled to regard it as a good thing, and b) going to give myself some credit for pushing this approach over the years!

It seems likely that ATLAS (and our competitors) will keep on doing extrapolations and Born-level Z comparisons in precision electroweak measurements for some time yet, but always as an interpretation step in addition to the "fiducial" measurement of particles we could resolve. This is fine, as long as we assess a reasonable extrapolation uncertainty, since comparison of total cross-sections for different processes (with different acceptances) is an interesting physics thing to do, and development of MC models isn't the sole purpose of the LHC. My opinion is that this "Born level" correction will also go away at some point, because the whole concept of a well-defined propagator momentum breaks down the moment that effects of EW loop corrections become substantial compared to experimental resolution, as they do for high-pT electroweak events that we will start to probe in earnest in LHC Run 2. And the relevant theory tools are improving to include these effects: at some point there will be no excuse to say that your PDF fitting or whatever is based on simulations where EW effects aren't included, because those codes will be obsolete. I heard one of the most prominent theorists working on MC simulations of such NLO EW corrections recently say in an ATLAS Standard Model group meeting "don't correct to Born to compare to our state-of-the-art predictions", which is rather a reversal of the usual situation where "corrections" to Born MC are made "because it's what the theorists want"!

There are also many cases where it just doesn't matter what Z definition you use, and there I think the argument depends on what you need to compare to: if you are looking at Z+6 jets observables then maybe there is a case for unfolding in a way which can be compared to fixed-order partonic calculations (although I think the question "why aren't you using Sherpa or MadGraph/aMC@NLO rather than MCFM?" is nowadays the appropriate response). But if you are looking at e.g. underlying event observables in leptonic Z events then publish the results using the dressed dilepton observable -- the sorts of models which are missing the detailed QED FSR that might make a difference in that case will certainly not have any modelling of the soft QCD crap that the observables are actually studying! (And in that case the dependence of those quantities on the detailed Z/dilepton kinematics is anyway extremely weak.) The Higgs analyses remain one of the few places where comparison (or reweighting) to analytic resummation calculations remains mainstream, and I hope in the next 5 years those techniques will be embedded into mainstream MC generators so we can do precision Higgs measurements as well as searches, with good control over modelling uncertainties... otherwise physically sound unfolding will not be possible. The key thing is that even if such corrections are applied for comparison to today's limited theory tools, what we actually measured should also always be preserved for posterity, with minimal corruption and model-dependence, so that our data remains valid forever and useful for as long as possible.

Now let's just address the fact that this isn't perfect. There is some model dependence in detector unfolding for sure, because we rely on models of both the fundamental proton interaction physics and the detector to build our unfolding system. This is why we use procedures like iterative Bayesian unfolding to try and reduce the prior dependence introduced by our simulation details, we use multiple input models, etc. So,provided no extrapolation is involved the effects of model dependence on unfolding should be small. But if you extrapolate, there is no data component in Bayes' theorem and the result will be 100% model dependent. Unfolding, as with any data processing, always introduces some increased uncertainty, and this should be explicitly estimated by studying e.g. the number of iterations used, using several MC models, etc. ... and the uncertainty bands need to inflate accordingly. The argument is that it's nearly always worth this small increase in uncertainty to make the data better represent the pure physics of what is going on, rather than mix it up with a bunch of uninteresting (to theorists) stuff like non-linear response of the ATLAS calorimeter.

In short, we need to know what we're doing... but that's the fun bit. Unfolding is a good thing, when we do it in a controlled way and never correct back to seductive-looking event record contents like "the true Z" or "the true quark/gluon jet". When we do the latter, we unintentionally cripple our own data and implicitly waste a lot of public money. None of us want to do that, and the good news is that the LHC experiments are aware of the danger and are actively moving toward more robust truth definitions as we enter a new precision era where this stuff really matters.

Top mass measurements and MC definitions -- an inexpert precis

Andy Buckley

2014-02-28 00:00

Comments

I was just recently notified that the world top mass combination uses "my" MCnet review paper on MC generators to justify stating that the definition of the top quark mass used in all (!) event generators is equivalent to the "pole" mass.

I've heard that statement very often, but not backed up by anything more concrete, so I was interested to read this section of the paper (Appendix C, starting on p184 of the PDF), which turns out to be rather good, interesting, and elegantly presented. Not to mention slightly embarrassing that I hadn't read it before, given that it has my name on the front! (In my defence, I did write some of this paper, just not that bit. I suspect most of the authors haven't read everything in it.)

Anyway, it definitely does not say that MC mass equals pole mass, so I thought it might be interesting to post my explanation of what it does say, at least as far as a dumb fence-sitting experimentalist/MC guy like myself can understand...

The argument is that the pole mass is in a scheme of asymptotically long-distance physics, i.e. an isolated particle with only self-interactions. However, that involves integrating terms with alpha_s at long distances, including the non-perturbative region where it diverges: this introduces the "renormalon ambiguity" of order LambdaQCD on any pole scheme quantity.

By comparison it can be shown that the renormalon ambiguity disappears (is cancelled by the same term in the inter- quark potential) in short-distance schemes like MSbar. This use of the inter-quark potential is why the statement is made that the ttbar cross-section is connected unambigously to the MSbar mass. I think once more exclusive properties of one top are looked at, this cancellation is no longer guaranteed and hence a discussion about the meaning of the MC top mass is needed... but that is rather an extrapolation on my part.

The final step is to argue that due to generator treatments of shower cutoffs, the top width, etc. the "MC scheme" is somewhere in-between the short-distance MSbar-type schemes and the long-distance pole scheme. This is roughly quantified by using the equation given (with unknown, assumed order-unity coefficients) for perturbatively relating short-distance schemes to the pole one (i.e. adding contributions including the renormalon term), and placing the "MC scheme" at a scale of about 1 GeV where the shower cutoff lives (i.e. not seeing much of the renormalon region). This gives the estimate that the true pole mass is of order 1 GeV higher than the mass which would be obtained by template matching to MC generator parameters -- the latter being what has so far been done experimentally. The dm = 1 GeV figure is an order of magnitude estimate, and would depend in detail on the way that the generator behaves -- including to some extent the shower, hadronization and other soft physics.

So any statement that this paper claims m_MC = m_pole is unjustified. The true claim is that the generator mass is neither a pole mass nor e.g. an MSbar mass, but something in-between whose exact relation to either is unknown. And as the first and last paragraphs say " this application warrants a deeper investigation of precisely how the top quark mass is de fined... Since the current experimental uncertainty is 1:1 GeV, clarifying this relation clearly demands more attention." Indeed.

Science TV is too nice

Andy Buckley

2013-04-14 00:00

Comments

Well well well, another blog post, eh? So soon: it's only been... erm, two years. Oh. Well I never promised to be prolific. This one's come about because I grumbled briefly on Twitter about the nature of (British) pop-science TV and immediately hit the restrictions of that medium. Twitter is a wonderful way to share neat things that you find online, and to make pithy soundbites & jokes (and for describing what you're eating, the form of public transport that you happen to be on, listing film names with comic vegetable name substitutions...), but for exploring a non-trivial issue 140 characters is, to put it mildly, a limitation. I have difficulty fitting one of my normal sentences into 140 characters. So of course I came across as a whining idiot, prompting the reply "Yeah, we really should do more about how shit it all is. You don't see that tone ANYWHERE" from Dara O'Briain. Well, not remotely what I meant, but who can blame him? So here's an attempt at a more coherent and nuanced version that hopefully doesn't make me come across as an anti- science, axe-grinding git. But perhaps as a slightly grumpy science nerd, which is fair enough. There's been a welcome rise in the amount and profile of science programming on TV in recent years, a good whack of which is due to the influence of Brian Cox's three "Wonders" series', the LHC start-up & Higgs boson excitement, ... and who knows, maybe it's also due to The Big Bang Theory. Discounting the mostly-gawp-fest wildlife docs, there's been Dara's own Science Club, Astonomy Live, Bang Goes The Theory, at least a couple of series fronted by my old supervision partner Helen Czerski... and the long-running Horizon, of course. And that's just the stuff that I've noticed.

So it's great to see more science on telly -- and particularly that it's now portrayed as fascinating and maybe a bit cool (a nerdy cool, but socially acceptable nonetheless) rather than tedious or eggheaded. Bravo. But I can't bear to watch half of it: the happy-clappy presenting style and typical lack of depth drive me bananas. Am I just stuck so far up my ivory tower that I won't be happy until mainstream science TV resembles the OU educational modules that used to screen on BBC2 at 2am? (I taped those while I was at school... but sure, there's a reason they were shown at the witching hour.) I hope I'm not that much of a prat, and since I get as wound up by biology programmes as by physics ones I'm at least not just moaning that TV doesn't know (and say) as much as I do on my specialist subject.

The root problem for me is that so much science TV is, bizarrely, unscientific. By which I mean that the defining feature of doing science is missing: where's the challenge, confrontation, and critical approach? The "hard" questions? Presumably it makes someone happy to assemble a bunch of young researchers and science journalists and fly them round the globe to ask questions to which they already know the answers, for the benefit of the cameras... which is demeaning but sort-of ok: that's how TV works. But having sent informed people you'd expect them to also call their subjects on some of the more bullshitty spoutings. Maybe this does take place, and gets left on the virtual cutting room floor, but the broadcast output tends to be "wow, this science/tech is really big/small/pretty, this is all totally awesome, thanks so much for having us, bye."

It probably is wonderful -- like I said, I've no interest in a "science is really shit" show, and there is plenty of cool stuff to go and film -- but nothing's 100% perfect, and scientists have agendas, too. Not usually evil ones, often not commercial ones, but agendas nonetheless. If you interviewed me, it'd be in my interest to make exciting noises about the LHC, particle physics, and particularly the bits of it that I'm interested in, not because I directly get lots of cash but because it helps to generate a buzz and maybe that will feed down the line to a boost in future funding, either for me specifically or my field in general. Plus the ego boost that someone chose to point a camera at me. So, intoxicated by heady ambitions, I might unwisely say something that I can't defend... and the interviewer should jump on that. They can do it nicely -- the Humphreys-Paxman approach is probably not Plan A -- but don't let me get away with it. Being able to do that succinctly and entertainingly -- and to choose the questions the audience will like rather than the nit-picky ones that science conferences are full of (no bad thing, but not media-friendly) -- is surely the defining characteristic of a good scientist-journalist.

A while back I saw a short film on LIGO (I think as part of Dara's Science Club) which effectively missed the enormous fish-in-a-barrel question: "you haven't seen anything in the last two years; why is this upgrade different?". Not that I don't like LIGO, but you've gotta ask. And just yesterday I switched off a Horizon special because the presenter was too busy cooing to call the interviewee on some suspiciously soundbitey statements about graphene -- the event that led to this very article. I found this particularly irritating: first they showed an arselickhan film on Andre Gaim and the graphene discovery, then cut back to the studio where a photogenic member of Gaim's own research team was there to enthuse (the great man having presumably been unavailable). Graphene is cool, so no complaint about that as a subject, but it's also dangerously full of superlatives... so when people start talking about supporting the weight of a cat on a 1 m2 sheet of the stuff* then someone should probably point at the disconnect between the reality and the hype: "what you've shown so far is how to make microscopic flakes via Sellotape: what's this 1 m2 sheet stuff? How do you think it can really be put into mass production: lots of Sellotape?! How long until those graphene-microelectronics applications?" But, hypnotised by a molecular model and some funky graphics, we moved on. Sigh, and a belch of irritation on Twitter...

I suspect this is what winds up people like Simon Jenkins who periodically pen articles railing against the special position that science holds -- in attempting to make science cool and accessible, the argument and dissent has been excluded. Capitalised "Science" can appear as an unassailable religion of sorts, and a faintly cold, inhuman one at that: no wonder some liberal columnists get the wrong idea. Rational dissent is exactly what science is about: it is explicitly not meant to be put on a pedestal and made intolerant of criticism. The philosophy that anyone with good enough logic or data can overturn the opinion of a grandee is rather special, and I think is a large part of why the process of science holds my interest all day, every day, and then late into the night. It's about coming up with ideas, testing them, building a case, and then going out and giving conflicting ideas the same critical grilling as their originators will be inflicting on yours. Kids and "civilians" watching these programs may get the feeling that scientists are either renegade geniuses (the programmes' heroes) or smoothly revolving cogs in the glorious science machine. It's messier than that, and hence far more interesting and human. My feeling is that showing that process would probably make better TV, too.

In fact, I can prove it. Remember the Faster Than Light Neutrinos? For the record, pretty much everyone in particle physics, including the authors of the study, immediately said "it's some kind of measurement bug", and lo it came to pass. But in the meantime there was a great public display of how our scientific community approaches a very contentious result with open but sceptical minds. There were immediately umpteen questions: how was the timing done? what about the distance measurement? In the end it turned out to be a fibre-optic cable connection that was to blame, and relativity escaped unscathed. If you search online you might find a Horizon film on the topic, and the star piece for me was Jon Butterworth (a friend, but that's not why I mention it) in the UCL staff common room getting excited about what he regarded as a problematic aspect of how the neutrino pulse timing was measured. I think it's a great bit of footage because he starts scribbling pulse shapes on a paper napkin and getting excited. It's the sort of thing we do among ourselves every day over lunch/coffee/whatever: right there is all the excitement and enthusiasm you could ask for, along with the touch of conflict and scepticism that's central to how science is really done. And it was bloody good to watch, too. More of that please.

[*] One of those publicly accessible science numbers that has obviously been pre-prepared. A friend once produced a parody that compared the weight of ATLAS to the number of dogs on the Isle of Wight, which I think is the right response -- at least scientists should feel a bit hinky when one of these conveniently accessible figures pops up.

PS. For the record, Brian Cox gets a pass from me on much of this criticism for two reasons, despite perhaps being an obvious role model for relentless sci-enthusiasm: 1) when he takes a break from being a silhouette, he occasionally enters a live studio and denounces someone as "a twat" for saying something dumbly unscientific. Grumpiness and good TV? High five. 2) I've really enjoyed the last two Wonders programs in particular: explaining entropy to the nation on a Sunday night is not a topic entered upon lightly, and while missing some obvious questions I thought that the physicists take on biology that characterised Wonders of Life was novel. Gotta point out, though, that in Life #1 he tried to make a clever physics reference to energy as the time component of momentum and cocked it up: not that he doesn't know the right answer, of course. But then Neil Armstrong fluffed his big line, too, so I'll forgive you that, Brian.

PPS. Three books on discovery/invention immediately pop to mind in defence of my thesis that conflict maketh narrative without needing to get on a downer about science overall. First, The Soul of a New Machine, Tracy Kidder's Pulitzer-winning account of how a renegade team in the 80s got a new personal computer to market against all the odds. Particle physicists will find much to empathise with. Second is Nobel Dreams, whose accuracy is disputed, but it's essentially a page-turning real-life thriller about the big characters in particle physics and the race to observe the W and Z bosons. And last, just because I read it recently in the 10 minute bedtime gaps before my son went to sleep, is Farmer Buckley's Exploding Trousers -- a compendium of the weird tales behind unsung bits of science and invention. All thoroughly recommended, and none about straightforward tales of scientific plain sailing. I'm sure you can recommend plenty of others to me as well, which is the sort of one-liner that is good for.

Academic journals aren't helping us to do science anymore

Andy Buckley

2011-02-13 00:00

Comments

Having recently concluded a long-ongoing saga to get the first ATLAS underlying event study both through the experiment's internal review procedures and then into the perverse format demanded by the academic journal, Physics Review D, to which we submitted it, it seems an apt time to offer a few comments on the state of the sacred academic publishing and peer review process. Over the years I've been in academic research, the tradition -- because it is largely tradition these days -- of academic journals has come to seem more and more perverse with each passing year. It is certainly an odd business: scientists spend months or years doing research, which we eventually write up and (modulo reviews, iteration and approval from colleagues whose names will appear on the author list) send to a journal. This final step is considered to be somehow magical, both individually by other scientists in their treatment of publication lists when hiring staff, and collectively by research councils and other funding bodies when reviewing grants (again, the review panels consisting of scientists, although not necessarily from the same field). The emphasis on publication lists seems to indicate a certain blind faith and lack of imagination in groups of usually contrary people when it comes to such revered procedures as peer review, for these days there is often little value added to publications by the arcane procedures and hoop-jumping required to obtain the mysterious approval that comes of putting an oddly-formatted journal reference in your CV.

I should perhaps give a little background about how research dissemination works in particle physics, for the benefit of any readers who aren't familiar with this field. Outsiders from academia probably assume that all scholarly publishing (and, not disconnectedly, career progression) works the same way, but this is most certainly not true: arts and humanities often hire lecturers immediately out of their PhDs and view teaching as a burden to be loaded upon them for several years while they fit in research in their spare time. Conversely, science departments almost universally hire young people in purely research roles before, when they have proven their worth as researchers, ensuring that they never do any again by means of a tempting promotion to a role more similar to that of early stage humanities lecturers -- with extra admin thrown in for good measure. It's a funny system, for sure, but the message to be conveyed here is that at least between arts and sciences there is a world of difference in how careers evolve and hence on how working output is to be judged. So generalisations are difficult and unwise.

Even within the sciences, for those important early years of research-dominated work, the culture can vary enormously. My own direct experience is limited to (mostly experimental) particle physics, which lies at one extreme of the publishing spectrum with huge collaborations (of order several thousand members in the case of the large LHC experiments), democratically and alphabetically representated on the author list. This is clearly a far cry from e.g. small biological or medical collaborations where the order of appearance on the author list is a covert channel by which to convey the role played by that person -- lab grunt who did the work (ugh, how common), their supervisor, provider of funding, etc.. Such differences, to my mind, must make it extraordinarily difficult for a mixed panel, as in the case of the cross-disciplinary Royal Society Fellowships, to judge the respectability of publication lists from a variety of publishing cultures. (But then I just failed to make their annual shortlist, so maybe I'm nursing a grudge! I would find it hard, for sure, to meaningfully review a biochemist.) It's not clear how well particle physicists do in this system -- one one hand they are likely to have large publication lists including papers not just by themselves but by any group within the mega-collaboration; on the other, all their listed papers have author lists long enough that scientists unfamiliar with our publishing culture could reasonably judge them to all reflect little on the individual being considered. The sanest approach to the use of HEP publication lists in hiring is, to my mind, to largely ignore them: at present we have forced upon ourselves a collectively deceitful and unfortunate prisoners' dilemma where even the most honest and self-deprecating researcher with an interest in career progression has to list all the papers that can be tenuously linked to themself, knowing that everyone else will be doing the same. In this system, people who spend their PhDs and early years working hard on non-running experiments suffer compared to those who do similar work on running ones, simply because the latter start their climb up the academic ladder with a vastly inflated list of superfically compelling research output.

And so to journals -- what do they do for us? In reality, particle physics researchers rarely use journals. The vast majority of real research exchange is via paper "preprints" manually added to the online arXiv system, which dumps a list of newly submitted papers into most of our email inboxes every morning. The "preprint" monicker is intended to imply that these are not "proper" papers, over which magic journal dust has been waved, and accordingly are probably not fully trustworthy. Bollocks! We read, submit and update arXiv PDFs exactly as fully-fledged papers, because we know that's how our science is actually being transmitted to others, as they transmit their findings to us. In the ten years since I began my PhD, I have not once visited a library to check out a dead trees copy of an academic paper in my field, which begs the question of why we pay so handsomely for our shelves of unthumbed manuscripts. When I do look up journal entries online it tends to be as part of one of my exercises in HEP archeology, to obtain copies of papers written before the arXiv era. Even this is rarely necessary, thanks to some superbly obsessive historical scanning by the Japanese KEK lab's library. But despite the fact that few if any particle physicists ever actually use academic journals to read about the state of research -- remember that these institutions originated as a sort of collective excitation of letters exchanged between natural philosophers in the early days of scientific research -- entries in publication lists with those magic journal reference details have disproportionate weight on the professional assessment of scientists. As with any unthinking scalar metric applied to judge progress, this leads to gaming the system: chasing a journal publication becomes more important than working with colleagues and driving forward the state of the art; scientists demand that conferences publish their non-peer-reviewed proceedings contributions in collaboration with a journal so that they look more like the mythical point-scoring type of publication in their CVs; and funding bodies accordingly get more formal in their classification of relevant publications -- it's a good old-fashioned arms race. The tail is most definitely wagging the dog.

Getting a paper into a journal is also a practically time-consuming and awkward process. Take for example the underlying event paper which I mentioned at the start -- I was one of the editors of this paper within ATLAS, which itself required several months of iterating with the physics group and subgroup, then with an internal editorial board, then two stages of whole-collaboration presentation and review (in principle, 3000 people; in practice still probably 20 or 30 active commenters to respond to), and a final round of sign-off by the experiment management. By the end of this process we were both exhausted and pretty sure of the academic credibility of the result. I am proud that in its Standard Model publications ATLAS has not attempted to [hastily put out much less useful papers], but stuck properly to the process of doing it well -- an approach which I believe comes from ATLAS' physics publications being driven by grassroots efforts and decisions rather than management diktat. It is unclear to me what an independent peer review could have done to improve this situation. However, getting the paper to that all-important peer reviewer was made painful by the requirements to reformat the paper in a less readable format per the journal's regulations; to work through the text changing the word "Figure" to "Fig." except when the first word in a sentence (what?!); to rename all our semantically-named plots as fig1a.eps, etc.; to mash the whole thing into a single TeX file, eliminating our carefully-tweaked BibTeX setup into the bargain; to mangle nicely optimised figures drawn with LaTeX commands into sub-optimal EPS files; and a plethora of equally irrelevant but very time-consuming tasks. And then make it compile on the PRD submission server's antiquated LaTeX installation. Submission to the arXiv is a bit of an art, but nothing like this... this waste of time and effort is what we scientists refer to as a "massive, pointless ball-ache."

But surely the important thing is not the little-used availability of a paper in dead tree form, but the much more important side-effect of the independent peer review afforded it. So, what did our peer review reveal? They noted that what we did was quite hard, and suggested that we move a label slightly lower in one of our figures (which we'd had to mash for them). Gee, that was worth it.

So does experiment internal review, especially in analyses which originate in this ground-up way, realistically obviate the need for journals? Bearing that caveat strongly in mind, I am coming to the conclusion that the answer is yes... particularly if the arXiv and other self-publishing preprint systems are augmented by social media style comments, reviews and webs of reference from the "user community". But that caveat about grass-roots origins is there for a reason, again based on a recent experience which I think worth recounting: this is the cautionary tale of the ATLAS heavy ion jet quenching paper. (Note: as far as I'm aware there is no reason not to recount this story, and as David Mitchell pointed out in a recent episode of 10 O'Clock Live, the only sort of body that is "on message" all the time is a cult. But anyway, my apologies to any ATLASsians who consider this some sort of breach of trust or violation of our collective voice.) At the end of November, a couple of weeks into the LHC lead ion run, ATLAS called several urgent whole-collaboration meetings to discuss an exciting new result: the observation of obviously asymmetric jet events in lead ion collisions, where the officially-unvoiced hypothesis is that one of the two jets is being "caught" in a quark-gluon plasma. It's a neat result, but -- and this is important -- a wholly expected one. The collaboration management worked itself into a frenzy over this, perhaps driven by the obvious disagreement of our data with simulations which, as several of us pointed out at the time, were never meant to contain this effect (and had other problems, too, which I won't go into here). Several days after the first meeting we were all pointed at a short preliminary note and told that there was a week's worth of consultation time for collaboration members to comment on it. Unlike our underlying event paper, which was forged in 6 months of internal and conference notes plus several extra months of review and editorial honing, this one had been cobbled together in a few days and was full of obvious holes -- this is not to criticise the authors: no- one can produce a really complete and polished academic paper on that timescale. So, accordingly, the online comments system filled up with 70+ substantial sets of comments and criticisms... and it would have been more had the deadline not been brought forward by 3 days and another super-urgent meeting called. In this meeting, on a Thursday just before Thanksgiving, we were told that CMS might try to cobble together a similar result and scoop us on Monday when the arXiv re-opened and therefore we should try to submit a paper to the journal by 8pm that night! And so it came to pass: ATLAS hurried out a paper with the vast majority of its own members' criticisms unaddressed, and locked down further comments. Those in the collaboration who hadn't attended this second meeting weren't even informed or consulted on this drastic change of schedule until the paper was already submitted. This is what happens when fast-tracking publication decisions are taken by a small number of people, in a frenzy of paranoia. As far as I'm aware, CMS never published that so-threatening paper, there was virtually no media mention of the press releases, and the theory community's response was largely a shrug and a "yeah, so?". Certainly not worth the subverting of our own standards -- which were applied pretty ruthlessly to the underlying event and other paper -- but as the collaboration management never replied to my email on the subject, I have no idea how on-message I am on that point.

[Update: since first writing this, I found that CMS did indeed publish a measurement of this effect... in fact, it was submitted a few days after I wrote this rant, a mere two months after ATLAS' paranoia drove us to rush out our imperfect paper. From a first look, they have done a more comprehensive job... after all, they had enough time!]

But back to the journals -- did magic peer review save us from ourselves? That would certainly be the conventional logic, and with so many critical (but pleasant and well-meaning) comments from inside the collaboration, surely an independent reviewer would require a lot of corrections? Nope, it was accepted as soon as the journal re-opened for business after the Thanksgiving holiday. With situations like this, and the similar pro forma acceptance of the pointless "Me first!" ALICE paper a week after first data taking in 2009, my faith in the safety-net of peer review is not so much dented as written-off and fit only for the scrapyard. Driving such flawed decisions is the sad fact that journals -- due more to their traditional status and connection to funding and career evolution -- are extraordinarly profitable businesses, and will make publication decisions influenced as much by "market forces" like the potential for a high-profile, impact-factor-boosting publication (and screw the quality) as based purely on boring old pure scientific merit. It's ironic that a system like peer review, motivated at least partially by the same scientific awareness of personal flaw-blindness and self-deception as motivates double-blinding of experiments, is happily subject to the distorting power of a market for the means of academic dissemination. I don't get asked to review very much, but in the cases where my review has been negative or suggested major changes, the journal response has not been good: I get the impression that for some journals it's all about throughput, and reviewers who get in the way of that by insisting on higher quality are not popular.

Journals are, as commented in a review by Deutsche Bank a unique business in which all the costly aspects of making the product are provided for free or a nominal fee by outside agents. Scientists do the research, write the paper, review it internally, typeset it (certainly in the case of HEP, where most papers are written by at least one person with a lot of LaTeX experience, the typesetting is likely to be of publication standard at the time of submission), and peer review it. The journal only has to print it and stick it on its website (and perhaps reformat into a less convenient form, as per some anachronism or other)... and then scientists have to pay -- a lot -- to read it. Except that we don't -- we read the same paper, but faster, earlier (by months), and for free online at the arXiv. The journal publication is completely secondary to the supposed role of disseminating scientific research: it is nowadays merely an anachronistic status symbol of dubious value.

Journals are an extraordinarily profitable business, since they can charge large fees in exchange for virtually no outlay. But this is apparently not enough: they have been [getting more expensive, at an average rate of 7.6% p.a. since 1986]](http://www.arl.org/bm~doc/arlstat05.pdf) hence outstripping inflation for decades. The resulting trend that academic libraries spend an ever-increasing proportion of their budgets on journals -- 72% at the last count in 1998, and exceedingly likely to have increased since then -- is known as the ("serials crisis"](http://en.wikipedia.org/wiki/Serials_crisis). For years, academic libraries have been struggling to pay the thousands of pounds that journals charge for providing access to their publications, to the point where they have to be selective about what journals they can afford. The bulk of these costs are unjustifiable profiteering and completely disproportionate to the actual costs involved: not only is all the raw material and reviewing provided gratis, but with the dominant mode of consumption being PDFs on the Web the cost of distribution is also virtually zero. In a move purely to protect their business model, journals refuse to sell online-only access accounts, insisting instead that institutions buy bundles of journals they don't want (in unpopular dead tree form) in order to get online access to one which they do. Accordingly, and shamefully, the very institutions charged with disseminating scientific research are stifling it. This is a case of the tail not only wagging the dog, but strangling it.

For some reason, despite in all objective senses holding all the cards, researchers have not responded with an embargo of journals. We should. The reason we don't is fear: research funding bodies are not the most agile of institutions and their metrics are driven by the assumption that journal publications are a valid Gold Standard for measuring academic output. It is a true Prisoners' Dilemma: it is absolutely in our interests to boycott journals, but if we do not do so collectively and with a single voice, we lose individually.

This seems clear-cut to me, but perhaps that is because it gels with my prevailing political views. Privatising and outsourcing can be good ideas in the right places, if contracted with appropriate restrictions to avoid raw profiteering. Academic publication, like coherent public transport, seems in practice to be ill-suited to the market -- especially when taxpayers have already paid for the research itself. There are better ways, such as online collaborative tools, backed by webs of trust: we should be exploiting these -- and more actively than the slothlike Open Access Publication project has managed. Change in such established institutions is unlikely to come top-down from a masterplan between journals and funding bodies: it will come from a disruptive technology such as commenting or trackbacks on arXiv posts. And as the months and years pass, the journals' position will become ever more exposed. I'm looking forward to the future, but I wish it would hurry up and arrive.

Anyway, when hiring a new researcher make sure to look well-beyond the publication list: it may not mean that much, and the bits that are meaningful may not be obvious. Ask around: are they known, respected, influential? Have they got more ideas and potential to develop, or are they simply a worthy research drone who was in the right collaboration at the right time? And if you are a researcher: recognise that the crucial thing is getting your research out there and used, not that it's in a journal. We're in a field of endeavour that (should) reward independent thinking and true achievement over adherence to outdated social conventions... let's live up to that reputation.

Another update: some extra links for your info -

Review of journal economics, including the Deutsch Bank study -- very worth reading * Some extra background to HEP open access and the SCOAP proposal * More info on SCOAP and OAP in HEP

It's a bug, not a feature, Rene

Andy Buckley

2010-12-27 17:28

Comments

I've had my attention drawn to this reply to a ROOT bug report, which I think highlights a serious problem with how the ROOT project interacts with the LHC experiments. In short, someone contacted the ROOTtalk mailing list to inform them that ROOT's calculation of weighted means is incorrect if there are negative weights involved. There is a well-defined procedure for calculating weighted means, and in fact it's dead simple. There's no reason to not get it right. This is a bug report, about a significant numerical error in just about the simplest statistical quantity that anyone might want to calculate: any statistical analysis tool worth bothering with would provide a bug fix as fast as possible. So how did ROOT respond?

Well, the response from Rene Brun, the ROOT project leader, is that ROOT have "no plans to change this algorithm". This is absurd: the LHC experiments depend on ROOT to provide them with accurate statistical results. Errors in ROOT mean mistakes in the LHC physics programme: this is not a feature request that can be turned down because it doesn't fit with ROOT's plans, it's a bug which needs to be fixed. That ROOT's management appears to see no reason to get simple calculations correct, and that errors this simple still exist after nearly 20 years of development, is an indictment of ROOT's relationship with the LHC programme... and of how CERN and the experiments have dropped the ball in making ROOT a useful tool targetted to LHC physics. I've often heard comments that ROOT is all about empire-building and CERN politics rather than about good science -- I'd rather not believe that, but examples such as this make it hard to draw a more appealing conclusion. The nicest thing I can say is that Rene apparently hasn't understood that this is a bug. If I were feeling more conspiracy-minded I'd interpret it as saying that he doesn't care if ROOT is right, as long as it's used. And it is certainly used -- ROOT has a virtual monopoly over LHC physics data analysis, without delivering the sort of quality that such a position demands. My criticism of ROOT is pretty well-established, but in the expedient interest of getting useful things done rather than starting fights, I've steered clear of doing so in public for quite a while... oh well, it was nice while it lasted. ROOT is at its most fundamental a statistical analysis toolkit -- that's what 90% of its users do with it for 90% of their time. At least, that's how most users perceive what it's for: as far as I can tell, the true "design" purpose of ROOT is to become as monolithically bloated as possible with all the contents of the ALICE experiment software framework, plus any features that Rene happens to take an interest in.

Given its core usage, it's remarkable how bad ROOT is at statistical analysis and data plotting! You have to explicitly tell it to get the errors on weighted histograms correct, rather than it working out of the box. You have to do some work to make it understand that histograms can have bins of variable width. The quality of its plotting is exceptionally poor -- how embarrassing that in a field where publication is dominated by beautifully rendered LaTeX formats, ROOT complements that with exceptionally ugly plots, poor-man's math fonts (argh, that sqrt sign...), an inexplicable default grey background, and drop shadow boxes around titles. The list goes on: insane APIs and inheritance structures, unusable object ownership semantics, serial avoidance of compatibility with such standards as the STL (amazingly, you can't even pass an std::string when ROOT wants a character string), an ill-advised yet bizarrely undeprecated flaky "C++" interpreter...

The focus of ROOT development is dominantly on little-used functionality -- or features which should be kept in the ALICE software framework rather than attached to a widely-used tool -- while the core statistics and plotting has never been improved from its original flawed implementation: think of the time just spent making the "THtml" Doxygen clone, which could have been spent making the histogramming any good. Thousands of people use it every day, yet few attempt to do anything about its fundamental lack of suitability for our purposes: what a frustrating community we physicists are, sometimes. Sadly, ROOT has also played a role in making some neat things less useful: the MathCore and Reflex systems started off as standalone libraries, and for political reasons were assimilated... I and others would love to be able to use both libraries in other projects, but the monolithic nature of ROOT makes it far too big (and nasty) a dependency to add lightly.

A couple of years ago I met Rene and several other ROOT developers at a conference -- they're nice guys, but we disagree fundamentally on what is needed, and on the direction that they've taken. Several of them are extremely good technical developers, which makes it even more tragic that their efforts are being expended in such an often-counterproductive direction. I've heard that there have been career security issues associated with this: you can be a good scientist and do nice things, but unless you join ROOT that CERN staff position won't be forthcoming: nasty stuff if true. Aside from this, it is very odd how little control CERN or the LHC experiments have over ROOT. The LHC physics programme is far too important for all ROOT design and implementation decisions to be made just by the ROOT team: we are enormously dependent on ROOT to get LHC physics right, but for features, design, and development emphasis it seems to be left to Rene and his team -- and their interests do not seem to actually be aligned with those of LHC experiments and physicists. For example, the CINT interpreter has had myriad bugs over the years and was shown several years ago to be an order of magnitude slower than the Python interface (let alone compiled C++)... yet CINT is still the standard and heavily pushed interface. Even if you think that interactive C++ is a good idea, which I would contest to the ends of the earth. Heck, it's not even that suitable a compiled language for physics analysis purposes! Where's the feedback process? Where's the management?

I've recently seen that Rene has proposed some incorporation of ROOT into the Geant4 detector simulation package. G4 has its own set of problems, but I don't see how ROOTifying it would play any role whatsoever in improving detector simulation. It would, however, add some extra weight to the monolith and further cement ROOT's position as an entirely indispensable and fundamental tool for our subject -- by inflexibility rather than merit -- and I think CERN and the LHC experiments should be very sceptical of the motivations for this proposal: it has not come from the experimental or the G4 community, but from ROOT, who see another key project ripe for assimilation. If we go this way, we are likely to spend a lot of time and money making yet another HEP software monolith that's not really fit for purpose, but to which we have no alternative. Sometimes I wonder what good are several thousand PhDs when we make such simple errors with our core tools: it's a sad fact that many physicists are happier spending a week writing an awkward ROOT tool to solve a physics problem than devoting a few days to making a better tool, so that it'll only take one day to solve the problem in future. But unless that changes, or physicists realise how much better tools like ROOT could (and should) be, it looks like we're going to be stuck with software which can't correctly calculate a mean for the forseeable future. How sad.

Back to life

Andy Buckley

2010-12-24 18:55

Comments

I've never been the most active blogger, but 2 years between posts seems a little excessive, even to me! I wish I could explain the cornucopia of reasons for this, but I'll just point a couple of accusatory fingers at the extraordinary busyness of life (in the intervening time I moved to Edinburgh, the LHC restarted, and my working life has generally been batshit insane) and general frustration at my server setup. You may also be frustrated with the slowness of this server: turns out that running a mail server and a Rails application on a 256 MB virtual server puts me deep into the swap memory zone and site performance accordingly drops into the region marked "painful". I'd have loved to fix this 2 years ago, but a) I've written enough web apps in the past to not want to make another half-assed one, and b) did I mention work being batshit insane? Anyhoo, someone asked me at a recent meeting when I was going to start blogging again, and after the initial shock of discovering that I have some sort of niche audience I decided that I should start venting my spleen in more substantial ways than the 140 chars offered by Twitter.

A bit of recent server upgrading means that this site is not quite as slow as it has been -- I'm enormously impressed by the service from Slicehost, by the way -- and I'm going to try and replace Radiant with something a little more fun, speedy, and memory-efficient in this little pre-2011 gap. Suggestions of CMSes/blog engines with user comments, picture galleries, code highlighting support, etc. and support for static content (such as this site's Cambridge night climbing content) would be much appreciated. Oh, and because I'm a picky person and have not found Ruby/Rails to be a terribly pleasant development experience, being based on Python would also be a big bonus!

A week in the land of contradictions

Andy Buckley

2008-11-22 01:08

Comments

It's the end of another busy week. Work life has been busy to the point of insanity recently, burrowing its way into every available bit of spare time... if you consider every weekend since mid October to be spare time rather than "essential time", that is. I'm actually inclined to the latter view: while happy to declare that my work is captivating and inspiring, some downtime is definitely needed. In the last month I've been to two week-long conferences in Italy, snatched a week back in Durham and then spent the last week in Chicago. After one more week in Durham, in which to pay some attention to demonstrating and marking Frank's excellent new computational physics course, I'm off on my travels again, this time to CERN for a week. And then it's Christmas and skiing; January is looking a bit crazy, and I'm trying not to think about that. Fortunately Jo has been a star and given me some (unearned) slack, but this schedule isn't really fair on either of us: I think I'll be imposing a more restricted travel schedule in the New Year and hopefully sending some collaborators out to do the salesman thing instead ;-)

Anyway, reflections on the past week: I have to say, I've really enjoyed Fermilab. Most people seem to bitch about the "boring site", the strip malls of West Chicago, the weather and anything else that springs to mind, but I must be a bit funny in the head because I like it all. Okay, not the strip malls --- a bit of restriction on the suburban planning process would have been welcome --- but here's a list of hat I've been up to, other than giving talks and coding up experimental analyses in Rivet: * a night taxi ride from the airport to St Charles through the first drifting Chicago snows of the year; * arriving at Wilson Hall, a wonderful piece of high-rise architecture which combines the brutal functionality of US Government and military sites with physics quirkiness. I was sorely tempted to step across the abyss on the 15th floor, but the consequences of a silly mistake made my stomach lurch in a most un-climbery way! * that dry sub-zero weather... all cold, crisp and wintery. I'm a cold-weather person, but it's so much better without the British innovation of constant drizzle. It keeps everything nice and clean, too, when the blowing leaves don't have enough moisture to turn to mulch everywhere. * this place gets tornados. Something about the proximity of undergroud storm shelters (some in the experiment control rooms!_ makes me feel like just staying here is somehow a wee bit hardcore! * staying in Fermilab Village: a former town that was abandoned and restored when the site was established. It's like a colourfully-painted summer camp, with a freaky Physics-Amish aesthetic. Cool. * the FNAL wildlife: a herd of buffalo, hovering hawks and honking flocks of pre-migration geese; * I hired a huge, heavy and completely rubbish cruiser bike which wouldn't go uphill without de-chaining (that's a problem on a single-speed bike with no chain tensioner and a big metal chain guard!) and cruised around the site and the area on it. I must ship my current bike out to CERN when I retire it... * having to cycle several miles out of the way for lunch because some fool put a particle accelerator in the way with lakes and a moat in the middle! * the awesomeness of the Two Brothers Tap House, and particularly the Northwind Stout. Dangerous stuff. The friendliness of the staff is amazing, too... and it's genuine, too (or really well faked); * biking back to the lab post-beer through driving snow, stopping to throw rocks on the frozen lakes ;) * and tomorrow I "do Chicago". Looking forward to it.

Anyway, from the above it's probably clear that I'm a cold-weather romantic (I love Lund for similar reasons) and sure, the initial enchantment probably fades fast, but it's been a great week. It's also good to remind myself of all the good things about the US: aside from the strip malls, sprawling suburbs and the meathead portion of the population that drives the commercialism and tackiness of mass US culture, most Americans I meet are friendly, genuine and intelligent. And there is that underlying feel of a pioneering, self-sustaining community still "in the air"... I can see why Americans are so enthusiastic about the importance of community, although maybe that same tight-knit community ethos lends itself to the less-positive lack of interest in the outside world among the population at large. Maybe I'm (as usual) sampling from a special subset of the population, since I can't match the people I meet to what we in Europe see as quintessentially American. Or maybe it's just hard for us to see beyond the in-your-face nature of US sports and TV and appreciate the similarities rather than differences between the cultures; the presentation may be different, but the underlying psyche isn't so far off, especially between the kind of educated people that you're going to meet at a physics lab. And you can hardly blame individuals for only being able to deviate a certain distance from the social norms of the country they're "embedded" in; fill in the GR analogies for yourself if you're that way inclined. Or maybe the tide is turning --- America's so-called disaffected youth turned out in near record numbers a couple of weeks ago to vote a young(ish) black man with talk of change and tolerance into the highest office in the land: it's hard to find a downside to that. Okay, the fact that the ludicrous opposition campaign fielded by the GOP wasn't immediately laughed out of existence is still cause for concern... but let's enjoy the victories while we have the opportunity!

The one thing that I find offensive and shocking is the obvious social inequalities: everyone cleaning the floors or serving (baaaad) food in the FNAL canteen is Mexican. Maybe I've just not lived in a big enough city in the UK to notice such blatant correlations between race and social status. But then I have a European unease about the fundamental ethos of US society as "every man for himself" capitalism. "Socialism" is about as far from a perjorative as it can get in my dictionary, but was recently used by Hank Paulson et al as if it was a synonym for collective agriculture, tractor production quotas, one child laws and 5 year plans. Is "spreading the wealth a little" really such a bad idea in a country which simultaneously manages to be the world's richest but to have virtually no healthcare or unemployment support for it's poorest citizens?

It's a land of contradictions alright: the thing that's surprised me most this week has been how much the friendliness and community spirit of the people I've met and worked with has contradicted the less savoury aspects of US culture that receive most attention in British stereotyping of our Yankee friends. Maybe I wouldn't live here long-term, but there's plenty that's charming about the US... even in West Chicago!

Burying the hatchet

Andy Buckley

2008-11-17 14:03

Comments

"Ah, the famous Andy Buckley. Or perhaps infamous, no?". When a dapper French gent at a statistics conference addresses you this way, I guess it's normal to feel a bit perturbed, particularly when you've devoted a serious chunk of time in the last few years to publicly demonizing their work. Really, I suppose it's maybe a minor miracle that Rene Brun and I haven't crossed paths before now --- although I guess this is largely to do with me being keen to avoid a fight. Rene is, as particle physicists will know, the author of many infrastructural and statistical packages in high energy physics. In particular, he's head honcho of the ROOT system, a piece of software which I've long considered (not in isolation, I should add) to be fundamentally misconcieved in myriad ways. Despite being naturally non-confrontational, I managed once to become a sort of focus for community displeasure with ROOT, and hence a persona non grata among the ROOTisti.

Fortunately, Rene isn't here to argue. He smiles and suggests a coffee or lunch with his lieutenants later. To my great surprise, this actually happens: I had half assumed that he had just come to ogle the dissident weirdo. As a result of these lunchtime conversations, I've shifted my opinion on the ROOT guys slightly. Sure, I still think the system is wrong in many ways --- confused object ownership, piss-poor stats object design, too much of an emphasis on barely- designed quantity over honed quality, and a general kitchen-sink lack of focus --- but they're a technically competent bunch with the difficult job of responding to continual mad demands from experimental collaborations. Of course, I think that they should learn to say "no" a lot more often, and to pro-actively deprecate bad design ideas, but that's a difficult thing to do, especially when you have a lot of users. Sometimes I'm glad that Rivet isn't overwhelmed with users, as it makes it still possible to redesign until we really think the interface is right... which is not to say that more contributors wouldn't be nice!

On the whole, I'm glad to say that the development standards on ROOT have been tightened up over the last couple of years, and it's good to see a developing attitude of "let's use external packages, when they're available", even if that probably means "let's bundle external packages with our huge tarball". I would still dearly love to see CINT consigned to the dustbin of history (C++ as an interactive UI? What an, erm, great idea...), but at least there are plans to throw away the really crappy parsing and plug in something more robust. Even Rene admits that only "10% of the functions in 10% of the libraries" see regular use, although culling the code doesn't seem to have become goodthink in ROOT land just yet.

What is still needed, I think, is a move to treating physicists as contributors rather than users --- many are technically very good, many have complaints about how the system makes thing difficult, and ultimately the point of ROOT is to provide statistical data analysis infrastructure for (primarily) LHC physics: feedback matters. This was particularly highlighted for me by Rene's repeated enthusiasm for a 3D OpenGL-powered data tree viewer... such eye-candy is fun to implement, but requests for better histogramming or 1D histogramming through slices of ND space were clearly not cool enough to excite Rene's attention. Surely more accountability and user input in the design process is needed? But perhaps I'm just sore about enduring all the FUD on the "roottalk" mailing list when I was dragged into the ROOT Flame War of 2005. Notably, the abuse and FUD was (mainly) perpetrated by crazy disciples of the One True TPath, rather than the developers, but the overall attitude was "how dare you express dissent" --- a much more closed-minded attitude than I've ever encountered on an open source project.

Anyway, if nothing else, Erice was a good place to bury the hatchet with the ROOT guys. And maybe they won't mind the occasional complaining so much, now that they know who's behind it ;)