It's a bug, not a feature, Rene

I've had my attention drawn to this reply to a ROOT bug report, which I think highlights a serious problem with how the ROOT project interacts with the LHC experiments. In short, someone contacted the ROOTtalk mailing list to inform them that ROOT's calculation of weighted means is incorrect if there are negative weights involved. There is a well-defined procedure for calculating weighted means, and in fact it's dead simple. There's no reason to not get it right. This is a bug report, about a significant numerical error in just about the simplest statistical quantity that anyone might want to calculate: any statistical analysis tool worth bothering with would provide a bug fix as fast as possible. So how did ROOT respond?

Well, the response from Rene Brun, the ROOT project leader, is that ROOT have "no plans to change this algorithm". This is absurd: the LHC experiments depend on ROOT to provide them with accurate statistical results. Errors in ROOT mean mistakes in the LHC physics programme: this is not a feature request that can be turned down because it doesn't fit with ROOT's plans, it's a bug which needs to be fixed. That ROOT's management appears to see no reason to get simple calculations correct, and that errors this simple still exist after nearly 20 years of development, is an indictment of ROOT's relationship with the LHC programme... and of how CERN and the experiments have dropped the ball in making ROOT a useful tool targetted to LHC physics. I've often heard comments that ROOT is all about empire-building and CERN politics rather than about good science -- I'd rather not believe that, but examples such as this make it hard to draw a more appealing conclusion. The nicest thing I can say is that Rene apparently hasn't understood that this is a bug. If I were feeling more conspiracy-minded I'd interpret it as saying that he doesn't care if ROOT is right, as long as it's used. And it is certainly used -- ROOT has a virtual monopoly over LHC physics data analysis, without delivering the sort of quality that such a position demands. My criticism of ROOT is pretty well-established, but in the expedient interest of getting useful things done rather than starting fights, I've steered clear of doing so in public for quite a while... oh well, it was nice while it lasted. ROOT is at its most fundamental a statistical analysis toolkit -- that's what 90% of its users do with it for 90% of their time. At least, that's how most users perceive what it's for: as far as I can tell, the true "design" purpose of ROOT is to become as monolithically bloated as possible with all the contents of the ALICE experiment software framework, plus any features that Rene happens to take an interest in.

Given its core usage, it's remarkable how bad ROOT is at statistical analysis and data plotting! You have to explicitly tell it to get the errors on weighted histograms correct, rather than it working out of the box. You have to do some work to make it understand that histograms can have bins of variable width. The quality of its plotting is exceptionally poor -- how embarrassing that in a field where publication is dominated by beautifully rendered LaTeX formats, ROOT complements that with exceptionally ugly plots, poor-man's math fonts (argh, that sqrt sign...), an inexplicable default grey background, and drop shadow boxes around titles. The list goes on: insane APIs and inheritance structures, unusable object ownership semantics, serial avoidance of compatibility with such standards as the STL (amazingly, you can't even pass an std::string when ROOT wants a character string), an ill-advised yet bizarrely undeprecated flaky "C++" interpreter...

The focus of ROOT development is dominantly on little-used functionality -- or features which should be kept in the ALICE software framework rather than attached to a widely-used tool -- while the core statistics and plotting has never been improved from its original flawed implementation: think of the time just spent making the "THtml" Doxygen clone, which could have been spent making the histogramming any good. Thousands of people use it every day, yet few attempt to do anything about its fundamental lack of suitability for our purposes: what a frustrating community we physicists are, sometimes. Sadly, ROOT has also played a role in making some neat things less useful: the MathCore and Reflex systems started off as standalone libraries, and for political reasons were assimilated... I and others would love to be able to use both libraries in other projects, but the monolithic nature of ROOT makes it far too big (and nasty) a dependency to add lightly.

A couple of years ago I met Rene and several other ROOT developers at a conference -- they're nice guys, but we disagree fundamentally on what is needed, and on the direction that they've taken. Several of them are extremely good technical developers, which makes it even more tragic that their efforts are being expended in such an often-counterproductive direction. I've heard that there have been career security issues associated with this: you can be a good scientist and do nice things, but unless you join ROOT that CERN staff position won't be forthcoming: nasty stuff if true. Aside from this, it is very odd how little control CERN or the LHC experiments have over ROOT. The LHC physics programme is far too important for all ROOT design and implementation decisions to be made just by the ROOT team: we are enormously dependent on ROOT to get LHC physics right, but for features, design, and development emphasis it seems to be left to Rene and his team -- and their interests do not seem to actually be aligned with those of LHC experiments and physicists. For example, the CINT interpreter has had myriad bugs over the years and was shown several years ago to be an order of magnitude slower than the Python interface (let alone compiled C++)... yet CINT is still the standard and heavily pushed interface. Even if you think that interactive C++ is a good idea, which I would contest to the ends of the earth. Heck, it's not even that suitable a compiled language for physics analysis purposes! Where's the feedback process? Where's the management?

I've recently seen that Rene has proposed some incorporation of ROOT into the Geant4 detector simulation package. G4 has its own set of problems, but I don't see how ROOTifying it would play any role whatsoever in improving detector simulation. It would, however, add some extra weight to the monolith and further cement ROOT's position as an entirely indispensable and fundamental tool for our subject -- by inflexibility rather than merit -- and I think CERN and the LHC experiments should be very sceptical of the motivations for this proposal: it has not come from the experimental or the G4 community, but from ROOT, who see another key project ripe for assimilation. If we go this way, we are likely to spend a lot of time and money making yet another HEP software monolith that's not really fit for purpose, but to which we have no alternative. Sometimes I wonder what good are several thousand PhDs when we make such simple errors with our core tools: it's a sad fact that many physicists are happier spending a week writing an awkward ROOT tool to solve a physics problem than devoting a few days to making a better tool, so that it'll only take one day to solve the problem in future. But unless that changes, or physicists realise how much better tools like ROOT could (and should) be, it looks like we're going to be stuck with software which can't correctly calculate a mean for the forseeable future. How sad.


Comments powered by Disqus