Wagging the dog in academia

It's been a couple of months since I posted here, partly because of holiday, partly work, and partly because what spare time I've had has been spent voraciously following the economic and political conversations that the suprisingly interesting Labour leadership election campaign has raised.

I suspect I'll spew my thoughts on the latter topic, and the general state we're in, at a not-much-later date. But today an easier task: a few thoughts on a residential "creativity" course that I did over the last couple of days.

All in all it wasn't bad. My thought on this sort of thing is that although many things that the University promote to us sound like the worst kind of management guff, there is often a kernel of useful content. Getting it might require a trade-off, i.e. sitting through 8 hours of infuriating nonsense to get the benefit of 5 mins of mild insight, but taking the occasional risk is, I think, preferable to decrying well-meaning "personal development" forever, and maybe missing out on something useful. As it happens, I had attended an "Effective Communication" course just a month ago, which was well worth the morning that I spent on it. Sometimes I even surprise my cynical self.

Two days residential course is a lot more than a morning, of course, but it was an easy call to make when I realised that this course would count as a more enjoyable version of the mandatory "Entrepreneurship" course that I need to attend. The first day had its good moments, but suffered -- or rather I suffered -- from endless hours of repeating the truism that creativity is in practice always about inventive, revealing combinations of existing things, rather than somehow popping 100% new ideas, based on not an iota of pre-existing knowledge, out of the vacuum. It's a reasonable point, and helpful for some, but is like an unconstructive mathematical proof: sure, that's true, but it has nothing to say about either what combinations are interesting, nor how to bias ourselves toward finding them. Day 2 was a bit better, with a nice exercise on a "toy model" topic... apparently we need to pay the training consultant more to get the "proper" course that works on real situations. Oh well. There's more to come, and I don't regret the two days of attempted self-improvement.

One of the things that struck me, among all the "unlearning" and challenging of preconceptions, was that every time my colleagues (in the broad sense -- I was the only physicist in the room) were given the opportunity to ask a free-form question, it was about grant applications, reviews, publications, and all the other paraphenalia of academic life. This isn't really surprising on the face of it, but the form of the questions caught me by surprise: there was rarely a sense of perspective, or awareness that there is value in many of the things that we do regardless of whether they lead to grants or publications. It was implicit that the only thing that matters is those superficial aspects of academic -- the cargo cult stuff -- that performance reviews and promotion metrics focus on. There was even a repeated question along the lines of "Shouldn't I just leave creativity for later, since I think grant reviewers want to see safe proposals at this stage". Very sad.

The question I ended up asking is "Why am I not equally myopic?" or more self-critically, "What do they know that I don't?". One part of the answer really is personal. My career hasn't had a very standard trajectory for a HEP experimentalist -- post-PhD I took 4 years "out" of experiment to work among theorists, in an environment where I was essentially my own boss for 90% of the time, and had great support from the likes of Jon Butterworth. This gave me a lot of freedom to establish a value system that was all about the science quality rather than the cargo-cult trappings of academia -- which of course I am framing to you as being the One True Way. And my natural inclination is anyway that rocking boats are more interesting than the plain sailing type. But while there are certainly plenty of experimental HEP'ers whose primary focus is strategic moves for rapid career progression -- conventionally followed by endless moaning about the unfairness of all the teaching and admin that they couldn't wait to inflict upon themselves -- I think on the whole our peculiar type of science protects us from the worst effects of modern academia's performance metrics on at least two fronts.

First up is the meaninglessness of citation counts in experimental HEP. In this strange world where one need only qualify as an ATLAS author once, and never again be asked to do any service (or indeed any) work to justify the constant flow of papers with your name on it, publication measures like raw citation counts, paper counts, or h-indices mean virtually nothing. Their strongest correlation is with longevity in the field, and in particular longevity on major running experiments. While this gives some advantage to those who spent the 2000s on running Tevatron experiments rather than in-development LHC ones, I think the main effect is a sort of scroll-blindness: everyone's numbers are so large and so similar that there is no power of differentiation. And when it comes to exercises like the REF, pretty much every UK HEP group points at the same major papers and has a half-decent case for doing so. Having filled our CVs up to the brim with collective publications, there is actually remarkable freedom on the resulting Fermi surface for us to focus on what we find interesting, without needing to daily obsess about The Paper. I'm also thoroughly looking forward to the demise of that outdated, cargo-cultish mode of academic communication -- and wealth transfer to Elsevier -- but that's for another day.

The second point in our favour is STFC's group consolidated grants, which are necessary for functional operation of very large projects over decades, as opposed to the 1 or 2 year peripatetic funding that is the norm elsewhere. Again there is individual freedom to be found in collectivism (christ, this is going to start reading like Maoist propaganda any minute now) -- most of us need not be overly concerned with grant chasing, particularly as there really aren't that many of them to chase. One poor biochemist I talked to said he'd put in 25 grant applications in the last 2 years -- I'm pretty sure there aren't even close to that many funding calls in total in UK/EU particle physics over that timescale. I can imagine that if your life becomes that dominated by application writing, then just like the people in the Bill Hicks skit you start to forget that it's just a ride. You forget that the reason you do this is not really the funding, or the promotion, or any of that crap, but the satisfaction of a job well done and of increasing our collective knowledge and wisdom.

Particle physics isn't a panacea, of course -- there is still deep unfairness over the number of excellent postdocs that we train, overwork, and then fail to provide permanent places for. And our huge collaborations have brought new modes of careerist gaming, and perverse incentives to do bad or at least substandard science. Grants are still chased, albeit with more emphasis on personal fellowships than project funding; and to my colleagues I'm sure I sound like a broken record when criticising their daily attendance of interminable ATLAS videoconference meetings* -- the motivation for which is something like "Jesus is coming; look busy" in the belief that being sub-co-coordinator of the Paper Clips Working Group is going to have some positive career impact. But despite all that, I think we've been strangely blessed by the administrative implications of our supersized science: a sort of academic asymptotic freedom. Long may it last.

[*] I'll maybe also moan about the appaling quality of ATLAS meetings at a later date. I'm just going to say here and now that I stopped attending them about 2 years ago, unless I specifically have a horse in that race. I've yet to notice any adverse impact, and I have a lot more headspace for physics thinking. Try it.

Migrating from Radiant CMS to... *anything* else

I mentioned recently about the painful transition of this website from the Ruby/Rails Radiant content management server to... well, anything that would actually work. Given its popularity, I have to assume that Ruby and Rails can be made to work well -- or that 1000s of development teams are herd-following idiots, but that can't be true, right? -- but my experience was a nightmare.

Mysterious Rakefiles, UI-disaster server commands, awful integration with system packages, god-awful outdated Radiant documentation, and changes with every release. In the end, an update of the base Ubuntu OS completely broke Radiant. I tried using Ruby Gems in all the ways I could find, and updated every package to the latest that Radiant thought it wanted but couldn't get it to run again. I tried making a new Radiant site and migrating the database via the advertised commands: it crashed. And in the end it seemed that Radiant's own declaration of package dependencies was inconsistent. This was just the final straw after several years of expecting a Rails epiphany, and dreading every time that I'd have to restart the server and somehow get the creaking mess up and running again.

Well, enough was enough. I'venow moved to using the Nikola static site generator instead and couldn't be happier: it's got a great command-line UI, it's totally clear what's going on, I can hack and extend it if I want to, and my data is forever in a human-readable, editable (even when offline!) format.

Radiant's page data is categorically not available in a human-readable format, so a significant part of the effort to get this site back to life was the need to write a script to access its article database, and dump out the pages in a form I could use. Fortunately the db is just an sqlite single-file database, and the table structure was pretty simple, so the dump script was easy. Here it is for posterity:

radiant2txt

#! /usr/bin/env python

"Convert a RadiantCMS SQLite3 db file into separate page and header text files"

import optparse, os
op = optparse.OptionParser()
op.add_option("-o", "--out", dest="OUTDIR", default="out")
opts, args = op.parse_args()

import sqlite3
conn = sqlite3.connect(args[0])
conn.row_factory = sqlite3.Row
c = conn.cursor()

import unicodedata
def norm(s):
    return unicodedata.normalize("NFD", s).encode("ascii", "ignore")

import datetime
def date(s):
    return datetime.datetime.strptime(s, "%Y-%m-%d %H:%M:%S").date().isoformat() if s else ""

import textwrap, re
class DocWrapper(textwrap.TextWrapper):
    """Wrap text in a document, processing each paragraph individually"""

    def __init__(self):
        self.tw = textwrap.TextWrapper(width=120, break_long_words=False)

    def wrap(self, text):
        """Override textwrap.TextWrapper to process 'text' properly when
        multiple paragraphs present"""
        para_edge = re.compile(r"(\n\s*\n)", re.MULTILINE)
        paragraphs = para_edge.split(text)
        wrapped_lines = []
        for para in paragraphs:
            if para.isspace():
                wrapped_lines.append('')
            else:
                wrapped_lines.extend(self.tw.wrap(para))
        return wrapped_lines

dw = DocWrapper()

for page in conn.execute("SELECT * FROM pages"):
    pagename = page["slug"] if page["slug"] != "/" else "index"
    outfile = os.path.join(opts.OUTDIR, "%s.md" % pagename)
    with open(outfile, "w") as f:
        f.write("<!-- \n")
        f.write(".. title: " + norm(page["title"]) + "\n")
        f.write(".. slug: " + pagename + "\n")
        if page["published_at"]:
            f.write(".. date: " + page["published_at"] + "\n")
        else:
            f.write(".. date: 2008-06-01 12:00:00\n")
        f.write(".. type: text\n")
        f.write(".. category: blog\n")
        f.write("-->")
        f.write("\n\n")
        for part in conn.execute("SELECT * FROM page_parts WHERE page_id = ? ORDER BY page_parts.name", (page["id"],)):
            text = dw.fill(norm(part["content"]))
            if text:
                f.write(text + "\n")

To get a bunch of pages out in the format I wanted (my site was using Markdown syntax, so the script writes out to a bunch of .md files), I ran this like:

./radiant2txt myradiantsite/db/radiant_live.sqlite.db -o out-nikola

A bit of manual hacking followed, but 95% of the job was done by the script above. Use if you like, but don't ask me for support; if you need something a bit different, hack it!

MP letter re. EDM 49 on Royal/commercial FoI

Well, I'm blogging again, and it seems to me that if I'm going to write a letter to my MP on a national issue, then I may as well wear my heart on my sleeve and make it an open letter. So here's the latest --- in fact the first I've written for a while, due to the replacement of my long-standing traditionalist/institutionalist MP with a hopefully more sympathetic model:

Attn: Owen Thompson MP Midlothian

Wednesday 10 June 2015

Dear Owen Thompson,

I'm writing to ask you to sign Parliamentary Early Day Motion 49, "Freedom of Information Legislation, publicly funded bodies and the royal family."

This EDM calls for two important things: 1) that commercial confidence not be a justification for secrecy on public sector contracts (after all, we all are the paying clients), and 2) that the Royal Family not be given special exemption from the freedom of information rules that govern all other publicly funded bodies (again, we are all paying for them and deserve to know what we get for our money).

The first point is, I hope, self-evident. One the second, I think it is worth noting that the recently published Prince Charles correspondence with ministers has shown how the heir to the throne, regardless of whether you agree with his comments, has abused his position of conventional neutrality on numerous political issues. He pressed ministers to favour his own interests and organisations, and the evidence is that most felt compelled to respond more substantially than they would to an "ordinary" citizen.

Extraordinarily, David Cameron claimed that there was an "important principle about the ability of senior members of the royal family to express their views to government confidentially" -- it's somehow democratically important than unelected aristocrats have special access to legislators despite that being constitutionally taboo?! And rather than respond constructively to the exposure, there is clearly a determination from the Conservative Government simply to hide the abuse from public view. This must be opposed, and indeed the existing exemption of the Royals from FoI requests (in response to the moves to publish Charles' letters) should be repealed. Please sign the EDM that calls for this.

Yours sincerely,

Dr Andy Buckley

And that's that.

Pygmentizing code for LaTeX

A couple of years ago, I realised that actually quite a few people were using my PySLHA library and plotter, and that I should write it up for them to cite, that being the tail-wags-dog way that the academic world rolls. So I knocked something together.

While writing this, using a LaTeX class file of my own devising, I decided I wanted to render my Python code examples better than the venerable listings package can do. And I found minted, a clever LaTeX package which automatically runs Pygments via the LaTeX chell escape mechanism. Problem is, the arXiv doesn't allow -shell-escape running of LaTeX; I had to beg a favour to get my original version of the paper uploaded.

Now I'm coming up to a major new release of PySLHA, it seems worth updating that arXiv note, and maybe even trying to get it "properly" published for the usual ineffable reasons. And another minted special request isn't going to wash. But I still like its output. So I just figured out what it was doing, and fiddled together a teeny bash script that provides the same code snippets statically. I don't think this exists elsewhere, but it's not worth a proper code release, so here's the while thing in case someone finds it useful:

pygtex

#! /usr/bin/env bash

## Write a .sty file defining the commands used in each Verbatim code block (bit hacky)
echo "" | pygmentize -l python -f latex -P full=True | head -n -10 | grep -E -v "documentclass|inputenc" > pygtex.sty

## Make a Verbatim code block for each input code file, transforming foo.ext to foo-ext.tex
for inname in $@; do
    outname=$(echo "$inname" | sed -e 's/[\ \.]/-/g').tex
    pygmentize -f latex -P verboptions='frame=leftline,framesep=1.5ex,framerule=0.8pt,fontsize=\smaller' $inname > $outname
done

I called it pygtex; you can call it whatever you like. It can be called like pygtex *snippet.py (if you've made code snippets with that name pattern) and will write out a pygtex.sty file, and a .tex file for each snippet. Then include them in your doc like this:

1
2
3
4
5
6
\usepackage{pygtex}
...
\input{foo-snippet-py}
...
\input{bar-snippet-py}
...

Enjoy.

38 Degrees and neonics

I've been very disappointed to see 38 Degrees, a people-power campaigning organisation whose petitions I've often signed, going down the data-blind anti-corporate route that blights the likes of Avaaz. Straying from their typical social justice agendas, 38 Degrees have decided to direct their ire at the government for considering a repeal of the EU-wide ban on neonicotinoid pesticides that's been in effect in Europe for the last year.

Read more…

Ahoy

Sorry for the 6 months that this site has been offline -- in particular for anyone who's been trying to read the night-climbers transcriptions. My RadiantCMS server refused to restart after an Ubuntu server upgrade, and because Radiant and Rails are a steaming pile of crap when it comes to package management and code quality, a good 10 or so hours of configuration fighting failed to resuscitate it. At which point I lost the will to live.

Read more…

A tense exchange

Time for another lazy excerpt from private correspondence! This time we visit that most viscerally thrilling and scientifically crucial of subjects: what tense(s) to use in your scientific paper. Daring, I know! But surprisingly controversial, and I'm motivated to write it after reading and reviewing umpteen notes, drafts, and published papers in which the tenses seem (to me) perverse. In particular I think there's a need to write such a thing after being told by one physicist "I think there's a convention in science writing that we always use present tense". Piffle!

Read more…

It just works... or does it? The dark side of Macs in HEP

If you attend a particle physics meeting these days (and most of us do, several times a day... this is not a good thing) it looks rather different to how it did 10+ years ago. Not that everyone paid attention then, but the type of laptop everyone's focusing on rather than the speaker has shifted, from the olden times array of various clunky black boxes to the situation now where 2/3 of the room seem to be wielding shiny silver Macbooks.

It seems like a no-brainer: Windows is pretty much 100% dysfunctional for computing-heavy science (unless you are either in a fully management role and never touch data, or for some reason love doing all your work though a virtual machine), but Linux is unfamiliar territory for most starting PhD students. Sure, it's a lot more user friendly than it used to be, with more helpful GUI ways to manage the system and the wifi even works out of the box most of the time. But Macs are perfect: beautifully designed, friendly, but with Unix underneath ... and they only cost an extra 50%! Ideal for HEP users who need Unix computing but want it to just work out of the box... and who doesn't? As the Apple advertising used to say "It just works". But does it?

Read more…

On unfolding

A while ago I was included in a discussion between an ATLAS experimentalist who had been told that some "unfolding" was needed in their analysis, and a theorist who had previously been a strong advocate of avoiding dangerous "unfoldings" of data. So it seemed that there was a conflict between the experimentalist position of what would be a good data processing and the view from the theory/MC generator community (or at least the portion of it who care about getting the details correct). In fact the real issue was just one of nomenclature: the u-word having been used to represent both a good and a bad thing. So here are my two cents on this issue, since they seemed to help in that case. First what the experimentalist was referring to as "unfolding" was almost certainly the "ok" kind: unfolding to hadrons, photons and leptons with lifetimes of at least ctau0 = 10 mm.

Read more…

Top mass measurements and MC definitions -- an inexpert precis

I was just recently notified that the world top mass combination uses "my" MCnet review paper on MC generators to justify stating that the definition of the top quark mass used in all (!) event generators is equivalent to the "pole" mass.

I've heard that statement very often, but not backed up by anything more concrete, so I was interested to read this section of the paper (Appendix C, starting on p184 of the PDF), which turns out to be rather good, interesting, and elegantly presented. Not to mention slightly embarrassing that I hadn't read it before, given that it has my name on the front! (In my defence, I did write some of this paper, just not that bit. I suspect most of the authors haven't read everything in it.)

Anyway, it definitely does not say that MC mass equals pole mass, so I thought it might be interesting to post my explanation of what it does say, at least as far as a dumb fence-sitting experimentalist/MC guy like myself can understand...

Read more…