Python indentation considered boneheaded

Andy Buckley

I've been using Python for maybe 4 or 5 years now. On the whole, the experience has been very positive: big pluses include the excellent (although rather stylistically disjoint) standard library; built-in collection types and list comprehensions; the experience, at least, of finding that duck typing actually "sort of works"; and the clean syntax. However, the "elegant" indentation-based scoping for which it's so famous is, all told, a very bad idea, regardless of what die hard Pythonistas may tell you.

Let's start with a consideration of syntax --- the most visceral feature of a language. Syntax is the most immediate feature of a language, and you know right from the start whether or not you like the feel of it. User interfaces appear secretly everywhere, from the symbols chosen to represent particular quantities in algebra to the syntactic sugar aspects of a programming language. Some experienced coders, especially those who know several languages well, may dismiss syntax-worrying as pointless and superficial, but I'm not so sure: a good syntax makes it not only fast to code up common tasks, but it will also emphasise the structure of an algorithm at a glance and be based on a few consistently applied core concepts. This is a lot more than lily gilding. Strictly, I can do anything in any Turing complete language, so the whole point of a good language is that it makes code for its target tasks readable, elegant and extensible: syntax plays a major role here. Python does well from this point of view: you can see right away that this is a language which doesn't render your own code unreadable when you go for a whole week without reading it. Contrast Lisp or Perl, or PHP: all serious languages suffering from serious syntactic defects (okay, so the most serious is the least heavily used... but that's got a lot to do with it having the worst syntax).

The one feature that everyone notices about Python is the indentation thing: scoping is denoted by indentation rather than braces or other explicit constructs. It's a feature that has dissuaded many potential users, who just think it's a bit too weird, despite the reassurances of the official tutorial. Well, I bit the bullet a while back and bought into the indentation thing for a few years. It was okay... actually, it was a non-issue: the indentation scoping seemed to work, provided your editor gave you a bit of help. However, recent experiences have convinced me that my gut reaction was right and that some sort of explicit block closure is required.

Here's my conclusion:

invisible markup is an accident waiting to happen; * scope structure should be unambiguous.

The first point is obvious in retrospect: one of the longest-running and most pointless debates in programming is the "spaces vs. tabs" indentation war. It shouldn't matter, yet everyone has an opinion on it, and despises the alternative. Personally, I'm a spaces-only kind of guy, and yes, I'm slightly militant about it as all good religious fanatics should be. The point is not really that one or the other is right, but that providing the opportunity to get confused between two kinds of invisible and barely distinguishable tokens is going to cause trouble some day. In most languages, this is fine but just a matter of aesthetics, and there are code formatters that will happily turn someone else's convention into yours and back if you're really that bothered.

Amazingly, Python does nothing in its syntax definition to sidestep the tabs vs. spaces war. It explicitly says that you can have both, mix them however you like, even change your definition of how much indentation each scope region needs, but make sure you follow the ensuing mess of rules. It's very cunning, and like most cunning things in computing, we'd be much better off without it. If a language designer can say things like "here is an example of a correctly (though confusingly) indented piece of Python code", there should be alarm bells going off in their head. I'm told that this also makes the definition of Python's EBNF grammar pretty grim, and I can believe that (though personally I can't even find reference to tabs, newlines and spaces in the official grammar document .) If more than one person, with different editor settings with regard to tab/space indentation edits the same Python code, a mix of spaces and tabs is pretty much guaranteed. In any other language, this doesn't affect the bvehaviour of the code: in Python the invisible markup can completely change the logic. Oops.

Second, since your blocks don't explicitly end, it's generally impossible to apply automatic reformatting of Python source to fix indentation screw-ups.

As a demonstration, how would you correct the indentation here (assume I'm just using spaces, so this is a relatively simple problem --- I'll improve this later!):

def myfunction(foo, bar):          foo.boing()         for i in bar.fizzle(foo):            baz = i**2

foo.wibble(baz) return foo, baz

If you imagine being a simple state machine, walking through this code, when you get to the foo.wibble(baz) line, have you or haven't you dropped out of the scope of the for loop? The difference is obviously significant, but you can't tell what was intended. Now, this is a pretty piss-poor contrived example, but I experienced this sort of thing for real recently and it was entirely because of working collaboratively via a version control system with someone whose editor liked to use tabs and 3-space indents together --- what the code would actually do was anyone's guess. In such a situation, it doesn't matter what Python's cunning rules say --- the correct answer could only be derived by re- considering the semantics of the function and "doing what makes it give the right answer". Essentially, you have to recode the function --- or your whole application, line by line, just by adjusting the indentation.

This is pretty idiotic, so why is it such a stubbornly established language feature? And it is stubborn:

andy@parity:~$ python      Python 2.5.1 (r251:54863, May  2 2007, 16:56:35)       [GCC 4.1.2 (Ubuntu

4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from future import braces File "", line 1 SyntaxError: not a chance >>>

Hmm. What's so wrong with encouraging good indentation, but actually delimiting block scopes explicitly? Maybe just that it's Python's "thing" --- but the cleanness of the rest of the syntax is Python's "thing" for me: the indentation is Python's "boneheaded, annoying thing". Consider this:

def myfunction(foo):          for i in range(10):              foo.process(i)          endfor          return

foo.result() enddef

Is that really so bad, Pythonistas? Really, this makes me want to find something else. Unfortunately Ruby, which has lots of lovely features, does explicitly close blocks and is very like Python in lots of ways, seems to belong to the Perl school of @cryptic :${modifie#rs}. sigh

Comments