Ugly Python • 12 Oct 2010
Last night, while perusing Esoteric Programming Languages, I came across an implementation of 'dropsort' (a lossy sort method) written in Python. When I looked at the source code, I found it ugly and non-idiomatic.
I'll discuss the issues I have with it below.
Multiple; statements; on one line;
I saw a lot of lines like this:
lena = len(amy); lenb = len(ben)
Multiple statements per line are generally discouraged in Python. I think it would be better to write:
lena = len(amy) lenb = len(ben)
You could write this when you're really feeling adventurous and on a trip to the nearest star:
lena, lenb = map(len, [amy, ben]) # thanks to ssbr
There's also code like this:
self.comparisons = 0; self.copies = 0; self.merges = 0
When initializing multiple variables to 0, I find it reasonable to put them all on one line. But in such a situation the following construct is seen more often:
self.comparisons = self.copies = self.merges = 0
(Too (many parentheses)) (no (this (is (not that language))))
The author probably comes from a Java/C/C++ background because he uses parenteses in if statements where they are not needed:
if(len(lst)<2): return lst
Note that there is also no space after the if statement and around the operator (PEP8). This would be the idomatic version:
if len(lst) < 2: return lst
The same style ailment can be seen in his while statements:
while(a < lena and b < lenb): ...
The underscore variable
A variable named '_' is bound to the string "bookkeeping lines" and peppered throughout the program as a device to shift instrumentation code to the right margin, out of the algorithmic code, in lines like:
_ ;stats.copies += 1
which will be interpreted as:
"bookkeeping lines" ;stats.copies += 1
'_' is a valid variable name in Python and has no special meaning. But it has an idiomatic use: it serves as a junk placeholder in unpacking.
Suppose you have a tuple like confection = ('crunchy', 'chocolate', 'frog') and you want to unpack that to three variables, and you want to signal the reader that you're only interested in the taste and animal, you could write:
_, taste, animal = confection
Similarily, if you're only interested in the taste, you could write:
_, taste, _ = confection
The misuse of the '_' variable in the original source has to be discouraged, and replaced with normal Python statements like stats.merges += 1 or stats.merge().
Funny variable names
Variable names should be meaningful, to enhance understanding of the code. In the merge() function some very strange naming is used:
def merge(amy, ben, stats): Carrington = ; a = 0; b = 0 #init lena = len(amy); lenb = len(ben)
I think it's OK to call the two source lists a and b (instead of amy and ben), but 'Carrington' as the result list is just confusing. Why not call it 'merged' or 'result'?
And oh, class names are usually capitalized, and classes should be new-style nowadays, and inherit from object. The old code is:
class state: ...
Modern code would be:
class State(object): ...
Some cleaning up
The code given by the author does not include tests or some example main code to run. I added that in this version.