michielovertoom.com

Ugly Python • 12 Oct 2010

dilapidated flat Last night, while perusing Esoteric Programming Languages, I came across an implementation of 'dropsort' (a lossy sort method) written in Python. When I looked at the source code, I found it ugly and non-idiomatic.

I'll discuss the issues I have with it below.


Multiple; statements; on one line;

I saw a lot of lines like this:

lena = len(amy); lenb = len(ben)

Multiple statements per line are generally discouraged in Python. I think it would be better to write:

lena = len(amy)
lenb = len(ben)

You could write this when you're really feeling adventurous and on a trip to the nearest star:

lena, lenb = map(len, [amy, ben]) # thanks to ssbr

There's also code like this:

self.comparisons = 0; self.copies = 0; self.merges = 0

When initializing multiple variables to 0, I find it reasonable to put them all on one line. But in such a situation the following construct is seen more often:

self.comparisons = self.copies = self.merges = 0

(Too (many parentheses)) (no (this (is (not that language))))

The author probably comes from a Java/C/C++ background because he uses parenteses in if statements where they are not needed:

if(len(lst)<2): return lst

Note that there is also no space after the if statement and around the operator (PEP8). This would be the idomatic version:

if len(lst) < 2:
    return lst

The same style ailment can be seen in his while statements:

while(a < lena and b < lenb):
    ...

The underscore variable

A variable named '_' is bound to the string "bookkeeping lines" and peppered throughout the program as a device to shift instrumentation code to the right margin, out of the algorithmic code, in lines like:

    _        ;stats.copies += 1

which will be interpreted as:

    "bookkeeping lines"        ;stats.copies += 1

'_' is a valid variable name in Python and has no special meaning. But it has an idiomatic use: it serves as a junk placeholder in unpacking.

Suppose you have a tuple like confection = ('crunchy', 'chocolate', 'frog') and you want to unpack that to three variables, and you want to signal the reader that you're only interested in the taste and animal, you could write:

_, taste, animal = confection

Similarily, if you're only interested in the taste, you could write:

_, taste, _ = confection

The misuse of the '_' variable in the original source has to be discouraged, and replaced with normal Python statements like stats.merges += 1 or stats.merge().

Funny variable names

Variable names should be meaningful, to enhance understanding of the code. In the merge() function some very strange naming is used:

def merge(amy, ben, stats):
    Carrington = []; a = 0;   b = 0    #init
    lena = len(amy);   lenb = len(ben)

I think it's OK to call the two source lists a and b (instead of amy and ben), but 'Carrington' as the result list is just confusing. Why not call it 'merged' or 'result'?

And oh, class names are usually capitalized, and classes should be new-style nowadays, and inherit from object. The old code is:

class state:
    ...

Modern code would be:

class State(object):
    ...

Some cleaning up

The code given by the author does not include tests or some example main code to run. I added that in this version.

This corrected version has been cleaned up, and validates in the PEP8 Python style guide checker.

Comments

peter • 19 Dec 2010

Where do I find the original source which you are critisizing?

J-K • 1 Apr 2011

Contrast that with http://blog.gordaen.com/2007/12/19/ugly-python-code/

Jeremy • 31 Jul 2015

I'm the original author, just found this post. Good to know my early python forays haven't gone unnoticed! Original code: http://ibiwan.com/programming/allsorts/in_python/dropsort.py

Leave a comment

name (required)



content last edited on May 31, 2011, 12:22 - rendered in 2.32 msec