Skip to content

Python Dos and Don'ts

Hannes Schmidt edited this page Oct 24, 2018 · 15 revisions

Don't piggy-back resource deallocation

For things that require closing/shutdown/cleanup, slavishly follow this schema:

thing = Thing()
try:
    thing.doIt()
finally:
    thing.shutdown()

If you have two things, nest this structure:

thing1 = Thing1()
try:
    thing2 = Thing2()
    try:
        thing1.doIt()
        thing2.doIt()
    finally:
        thing2.shutdown()
finally:
    thing1.shutdown()

Do not piggy-back, the following is broken:

thing1 = Thing1()
thing2 = Thing2()
try:
    thing1.doIt()
    thing2.doIt()
finally:
    thing2.shutdown()
    thing1.shutdown()

If thing2 fails (raises an exception) to initialize or shutdown, thing1 isn't shutdown. That may be what you want but if it is, you need to document the intent as this is an uncommon scenario.

If the nesting annoys you (as it should), make Thing1 and Thing2 context managers and use them like this:

with Thing1() as thing1:
    with Thing2() as thing2:
        thing1.doIt()
        thing2.doIt()

Don't use format() or % to format log messages

The logger's debug(), info(), warn(), etc. methods are all variadic, and the extra arguments will be interpolated into the log message. IOW,

logger.info('Hello, %s' % world)

is equivalent to

logger.info('Hello, %s', world)

just less efficient if the log level set above info. String interpolation isn't exactly cheap so its desirable to only do it if the log message is actually delivered.

Don't document the obvious

Document the non-obvious. Don't document the how, do document the why. Recently came across this:

# toil imports
from toil.job import Job

# toil scripts imports
from toil_scripts.batch_alignment.bwa_alignment import docker_call

Don't use ASCII art for decorative purposes

#############################
# Look at me, look at me!!! #
#############################

When I say that writing good software is an art, I don't mean that.

Don't waste pixels, …

… use single-quoted string literals by default. Use double-quoted string literals if the string value contains single quotes.

Avoid backslash for line continuations

Python can infer a line continuation if it occurs within a balanced construct like parentheses, braces and brackets. I sometimes even introduce an otherwise redundant pair of parens if that lets me avoid backslashes. So don't

log( "Bla, bla, bla %s %s %s", foo, \
     bar, \
     ding )

but do

log( "Bla, bla, bla %s %s %s", foo,
     bar,
     ding )

And don't

from germany import bmw, \
    mercedes, \
    audi

but do

from germany import (bmw,
    mercedes, 
    audi)

When iterating a dictionary d, …

… don't

for k in d:
    stuff(k, d[k])

but do

for k,v in k.iteritems():
   stuff(k,v)

I think it is faster.

Don't call threading.Queue.task_done()

… unless you intend to call join() on that queue. Note that threading.Queue.join() is different from threading.Thread.join().

Don't use threading.Event to signal boolean conditions between threads

Use threading.Event only if you need to efficiently block on the event to occur. Otherwise use a simple bool variable. The GIL takes care of the rest.

Minimize the body of try:, maximize the specificity of except:

Don't

try:
   d2 = d1[k1]
   x = d2[k2]
except:
   log.warning('…')

but do

try:
   d2 = d1[k1]
except KeyError:
   log.warning('…')
else:
   x = d2[k2]

Here the invariant is that if the first lookup succeeds so must the second. The first version does not enforce that invariant. The second version does. The first version would swallow a violation of the invariant because the body incorporates both look-ups. It also swallows other potential issues because the except clause is too broad.