Writing Clean Code

Writing clean, readable code is something we should all strive for. If you’re just learning to program, you may not be paying attention to how clean your code is, but after a while when you start working with other people with a shared code base you may begin to recognize the importance of writing clean code. Code is read a lot more than it is written and sloppy code just depresses and frustrates people. Being sloppy up front when you are under pressure actually slows you down in the long term. You will end up creating more bugs which leads to more maintenance.

In this section, I’ll give brief pointers on how to make your code cleaner and easier to read. For more in-depth discussions on writing clean code, I recommend the excellent book, Clean Code: A Handbook of Agile Software Craftsmanship, by Robert C Martin, and the Pluralsight course, Clean Code: Writing Code for Humans by Cory House.

The DRY Principle

DRY stands for Don’t Repeat Yourself. You should state a piece of logic once and only once. If you find yourself copying and pasting chunks of code multiple times, that should be a signal that you are repeating yourself. Repeating code just leads to more code to maintain and debug. For example, the following code has repeated code:

sphere = create_poly_sphere(name='left_eye')
assign_shader(sphere, 'blinn1')
parent_constrain(head, sphere)

sphere = create_poly_sphere(name='right_eye')
assign_shader(sphere, 'blinn2')
parent_constrain(head, sphere)

This code should be written like the following:

def create_eye(name, shader):
    sphere = create_poly_sphere(name=name)
    assign_shader(sphere, shader)
    parent_constrain(head, sphere)

create_eye('left_eye', 'blinn1')
create_eye('right_eye', 'blinn2')

The second code example is easier to maintain. If an update needs to be implemented, we only have to update code in one place rather than multiple places.

Use Clean Names

The names of your classes, variables, and functions contribute greatly to how readable and clean your code is. Take this code snippet from an actual VFX studio pipeline tool:

curr= os.environ.get('CURRENT_CONTEXT')
if curr:
    cl= curr.split('/')
self.__curr= [None] * 6
self.setType( cl[0] )
self.setSeq( cl[1] )
if len( cl ) > 3:
    self.setSubseq( cl[2] )
    self.setShot( '/'.join( cl[2:] ) )
else:
    self.setShot( cl[-1] )

if wa: self.__wa= wa
else: self.__wa= os.environ.get('CURRENT_WORKAREA')

Can you tell what this code is doing? If you’re familiar with writing pipeline environment tools, you might recognize what it is trying to do. What is self.__curr? Why is it a list of 6 values? What do the elements of cl repreprent? Why does the length of cl being greater than 3 differentiate one block of the if statement from the other?

A cleaner implementation would look something like this:

context_type, sequence, subsequence, shot = self.get_current_context()
self.set_type(context_type)
self.set_sequence(sequence)
if subsequence:
    self.set_subsequence(subsequence)
if shot:
    self.set_shot(shot)
self.__work_area = work_area if work_area else self.get_current_work_area()

The above code cleans up all the list indices and string manipulations to make the code easier to read and understand. The individual list elements have been assigned meaningful names. Also the environment variable accesses have been extracted away into new methods. This makes the code easier to maintain. What happens if we want to change the name of the environment variables or maybe we want to read the values from a configuration file on disk? Extracting those values to functions would let us update the code in one place rather than multiple direct accesses to the environment variable.

Naming Classes

Class names should be a noun because they represent objects. The name should be as specific as possible. If the name cannot be specific, it may be a sign that the class needs to be split into smaller classes. Classes should have a single responsibility.

Bad class names include:

  • ShapeIE
  • Utility
  • Common
  • MyFunctions
  • DansUtils
  • ShapeClass

Good class names include:

  • ShapeExporter
  • RigPublisher
  • Project
  • User

Naming Methods

Method names should be verbs because they perform actions. There should be no need to read the contents of a method if the name accurately describes what the method does. If the function is doing one thing (as it should) it should be easy to name. If not, split the code into smaller functions. Sometimes explaining the code out loud and help you name the function. If you say “And”, “If”, or “Or” it is a warning sign that the method is doing too much.

Bad method names include:

  • proc_new
  • pending
  • process1
  • process2

Good method names include:

  • create_process
  • is_pending
  • send_notification
  • import_mesh
  • calculate_rivet_matrix

Methods should only perform the actions described by the name. Any other actions are called side effects and they can confuse people using your code. For example, a method called validate_form should not save the form. A method called publish_model should not smooth the normals. A method called prune_weights should not remove unused influences.

Avoid Abbreviations

Abbreviated text may be easier to type, but code is read more than it is written. When people talk about code or read it silently, it is harder to say the abbreviations. There are also no standards when referring to abbreviations.

Bad names:

  • sjData
  • jid
  • sjid
  • nm
  • sjState

Good names:

  • subjob_data
  • job_id
  • subjob_id
  • name
  • subjob_state

Naming Booleans

Boolean values should be able to fit in an actual sentence of saying something is True or False.

Bad boolean names:

  • open
  • status
  • login
Good Boolean names:
  • is_open
  • logged_in
  • is_valid
  • enabled
  • done

Symmetrical Names

When names have a corresponding opposite, be consistent and always use the same opposite.

Bad naming

  • on/disabled
  • quick/slow
  • lock/open
  • low/max

Good naming

  • on/off
  • fast/slow
  • lock/unlock
  • min/max

Working with Booleans

When comparing Booleans, compare them implicitly:

# Don't do this
if (is_valid == True):
    # do something

# Instead do this
if is_valid:
    # do something

When assigning booleans, assign them implicitly:

# Don't do this
if len(items) == 0:
    remove_entry = True
else:
    remove_entry = False

# Instead do this
remove_entry = len(items) == 0

Avoid using booleans that represent negative values. This leads to double negatives, which end up confusing people:

# Don't do this
if not not_valid:
    pass

# Instead do this
if valid:
    pass

Use Ternaries

Ternaries are ways of assigning a value to a variable depending on if some condition is True or False. For example:

# Don't do this
if height > height_threshold:
    category = 'giant'
else:
    category = 'hobbit'

# Instead do this
category = 'giant' if height > height_threshold else 'hobbit'

Don’t Use String as Types

You may have encountered code similar to the following:

if component_type == 'arm':
    # do something
elif component_type == 'leg':
    # do something else

This is considered bad form for various reasons. If we end up wanting to change the value of one of these types, we have to change it in all the places that it is used. It can also lead to typos and inconsistencies. A better approach would be:

class ComponentType(object):
    arm = 'arm'
    leg = 'leg'

if component_type == ComponentType.arm:
    # do something
elif component_type == ComponentType.leg:
    # do something else

The new code provides one place to change and update values (the DRY principle). It also provides auto-completion support and is more searchable if you are using an IDE like PyCharm or Eclipse.

Don’t Use Magic Numbers

Magic numbers are numeric values that seemed to have been pulled out of nowhere. For example, the following code was pulled from an actual VFX pipeline tool:

if run_mode < 3:
    run_mode = 5
elif run_mode == 3:
    run_mode = 4

What do these numbers mean? You would need to search all over code that could span multiple files to figure out what these numbers represent. A better approach would be:

class JobStatus(object):
    waiting = 1
    starting = 2
    running = 3
    aborting = 4
    done = 5

    def __init__(self, value=JobStatus.waiting):
        self.status = value

    def not_yet_running(self):
        return self.status < JobStatus.running

    def abort(self):
        if self.not_yet_running():
            self.status = JobStatus.done
        elif self.status == JobStatus.running:
            self.status = JobStatus.aborting

# job_status is the new run_mode
job_status.abort()

Encapsulate Complex Conditionals

Sometimes you may have conditionals with many comparisons chained together. At some point, this is going to get hard to read. For example:

# Instead of this
if (obj.component.partial_path.startswith('model') and
    namespace == ‘GEOM’ and
    has_rigging_publish(obj.child) and
    edits_path):

# Encapsulate the complex conditional in a function or variable
def is_model_only_publish(obj):
    return (obj.component.partial_path.startswith('model') and
            namespace == 'GEOM' and
            has_rigging_publish(obj.child) and
            edits_path)

if is_model_only_publish(obj):

Writing Clean Functions

Functions should be created in order to help convey intent. They should do one thing and one thing only as this aids the reader, promotes reuse, eases testing, and avoids side effects. Strive for a function to only have 0-3 parameters with a max of 7-9 parameters. Anything longer makes it harder for readers to keep track of all the parameters while running through the code in their head. Functions should be relatively short, maybe no more than 100 or so lines. If a function is longer, it may be time to refactor (update) the code into smaller functions.

Extracting a Method

If you find your code 3 or 4 indentation levels deep, it may be time to extract some of that code into a separate function. For example:

# Instead of this
if something:
    if something_else:
        while some_condition:
            # do something complicated

# Do this instead
if something:
    if something_else:
        do_complicated_things()

def do_complicated_things():
    while some_condition:
        # do something complicated

Return Early

People can usually only keep track of a handful of trains of thought at a time. Therefore, we should try to organize our code in as many discrete independent chunks as possible. For example:

# Instead of this
def validate_mesh(mesh):
    result = False
    if has_uniform_scale(mesh):
        if has_soft_normal(mesh):
            if name_is_alphanumeric(mesh):
                result = name_is_unique(mesh)
    return result

# Do this
def validate_mesh(mesh):
    if not has_uniform_scale(mesh):
        return False
    if not has_normal(mesh):
        return False
    if not name_is_alphanumeric(mesh):
        return False
    return name_is_unique(mesh)

This is not a strict rule. Like everything listed so far, use it when it enhances readability.

Signs Your Function is Too Long

Functions should hardly ever be over 100 lines. Longer functions are harder to test, debug, and maintain since users need to keep track if updates at the start of the function affect areas and the end of the function. Here are some simple rules to determine if a function is too long:

  • You separate sections of code in a function with whitespace and/or comments
  • Scrolling is required to view all the code.
  • The function is hard to name.
  • There are conditionals several levels deep.
  • There are more than 7 parameters or variables in scope at a time.

Writing Clean Classes

Classes are like headings in a book, there should be multiple layers of abstraction going from high level ideas to more detailed lower level ideas:

  • Chapter
    • Heading 1
      • Paragraph 1
      • Paragraph 2
    • Heading 2
      • Paragraph 1
  • Module
    • Class 1
      • Method 1
      • Method 2
    • Class 2
      • Method 1

High Cohesion

Cohesion is the fact of forming a united whole. When a class is said to have high cohesion, all of its functionality is closely related. We should strive to create classes with high cohesion. High cohesion not only enhances readability; it also increases the likelihood of reusing the class. Signs that a class does not have high cohesion are:

  • The class has methods that don't interact with the rest of the class.
  • The class has fields only used by one method.
  • The class changes often.

For example:

# Low cohesion class
class Vehicle(object):

    def edit_options():
        pass

    def update_pricing():
        pass

    def schedule_maintenance():
        pass

    def send_maintenance_reminder():
        pass

    def select_financing():
        pass

    def calculate_monthly_payment():
        pass

The Vehicle class contains many unrelated methods. This makes it harder to use and maintain because parts of unrelated code are intertwined together. A better approach would be to split this class up into smaller classes:

# High cohesion classes
class Vehicle(object):
    def __init__(self)

    def edit_options():
        pass

    def update_pricing():
        pass

class VehicleMaintainer(object):
    def schedule_maintenance():
        pass

    def send_maintenance_reminder():
        pass

class VehicleFinancer(object)
    def select_financing():
        pass

    def calculate_monthly_payment():
        pass

Method Proximity

Code should read top to bottom and related methods should be kept together:

def add_take():
    if not validate_take():  # First method referenced should be directly below
        raise ValueError('Take is not valid')
    save_take()  # Second method referenced should be below first

def validate_take():
    return take.endswith(‘.mov’)

def save_take():
    # save in database

Collapsed code should read like an outline. Strive for multiple layers of abstraction:

  • Class
    • Method 1
      • Method 1a
        • Method 1ai
        • Method 1aii
      • Method 1b
      • Method 1c
    • Method 2
      • Method 2a

Comments

Comments should only be used to explain ideas and assumptions not already apparent by reading the code.

Redundant Comments

The comments in this code do not add anything the user could not have figured out by reading the code.

# Clear the node combo box then add items
self.node_combobox.clear()
if nodes:
    # Sort the nodes
    nodes.sort()
    # Check to see if there is a shape controller associated with the node
    self.find_shape_controllers(nodes)

    # Now add the list of nodes to the combo box
    self.node_combobox.addItems(nodes)

    # If a node is specified set the combo box
    if node:
        # Find the node's index
        index = self.node_combobox.findText(
            node,
            QtCore.Qt.MatchExactly | QtCore.Qt.MatchCaseSensitive)
        self.node_combobox.setCurrentIndex(index)

Divider Comments

If you see divider comments, it's a sign you need to extract the code into its own function:

# Now create the new group object and insert it into the table
# ----------------------------------------------------------------------------------
# Create the group object
group = slidergroup.SliderGroup(name)
self._slider_groups[name] = group
# Tell the group what its start row is
group.setRow(row)        
# Apply color
if color:
    group.setColor(color)        

# Generate sliders from the attributes attached to the group
# ----------------------------------------------------------------------------------
row_index = row + 1
sliders_to_add = []
for attr in attributes:
    if cmds.objExists(attr):
        slider = self.add_slider(attr, rowIndex, group)
        row_index += 1

Zombie Comments

Zombie comments are large sections of commented out code. People often do this because they think they might need the code in the future. This is unnecessary because version control systems like git, svn, and perforce perform this exact functionality. People looking at code with large commented out portions will be confused. Why is the code commented out? Is it important?

Clean Code Conclusion

In this section, I covered a quick overview on writing clean code. There are many other great resources that go in a lot more detail of writing clean code including explanations why it is considered clean code. As I mentioned at the start of the section, for more in-depth discussions on writing clean code, refer to Clean Code: A Handbook of Agile Software Craftsmanship, by Robert C Martin, and the Pluralsight course, Clean Code: Writing Code for Humans by Cory House.

3 thoughts on “Writing Clean Code”

K March 6, 2016 at 2:06 am

Thanks for the interesting article.
Inspired by this I wrote a list of to-dos for a readable code.
http://koichitamura.blogspot.ca/2016/03/readable-code.html
I hope more and more people pay attention to code sanity.

Chad March 6, 2016 at 10:43 am

Cool! Since the target audience for this page is those newer to programming, you might want to give some examples with your points or else people not might know what you are talking about.

Leave A Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

<code> For inline code.
[sourcecode] For multiline code.