Lain and Sakura

Icono

”One byte to rule them all’ – dtmf – old school

Doctests, Huh?

– – – – – – – – – – – –
By Brian Sutherland  |

Articulo copiado de : http://www.pyzine.com/Issue008/Section_Articles/article_Doctests.html

For good code to be useful, it requires two additional things: good documentation and good unit testing. Doctests allow the programmer to do both at the same time with the additional benefits that the documentation is tested and the prose between the tests documents the tests themselves. There are few things worse than undocumented unit tests which over time grow contorted by special cases until they are as understandable as black magic. Forming a link between code, documentation and testing is the greatest advantage of doctests.

Normally in a project there are three things you want to test using doctests:

  • Examples in the docstrings of functions, classes and modules showing their use and testing major functionality.
  • Documentation to be sure code examples are correct.
  • Regression testing to make sure past bugs don’t re-appear (Regression tests can become very obscure, so normally these should be separated from the rest to avoid over-complicating the documentation).

Each of these three have a different purpose and in a well structured project should be separated so that they can serve their purpose well. For example, including regression tests in the docstrings of a module will, over time, over-complicate the module’s docstrings. This interferes with the primary use of module docstrings i.e. understanding what the module does.

There are three API’s for writing doctests, a simple API, a unittest API and an advanced API. This article focuses on the unittest API as it is simple enough to learn very quickly, can be combined with more traditional python unit tests and can deal with almost almost every testing problem. In short, a very good compromise.

First we need to set up a project tree with some packages and modules, so just create a directory structure like this:

src/fruit/__init__.py
src/fruit/freshfruit.py
src/fruit/freshfruit-tutorial.txt   -> documentation
src/fruit/tests/__init__.py
src/fruit/tests/test_freshfruit.py  -> regression testing

Next we need some way to run the tests we are going to write. The best is probably to have a test runner script in the top level directory. You can roll your own, but I’m going to grab one from the Zope project. Just drop test.py into the top level of of the project, make sure you have python 2.4 and the python profiler(Debian specific) installed and we can begin.

Testing Docstrings in ModulesSo what is the first thing to do with fresh fruit. Make a Salad! So we write some tests in and classes in freshfruit.py then fill in the code. Have a look at how easy it is to understand the code from the test examples. freshfruit.py:

"""
This module makes salads, it implements a FoodProcessor to process
ingredients which are then added to the SaladBowl container.
"""

class FoodProcessor:
    """This class contains various food processing functions."""

    def diceCarrot(self, str):
        """This function dices carrots.

            >>> processor = FoodProcessor()
            >>> processor.diceCarrot('carrot')
            'diced carrot'

        and returns a ValueError if the input is not a carrot:

            >>> processor.diceCarrot('A diced carrot')
            Traceback (most recent call last):
                ...
            ValueError: 'A diced carrot' is not a carrot
        """
        if str == 'carrot':
            return 'diced carrot'
        else:
            raise ValueError("%s is not a carrot" % repr(str))

    def peelBanana(self, str):
        """This function peels bananas.

            >>> processor = FoodProcessor()
            >>> processor.peelBanana('banana')
            'peeled banana'

        and returns a ValueError if the input is not a banana:

            >>> processor.peelBanana('A peeled banana')
            Traceback (most recent call last):
                ...
            ValueError: 'A peeled banana' is not a banana
        """
        if str == 'banana':
            return 'peeled banana'
        else:
            raise ValueError("%s is not a banana" % repr(str))

class SaladBowl:
    """This is a container for processed ingredients.

    Then you can make a salad bowl and add the prepared ingredients

        >>> bowl = SaladBowl()
        >>> bowl.addIngredient('peeled banana')
        >>> bowl.addIngredient('diced carrot')
        >>> bowl.addIngredient('peeled banana')

    Finally, most people would want to eat it:

        >>> bowl.eat()
        That peeled banana was tasty!
        That diced carrot was tasty!
        That peeled banana was tasty!
    """

    def __init__(self):
        self.contents = []

    def addIngredient(self, addition):
        """Adds ingredients to the salad bowl.

        Make a SaladBowl

            >>> bowl = SaladBowl()

        Add all the ingredients in a string

            >>> bowl.addIngredient('peeled banana')
            >>> bowl.addIngredient('diced carrot')
            >>> bowl.contents
            ['peeled banana', 'diced carrot']

        """
        self.contents.append(addition)

    def eat(self):
        """Eat the contents of the SaladBowl.

        First set up a salad bowl:

            >>> bowl = SaladBowl()
            >>> bowl.contents = ['diced carrot', 'peeled banana']

        eat() out a eating message for all contents of the salad bowl.

            >>> bowl.eat()
            That diced carrot was tasty!
            That peeled banana was tasty!

        and removes them from the bowl:

            >>> bowl.contents
            []
        """
        while self.contents:
            print "That %s was tasty!" % self.contents.pop(0)

Also we need to set up the test runner so that it knows which module to test in test_freshfruit.py. This code should do the trick:

import unittest
import doctest

def test_suite():
    suite = unittest.TestSuite()
    suite.addTest(doctest.DocTestSuite('fruit.freshfruit'))
    return suite

Running the test.py script that we dropped into the top level directory with

> python2.4 test.py

should show you that 5 unit tests were run successfully. Try changing the code or the tests to find out what happens when things fail.

As you may have noticed, each docstring is run in its own namespace, so it is necessary to define class instances for every test (i.e. bowl and processor). To make life easier, you can define them in extra parameters to the DocTestSuite call. Explaining exactly how to do this is too much for this article, but the doctest documentation is excellent.

Also, you don’t have to write the entirety of an exception, doctests will ignore all indented text between the first and last lines of an exception. The ... used in the examples is just a convention, but makes for more readable documentation than a copy of the exception itself.

Of the many doctest options there are two which are incredibly useful. The first, doctest.NORMALIZE_WHITESPACE causes the doctest module not to worry whether the whitespace in your test example is exact (Having failing tests because of trailing whitespace can be incredibly irritating). The second, doctest.ELLIPSIS makes ... match any substring in the test output, much like .* in regular expressions. Of course these should be used with care as they can cause failing tests to appear to work!

You can add these options to to the testing framework by modifying test_freshfruit.py as follows:

import unittest
import doctest

def test_suite():
    suite = unittest.TestSuite()
    suite.addTest(doctest.DocTestSuite('fruit.freshfruit',
            optionflags=doctest.ELLIPSIS + doctest.NORMALIZE_WHITESPACE))
    return suite
Testing Tutorial DocumentationIf you want to write tutorial documentation, doctests make it possible to add interactive code examples in plain text files, you can try this by adding

Making a salad
--------------

To make a salad, you first have to get a food processor:

>>> from fruit.freshfruit import FoodProcessor
>>> processor = FoodProcessor()

Then to process the food, you can dice some carrots:

>>> diced_carrot = processor.diceCarrot('carrot')
>>> diced_carrot
'diced carrot'

and peel some bananas:

>>> peeled_banana1 = processor.peelBanana('banana')
>>> peeled_banana2 = processor.peelBanana('banana')
>>> peeled_banana2
'peeled banana'

You can also get a salad bowl and put the ingredients in:

>>> from fruit.freshfruit import SaladBowl
>>> bowl = SaladBowl()
>>> bowl.addIngredient(diced_carrot)
>>> bowl.addIngredient(peeled_banana1)
>>> bowl.addIngredient(peeled_banana2)

Finally, you can eat and enjoy:

>>> bowl.eat()
That diced carrot was tasty!
That peeled banana was tasty!
That peeled banana was tasty!

to freshfruit-tutorial.txt and slightly modifying test_freshfruit.py to:

import os
import unittest
import doctest

def test_suite():
    suite = unittest.TestSuite()
    suite.addTest(doctest.DocFileSuite(
        os.path.join('..', 'freshfruit-tutorial.txt')))
    suite.addTest(doctest.DocTestSuite('fruit.freshfruit',
            optionflags=doctest.ELLIPSIS + doctest.NORMALIZE_WHITESPACE))
    return suite

As you can see, the text file is parsed exactly like one large docstring. Simple!

Regression TestingFinally, fruit is released and the bug reports start to roll in. Some people want to dice Unicode carrots and others BANANAS. In fact, there are so many different types of carrots and bananas, that testing every special case will make the in-module and tutorial documentation impossible to read.

This is what the test_freshfruit.py file is for. Simply adding doctest.DocTestSuite()` will test all of the docstrings in ``test_freshfruit.py. Regression tests can be added as functions containing only docstrings without fear that they will complicate the documentation. As an example, this test_freshfruit.py includes the tests for Unicode and capitalisation:

import os
import unittest
import doctest

def doctest_Unicode():
    """Test to make sure that freshfruit.py deals with Unicode fruit.

    SetUp:

        >>> from fruit.freshfruit import FoodProcessor
        >>> processor = FoodProcessor()

    A Unicode bananas and carrots must return Unicode:

        >>> processor.peelBanana(u'banana')
        u'peeled banana'
        >>> processor.diceCarrot(u'carrot')
        u'diced carrot'
    """

def doctest_Capitalize():
    """Test to make sure that freshfruit.py deals with CAPITALIZATION.

    SetUp:

        >>> from fruit.freshfruit import FoodProcessor
        >>> processor = FoodProcessor()

    A BANANAs and CARROTs are also fruit:

        >>> processor.peelBanana('BANANA')
        'peeled banana'
        >>> processor.diceCarrot('CARROT')
        'diced carrot'
    """

def test_suite():
    suite = unittest.TestSuite()
    suite.addTest(doctest.DocFileSuite(
        os.path.join('..', 'freshfruit-tutorial.txt')))
    suite.addTest(doctest.DocTestSuite('fruit.freshfruit',
            optionflags=doctest.ELLIPSIS + doctest.NORMALIZE_WHITESPACE))
    suite.addTest(doctest.DocTestSuite())
    return suite

As you can see, it is easy to see exactly what each regression test is testing. It is left as an exercise for the reader to code the solution.

Some Doctest PitfallsThere are of course, some caveats. Specifically, the parsed output:

  • must be the same every time.
  • can be hard to fit into screen width.

Even with these, doctests are still capable of being useful in almost every testing situation. For the rest, there is the Python unittest framework.

FinallyThis short overview should be enough for a programmer to become productive with doctests immediately. He already has all the skills merely by knowing the python command line.

But perhaps the most important aspect of doctests is forcing a good style of documentation that is tested and correct. Every programmer knows the frustration of trying to understand a complex code without the aid of examples or comments. Think about them when you write your tests.

Further Reading and ThanksFor more information you should defiantly have a look at the doctest documentation for implementation, also Jim Fulton’s PyCon 2004 presentation and Phillip J. Eby’s essay go a long way to describe the philosophy behind doctests.

With thanks to Marius Gedminas for introducing me to doctests and helping out with this article. Errors are, of course, mine.


Brian Sutherland

Anuncios

Archivado en: Plone, Python

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s

A %d blogueros les gusta esto: