How to get visual performance profiles from plone doctests ?

I am developping a couple of Plone 3.x products. They have some tests, including a huge functional doctest which takes a long time to run (about a couple of hours !) but covers some of my most interesting use cases. I wanted to use these tests in order to get some insights about possible performance bottlenecks and other optimization hot points in my code. The result of my effort was a very nice visual chart showing these bottlenecks and hotpoints.

[update: added another visualization package, see at the end of the post]

Here is how I had to proceed (note that I am more of a foolish and coward hacker than an expert and I decline any responsibility on the consequences of following my howto !) :

1. Give your python a suitable profiler

Plone 3.x requires zope 2.10 which in turn requires python 2.4. More recent versions are not supported AFAICS. Problem: python2.4 does not have a reliable performance profiling module. Its « hotshot » module is both slow (when loading statistics) and badly bugged : it crashes when you have it load some of the profiles it can generate. You have to add a better profiler to your python environment, namely cProfile (which is shipped with python 2.5).

I am a terrible sysadmin and I don’t really understand (and care about) how python manages its pathes and accesses its libraries. So I did this :

  1. download and unzip the source tarball of python 2.5 so that you get cProfile source code
  2. locate relevant files referring to lsprof (the old name of cProfile), using a grep -R lsprof * on the source directory
  3. I personnally located the following files (I leave cProfile test files apart) : Lib/cProfile.py Modules/_lsprof.c and Modules/rotatingtree.* (.c and .h)
  4. download and unzip the source tarball of python 2.4
  5. copy the located cProfile files from their python 2.5 location to the proper dirs into the source code of your fresh python 2.4
  6. update python 2.4 ‘s setup.py file so that the line below is added just after the hoshot one : exts.append( Extension(‘_lsprof’, [‘_lsprof.c’, ‘rotatingtree.c’]) )
  7. did I mention I am so bad at hacking things that I don’t even provide a patch for the operations above ?
  8. compile python 2.4 using a ./configure then make

At this point, you must have an executable python interpreter version 2.4 which includes cProfile. You can check by launching this python and trying a import cProfile which should not fail.

I replaced my system python2.4 by then doing a sudo make altinstall but I also had to manually tweak my system files so that this new python2.4 gets properly called (I am using ubuntu 8.10 intrepid, BTW) :

cd /usr/bin

sudo mv ./python2.4 ./python2.4.5

sudo ln -s /usr/local/bin/python2.4

Now, a plain command line call to python2.4 should give you an interpreter prompt which lets you import cProfile if you dare. I suffered some colateral damage here : the python prompt lost its ability to have previous lines copied at the prompt by pressing the Up/Down arrows. And I had to re-install reportlab from the source (some of my products depend on pisa which depends on reportlab). Anyone knows how to restore this Up/Down arrow capability ?

2. Recreate your buildout using this new python version

So that zope gets recompiled using your new python version :

rm -Rf parts bin develop-eggs

python2.4 bootstrap.py

bin/buildout

3. Patch zope testrunner so that it supports cProfile instead of only supporting hotshot

I got a bit confused because my buildout contains 2 zope testrunners. It took me some time to figure out which was which : the one which is used by the zope instance your buildout creates is the one which is shipped with zope 2.10 and is located at parts/zope2/lib/python/zope/testing/. The other one I have is in the zope.testing egg. I don’t know how and why I got such an egg. Anyway, this egg supports both hotshot and cProfile whereas zope 2.10 testrunner doesn’t. So I hacked the weaker/older zope 2.10 testrunner with some inspiration from zope.testing so that cProfile can be used when running tests. Here is the diff you can use for enhancing  parts/zope2/lib/python/zope/testing/testrunner.py. Oops, left version is the modified one, right version is the original one.

38,69d37
< before_tests_hooks = []
< after_tests_hooks = []
< available_profilers = {}
<
< try:
<     import cProfile
<     import pstats
< except ImportError:
<     pass
< else:
<     class CProfiler(object):
<         «  » »cProfiler » » »
<         def __init__(self, filepath):
<             self.filepath = filepath
<             self.profiler = cProfile.Profile()
<             self.enable = self.profiler.enable
<             self.disable = self.profiler.disable
<
<         def finish(self):
<             self.profiler.dump_stats(self.filepath)
<
<         def loadStats(self, prof_glob):
<             stats = None
<             for file_name in glob.glob(prof_glob):
<                 if stats is None:
<                     stats = pstats.Stats(file_name)
<                 else:
<                     stats.add(file_name)
<             return stats
<
<     available_profilers[‘cProfile’] = CProfiler
<
75,98c43
<     pass
< else:
<     class HotshotProfiler(object):
<         «  » »hotshot interface » » »
<
<         def __init__(self, filepath):
<             self.profiler = hotshot.Profile(filepath)
<             self.enable = self.profiler.start
<             self.disable = self.profiler.stop
<
<         def finish(self):
<             self.profiler.finish()
<
<         def loadStats(self, prof_glob):
<             stats = None
<             for file_name in glob.glob(prof_glob):
<                 loaded = hotshot.stats.load(file_name)
<                 if stats is None:
<                     stats = loaded
<                 else:
<                     stats.add(loaded)
<             return stats
<
<     available_profilers[‘hotshot’] = HotshotProfiler

>     hotshot = None
288c233
<     if len(available_profilers) == 0 and options.profile:

>     if hotshot is None and options.profile:
320,324c265,266
<         if available_profilers.has_key(‘cProfile’): prof = available_profilers[‘cProfile’](file_path)
<         else: prof = available_profilers[‘hotshot’](file_path)
<         before_tests_hooks.append(prof.enable)
<         after_tests_hooks.append(prof.disable)
<

>         prof = hotshot.Profile(file_path)
>         prof.start()
335c277,278
<             prof.finish()

>             prof.stop()
>             prof.close()
342c285,292
<         stats=prof.loadStats(prof_glob)

>         stats = None
>         for file_name in glob.glob(prof_glob):
>             loaded = hotshot.stats.load(file_name)
>             if stats is None:
>                 stats = loaded
>             else:
>                 stats.add(loaded)
>
459d408
<                 [hook() for hook in before_tests_hooks]
461d409
<                 [hook() for hook in after_tests_hooks]
656,659c604
<     [hook() for hook in before_tests_hooks]
<     results = run_tests(options, tests, layer_name, failures, errors)
<     [hook() for hook in after_tests_hooks]
<     return results

>     return run_tests(options, tests, layer_name, failures, errors)

Oh, BTW, this diff also lets you filter out the profiling of the setup and teardown steps of your tests which are of poor value compared to actual tests. Thanks to Daniel Nouri for this.

At this point, you should have given your zope instance the capability of profiling tests using cProfile. You can check it by asking for a debug prompt from zope : bin/instance debug The prompt you get should allow you to safely import cProfile

4. Profile your test

Say you have a Products called Products.DearProduct with some tests. Profile them :

bin/instance test -s Products.DearProduct –profile

At this point, you should get a tests_profile.*.prof file saved in the current dir. It contains the performance profile cProfile generated, using the pstats format. You can manually load and analyze this data. Or have a limited GUI show you what it’s like. Or you can go for the nicer, more insightful version which follows.

5. Visualize and analyze the performance profile you generated

Thanks to Ingeniweb folks, I heard of gprof2dot and xdot. Download them (the scripts, not the folks). Use them to generate and display a very nice graph :

chmod 744 gprof2dot.py

chmod 744 xdot.py

./gprof2dot.py -f pstats -o profile.dot tests_profile.*.prof

./xdot.py profile.dot

Note the * you may replace with the ID of the profile generated above. Or you can use the fancy but dangerous one-liner below which runs the tests,  generates the profile, generates the corresponding graph, displays the results of tests and displays the graph for analysis :

rm -f tests_profile.*.prof && rm -f profile.pstats && rm -f profile.dot && bin/single-instance test -s Products.MyDearProduct –profile > /tmp/test.txt ; ./gprof2dot.py -f pstats -o profile.dot tests_profile.*.prof && less /tmp/test.txt ; ./xdot.py profile.dot

At this point, you should be starring at nice colored graph which represent the flow of your tests and the method which may be performance bottlenecks. And you should be hoping that it was worth the effort.

[Here starts the update]

After some contemplation moment, I tried to analyze the graph of my tests and did not feel extremely happy with this graph visualization. It indeed shows me that the slowlyness of functional doctest is mostly due to the testing framework (zope.testbrowser, etc.). This slowlyness « hides » the optimization opportunities of my code. And I don’t know how to exclude some products from the being profiled or from being present in the profile stats (I would have liked to filter out zope.testbrowser and other Plone-specific things). But, all hope is not lost, here comes kcachegrind:

sudo apt-get install kcachegrind

sudo easy-install pyprof2calltree

pyprof2calltree -o output.calltree.stats -i tests_profile.*.prof -k

Using kcachegrind with the help of pyprof2calltree, I was able to focus on my product methods and identify those methods which deserve some caching. Added some @memoize decorators and reran the profiled tests so that I could enjoy the performance improvement… Happy I am, happy thou shalt be.

What do you think ?