Tuesday, April 6, 2010

Tracking computational experiments with Sumatra

“I thought I used the same parameters but I’m getting different results”

“I can’t remember which version of the code I used to generate figure 6”

“The new student wants to reuse that model I published three years ago but he can’t reproduce the figures”

“It worked yesterday”

“Why did I do that?”


We would like to announce the release of version 0.1 of Sumatra, a tool for tracking computational experiments and analyses so as to be able to easily replicate them at a later date.

Replication of computational experiments or analyses ought to be easy, given that computers don't suffer from the problems of inter-subject and trial-to-trial variability that make reproduction of biological experiments so challenging. In general, however, it is not easy, perhaps due to the complexity of our code and our computing environments, and the difficulty of capturing every essential piece of information needed to reproduce a computational experiment using existing tools such as spreadsheets, version control systems and paper notebooks.

The aim of Sumatra is to record as much as possible of the experimental context (software versions, parameters, dependencies, platform information, what files were produced, etc.) automatically, and make it easy to annotate the record with information that cannot be obtained automatically (why the simulation or analysis was performed, tags for later searching, etc.).

Given the large differences in the workflows of different researchers (command line, GUI, batch-jobs (e.g. in supercomputer environments), or any combination of these for different components (simulation, analysis, graphing, etc.) and phases of a project), it is difficult to provide a one-tool-fits-all solution, therefore Sumatra provides the core functionality as a Python package on top of which various different interfaces can be built.

Sumatra currently provides a command-line interface and a rudimentary web interface; we hope that people will also be interested in incorporating Sumatra's functionality within their own tools.

Sumatra 0.1 may be downloaded from the INCF Software Center or from PyPI.

For more information and documentation, check out https://neuralensemble.org/trac/sumatra/.

6 comments:

mattions said...

And Now also with git support.
Yes :)

Andrew Davison said...

note that git support is in the repository, but will not be officially released until version 0.2 (coming soon).

HC said...

Mercurial is in python, so I am wondering if there is any interest in mercurial integration? We are also using Mercurial to track source code changes for Genesis development. Maybe we can have a look at it at the CodeJam session?

Andrew Davison said...

Mercurial is already supported in 0.1, as is Subversion. It might be interesting to integrate Sumatra in the Genesis GUI - perhaps we could discuss that at the CJ.

Alexandra said...

Interested to know if you have had any feedback as to what different interfaces have been built on top of the core Python package?

Sarah said...

“The new student wants to reuse that model I published three years ago but he can’t reproduce the figures”

"We would like to announce the release of version 0.1 of Sumatra, a tool for tracking computational experiments and analyses so as to be able to easily replicate them at a later date."

This is music to my ears - it is just so frustrating when you create something and then due to slight changes in technology or software, you end up wasting your time trying to figure out how to share the original data properly. Great project and hopefully you will let us know the update soon.

Sarah

CPR certification