Thursday, October 18, 2012

Sumatra 0.4 released

We would like to announce the release of version 0.4.0 of Sumatra, a tool for automated tracking of simulations and computational analyses so as to be able to easily replicate them at a later date.


The biggest change in Sumatra 0.4 is the redesign of the browser-based interface, launched with smtweb. Thanks to the Google Summer of Code program, Dmitry Samarkanov was able to spend his summer working on improving Sumatra, with the results being a much improved web interface, better support for running Sumatra on Windows, and better support for running Matlab scripts with Sumatra. Many thanks to Google and to the INCF as mentoring organisation. In addition to Dmitry's improvements, handling of input and output data files is much improved, and Sumatra now captures and stores standard output (stdout) and standard error (stderr) streams. More details on all of these, plus a bunch of minor improvements and bug fixes, is given below. Finally, Sumatra no longer supports Python 2.5 - the minimum requirement is Python 2.6.

Web interface

The Sumatra browser-based interface runs a local webserver on your computer, and allows you to browse the information that Sumatra captures about your analyses, simulations or other computations, including code versions, input and output data files, parameter/configuration files, the operating system and processor architecture.

The interface has been completely redesigned for Sumatra 0.4, and includes dozens of large and small improvements, including:

  • a more modern, attractive design
  • the ability to select which columns to display in the record list view
  • the ability to search all of your records based on date, tags or full-text
  • side-by-side comparison of records
  • sorting of records based on any column
  • selection of multiple records by clicking or dragging for deletion, comparison and tagging

Furthermore, it is now possible to launch computations from the browser interface.

For more information, see Using the web interface.

Data file handling

In earlier versions of Sumatra, the filename (or rather, the file path relative to a user-defined root) was used as the identifier for input and output data files. The problem with this, of course, is that it is possible to overwrite a given file with new data. For this reason, Sumatra 0.4 now calculates and stores the SHA1 hash of the file contents. If the file contents change, the hash will also change, so that Sumatra can alert you if a file is accidentally overwritten, for example.

Sumatra 0.4 also adds a new data store which automatically archives a copy of the output data from your computations in a user-selected location. This data store is accessible through the API as the ArchivingFileSystemDataStore class, or through the smt command-line interface with the "archive" option to the "init" and "configure" commands.

Finally, Sumatra now allows the user the choice of whether to use an absolute or relative path for the data store root directory. Using a relative path makes projects easier to move and easier to access from other locations (e.g. with symbolic links or NFS).

Matlab support

Sumatra can capture certain information for any command-line tool: input and output data, version of the main codebase, operating system and processor architecture, etc. For dependency information, however (i.e. which libraries, modules or packages are imported/included by your main script), a separate plugin is required for each language. Sumatra already has a dependency tracking plugin for Python and for two computational neuroscience simulation environments, NEURON and GENESIS. Sumatra 0.4 adds dependency tracking for Matlab scripts.

Recording of stdout and stderr

Sumatra 0.4 now supports recording and storage of the standard output and standard error streams from your scripts.

Other new features

  • added support for JSON-format parameter files;
  • added smt export command, which allows the contents of a Sumatra record store to be exported in JSON format;
  • more information is now printed by smt list --long;
  • the Python dependency finder now supports scripts run with Python 3 (although Sumatra itself still needs Python 2);
  • can now specify HttpRecordStore username and password as part of the URL passed to smt init;
  • added support for markup using reStructuredText in the project description
  • it is no longer required to have a script file, which makes it possible to use Sumatra with your own compiled executables. Further support for compiled languages is planned for the next release.

Download, support and documentation

The easiest way to get the latest version of Sumatra is

  $ pip install sumatra

Alternatively, Sumatra 0.4.0 may be downloaded from PyPI or from the INCF Software Center. Support is available from the sumatra-users Google Group. Full documentation is available on PyPI.