We would like to announce the release of version 0.6.0 of Sumatra, a tool for automated tracking of simulations and computational analyses so as to be able to easily replicate them at a later date.
This version of Sumatra sees many new features, improvements to existing ones, and bug fixes:
New export formats: LaTeX and bash script
You can now export your project history as a LaTeX document, allowing you to easily generate a PDF listing details of all the simulations or data analyses you've run.
$ smt list -l -f latex > myproject.tex
Using tags, you can also restrict the output to only a subset of the history.
You can also generate a shell script, which can be executed to repeat the sequence of computations captured by Sumatra, or saved as an executable record of your work. The shell script includes all version control commands needed to ensure the correct version of the code is used at each step.
$ smt list -l -f shell > myproject.sh
Working with multiple VCS branches
Previous versions of Sumatra would always update to the latest version of the code in your version control repository before running the computation.
Now, "smt run" will never change the working copy, which makes it much easier to work with multiple version control branches and to go back to running earlier versions of your code.
Programs that read from stdin or write to stdout
Programs that read from standard input or write to standard output can now be run with "smt run". For example, if the program is normally run using:
$ myprog < input > output
You can run it with Sumatra using:
$ smt run -e myprog -i input -o output
Improvements to the Web browser interface
There have been a number of small improvements to the browser interface:
- a slightly more compact and easier-to-read (table-like) layout for parameters;
- the working directory is now displayed in the record detail view;
- the interface now works if record label contain spaces or forward slashes.
The minimal Django version needed is now 1.4.
Support for PostgreSQL
The default record store for Sumatra is based on the Django ORM, using SQLite as the backend. It is now possible to use PostgreSQL instead of SQLite, which gives better performance and allows the Sumatra record store to be placed on a separate server.
To set up a new Sumatra project using PostgreSQL, you will first have to create a database using the PostgreSQL tools ("psql", etc.). You then configure Sumatra as follows:
$ smt init --store=postgres://username:password@hostname/databasename MyProject
Note that the database tables will not be created until after the first "smt run".
Integration with the SLURM resource manager
Preliminary support for launching MPI computations via SLURM has been added.
$ smt configure --launch_mode=slurm-mpi $ smt run -n 256 input_file1 input_file2
This will launch 256 tasks using "salloc" and "mpiexec".
Command-line options for SLURM can also be set using "smt configure"
$ smt configure --launch_mode_options=" --tasks-per-node=1"
If you are using "mpiexec" on its own, without a resource manager, you can set MPI command-line options in the same way.
Improvements to the @capture decorator
The "@capture" decorator, which makes it easy to add Sumatra support to Python scripts (as an alternative to using the "smt" command), now captures stdout and stderr.
Improvements to parameter file handling
Where a parameter file has a standard mime type (like json, yaml), Sumatra uses the appropriate extension if rewriting the file (e.g. to add parameters specified on the command line), rather than the generic ".param".
Migrating data files
If you move your input and/or output data files, either within the filesystem on your current computer or to a new computer, you need to tell Sumatra about it so that it can still find your files. For this there is a new command, "smt migrate". This command also handles changes to the location of the data archive, if you are using one, and changes to the base URL of any mirrored data stores. For usage information, run:
$ smt help migrate
Miscellaneous improvements
- "smt run" now passes unknown keyword args on to the user program. There is also a new option '--plain' which prevents arguments of the form "x=y" being interpreted by Sumatra; instead they are passed straight through to the program command line.
- When repeating a computation, the label of the original is now stored in the 'repeats' attribute of 'Record', rather than appending "_repeat" to the original label. The new record will get a new unique label, or a label specified by the user. This means a record can be repeated more than once, and is a more reliable method of indicating a repeat.
- A Sumatra project now knows the version of Sumatra with which it was created.
- "smt list" now has an option '-r'/'--reverse' which lists records oldest to newest.
Bug fixes
A fair number of bugs have been fixed.
Download, support and documentation
The easiest way to get the latest version of Sumatra is
$ pip install sumatra
Alternatively, Sumatra 0.6.0 may be downloaded from PyPI or from the INCF Software Center. Support is available from the sumatra-users Google Group. Full documentation is available on pythonhosted.org.
No comments:
Post a Comment