diff --git a/docs/_running_intro.rst b/docs/_running_intro.rst index a868ee22e..e4427dd08 100644 --- a/docs/_running_intro.rst +++ b/docs/_running_intro.rst @@ -28,6 +28,9 @@ the ``Number of lines`` parameter) are randomly selected. If you want to view it in the Galaxy interface, you can do so with the command ``planemo workflow_edit tutorial.ga``. +Running a workflow +-------------------------------- + The simplest way to run a workflow with planemo is on a locally hosted Galaxy instance, just like executing a tool test with ``planemo test``. This can be achieved with the command @@ -71,7 +74,14 @@ of the user's choice. The full list of engines provided by Galaxy is: ``galaxy`` (the default, used in the first example above), ``docker_galaxy``, ``cwltool``, ``toil`` and ``external_galaxy``. -As a final example to demonstrate workflow testing, try: +Testing a workflow +-------------------------------- + +Testing a workflow can be thought of as an extension of running a workflow where, +after the run finishes, planemo asserts specified expectations about defined outputs. +Workflow tests, like tool tests, are performed with ``planemo test``. + +As an example, try: :: @@ -100,11 +110,13 @@ If you inspect its contents: path: "data/output.txt" -you see that the job parameters are defined identically to the ``tutorial-job.yml`` -file, with the addition of an output. For the test to pass, the output file -produced by the workflow must be identical to that stored in ``data/output.txt``. +you see that the ``job`` parameters, used to run the workflow, are defined identically to the +``tutorial-job.yml`` file, but that the test definition has an additional ``outputs`` section. +For the test to pass, the output file produced by the workflow must be identical to that stored in ``data/output.txt``. + +More details about workflow testing can be found in the dedicated `Test Format `__ chapter. -The three commands above demonstrate the basics of workflow execution with +The examples above demonstrate the basics of workflow execution with Planemo. For large scale workflow execution, however, it's likely that you would prefer to use the more extensive resources provided by a public Galaxy server, rather than running on a local instance. The tutorial therefore now turns to the diff --git a/docs/running.rst b/docs/running.rst index e130d373a..4f94e3414 100644 --- a/docs/running.rst +++ b/docs/running.rst @@ -1,23 +1,54 @@ ==================================== -Running Galaxy workflows +Interacting with Galaxy workflows ==================================== -Planemo offers a number of convenient commands for working with Galaxy -workflows. Workflows are made up of a number of individual tools, which are +Galaxy workflows are made up of a number of individual tools, which are executed in sequence, automatically. They allow Galaxy users to perform complex analyses made up of multiple simple steps. Workflows can be easily created, edited and run using the Galaxy user interface (i.e. in the web-browser), as is described in the `workflow tutorial `__ -provided by the Galaxy Training Network. However, in some circumstances, -executing workflows may be awkward via the graphical interface. For example, -you might want to run workflows a very large number of times, or you might -want to automatically trigger workflow execution as a particular time as new -data becomes available. For these applications, being able to execute workflows -via the command line is very useful. This tutorial provides an introduction to -the ``planemo run`` command, which allows Galaxy tools and workflows to be -executed simply via the command line. +provided by the Galaxy Training Network. + +Planemo commands for interacting with workflows +=============================================== + +Planemo offers a number of convenient commands for interacting with Galaxy +workflows from the command line. +Here, you will use the following ones to interact with a small example workflow: + +- ``planemo run``, which can be used to execute a Galaxy workflow with input datasets / parameters defined in a so-called *job file* in YAML format. + + If you are looking for a way to run workflows a very large number of times, + or to automatically trigger workflow execution at particular times or as new + data becomes available, this command is a great starting point! + +- ``planemo test`` which cannot only be used to `test Galaxy tools `__, but also workflows. + + Similar to ``planemo run``, this command can be used to execute a Galaxy workflow, + but it will also evaluate the success of the workflow execution by comparing workflow output datasets to expected results. + + This command enables test-driven development of Galaxy workflows. + It can also form the basis of automated monitoring systems that, for example, + check for compatibility between workflow versions and Galaxy server versions and instances. + + Input datasets / parameters and output assumptions are passed to this command in a *test file* in YAML format, + which extends the job file format used with ``planemo run``. + +- ``planemo workflow_job_init`` and ``planemo workflow_test_init`` + + These are useful helper commands that generate templates of the *job file* + expected by ``planemo run`` and of the *test file* expected by ``planemo test``, + respectively, from a workflow definition file. + +- ``planemo workflow_lint``, which lets you check a workflow for syntax errors and violation of workflow `best practices `__. + +- ``planemo list_invocations`` and ``planemo rerun``, which are great companions of ``planemo run``. + + ``planemo list_invocations`` provides information about the status of previous runs of a given workflow, + while ``planemo rerun``, through its ``--invocation`` option lets you rerun failed jobs + that resulted from any particular previous run of your workflow. .. include:: _running_intro.rst -.. include:: _running_external.rst \ No newline at end of file +.. include:: _running_external.rst diff --git a/docs/test_format.rst b/docs/test_format.rst index 3c8b7c469..f5eda28d1 100644 --- a/docs/test_format.rst +++ b/docs/test_format.rst @@ -13,14 +13,16 @@ test results (pass or fail for each test) in the console and creates an HTML rep directory. Additional bells and whistles include the ability to generate XUnit reports, publish test results and get embedded Markdown to link to them for PRs, and test remote artifacts in Git repositories. +For more information about testing Galaxy tools using embedded tool XML tests see the tutorial-style chapter +`Test-Driven Development `__ +of Galaxy tools. + Much of this same functionality is now also available for Galaxy_ Workflows as well as `Common Workflow Language`_ -(CWL) tools and workflows. The rest of this page describes this testing format and testing options for these -artifacts - for information about testing Galaxy tools specifically using the embedded tool XML tests see -`Test-Driven Development `__ -of Galaxy tools tutorial. +(CWL) tools and workflows. The rest of this page describes the test format and testing options for these +artifacts. Unlike the traditional Galaxy tool approach, these newer types of artifacts should define tests in files -located next artifact. For instance, if ``planemo test`` is called on a Galaxy workflow called ``ref-rnaseq.ga`` +located next to the artifact. For instance, if ``planemo test`` is called on a Galaxy workflow called ``ref-rnaseq.ga`` tests should be defined in ``ref-rnaseq-tests.yml`` or ``ref-rnaseq-tests.yaml``. If instead it is called on a CWL_ tool called ``seqtk_seq.cwl``, tests can be defined in ``seqtk_seq_tests.yml`` for instance. @@ -103,7 +105,7 @@ runnable artifact outside the context of testing with ``planemo run``. $ planemo run --engine= [ENGINE_OPTIONS] [ARTIFACT_PATH] [JOB_PATH] - This should be familar to CWL developers - and indeed if ``--engine=cwltool`` this works as a formal CWL + This should be familar to CWL developers - and indeed with ``--engine=cwltool`` this works as a formal CWL runner. Planemo provides a uniform interface to Galaxy for Galaxy workflows and tools though using the same CLI invocation if ``--engine=galaxy`` (for a Planemo managed Galaxy instance), ``--engine=docker_galaxy`` (for a Docker instance of Galaxy launched by Planemo), or ``--engine=external_galaxy`` (for a running @@ -168,7 +170,7 @@ your workflows should be labeled anyway to work with Galaxy subworkflows and mor If an output is known, fixed, and small it makes a lot of sense to just include a copy of the output next to your test and set ``file: relative/path/to/output`` in your output definition block as show in the first -example above. For completely reproducible processes this is a great guarentee that results are fixed over +example above. For completely reproducible processes this is a great guarantee that results are fixed over time, across CWL_ engines and engine versions. If the results are fixed but large - it may make sense to just describe the outputs by a SHA1_ checksum_. @@ -180,35 +182,29 @@ describe the outputs by a SHA1_ checksum_. wf_output_1: checksum: "sha1$a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b" -One advantage of included an exact file instead of a checksum is that Planemo can produce very nice line +One advantage of including an exact file instead of a checksum is that Planemo can produce very nice line by line diffs for incorrect test results by comparing an expected output to an actual output. -There are reasons one may not be able to write such exact test assertions about outputs however, perhaps -date or time information is incorporated into the result, unseeded random numbers are used, small numeric -differences occur across runtimes of interest, etc.. For these cases, a variety of other assertions can -be executed against the execution results to verify outputs. The types and implementation of these test -assertions match those available to Galaxy_ tool outputs in XML but have equivalent YAML formulations that -should be used in test descriptions. - -Even if one can write exact tests, a really useful technique is to write sanity checks on outputs as one -builds up workflows that may be changing rapidly and developing complex tools or worklflows via a +There are reasons one may not be able to write exact test assertions about outputs however. +Perhaps date or time information is incorporated into a result, unseeded random numbers are used, small numeric +differences occur across runtimes of interest, etc.. +Even if one can write exact tests, a really useful technique is to write more liberal sanity checks on outputs as one +builds up workflows that may be changing rapidly and develops complex tools or workflows via a `Test-Driven Development cycle `__ using Planemo. *Tests shouldn't just be an extra step you have to do after development is done, they should guide development as well.* -The workflow example all the way above demonstrates some assertions one can make about the contents of -files. The full list of assertions available is only documented for the Galaxy XML format but it is -straightforward to adapt to the YAML format above - check out the -`Galaxy XSD `__ -for more information. - -Some examples of inexact file comparisons derived from an artificial test case in the Planemo test suite is shown below, -these are more options available for checking outputs that may change in small ways over time. +In all of these cases, a variety of other assertions can be run against the execution results to verify outputs. +The "Microbial variant calling workflow" example at the beginning of this chapter demonstrates some assertions one can make about the contents of result files. +Some additional examples of inexact file comparisons taken from an artificial test case in the Planemo test suite are shown below. .. literalinclude:: example_assertions.yml :language: yaml +Currently, the full list of available assertions is only documented as part of the `Galaxy Tool XML format `__ definition in the section on `asserting the contents of Galaxy tool outputs `__, but it should be fairly easy to translate this XML syntax into the YAML format above. + + Engines for Testing --------------------- @@ -333,7 +329,7 @@ doesn't need to exist, but it is used to find ``wf11-remote.gxwf-test.yml``. Galaxy Testing Template ------------------------- -The following a script that can be used with `continuous integration`_ (CI) services such +The following is a script that can be used with `continuous integration`_ (CI) services such Travis_ to test Galaxy workflows in a Github repository. This shell script can be configured via various environment variables and shows off some of the modalities Planemo ``test`` should work in (there may be bugs but we are trying to stablize this functionality).