Stage modularity, round 2

At the moment, pipelines are split into stages but handling of intermediate files is messy and it's hard to modify pipelines. It would be better if:

- stages take generic parameters for input/output files, not predetermined file paths
- file paths are defined by the Pipeline object, or whatever the stages are being used by
- no wildcards - part from date-stamped files in BCBio, we should know what every file will be called
  - this might make output_files.yaml redundant
- checks for whether a stage should run uses presence of a reporting app stage is used as well as presence of files, not instead

We should also be able to mock a dataset, patch `executor`, run a pipeline and assert what all the bash commands were.

We should also make Stage objects as lightweight as possible, ideally removing their access to Dataset - see #395 for reasons why.

This might be a good opportunity to look at [sciluigi](https://github.com/pharmbio/sciluigi)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stage modularity, round 2 #418

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stage modularity, round 2 #418

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions