Conversation
- branch does NOT work
…d main nextflow files to integrate in the pipeline
… as in modules main.nf
…ig , corrected process
|
tested that the branch works in this seqera run |
|
Note that to plot the groups they should be parsed in seqera as this:
When groups are set up in other format it will use only the first group for comparisons (in this case "BLADDER_LOCATION"):
Checked in this run |
m-huertasp
left a comment
There was a problem hiding this comment.
Hola!! I think this is quite a nice job! As we talked about, there are a few things that might need to be changed. I have added some comments so it is easier to check the things we discussed.
Good job!
Co-authored-by: Marta Huertas <97596516+m-huertasp@users.noreply.github.com>
…s parameter, and the output of a pdf with all plots per group. Also added a condition so this module can only run when user defines a groups list
bin/depth_group_comparison.py
Outdated
|
|
||
| @click.command() | ||
| @click.option('--table-filename', required=True, type=click.Path(exists=True), help='Input features table file') | ||
| @click.option('--depth-table', required=True, type=click.Path(exists=True), help='Input depth table file') |
There was a problem hiding this comment.
@FerriolCalvet I couldn't find an output file from consensus exons depth that contains the "ALL_GENES" value per sample, so that I could use it to plot the all genes value per sample. These are the other files from the output, I used average_depth_gene_sample:
average_depth_sample = DEPTHSSUMMARY.out.average_per_sample
average_depth_gene = DEPTHSSUMMARY.out.average_per_gene
average_depth_gene_sample = DEPTHSSUMMARY.out.average_per_gene_sample
So the script uses the average_depth_gene_sample to plot the 'all genes' value per group of samples, and this is why the 'all genes' plots per group show more spots than the by gene per group plots. The way it handles it could be error prone so I would try to fix this with an external file.
There was a problem hiding this comment.
there is no such ALL_GENES scenario, the information in the average_depth_sample is like what you would expect to have for ALL_GENES.
Maybe you could:
- also take this file
- define a GENE column with the value ALL_GENES
- then concatenate this table to the table that you already have
and this way you would have a correct ALL_GENES value that you could plot with the rest of plots
…ALL_GENES values per group. Added new input in config for it. Also fixed bug in processing groups and added in modules.config a subdirectory for output
bin/depth_group_comparison.py
Outdated
| # Process groups | ||
| groups_of_interest_init = ast.literal_eval(groups) if groups else [] | ||
| groups_of_interest = [] | ||
| groups_of_interest = list(dict.fromkeys(item.strip() for sublist in groups_of_interest_init for item in sublist if item != '')) |
There was a problem hiding this comment.
there was a bug in this code that I had to replace to handle different type of groups:
["BLADDER_LOCATION"], ["BLADDER_LOCATION", "SEX"], ["BLADDER_LOCATION", "SEX", "SMOKING_STATUS"] ]'
and
"[ ["Sample_Group"], ["cancer"], ["Age_onset"], ["Cancer_age_group"] , ["Bacterial_Signatures_identified"]]"
|
|
||
| # Process depth per sample to add the 'ALL_GENES' depth value per sample | ||
| print('Processing per sample depth table to add the ALL_GENES depth column: ') | ||
| depth_per_sample['GENE'] = 'ALL_GENES' |
There was a problem hiding this comment.
@FerriolCalvet if user does not define the genes in the panel as custom genes, with the way we handle it now to calculate ALL_GENES depth, it will display all depths from both on target and off target genes in the panel right? I mean the default would not be the panel genes but including off targets...
| // you will have to add an extra parameters to the pipeline (nextflow.config) and then handle it here | ||
| publishDir = [ | ||
| path: { "${params.outdir}/plots/depths_summary" }, | ||
| path: { "${params.outdir}/plots/depths_summary/plots_per_group" }, |
There was a problem hiding this comment.
now added as subdirectory specific for groups
This branch adds a new module to process analysis of consensus exons depth for a defined group of samples and for a subset of genes (default genes in the panel) plotting all these possibilities in an output pdf that is stored in the same output directory as plotting the depths (
/depths/summary)