Skip to content

plot depths per group#424

Open
efigb wants to merge 13 commits intodevfrom
145-enhance-the-analysis-of-depths
Open

plot depths per group#424
efigb wants to merge 13 commits intodevfrom
145-enhance-the-analysis-of-depths

Conversation

@efigb
Copy link
Collaborator

@efigb efigb commented Feb 25, 2026

This branch adds a new module to process analysis of consensus exons depth for a defined group of samples and for a subset of genes (default genes in the panel) plotting all these possibilities in an output pdf that is stored in the same output directory as plotting the depths (/depths/summary)

@efigb
Copy link
Collaborator Author

efigb commented Feb 25, 2026

tested that the branch works in this seqera run

@efigb efigb added the new-feature New functionality being added. label Feb 25, 2026
@efigb
Copy link
Collaborator Author

efigb commented Feb 25, 2026

Note that to plot the groups they should be parsed in seqera as this:

features_groups_list: "[ ["BLADDER_LOCATION"], ["SEX"], ["SMOKING_STATUS"] ]"

When groups are set up in other format it will use only the first group for comparisons (in this case "BLADDER_LOCATION"):

features_groups_list: "[ ["BLADDER_LOCATION"], ["BLADDER_LOCATION", "SEX"], ["BLADDER_LOCATION", "SEX", "SMOKING_STATUS"] ]"

Checked in this run

@m-huertasp m-huertasp self-requested a review February 25, 2026 14:31
Copy link
Collaborator

@m-huertasp m-huertasp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hola!! I think this is quite a nice job! As we talked about, there are a few things that might need to be changed. I have added some comments so it is easier to check the things we discussed.

Good job!

efigb and others added 2 commits February 25, 2026 16:41
Co-authored-by: Marta Huertas <97596516+m-huertasp@users.noreply.github.com>
…s parameter, and the output of a pdf with all plots per group. Also added a condition so this module can only run when user defines a groups list

@click.command()
@click.option('--table-filename', required=True, type=click.Path(exists=True), help='Input features table file')
@click.option('--depth-table', required=True, type=click.Path(exists=True), help='Input depth table file')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FerriolCalvet I couldn't find an output file from consensus exons depth that contains the "ALL_GENES" value per sample, so that I could use it to plot the all genes value per sample. These are the other files from the output, I used average_depth_gene_sample:

    average_depth_sample        = DEPTHSSUMMARY.out.average_per_sample
    average_depth_gene          = DEPTHSSUMMARY.out.average_per_gene
    average_depth_gene_sample   = DEPTHSSUMMARY.out.average_per_gene_sample

So the script uses the average_depth_gene_sample to plot the 'all genes' value per group of samples, and this is why the 'all genes' plots per group show more spots than the by gene per group plots. The way it handles it could be error prone so I would try to fix this with an external file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no such ALL_GENES scenario, the information in the average_depth_sample is like what you would expect to have for ALL_GENES.
Maybe you could:

  1. also take this file
  2. define a GENE column with the value ALL_GENES
  3. then concatenate this table to the table that you already have
    and this way you would have a correct ALL_GENES value that you could plot with the rest of plots

@efigb efigb self-assigned this Feb 25, 2026
…ALL_GENES values per group. Added new input in config for it. Also fixed bug in processing groups and added in modules.config a subdirectory for output
# Process groups
groups_of_interest_init = ast.literal_eval(groups) if groups else []
groups_of_interest = []
groups_of_interest = list(dict.fromkeys(item.strip() for sublist in groups_of_interest_init for item in sublist if item != ''))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a bug in this code that I had to replace to handle different type of groups:

["BLADDER_LOCATION"], ["BLADDER_LOCATION", "SEX"], ["BLADDER_LOCATION", "SEX", "SMOKING_STATUS"] ]'

and

"[ ["Sample_Group"], ["cancer"], ["Age_onset"], ["Cancer_age_group"] , ["Bacterial_Signatures_identified"]]"


# Process depth per sample to add the 'ALL_GENES' depth value per sample
print('Processing per sample depth table to add the ALL_GENES depth column: ')
depth_per_sample['GENE'] = 'ALL_GENES'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FerriolCalvet if user does not define the genes in the panel as custom genes, with the way we handle it now to calculate ALL_GENES depth, it will display all depths from both on target and off target genes in the panel right? I mean the default would not be the panel genes but including off targets...

// you will have to add an extra parameters to the pipeline (nextflow.config) and then handle it here
publishDir = [
path: { "${params.outdir}/plots/depths_summary" },
path: { "${params.outdir}/plots/depths_summary/plots_per_group" },
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now added as subdirectory specific for groups

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-feature New functionality being added.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants