Skip to content

Massive chi-squared that increase with iterations? #148

@nigiord

Description

@nigiord

Hi everyone,

We’re currently testing COGAPS on scRNA-seq data from tumor samples with heterogeneous subclones. Our goal is to determine whether COGAPS can help identify patterns that could distinguish between these subclones.

To test this, we used a dataset of 1200 cells (randomly selected from a 6-subclone sample, with 200 cells per subclone) and ran COGAPS with varying parameters:

Input: 1200 x 22,927 (cells x genes)
nPatterns range: [2, 11]
nIterations range: [200, 150000]

cogaps_params <- CogapsParams(
    nIterations = n_iterations,
    seed = 42,
    nPatterns = n_patterns,
    sparseOptimization = TRUE,
    distributed = "genome-wide"
)

However, we’ve encountered an unexpected issue: the mean $\chi^2$ values are extremely high (>130 million) and show an increase with more iterations, contrary to what we expected from the publication and documentation about the package.

Ex:

  • mean $\chi^2$ = 132,013,356 for nIterations = 500; nPatterns = 9
  • mean $\chi^2$ = 130 660 318 for nIterations = 50,000; nPatterns = 9
  • mean $\chi^2$ = 130 740 648 for nIterations = 150,000; nPatterns = 9

This raises concerns about the relevance of the patterns being detected. Is that something normal with COGAPS or could there be an issue with our setup or parameters? We also observe the same behavior on another dataset (5239 cells x 16611 genes).

Thanks for your help!

Cheers,
−Nils

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions