Skip to content

Running copyKAT on large datasets #138

@ManarRashad

Description

@ManarRashad

Dears,
I am trying to integrate scRNA datasets (2 million cells), and regarding the copy number variation step, I am using copyKAT, and as you know, it can take quite a long time to run, especially if performed per dataset.
I started running it per sample as recommended in the tutorial. I was also wondering whether it is okay for subsampling cells after clustering and then applying transfer learning for label transfer to speed up the process.
I mean, cells are first clustered, a representative subset of cells from each cluster is selected for copyKAT analysis, and then the resulting copy number labels are transferred from the annotated cells to the remaining cells.
OR applying the Metacell-2 to aggregate cells to reduce noise and handle millions of data points before applying the copyKAT?

What do you think, which approach will speed the process? Or even if you have any other suggestions, please let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions