Skip to content

[error]: sequence chr22 is not present #75

@tamuanand

Description

@tamuanand

Hi @readmanchiu @cmdcolin

Thanks for providing this great tool.

I was experimenting with the tool - did a conda install of version 1.5.5.

When I run straglr.py --version , I get these warnings

/usr/local/lib/python3.12/site-packages/src/ins.py:364: SyntaxWarning: invalid escape sequence '\d'
  clipped_start_regex = re.compile('^(\d+)[S|H]')
/usr/local/lib/python3.12/site-packages/src/ins.py:365: SyntaxWarning: invalid escape sequence '\d'
  clipped_end_regex = re.compile('(\d+)[S|H]$')
/usr/local/lib/python3.12/site-packages/src/tre.py:79: SyntaxWarning: invalid escape sequence '\d'
  m = re.search('(\d[\d\s]*\d)', self.trf_args)
1.5.5

More importantly, I downloaded the test data files from https://github.com/bcgsc/straglr/tree/master/test (test.bam, test.bam.bai, test.fa.gz,test.fa.gz.fai, test.bed) and when I try calling straglr.py it keeps erring out with this below error message. (Note: chr22 is present in both the fasta and fai files)

Please let me know how I can go about debugging this.

multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pathos/helpers/mp_helper.py", line 15, in <lambda>
    func = lambda args: f(*args)
                        ^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/src/tre.py", line 1210, in get_alleles
    self.update_refs(variants, genome_fasta)
  File "/usr/local/lib/python3.12/site-packages/src/tre.py", line 1236, in update_refs
    ref_seq = genome_fasta.fetch(variant[0], variant[1], variant[2])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pysam/libcfaidx.pyx", line 301, in pysam.libcfaidx.FastaFile.fetch
KeyError: "sequence 'chr22' not present"
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/straglr.py", line 120, in <module>
    main()
  File "/usr/local/bin/straglr.py", line 112, in main
    variants = tre_finder.genotype(args.loci)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/src/tre.py", line 1424, in genotype
    return self.collect_alleles(loci)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/src/tre.py", line 1396, in collect_alleles
    batched_results = parallel_process(self.get_alleles, batches, self.nprocs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/src/utils.py", line 21, in parallel_process
    results = p.map(func, args)
              ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/pathos/multiprocessing.py", line 154, in map
    return _pool.map(star(f), zip(*args), **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/multiprocess/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/multiprocess/pool.py", line 774, in get
    raise self._value

This was the command I used to test

straglr.py \
    test.bam \
    test.fa \
    "test_straglr" \
    --loci test.bed \
    --min_str_len 2 \
    --max_str_len 6 \
    --min_support 2 \
    --genotype_in_size 

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions