perf: O(N) disk I/O performance bottleneck in DICOM header extraction

**Description**
There is a significant performance bottleneck when processing large medical datasets. The `get_dicom_header()` function in `pyaslreport/main.py` currently attempts to parse every single file in a target directory to build an array of valid DICOMs, before finally returning the header of the first file. For clinical datasets containing thousands of DICOM slices, this causes massive, unnecessary disk I/O and memory overhead.

**Steps to Reproduce**
1. Trigger a report generation using a directory containing a large number of DICOM files (e.g., 3,000+ slices).
2. Monitor the execution time and memory usage.
3. Observe the massive delay caused by `pydicom.dcmread` iterating over every file.

**Expected Behavior**
The function should return the `dcm_header` immediately upon successfully parsing the very first valid DICOM file, turning an O(N) operation into an O(1) operation (best case).

**Actual Behavior**
The loop iterates through and parses every single file in the directory before returning.

**Environment**
- OS: Ubuntu 24.04.4 LTS
- Python Version: 3.11
- Package: pyaslreport

**Additional Context**
I have already written an optimized fix for this that returns the header immediately and stops the loop. I will link a PR shortly. #29 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: O(N) disk I/O performance bottleneck in DICOM header extraction #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf: O(N) disk I/O performance bottleneck in DICOM header extraction #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions