Testing flux model on multiple CPU architectures and collecting traces with DFTRACER, the time for getting the next sample with ParquetReader is spent for approximately 80% on the deserialization part rather than in I/O.
For the case of UNET3D and NPZ reader, this was not higher than 20-30%
Also, there seems to be a considerable impact of the CPU architecture in the deserialization time.
This may impact the calculation of the Accelerator Utilization % as mainly influenced by elements not strictly related to storage performance.
This could potentially impact other models using ParquetReader
Testing
fluxmodel on multiple CPU architectures and collecting traces withDFTRACER, the time for getting the next sample withParquetReaderis spent for approximately 80% on the deserialization part rather than in I/O.For the case of UNET3D and NPZ reader, this was not higher than 20-30%
Also, there seems to be a considerable impact of the CPU architecture in the deserialization time.
This may impact the calculation of the Accelerator Utilization % as mainly influenced by elements not strictly related to storage performance.
This could potentially impact other models using
ParquetReader