S390x test fixes by AlekseiNikiforovIBM · Pull Request #27404 · microsoft/onnxruntime

AlekseiNikiforovIBM · 2026-02-20T11:50:07Z

Description

This PR contains fixes to various big endian support issues in onnxruntime, both in libraries and tests.

Motivation and Context

Currently some tests from onnxruntime testsuite fail.
This change fixes all tests from onnxruntime testsuite when it's built without training support.
It also includes a linking issue fix.

Following tests are fixed on s390x:
OrtModelOnlyTests.ValidateOrtFormatModelDoesNotRunOptimizersInFullBuild
FlatbufferUtilsTest.ExternalWriteReadWithLoadInitializers
SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices64
SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices32
SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices16
SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices8
SparseTensorConversionTests.SparseTensorProtoToDense_Rank2Indices_COO
SparseTensorConversionTests.TestConstantNodeConversion
OrtModelOnlyTests.SparseInitializerHandling
SparseTensorConversionTests.TestConstantNodeConversion
SparseTensorConversionTests.TestDenseToSparseConversion
ExecutionFrameTestInit.SparseInitializerAsOutput
CApiTest.SparseOutputModel

AlekseiNikiforovIBM · 2026-02-24T14:06:57Z

I've added fixes for tests enabled with training. Although tests are for training, a lot of fixes are actually in common code.

AlekseiNikiforovIBM · 2026-03-03T12:27:21Z

@tianleiwu @amarin16 @baijumeswani, could you please take a look?

tianleiwu · 2026-03-03T17:49:39Z

1. `onnxruntime/core/graph/graph_flatbuffers_utils.cc` (Changes Requested)

In SaveInitializerOrtFormat, the handling of HasExternalData(be_copy) for big-endian machines introduces a bug where external data will be saved into the ORT Flatbuffer as Big-Endian bytes instead of Little-Endian.

Concrete byte-level trace of the bug:

For the inline data path (correct), using int32 value 1 as an example:

be_copy.raw_data() starts as LE bytes from the ONNX proto: [01, 00, 00, 00]
ConvertRawDataInTensorProto(be_copy) swaps in-place → [00, 00, 00, 01] (now BE)
UnpackInitializerData → has raw_data → UnpackTensorWithRawData → ReadLittleEndian:
interprets [00, 00, 00, 01] as LE, swaps to native BE → output = [01, 00, 00, 00]
Result: unpacked_tensor = [01, 00, 00, 00] = LE bytes ✅

For the external data path (buggy):

ConvertRawDataInTensorProto(be_copy) — no-op (external data, no inline data to swap)
TensorProtoWithExternalDataToTensorProto(be_copy_external_data, {}, be_copy) — loads LE bytes from disk into a brand new TensorProto result (see tensorprotoutils.cc:296), which only copies name/data_type/dims but NOT external_data or data_location. So be_copy becomes an inline-data TensorProto with raw_data = [01, 00, 00, 00] (LE from disk).
UnpackInitializerData(be_copy) — HasExternalData now returns false, so it takes the non-external path → CASE_UNPACK → UnpackTensorWithRawData → ReadLittleEndian:
interprets [01, 00, 00, 00] as LE, swaps to native BE → output = [00, 00, 00, 01]
Result: unpacked_tensor = [00, 00, 00, 01] = BE bytes ❌ — flatbuffer stores BE data.

Root cause: The inline data path relies on a double-swap (step 2 + step 3) that cancels out, producing LE. The external data path only has a single swap (step 3), producing BE.

Proposed fix — add ConvertRawDataInTensorProto after loading external data to match the inline path's double-swap:

      if (onnxruntime::utils::HasExternalData(be_copy)) {
        auto be_copy_external_data{be_copy};
        ORT_RETURN_IF_ERROR(onnxruntime::utils::TensorProtoWithExternalDataToTensorProto(be_copy_external_data, {}, be_copy));
        // Swap the newly loaded LE raw_data to BE, matching what ConvertRawDataInTensorProto
        // would have done for inline data. UnpackInitializerData's ReadLittleEndian will then
        // swap it back to LE, producing the correct result.
        onnxruntime::utils::ConvertRawDataInTensorProto(be_copy);
      }

Copilot

Pull request overview

This PR addresses big-endian (s390x) correctness issues in ORT by standardizing raw tensor data handling (writing/reading in little-endian form where required) and adjusting affected tests/build wiring so the test suite passes without training enabled.

Changes:

Replace direct TensorProto::set_raw_data(...) usage with onnxruntime::utils::SetRawDataInTensorProto(...) across multiple components/tests to centralize endianness handling.
Improve test robustness on big-endian by unpacking tensor proto data via ORT utilities (e.g., UnpackTensor, ConvertRawDataInTensorProto) instead of memcpy/reinterpretation.
Update ORT-format/flatbuffer initializer handling and unit-test build configuration to support big-endian scenarios and non-training builds.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
winml/adapter/winml_adapter_model.cpp	Use ORT helper to set TensorProto raw data with endian-awareness.
orttraining/orttraining/training_api/checkpoint.cc	Remove little-endian-only guard for training checkpoints.
orttraining/orttraining/test/graph/optimizer_graph_builder_test.cc	Read raw_data via `UnpackTensor` to be endian-correct.
orttraining/orttraining/core/optimizer/shape_optimizer.cc	Use endian-aware raw data setter for constant initializers.
orttraining/orttraining/core/optimizer/megatron_transformer.cc	Use endian-aware raw data setter for partitioned initializers.
orttraining/orttraining/core/optimizer/conv1d_replacement.cc	Use endian-aware raw data setter for initializer creation.
orttraining/orttraining/core/framework/checkpointing.cc	Remove big-endian “not implemented” restriction in checkpoint saving path.
onnxruntime/test/providers/nv_tensorrt_rtx/test_nv_trt_rtx_ep_util.cc	Use endian-aware raw data setter in test model building utilities.
onnxruntime/test/framework/sparse_kernels_test.cc	Convert/check raw_data in a big-endian-safe way (copy-by-value + conversion).
onnxruntime/test/framework/int2_test.cc	Use endian-aware raw data setter in Int2 round-trip test.
onnxruntime/test/framework/endian_test.cc	Use endian-aware raw data setter and add stronger assertions about conversion effects.
onnxruntime/test/flatbuffers/flatbuffer_utils_test.cc	Remove manual conversion now handled elsewhere.
onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc	Use endian-aware raw data setter for dummy initializer scalars.
onnxruntime/core/graph/graph_flatbuffers_utils.cc	Add big-endian handling for saving/loading ORT-format initializers and tensor dims.
onnxruntime/core/graph/graph.cc	Remove prior sparse-constant endian workaround; use endian-aware raw data setter for editor API.
onnxruntime/core/framework/tensorprotoutils.cc	Enhance raw-data setter and conversion logic; ensure sparse raw bytes are little-endian.
onnxruntime/core/framework/data_transfer_utils.h	Add byte-swapping to `CopyTensorDataToByteSpan` on big-endian.
cmake/onnxruntime_unittests.cmake	Ensure endian_utils is linked into tests when training is disabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-03T17:51:52Z

onnxruntime/core/framework/data_transfer_utils.h

+  if constexpr (endian::native != endian::little) {
+    gsl::span<std::byte> bytes_span(reinterpret_cast<std::byte*>(dst_span.data()), dst_span.size_bytes());
+    onnxruntime::utils::SwapByteOrderInplace(src_tensor.DataType()->Size(), bytes_span);
+  }


CopyTensorDataToByteSpan now unconditionally byte-swaps the destination buffer on big-endian hosts. This helper is used outside of checkpoint/serialization codepaths (e.g., copying device tensors into a NumPy array in the Python bindings), where callers expect a pure byte-for-byte copy in native format. Please avoid changing CopyTensorDataToByteSpan semantics globally; instead move the endian conversion to the specific serialization/checkpointing call sites that require little-endian on-disk representation, or add an explicit opt-in parameter/helper for "copy as little-endian".

Looks like a correct catch, I'll rework this part.

Copilot · 2026-03-03T17:51:52Z

onnxruntime/core/graph/graph_flatbuffers_utils.cc

      auto be_copy{initializer};
      onnxruntime::utils::ConvertRawDataInTensorProto(be_copy);
+      // If data is external, then call to ConvertRawDataInTensorProto will not properly byteswap it.
+      // A conversion using TensorProtoWithExternalDataToTensorProto does it.
+      // Do it after call to ConvertRawDataInTensorProto to ensure all data is byteswapped.
+      if (onnxruntime::utils::HasExternalData(be_copy)) {
+        auto be_copy_external_data{be_copy};
+        ORT_RETURN_IF_ERROR(onnxruntime::utils::TensorProtoWithExternalDataToTensorProto(be_copy_external_data, {}, be_copy));
+      }
      ORT_RETURN_IF_ERROR(onnxruntime::utils::UnpackInitializerData(be_copy, model_path, unpacked_tensor));


This external-data handling in SaveInitializerOrtFormat has two issues on big-endian hosts:

It passes {} into TensorProtoWithExternalDataToTensorProto, but that value is forwarded as the tensor_proto_dir for ReadExternalDataForTensor, so external data may be resolved relative to the current working directory instead of the model directory.

After loading external bytes into be_copy.raw_data, the later UnpackInitializerData(be_copy, ...) call will invoke ReadLittleEndian and convert the bytes to native endianness before writing them into the flatbuffer, but ORT format raw_data is expected to be little-endian (LoadOrtTensorOrtFormat unpacks it with ReadLittleEndian).
Consider reading external bytes directly into unpacked_tensor (no UnpackTensor conversion) and using the correct directory derived from model_path when resolving the external data.

Suggested change

auto be_copy{initializer};

onnxruntime::utils::ConvertRawDataInTensorProto(be_copy);

// If data is external, then call to ConvertRawDataInTensorProto will not properly byteswap it.

// A conversion using TensorProtoWithExternalDataToTensorProto does it.

// Do it after call to ConvertRawDataInTensorProto to ensure all data is byteswapped.

if (onnxruntime::utils::HasExternalData(be_copy)) {

auto be_copy_external_data{be_copy};

ORT_RETURN_IF_ERROR(onnxruntime::utils::TensorProtoWithExternalDataToTensorProto(be_copy_external_data, {}, be_copy));

}

ORT_RETURN_IF_ERROR(onnxruntime::utils::UnpackInitializerData(be_copy, model_path, unpacked_tensor));

// On big-endian hosts we must be careful to preserve the little-endian layout

// required by ORT format raw_data.

if (onnxruntime::utils::HasExternalData(initializer)) {

// Resolve external data relative to the model directory and read the bytes

// directly into unpacked_tensor without any endian conversion.

auto tensor_with_external{initializer};

auto tensor_with_data{initializer};

const auto model_dir = model_path.parent_path().string();

ORT_RETURN_IF_ERROR(

onnxruntime::utils::TensorProtoWithExternalDataToTensorProto(tensor_with_external, model_dir, tensor_with_data));

unpacked_tensor.assign(tensor_with_data.raw_data().begin(), tensor_with_data.raw_data().end());

} else {

auto be_copy{initializer};

onnxruntime::utils::ConvertRawDataInTensorProto(be_copy);

ORT_RETURN_IF_ERROR(onnxruntime::utils::UnpackInitializerData(be_copy, model_path, unpacked_tensor));

}

This part is reworked now

AlekseiNikiforovIBM · 2026-03-04T12:10:14Z

Thanks for review, I'll rework that change. It is likely that byteswapping is excessive in that place but missing in some other place.

Build command: ./build.sh --config Debug --parallel 0 --enable_pybind --build_wheel --allow_running_as_root

Later this data is narrowed: *p_data++ = static_cast<T>(*data_iter); If for example BE int32_t data is 4 bytes: 0x00 0x00 0x00 0x01 After byteswapping it'll become: 0x01 0x00 0x00 0x00 And after narrowing to int16_t two rightmost bytes are used on big endian and result is 0x00 0x00 If instead we byteswap it as two shorts, byteswapping result is: 0x00 0x00 0x01 0x00 And narrowing result is 0x01 0x00, which is correct LE representation of that number. This change fixes following test on s390x: FlatbufferUtilsTest.ExternalWriteReadWithLoadInitializers

Raw data is expected to be in LE. This change fixes tests: SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices64 SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices32 SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices16 SparseTensorConversionTests.SparseTensorProtoToDense_Rank1Indices8 SparseTensorConversionTests.SparseTensorProtoToDense_Rank2Indices_COO

This change fixes tests: OrtModelOnlyTests.SparseInitializerHandling SparseTensorConversionTests.TestConstantNodeConversion

…390x

This change fixes following tests on s390x: ExecutionFrameTestInit.SparseInitializerAsOutput CApiTest.SparseOutputModel

This change will allow to assess and fix big-endian-specific issues in training-related code.

This change fixes approximately 40 tests.

This change fixes test CheckpointingTest.SaveAndLoad on s390x.

…ronTransformer class This change fixes following tests on s390x: GraphTransformationTests.MegatronMLPPartitionRank0 GraphTransformationTests.MegatronMLPPartitionRank1 GraphTransformationTests.MegatronSelfAttentionPartitionRank0 GraphTransformationTests.MegatronSelfAttentionPartitionRank1

…roto This should fix a lot of potential endianness issues on s390x

This change fixes following tests on s390x: OptimizerGraphBuilderTest.LoadOptimState_FullPrecision_Adam OptimizerGraphBuilderTest.LoadOptimState_FullPrecision_Lamb

Memory data is in native endian format, while on-disk data should be in little endian format already. Move out a part of ConvertRawDataInTensorProto function into a separate one for convenience. This change fixes test OrtModelOnlyTests.ValidateOrtFormatModelDoesNotRunOptimizersInFullBuild on s390x.

This change fixes following tests on s390x: SaveWithExternalInitializers.Mnist SaveWithExternalInitializers.ModelWithOriginalExternalData SaveWithExternalInitializers.ModelWithOriginalExternalDataAlignOffset

…unction ReadExternalDataForTensor already returns data in native endian format. Also remove ConvertEndianessForVector function. It does const_cast and unexpectedly modifies original data. When needed, use WriteLittleEndian instead. These changes fix tests on s390x: TensorProtoUtilsTest.UnpackTensorWithExternalData TensorProtoUtilsTest.ConstantTensorProtoWithExternalData

…onstantNodeConversion test

AlekseiNikiforovIBM · 2026-03-06T14:28:56Z

Your finding is entirely correct if external data is in file. However, in-memory external data is actually in native endian format, i.e. big endian on big endian systems:

onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc

Lines 1803 to 1815 in 65fb61b

    
           if (use_tensor_buffer && tensor.SizeInBytes() > kSmallTensorExternalDataThreshold) { 
        
             // https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/graph/graph_flatbuffers_utils.cc#L302 
        
             const auto* raw_data = tensor.DataRaw(); 
        
             ORT_ENFORCE(raw_data, "Missing raw data for tensor proto. Invalid tensor."); 
        
             static_assert(sizeof(void*) <= sizeof(ExternalDataInfo::OFFSET_TYPE)); 
        
             // we reinterpret_cast this back to void* in tensorprotoutils.cc:GetExtDataFromTensorProto. 
        
             // use intptr_t as OFFSET_TYPE is signed. in theory you could get a weird looking value if the address uses the 
        
             // high bit, but that should be unlikely in a scenario where we care about memory usage enough to use this path. 
        
             auto offset = narrow<ExternalDataInfo::OFFSET_TYPE>(reinterpret_cast<intptr_t>(raw_data)); 
        
             ExternalDataInfo::SetExternalLocationToProto(onnxruntime::utils::kTensorProtoMemoryAddressTag, 
        
                                                          offset, tensor.SizeInBytes(), tensor_proto);

It seems like a bad idea to byteswap this memory data in advance, so I've added byteswapping of data from file after reading it. It also revealed a couple additional byteswapping issues which I also investigated and fixed.

tianleiwu

`GetElementSizeOfTensor` Extraction (`tensorprotoutils.cc`)

⚠️ Missing STRING type returns 0, propagated to ReadLittleEndian without guard: GetElementSizeOfTensor returns 0 for unknown/unsupported types (e.g., STRING). Most callers guard with if (element_size > 1), but in ReadExternalDataForTensor for the kTensorProtoLittleEndianMemoryAddressTag path, ReadLittleEndian(element_size, src, dst) is called unconditionally. If element_size is 0, CopyLittleEndian receives 0 for element_size_in_bytes, which violates its documented precondition ("element_size_in_bytes should be greater than zero"). In practice this path is unlikely for STRING tensors (strings don't use raw external data), but a defensive check would be prudent:
```
size_t element_size = GetElementSizeOfTensor(...);
ORT_RETURN_IF(element_size == 0, "Unsupported data type for endian conversion");
```

`GetExtDataFromTensorProto` — WriteLittleEndian Misuse (`tensorprotoutils.cc`)

⚠️ WriteLittleEndian used where ReadLittleEndian is semantically correct: In the kTensorProtoLittleEndianMemoryAddressTag branch of GetExtDataFromTensorProto, the source buffer (ext_data_buf) is in little-endian format and the destination (native_data) should be in native endian format. This is a "read from LE" operation, so ReadLittleEndian should be used. The code instead calls WriteLittleEndian(element_size, src_span, dst_span), which is documented as "writes to a little-endian destination." Both functions are functionally identical (byte swap is self-inverse), so there is no runtime bug, but the naming is misleading and inconsistent with the DML provider code which correctly uses ReadLittleEndian in the same scenario.
```
// Current (misleading):
ORT_RETURN_IF_ERROR(onnxruntime::utils::WriteLittleEndian(element_size, src_span, dst_span));
// Should be:
ORT_RETURN_IF_ERROR(onnxruntime::utils::ReadLittleEndian(element_size, src_span, dst_span));
```

Save Path Byte-Swapping (`graph.cc`, `graph_flatbuffers_utils.cc`)

⚠️ Duplicated byte-swap boilerplate in SaveOrtTensorOrtFormat: The inline-data and external-data-writer paths in SaveOrtTensorOrtFormat both contain nearly identical byte-swap logic (allocate vector, compute element_size, WriteLittleEndian). Consider extracting into a local lambda or helper to reduce duplication. This is a readability concern, not a correctness issue.

Summary of Concerns

#	Severity	Component	Issue
1	Suggestion	`GetElementSizeOfTensor`	Returns 0 for unknown types; no guard before `ReadLittleEndian` in the `kTensorProtoLittleEndianMemoryAddressTag` path of `ReadExternalDataForTensor`.
2	Suggestion	`GetExtDataFromTensorProto`	`WriteLittleEndian` used where `ReadLittleEndian` is semantically correct (LE source → native dest). Functionally identical but misleading.
3	Nitpick	`SaveOrtTensorOrtFormat`	Duplicated byte-swap boilerplate between inline and external data writer paths could be extracted.

Verdict

APPROVE — The PR correctly addresses a systemic class of big-endian bugs across the entire data serialization stack. The core architectural decisions (tag split, centralized SetRawDataInTensorProto conversion, ReadExternalDataForTensor returning native endian) are sound. The ConvertRawDataInTensorProto fix for sub-word types stored in wider protobuf fields is critical and correct. The two suggestion-level concerns (missing guard for element_size=0 and WriteLittleEndian/ReadLittleEndian naming mismatch) are worth fixing but are not blockers — the former is unreachable in practice and the latter is a semantic-only issue with no runtime impact.

onnxruntime/core/framework/tensorprotoutils.cc

…orProto

…ng 0

tianleiwu · 2026-03-26T03:05:55Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-03-26T03:06:17Z

Azure Pipelines successfully started running 4 pipeline(s).

AlekseiNikiforovIBM · 2026-03-26T15:32:31Z

I've added change to include endian headers in a couple of files due to failing pipeline indicating this issue in one of build configurations.

Edit: failing pipeline: https://github.com/microsoft/onnxruntime/actions/runs/23539560419

tianleiwu · 2026-03-29T23:28:31Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-03-29T23:28:53Z

Azure Pipelines successfully started running 4 pipeline(s).

tianleiwu · 2026-03-30T00:27:55Z

@AlekseiNikiforovIBM, there is still build errors in DirectML builds (please also check other failed builds):

D:\a\_work\onnxruntime\onnxruntime\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(128,31): error C2653: 'endian': is not a class or namespace name [D:\a\_work\_temp\build\RelWithDebInfo\onnxruntime_providers_dml.vcxproj]
Error: D:\a\_work\onnxruntime\onnxruntime\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(128,39): error C2065: 'native': undeclared identifier [D:\a\_work\_temp\build\RelWithDebInfo\onnxruntime_providers_dml.vcxproj]
Error: D:\a\_work\onnxruntime\onnxruntime\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(128,49): error C2653: 'endian': is not a class or namespace name [D:\a\_work\_temp\build\RelWithDebInfo\onnxruntime_providers_dml.vcxproj]
Error: D:\a\_work\onnxruntime\onnxruntime\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(128,57): error C2065: 'little': undeclared identifier [D:\a\_work\_temp\build\RelWithDebInfo\onnxruntime_providers_dml.vcxproj]
Error: D:\a\_work\onnxruntime\onnxruntime\onnxruntime\core\providers\dml\DmlExecutionProvider\src\DmlGraphFusionHelper.cpp(134,63): error C2664: 'size_t onnxruntime::utils::GetElementSizeOfTensor(onnx::TensorProto_DataType)': cannot convert argument 1 from 'int32_t' to 'onnx::TensorProto_DataType' [D:\a\_work\_temp\build\RelWithDebInfo\onnxruntime_providers_dml.vcxproj]

tianleiwu · 2026-03-30T08:15:01Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-03-30T08:15:21Z

Azure Pipelines successfully started running 4 pipeline(s).

tianleiwu · 2026-03-31T00:04:01Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-03-31T00:04:21Z

Azure Pipelines successfully started running 4 pipeline(s).

AlekseiNikiforovIBM force-pushed the s390x_test_fixes branch from c7d2c7f to 3aa7f73 Compare February 24, 2026 15:03

tianleiwu requested a review from Copilot March 3, 2026 17:43

Copilot started reviewing on behalf of tianleiwu March 3, 2026 17:44 View session

Copilot AI reviewed Mar 3, 2026

View reviewed changes

AlekseiNikiforovIBM force-pushed the s390x_test_fixes branch from 3aa7f73 to 067dd42 Compare March 6, 2026 14:21

AlekseiNikiforovIBM added 18 commits March 6, 2026 15:22

Fix build on s390x when training is disabled

df5ccf5

Build command: ./build.sh --config Debug --parallel 0 --enable_pybind --build_wheel --allow_running_as_root

Fix SparseTensorConversionTests.TestConstantNodeConversion test on s390x

29374ef

Fix byte order in SparsifyGeneric function

ce9d15b

This change fixes tests: OrtModelOnlyTests.SparseInitializerHandling SparseTensorConversionTests.TestConstantNodeConversion

Fix test SparseTensorConversionTests.TestDenseToSparseConversion on s…

152eda2

…390x

Remove excessive byteswapping in Graph::Graph

6c1c9d1

This change fixes following tests on s390x: ExecutionFrameTestInit.SparseInitializerAsOutput CApiTest.SparseOutputModel

Remove big-endian-specific exceptions

ce495f8

This change will allow to assess and fix big-endian-specific issues in training-related code.

Byteswap dimensions obtained from flatbuffer structures

eda4d90

This change fixes approximately 40 tests.

Add byteswapping in CopyTensorDataToByteSpan function

6e5ce60

This change fixes test CheckpointingTest.SaveAndLoad on s390x.

Replace most TensorProto::set_raw_data calls with SetRawDataInTensorP…

c307fbf

…roto This should fix a lot of potential endianness issues on s390x

Fix unpacking raw data in tests

e3aa9c3

This change fixes following tests on s390x: OptimizerGraphBuilderTest.LoadOptimState_FullPrecision_Adam OptimizerGraphBuilderTest.LoadOptimState_FullPrecision_Lamb

Apply lint fixes

11d437b

Byteswap data when saving it to external file

3ace82c

This change fixes following tests on s390x: SaveWithExternalInitializers.Mnist SaveWithExternalInitializers.ModelWithOriginalExternalData SaveWithExternalInitializers.ModelWithOriginalExternalDataAlignOffset

Write test file in little endian in SparseTensorConversionTests.TestC…

b8042be

…onstantNodeConversion test

AlekseiNikiforovIBM force-pushed the s390x_test_fixes branch from 067dd42 to b8042be Compare March 6, 2026 14:25

Move byteswapping to saving tensor to file

c66921b

tianleiwu previously approved these changes Mar 24, 2026

View reviewed changes

AlekseiNikiforovIBM dismissed tianleiwu’s stale review via e2becb5 March 24, 2026 10:33

tianleiwu reviewed Mar 24, 2026

View reviewed changes

onnxruntime/core/framework/tensorprotoutils.cc Outdated Show resolved Hide resolved

AlekseiNikiforovIBM force-pushed the s390x_test_fixes branch from e2becb5 to 7e1162b Compare March 24, 2026 16:38

AlekseiNikiforovIBM added 3 commits March 25, 2026 12:38

Replace WriteLittleEndian with ReadLittleEndian in GetExtDataFromTens…

a5fd873

…orProto

Move common unpacking code in SaveOrtTensorOrtFormat into a lambda

41d0d19

Add guards against onnxruntime::utils::GetElementSizeOfTensor returni…

38983f1

…ng 0

AlekseiNikiforovIBM force-pushed the s390x_test_fixes branch from 7e1162b to 38983f1 Compare March 25, 2026 11:51

tianleiwu previously approved these changes Mar 26, 2026

View reviewed changes

tianleiwu enabled auto-merge (squash) March 26, 2026 03:06

Add missing headers

6e6e743

auto-merge was automatically disabled March 26, 2026 15:30
Head branch was pushed to by a user without write access

AlekseiNikiforovIBM dismissed tianleiwu’s stale review via 6e6e743 March 26, 2026 15:30

Add missing casts when creating gls::span

692ebf7

Add missing header file and type cast

28bbd3d

tianleiwu previously approved these changes Mar 30, 2026

View reviewed changes

tianleiwu enabled auto-merge (squash) March 30, 2026 08:15

Add missing namespace when using endian::native and endian::little

b84ba98

auto-merge was automatically disabled March 30, 2026 14:33
Head branch was pushed to by a user without write access

AlekseiNikiforovIBM dismissed tianleiwu’s stale review via b84ba98 March 30, 2026 14:33

Conversation

AlekseiNikiforovIBM commented Feb 20, 2026

Description

Motivation and Context

Uh oh!

AlekseiNikiforovIBM commented Feb 24, 2026

Uh oh!

AlekseiNikiforovIBM commented Mar 3, 2026

Uh oh!

tianleiwu commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. onnxruntime/core/graph/graph_flatbuffers_utils.cc (Changes Requested)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

AlekseiNikiforovIBM Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

AlekseiNikiforovIBM Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

AlekseiNikiforovIBM commented Mar 4, 2026

Uh oh!

AlekseiNikiforovIBM commented Mar 6, 2026

Uh oh!

tianleiwu left a comment

Choose a reason for hiding this comment

GetElementSizeOfTensor Extraction (tensorprotoutils.cc)

GetExtDataFromTensorProto — WriteLittleEndian Misuse (tensorprotoutils.cc)

Save Path Byte-Swapping (graph.cc, graph_flatbuffers_utils.cc)

Summary of Concerns

Verdict

Uh oh!

Uh oh!

tianleiwu commented Mar 26, 2026

Uh oh!

azure-pipelines bot commented Mar 26, 2026

Uh oh!

AlekseiNikiforovIBM commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tianleiwu commented Mar 29, 2026

Uh oh!

azure-pipelines bot commented Mar 29, 2026

Uh oh!

tianleiwu commented Mar 30, 2026

Uh oh!

tianleiwu commented Mar 30, 2026

Uh oh!

azure-pipelines bot commented Mar 30, 2026

Uh oh!

tianleiwu commented Mar 31, 2026

Uh oh!

azure-pipelines bot commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tianleiwu commented Mar 3, 2026 •

edited

Loading

1. `onnxruntime/core/graph/graph_flatbuffers_utils.cc` (Changes Requested)

`GetElementSizeOfTensor` Extraction (`tensorprotoutils.cc`)

`GetExtDataFromTensorProto` — WriteLittleEndian Misuse (`tensorprotoutils.cc`)

Save Path Byte-Swapping (`graph.cc`, `graph_flatbuffers_utils.cc`)

AlekseiNikiforovIBM commented Mar 26, 2026 •

edited

Loading