Skip to content

DataFrame Merge method throws exception when one of the DataFrames is empty. #7572

@sevenzees

Description

@sevenzees

System Information (please complete the following information):

  • OS & Version: All
  • ML.NET Version: 5.0.0
  • .NET Version: 8

Describe the bug
When using the DataFrame Merge method, if one of the DataFrames is empty (has no rows), you will get a System.ArgumentOutOfRangeException.

To Reproduce
This code will reproduce the behavior:
DataFrame left = new DataFrame(
new Int32DataFrameColumn("Index"),
new Int32DataFrameColumn("L1"),
new Int32DataFrameColumn("L2"),
new StringDataFrameColumn("L3")
);
DataFrame right = new DataFrame(
new Int32DataFrameColumn("Index", new[] { 0, 1, 2 }),
new Int32DataFrameColumn("R1", new[] { 0, 1, 1 }),
new Int32DataFrameColumn("R2", new[] { 1, 1, 2 }),
new StringDataFrameColumn("R3", new[] { "Z", "Y", "B" })
);
DataFrame merged = left.Merge(right, ["L1"], ["R1"], joinAlgorithm: JoinAlgorithm.Left);

Expected behavior
Expected to get this DataFrame instead of an ArgumentOutOfRangeException:
DataFrame expectedResult = new DataFrame(
new Int32DataFrameColumn("Index_left"),
new Int32DataFrameColumn("L1"),
new Int32DataFrameColumn("L2"),
new StringDataFrameColumn("L3"),
new Int32DataFrameColumn("Index_right"),
new Int32DataFrameColumn("R1"),
new Int32DataFrameColumn("R2"),
new StringDataFrameColumn("R3")
);

Screenshots, Code, Sample Projects
N/A

Additional context
I have code ready to fix this and will create a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions