Skip to content

add UUID as new entry value in the add_entries processor#6653

Merged
dlvenable merged 3 commits intoopensearch-project:mainfrom
Zhangxunmt:feature/add_uniqueId
Mar 23, 2026
Merged

add UUID as new entry value in the add_entries processor#6653
dlvenable merged 3 commits intoopensearch-project:mainfrom
Zhangxunmt:feature/add_uniqueId

Conversation

@Zhangxunmt
Copy link
Copy Markdown
Contributor

@Zhangxunmt Zhangxunmt commented Mar 18, 2026

Description

Added a new entry in the add_entries processor that generates a unique ID for each record. This is needed for cases where the original source data does not contain a unique identifier. The unique ID is essential for running asynchronous batch inference jobs, as it is used to match and merge the inference results back with the source data.

UUID.randomUUID() (UUID v4) uses Java's SecureRandom, and collision probability is so low it has never been observed in practice in any production system, which is what all major distributed systems use for this exact problem.

Usage

  processor:
    - add_entries:
        entries:
          - key: recordId
            value_expression: 'generateUuid()'

Issues Resolved

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@graytaylor0
Copy link
Copy Markdown
Member

This may be better implemented as an expression function rather than a new parameter

  processor:
    - add_entries:
        entries:
          - key: recordId
            value_expression: 'generate_uuid()'

@Zhangxunmt
Copy link
Copy Markdown
Contributor Author

This may be better implemented as an expression function rather than a new parameter

  processor:
    - add_entries:
        entries:
          - key: recordId
            value_expression: 'generate_uuid()'

Thanks for the comment! @graytaylor0 Make sense to use a function. BTW, I think the data prepper documents needs update to list common available functions. Probably I can do that in a separate PR.

@Zhangxunmt Zhangxunmt force-pushed the feature/add_uniqueId branch 3 times, most recently from 5e79f51 to d64d8f4 Compare March 20, 2026 22:16
@graytaylor0
Copy link
Copy Markdown
Member

Looks good! Please follow up with a documentation change (https://docs.opensearch.org/latest/data-prepper/pipelines/functions/). Thanks!

@Zhangxunmt
Copy link
Copy Markdown
Contributor Author

Looks good! Please follow up with a documentation change (https://docs.opensearch.org/latest/data-prepper/pipelines/functions/). Thanks!

Docs update PR -> opensearch-project/documentation-website#12130.

Copy link
Copy Markdown
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Zhangxunmt for this contribution! I just have a couple comments.

@Named
public class GenerateUuidExpressionFunction implements ExpressionFunction {

static final String FUNCTION_NAME = "generate_uuid";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data Prepper functions use lowerCamelCase. Please update to generateUuid.

Suggested change
static final String FUNCTION_NAME = "generate_uuid";
static final String FUNCTION_NAME = "generateUuid";

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

final Object result = function.evaluate(Collections.emptyList(), event, convertLiteralType);
assertThat(result, instanceOf(String.class));
final String uuidStr = (String) result;
// UUID.fromString throws if invalid
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be clearer and be retained in code rather than as a comment.

UUID regeneratedUuid = 
  assertDoesNotThrow(() -> UUID.fromString("hello"));
assertThat(regeneratedUuid.toString(), equalTo(uuidStr));


@Test
void test_generate_uuid_expression_adds_uuid_string_to_event() {
final String uuidExpr = "generate_uuid()";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be sure to update these as well.

Signed-off-by: Xun Zhang <xunzh@amazon.com>
Signed-off-by: Xun Zhang <xunzh@amazon.com>
@Zhangxunmt Zhangxunmt force-pushed the feature/add_uniqueId branch from d64d8f4 to 15bf2bf Compare March 23, 2026 20:29
Signed-off-by: Xun Zhang <xunzh@amazon.com>
@Zhangxunmt Zhangxunmt force-pushed the feature/add_uniqueId branch from 15bf2bf to 069be4a Compare March 23, 2026 20:33
Copy link
Copy Markdown
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Zhangxunmt !

@dlvenable dlvenable merged commit 3a931a5 into opensearch-project:main Mar 23, 2026
70 of 72 checks passed
@dlvenable dlvenable added this to the v2.15 milestone Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants