add UUID as new entry value in the add_entries processor#6653
add UUID as new entry value in the add_entries processor#6653dlvenable merged 3 commits intoopensearch-project:mainfrom
Conversation
|
This may be better implemented as an expression function rather than a new parameter |
Thanks for the comment! @graytaylor0 Make sense to use a function. BTW, I think the data prepper documents needs update to list common available functions. Probably I can do that in a separate PR. |
5e79f51 to
d64d8f4
Compare
|
Looks good! Please follow up with a documentation change (https://docs.opensearch.org/latest/data-prepper/pipelines/functions/). Thanks! |
Docs update PR -> opensearch-project/documentation-website#12130. |
dlvenable
left a comment
There was a problem hiding this comment.
Thanks @Zhangxunmt for this contribution! I just have a couple comments.
| @Named | ||
| public class GenerateUuidExpressionFunction implements ExpressionFunction { | ||
|
|
||
| static final String FUNCTION_NAME = "generate_uuid"; |
There was a problem hiding this comment.
Data Prepper functions use lowerCamelCase. Please update to generateUuid.
| static final String FUNCTION_NAME = "generate_uuid"; | |
| static final String FUNCTION_NAME = "generateUuid"; |
| final Object result = function.evaluate(Collections.emptyList(), event, convertLiteralType); | ||
| assertThat(result, instanceOf(String.class)); | ||
| final String uuidStr = (String) result; | ||
| // UUID.fromString throws if invalid |
There was a problem hiding this comment.
This would be clearer and be retained in code rather than as a comment.
UUID regeneratedUuid =
assertDoesNotThrow(() -> UUID.fromString("hello"));
assertThat(regeneratedUuid.toString(), equalTo(uuidStr));
|
|
||
| @Test | ||
| void test_generate_uuid_expression_adds_uuid_string_to_event() { | ||
| final String uuidExpr = "generate_uuid()"; |
There was a problem hiding this comment.
Be sure to update these as well.
Signed-off-by: Xun Zhang <xunzh@amazon.com>
Signed-off-by: Xun Zhang <xunzh@amazon.com>
d64d8f4 to
15bf2bf
Compare
Signed-off-by: Xun Zhang <xunzh@amazon.com>
15bf2bf to
069be4a
Compare
Description
Added a new entry in the add_entries processor that generates a unique ID for each record. This is needed for cases where the original source data does not contain a unique identifier. The unique ID is essential for running asynchronous batch inference jobs, as it is used to match and merge the inference results back with the source data.
UUID.randomUUID() (UUID v4) uses Java's SecureRandom, and collision probability is so low it has never been observed in practice in any production system, which is what all major distributed systems use for this exact problem.
Usage
Issues Resolved
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.