diff --git a/docs/user/ppl/functions/aggregations.md b/docs/user/ppl/functions/aggregations.md index c4a84e31149..ce55b6a19d2 100644 --- a/docs/user/ppl/functions/aggregations.md +++ b/docs/user/ppl/functions/aggregations.md @@ -1,37 +1,46 @@ -# Aggregation Functions +# Aggregation functions -## Description +Aggregation functions perform calculations across multiple rows to return a single result value. These functions are used with the `stats`, `eventstats`, and `streamstats` commands to analyze and summarize data. -Aggregation functions perform calculations across multiple rows to return a single result value. These functions are used with `stats`, `eventstats` and `streamstats` commands to analyze and summarize data. -The following table shows how NULL/MISSING values are handled by aggregation functions: +The following table shows how `NULL` and missing values are handled by aggregation functions. -| Function | NULL | MISSING | +| Function | null | Missing | | --- | --- | --- | -| COUNT | Not counted | Not counted | -| SUM | Ignore | Ignore | -| AVG | Ignore | Ignore | -| MAX | Ignore | Ignore | -| MIN | Ignore | Ignore | -| FIRST | Ignore | Ignore | -| LAST | Ignore | Ignore | -| LIST | Ignore | Ignore | -| VALUES | Ignore | Ignore | - -## Functions +| `COUNT` | Not counted | Not counted | +| `SUM` | Ignored | Ignored | +| `AVG` | Ignored | Ignored | +| `MAX` | Ignored | Ignored | +| `MIN` | Ignored | Ignored | +| `FIRST` | Ignored | Ignored | +| `LAST` | Ignored | Ignored | +| `LIST` | Ignored | Ignored | +| `VALUES` | Ignored | Ignored | + +## Functions + +The following aggregation functions are available in PPL for data analysis and summarization. ### COUNT -#### Description +**Usage**: `COUNT(expr)`, `C(expr)`, `c(expr)`, `count(expr)` + +Counts the number of `expr` values in the retrieved rows. `C()`, `c()`, and `count()` are available as abbreviations for `COUNT()`. For filtered counting, use an `eval` expression to specify the filtering condition. + +**Parameters**: + +- `expr` (Optional): The expression whose values are to be counted. -Usage: Returns a count of the number of expr in the rows retrieved. The `C()` function, `c`, and `count` can be used as abbreviations for `COUNT()`. To perform a filtered counting, wrap the condition to satisfy in an `eval` expression. -### Example +**Return type**: `LONG` + +#### Example ```ppl source=accounts | stats count(), c(), count, c ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -42,14 +51,15 @@ fetched rows / total rows = 1/1 +---------+-----+-------+---+ ``` -Example of filtered counting - +The following example counts only records that match a specific condition: + ```ppl source=accounts | stats count(eval(age > 30)) as mature_users ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -62,17 +72,25 @@ fetched rows / total rows = 1/1 ### SUM -#### Description +**Usage**: `SUM(expr)` + +Returns the sum of `expr` values. + +**Parameters**: + +- `expr` (Required): The expression whose values are to be summed. -Usage: `SUM(expr)`. Returns the sum of expr. -### Example +**Return type**: Same as input type (`INTEGER`, `LONG`, `FLOAT`, or `DOUBLE`) + +#### Example ```ppl source=accounts | stats sum(age) by gender ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -86,17 +104,25 @@ fetched rows / total rows = 2/2 ### AVG -#### Description +**Usage**: `AVG(expr)` + +Returns the average value of `expr`. + +**Parameters**: -Usage: `AVG(expr)`. Returns the average value of expr. -### Example +- `expr` (Required): The expression whose values are to be averaged. + +**Return type**: `DOUBLE` for numeric inputs; same as input type for `DATE`, `TIME`, or `TIMESTAMP` inputs + +#### Example ```ppl source=accounts | stats avg(age) by gender ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -110,18 +136,25 @@ fetched rows / total rows = 2/2 ### MAX -#### Description +**Usage**: `MAX(expr)` + +Returns the maximum value of `expr`. For non-numeric fields, this function returns the value that comes last in alphabetical order. + +**Parameters**: -Usage: `MAX(expr)`. Returns the maximum value of expr. -For non-numeric fields, values are sorted lexicographically. -### Example +- `expr` (Required): The expression for which to find the maximum value. + +**Return type**: Same as input type + +#### Example ```ppl source=accounts | stats max(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -132,14 +165,15 @@ fetched rows / total rows = 1/1 +----------+ ``` -Example with text field - +The following example returns the value from the `firstname` text field that comes last in alphabetical order: + ```ppl source=accounts | stats max(firstname) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -152,18 +186,25 @@ fetched rows / total rows = 1/1 ### MIN -#### Description +**Usage**: `MIN(expr)` + +Returns the minimum value of `expr`. For non-numeric fields, this function returns the value that comes first in alphabetical order. + +**Parameters**: -Usage: `MIN(expr)`. Returns the minimum value of expr. -For non-numeric fields, values are sorted lexicographically. -### Example +- `expr` (Required): The expression for which to find the minimum value. + +**Return type**: Same as input type + +#### Example ```ppl source=accounts | stats min(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -174,14 +215,15 @@ fetched rows / total rows = 1/1 +----------+ ``` -Example with text field - +The following example returns the value from the `firstname` text field that comes first in alphabetical order: + ```ppl source=accounts | stats min(firstname) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -194,17 +236,25 @@ fetched rows / total rows = 1/1 ### VAR_SAMP -#### Description +**Usage**: `VAR_SAMP(expr)` + +Returns the sample variance of `expr`. + +**Parameters**: + +- `expr` (Required): The expression for which to calculate the sample variance. -Usage: `VAR_SAMP(expr)`. Returns the sample variance of expr. -### Example +**Return type**: `DOUBLE` + +#### Example ```ppl source=accounts | stats var_samp(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -217,17 +267,25 @@ fetched rows / total rows = 1/1 ### VAR_POP -#### Description +**Usage**: `VAR_POP(expr)` + +Returns the population variance of `expr`. + +**Parameters**: -Usage: `VAR_POP(expr)`. Returns the population standard variance of expr. -### Example +- `expr` (Required): The expression for which to calculate the population variance. + +**Return type**: `DOUBLE` + +#### Example ```ppl source=accounts | stats var_pop(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -240,17 +298,25 @@ fetched rows / total rows = 1/1 ### STDDEV_SAMP -#### Description +**Usage**: `STDDEV_SAMP(expr)` + +Returns the sample standard deviation of `expr`. -Usage: `STDDEV_SAMP(expr)`. Return the sample standard deviation of expr. -### Example +**Parameters**: + +- `expr` (Required): The expression for which to calculate the sample standard deviation. + +**Return type**: `DOUBLE` + +#### Example ```ppl source=accounts | stats stddev_samp(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -263,17 +329,25 @@ fetched rows / total rows = 1/1 ### STDDEV_POP -#### Description +**Usage**: `STDDEV_POP(expr)` -Usage: `STDDEV_POP(expr)`. Return the population standard deviation of expr. -### Example +Returns the population standard deviation of `expr`. + +**Parameters**: + +- `expr` (Required): The expression for which to calculate the population standard deviation. + +**Return type**: `DOUBLE` + +#### Example ```ppl source=accounts | stats stddev_pop(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -284,20 +358,27 @@ fetched rows / total rows = 1/1 +--------------------+ ``` -### DISTINCT_COUNT, DC +### DISTINCT_COUNT, DC + +**Usage**: `DISTINCT_COUNT(expr)`, `DC(expr)` + +Returns the approximate number of distinct values using the `HyperLogLog++` algorithm. Both functions are equivalent. For more information about algorithm accuracy and precision control, see [Controlling precision](https://docs.opensearch.org/latest/aggregations/metric/cardinality/#controlling-precision). + +**Parameters**: -#### Description +- `expr` (Required): The expression for which to count distinct values. -Usage: `DISTINCT_COUNT(expr)`, `DC(expr)`. Returns the approximate number of distinct values using the HyperLogLog++ algorithm. Both functions are equivalent. -For details on algorithm accuracy and precision control, see the [OpenSearch Cardinality Aggregation documentation](https://docs.opensearch.org/latest/aggregations/metric/cardinality/#controlling-precision). -### Example +**Return type**: `LONG` + +#### Example ```ppl source=accounts | stats dc(state) as distinct_states, distinct_count(state) as dc_states_alt by gender ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -311,17 +392,25 @@ fetched rows / total rows = 2/2 ### DISTINCT_COUNT_APPROX -#### Description +**Usage**: `DISTINCT_COUNT_APPROX(expr)` + +Returns the approximate count of distinct values in `expr` using the `HyperLogLog++` algorithm. + +**Parameters**: -Usage: `DISTINCT_COUNT_APPROX(expr)`. Return the approximate distinct count value of the expr, using the hyperloglog++ algorithm. -### Example +- `expr` (Required): The expression for which to count approximate distinct values. + +**Return type**: `LONG` + +#### Example ```ppl source=accounts | stats distinct_count_approx(gender) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -334,21 +423,27 @@ fetched rows / total rows = 1/1 ### EARLIEST -#### Description +**Usage**: `EARLIEST(field [, time_field])` -Usage: `EARLIEST(field [, time_field])`. Return the earliest value of a field based on timestamp ordering. -* `field`: mandatory. The field to return the earliest value for. -* `time_field`: optional. The field to use for time-based ordering. Defaults to @timestamp if not specified. - -### Example +Returns the earliest value of a `field` based on timestamp ordering. + +**Parameters**: + +- `field` (Required): The field for which to return the earliest value. +- `time_field` (Optional): The field to use for time-based ordering. Defaults to `@timestamp` if not specified. + +**Return type**: Same as input field type + +#### Example ```ppl source=events | stats earliest(message) by host | sort host ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -360,15 +455,16 @@ fetched rows / total rows = 2/2 +-------------------+---------+ ``` -Example with custom time field - +The following example uses a custom time field instead of the default `@timestamp` field for ordering: + ```ppl source=events | stats earliest(status, event_time) by category | sort category ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -382,21 +478,27 @@ fetched rows / total rows = 2/2 ### LATEST -#### Description +**Usage**: `LATEST(field [, time_field])` -Usage: `LATEST(field [, time_field])`. Return the latest value of a field based on timestamp ordering. -* `field`: mandatory. The field to return the latest value for. -* `time_field`: optional. The field to use for time-based ordering. Defaults to @timestamp if not specified. - -### Example +Returns the latest value of a `field` based on timestamp ordering. + +**Parameters**: + +- `field` (Required): The field for which to return the latest value. +- `time_field` (Optional): The field to use for time-based ordering. Defaults to `@timestamp` if not specified. + +**Return type**: Same as input field type + +#### Example ```ppl source=events | stats latest(message) by host | sort host ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -408,15 +510,16 @@ fetched rows / total rows = 2/2 +------------------+---------+ ``` -Example with custom time field - +The following example uses a custom time field instead of the default `@timestamp` field for ordering: + ```ppl source=events | stats latest(status, event_time) by category | sort category ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -430,20 +533,26 @@ fetched rows / total rows = 2/2 ### TAKE -#### Description +**Usage**: `TAKE(field [, size])` -Usage: `TAKE(field [, size])`. Return original values of a field. It does not guarantee on the order of values. -* `field`: mandatory. The field must be a text field. -* `size`: optional integer. The number of values should be returned. Default is 10. - -### Example +Returns the original values from a field. This function does not guarantee the order of the returned values. + +**Parameters**: + +- `field` (Required): A text field from which to extract values. +- `size` (Optional): The number of values to return. Defaults to `10`. + +**Return type**: `ARRAY` + +#### Example ```ppl source=accounts | stats take(firstname) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -454,22 +563,31 @@ fetched rows / total rows = 1/1 +-----------------------------+ ``` -### PERCENTILE or PERCENTILE_APPROX +### PERCENTILE, PERCENTILE_APPROX -#### Description +**Usage**: `PERCENTILE(expr, percent)`, `PERCENTILE_APPROX(expr, percent)` -Usage: `PERCENTILE(expr, percent)` or `PERCENTILE_APPROX(expr, percent)`. Return the approximate percentile value of expr at the specified percentage. -* `percent`: The number must be a constant between 0 and 100. - -Note: From 3.1.0, the percentile implementation is switched to MergingDigest from AVLTreeDigest. Ref [issue link](https://github.com/opensearch-project/OpenSearch/issues/18122). -### Example +Returns the approximate percentile value of `expr` at the specified percentage. + +**Parameters**: + +- `expr` (Required): The expression for which to calculate the percentile. +- `percent` (Required): A constant number between `0` and `100`. + +**Return type**: Same as input type + +Starting in version 3.1.0, the percentile implementation switched from `AVLTreeDigest` to `MergingDigest`. For more information, see the [corresponding issue](https://github.com/opensearch-project/OpenSearch/issues/18122). +{: .note} + +#### Example ```ppl source=accounts | stats percentile(age, 90) by gender ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -481,20 +599,21 @@ fetched rows / total rows = 2/2 +---------------------+--------+ ``` -#### Percentile Shortcut Functions +#### Percentile shortcut functions For convenience, OpenSearch PPL provides shortcut functions for common percentiles: -- `PERC(expr)` - Equivalent to `PERCENTILE(expr, )` -- `P(expr)` - Equivalent to `PERCENTILE(expr, )` - -Both integer and decimal percentiles from 0 to 100 are supported (e.g., `PERC95`, `P99.5`). +- `PERC(expr)` - Equivalent to `PERCENTILE(expr, )`. +- `P(expr)` - Equivalent to `PERCENTILE(expr, )`. + +Both integer and decimal percentiles from `0` to `100` are supported (for example, `PERC95`, `P99.5`): ```ppl source=accounts | stats perc99.5(age); ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -509,8 +628,9 @@ fetched rows / total rows = 1/1 source=accounts | stats p50(age); ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -523,17 +643,25 @@ fetched rows / total rows = 1/1 ### MEDIAN -#### Description +**Usage**: `MEDIAN(expr)` -Usage: `MEDIAN(expr)`. Returns the median (50th percentile) value of `expr`. This is equivalent to `PERCENTILE(expr, 50)`. -### Example +Returns the median (50th percentile) value of `expr`. This is equivalent to `PERCENTILE(expr, 50)`. + +**Parameters**: + +- `expr` (Required): The expression for which to calculate the median. + +**Return type**: Same as input type + +#### Example ```ppl source=accounts | stats median(age) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -546,19 +674,25 @@ fetched rows / total rows = 1/1 ### FIRST -#### Description +**Usage**: `FIRST(field)` -Usage: `FIRST(field)`. Return the first non-null value of a field based on natural document order. Returns NULL if no records exist, or if all records have NULL values for the field. -* `field`: mandatory. The field to return the first value for. - -### Example +Returns the first non-null value of a `field` based on natural document order. Returns `NULL` if no records exist or if all records have `NULL` values for the `field`. + +**Parameters**: + +- `field` (Required): The field for which to return the first value. + +**Return type**: Same as input field type + +#### Example ```ppl source=accounts | stats first(firstname) by gender ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -572,19 +706,25 @@ fetched rows / total rows = 2/2 ### LAST -#### Description +**Usage**: `LAST(field)` -Usage: `LAST(field)`. Return the last non-null value of a field based on natural document order. Returns NULL if no records exist, or if all records have NULL values for the field. -* `field`: mandatory. The field to return the last value for. - -### Example +Returns the last non-null value of a `field` based on natural document order. Returns `NULL` if no records exist or if all records have `NULL` values for the `field`. + +**Parameters**: + +- `field` (Required): The field for which to return the last value. + +**Return type**: Same as input field type + +#### Example ```ppl source=accounts | stats last(firstname) by gender ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -598,21 +738,30 @@ fetched rows / total rows = 2/2 ### LIST -#### Description +**Usage**: `LIST(expr)` -Usage: `LIST(expr)`. Collects all values from the specified expression into an array. Values are converted to strings, nulls are filtered, and duplicates are preserved. -The function returns up to 100 values with no guaranteed ordering. -* `expr`: The field expression to collect values from. -* This aggregation function doesn't support Array, Struct, Object field types. - -Example with string fields +Collects all values from the specified expression into an array. Values are converted to strings, `NULL` values are filtered out, and duplicates are preserved. This function returns up to `100` values without a guaranteed order. + +**Parameters**: + +- `expr` (Required): The field expression from which to collect values. + +**Return type**: `ARRAY` + +This aggregation function does not support array, struct, or object field types. +{: .note} + +#### Example + +The following example collects all values from a string field into an array: ```ppl source=accounts | stats list(firstname) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -625,22 +774,32 @@ fetched rows / total rows = 1/1 ### VALUES -#### Description +**Usage**: `VALUES(expr)` -Usage: `VALUES(expr)`. Collects all unique values from the specified expression into a sorted array. Values are converted to strings, nulls are filtered, and duplicates are removed. -The maximum number of unique values returned is controlled by the `plugins.ppl.values.max.limit` setting: -* Default value is 0, which means unlimited values are returned -* Can be configured to any positive integer to limit the number of unique values -* See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details - -Example with string fields +Collects all unique values from the specified expression into a sorted array. Values are converted to strings, `NULL` values are filtered out, and duplicates are removed. + +**Parameters**: + +- `expr` (Required): The expression from which to collect unique values. + +**Return type**: `ARRAY` + +> The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: +> - The default value is 0, which returns an unlimited number of values. +> - Setting this to any positive integer limits the number of unique values. +> - See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details + +#### Example + +The following example collects unique values from a string field into a sorted array: ```ppl source=accounts | stats values(firstname) ``` + -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 diff --git a/docs/user/ppl/functions/collection.md b/docs/user/ppl/functions/collection.md index ca9f7015c1a..006a49ef6d5 100644 --- a/docs/user/ppl/functions/collection.md +++ b/docs/user/ppl/functions/collection.md @@ -1,13 +1,25 @@ -# PPL Collection Functions +# Collection functions -## ARRAY +Collection functions create, manipulate, and analyze arrays and multivalue fields in data. These functions are essential for working with complex data structures and performing operations such as filtering, transforming, and analyzing array elements. -### Description +The following collection functions are supported in PPL. -Usage: `array(value1, value2, value3...)` create an array with input values. Currently we don't allow mixture types. We will infer a least restricted type, for example `array(1, "demo")` -> ["1", "demo"] -**Argument type:** `value1: ANY, value2: ANY, ...` -**Return type:** `ARRAY` -### Example +## ARRAY + +**Usage**: `array(value1, value2, value3...)` + +Creates an array containing the input values. Mixed types are automatically converted to the least restrictive type. For example, `array(1, "demo")` returns `["1", "demo"]` where the integer is converted to a string. + +**Parameters**: + +- `value1` (Required): A value of any type to include in the array. +- `value2`, `value3` (Optional): Additional values of any type to include in the array. + +**Return type**: `ARRAY` + +#### Example + +The following example creates an array with numeric values: ```ppl source=people @@ -16,7 +28,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -26,6 +38,8 @@ fetched rows / total rows = 1/1 | [1,2,3] | +---------+ ``` + +The following example demonstrates mixed-type conversion: ```ppl source=people @@ -34,7 +48,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -45,15 +59,20 @@ fetched rows / total rows = 1/1 +----------+ ``` -## ARRAY_LENGTH +## ARRAY_LENGTH -### Description +**Usage**: `array_length(array)` + +Returns the length of the input `array`. + +**Parameters**: + +- `array` (Required): The array for which to return the length. + +**Return type**: `INTEGER` + +#### Example -Usage: `array_length(array)` returns the length of input array. -**Argument type:** `array:ARRAY` -**Return type:** `INTEGER` -### Example - ```ppl source=people | eval array = array(1, 2, 3) @@ -62,7 +81,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -73,15 +92,21 @@ fetched rows / total rows = 1/1 +--------+ ``` -## FORALL +## FORALL -### Description +**Usage**: `forall(array, function)` + +Checks whether all elements in the array satisfy the lambda function condition. The lambda function must accept a single input parameter and return a Boolean value. + +**Parameters**: + +- `array` (Required): The array to check. +- `function` (Required): A lambda function that returns a Boolean value and accepts a single input parameter. + +**Return type**: `BOOLEAN` + +#### Example -Usage: `forall(array, function)` check whether all element inside array can meet the lambda function. The function should also return boolean. The lambda function accepts one single input. -**Argument type:** `array:ARRAY, function:LAMBDA` -**Return type:** `BOOLEAN` -### Example - ```ppl source=people | eval array = array(1, 2, 3), result = forall(array, x -> x > 0) @@ -89,7 +114,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -100,15 +125,21 @@ fetched rows / total rows = 1/1 +--------+ ``` -## EXISTS +## EXISTS -### Description +**Usage**: `exists(array, function)` + +Checks whether at least one element in the array satisfies the lambda function condition. The lambda function must accept a single input parameter and return a Boolean value. + +**Parameters**: + +- `array` (Required): The array to check. +- `function` (Required): A lambda function that returns a Boolean value and accepts a single input parameter. + +**Return type**: `BOOLEAN` + +#### Example -Usage: `exists(array, function)` check whether existing one of element inside array can meet the lambda function. The function should also return boolean. The lambda function accepts one single input. -**Argument type:** `array:ARRAY, function:LAMBDA` -**Return type:** `BOOLEAN` -### Example - ```ppl source=people | eval array = array(-1, -2, 3), result = exists(array, x -> x > 0) @@ -116,7 +147,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -127,15 +158,21 @@ fetched rows / total rows = 1/1 +--------+ ``` -## FILTER +## FILTER -### Description +**Usage**: `filter(array, function)` + +Filters the elements in the array using a lambda function. The lambda function must accept a single input parameter and return a Boolean value. + +**Parameters**: + +- `array` (Required): The array to filter. +- `function` (Required): A lambda function that returns a Boolean value and accepts a single input parameter. + +**Return type**: `ARRAY` + +#### Example -Usage: `filter(array, function)` filter the element in the array by the lambda function. The function should return boolean. The lambda function accepts one single input. -**Argument type:** `array:ARRAY, function:LAMBDA` -**Return type:** `ARRAY` -### Example - ```ppl source=people | eval array = array(1, -2, 3), result = filter(array, x -> x > 0) @@ -143,7 +180,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -154,15 +191,23 @@ fetched rows / total rows = 1/1 +--------+ ``` -## TRANSFORM +## TRANSFORM -### Description +**Usage**: `transform(array, function)` + +Transforms the elements of the `array` one by one using a lambda function. The lambda function can accept one or two inputs. If the lambda function accepts two parameters, the second parameter is the index of the element in the `array`. + +**Parameters**: + +- `array` (Required): The array to transform. +- `function` (Required): A lambda function that accepts one or two input parameters and returns a transformed value. + +**Return type**: `ARRAY` + +#### Example + +The following example transforms each element by adding 2: -Usage: `transform(array, function)` transform the element of array one by one using lambda. The lambda function can accept one single input or two input. If the lambda accepts two argument, the second one is the index of element in array. -**Argument type:** `array:ARRAY, function:LAMBDA` -**Return type:** `ARRAY` -### Example - ```ppl source=people | eval array = array(1, -2, 3), result = transform(array, x -> x + 2) @@ -170,7 +215,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -180,6 +225,8 @@ fetched rows / total rows = 1/1 | [3,0,5] | +---------+ ``` + +The following example uses both element value and index in the transformation: ```ppl source=people @@ -188,7 +235,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -199,15 +246,25 @@ fetched rows / total rows = 1/1 +----------+ ``` -## REDUCE +## REDUCE -### Description +**Usage**: `reduce(array, acc_base, function, )` + +Uses a lambda function to iterate through all elements and interact with the accumulator base value. The lambda function accepts two parameters: the accumulator and the array element. When an optional `reduce_function` is provided, it is applied to the final accumulator value. The reduce function accepts the accumulator as a single parameter. + +**Parameters**: + +- `array` (Required): The array to reduce. +- `acc_base` (Required): The initial accumulator value. +- `function` (Required): A lambda function that accepts accumulator and array element as parameters. +- `reduce_function` (Optional): A lambda function to apply to the final accumulator value. + +**Return type**: Same as accumulator type (determined by `acc_base` and `reduce_function`) + +#### Example + +The following example reduces an array by summing all elements with an initial value: -Usage: `reduce(array, acc_base, function, )` use lambda function to go through all element and interact with acc_base. The lambda function accept two argument accumulator and array element. If add one more reduce_function, will apply reduce_function to accumulator finally. The reduce function accept accumulator as the one argument. -**Argument type:** `array:ARRAY, acc_base:ANY, function:LAMBDA, reduce_function:LAMBDA` -**Return type:** `ANY` -### Example - ```ppl source=people | eval array = array(1, -2, 3), result = reduce(array, 10, (acc, x) -> acc + x) @@ -215,7 +272,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -225,6 +282,8 @@ fetched rows / total rows = 1/1 | 12 | +--------+ ``` + +The following example uses an additional reduce function to transform the final result: ```ppl source=people @@ -233,7 +292,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -244,15 +303,23 @@ fetched rows / total rows = 1/1 +--------+ ``` -## MVJOIN +## MVJOIN -### Description +**Usage**: `mvjoin(array, delimiter)` + +Joins string array elements into a single string, separated by the specified delimiter. `NULL` elements are excluded from the output. Only string arrays are supported. + +**Parameters**: + +- `array` (Required): An array of strings to join. +- `delimiter` (Required): The string to use as a separator between array elements. + +**Return type**: `STRING` + +#### Example + +The following example joins an array of strings with a comma delimiter: -Usage: `mvjoin(array, delimiter)` joins string array elements into a single string, separated by the specified delimiter. NULL elements are excluded from the output. Only string arrays are supported. -**Argument type:** `array: ARRAY of STRING, delimiter: STRING` -**Return type:** `STRING` -### Example - ```ppl source=people | eval result = mvjoin(array('a', 'b', 'c'), ',') @@ -260,7 +327,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -270,6 +337,8 @@ fetched rows / total rows = 1/1 | a,b,c | +--------+ ``` + +The following example joins field values into a single string: ```ppl source=accounts @@ -279,7 +348,7 @@ source=accounts | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -290,15 +359,24 @@ fetched rows / total rows = 1/1 +-------------+ ``` -## MVAPPEND +## MVAPPEND -### Description +**Usage**: `mvappend(value1, value2, value3...)` + +Appends all elements from parameters to create an array. Flattens array parameters and collects all individual elements. Always returns an array or `NULL` for consistent type behavior. + +**Parameters**: + +- `value1` (Required): A value of any type to append to the array. +- `value2` (Optional): Additional values of any type to append to the array. +- `...` (Optional): Any number of additional values. + +**Return type**: `ARRAY` + +#### Example + +The following example appends multiple values to create an array: -Usage: `mvappend(value1, value2, value3...)` appends all elements from arguments to create an array. Flattens array arguments and collects all individual elements. Always returns an array or null for consistent type behavior. -**Argument type:** `value1: ANY, value2: ANY, ...` -**Return type:** `ARRAY` -### Example - ```ppl source=people | eval result = mvappend(1, 1, 3) @@ -306,7 +384,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -316,6 +394,8 @@ fetched rows / total rows = 1/1 | [1,1,3] | +---------+ ``` + +The following example demonstrates array flattening: ```ppl source=people @@ -324,7 +404,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -334,6 +414,8 @@ fetched rows / total rows = 1/1 | [1,2,3] | +---------+ ``` + +The following example shows nested `mvappend` calls: ```ppl source=people @@ -342,7 +424,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -352,6 +434,8 @@ fetched rows / total rows = 1/1 | [1,2,3] | +---------+ ``` + +The following example creates an array from a single value: ```ppl source=people @@ -360,7 +444,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -370,6 +454,8 @@ fetched rows / total rows = 1/1 | [42] | +--------+ ``` + +The following example demonstrates `NULL` value filtering: ```ppl source=people @@ -378,7 +464,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -388,6 +474,8 @@ fetched rows / total rows = 1/1 | [2] | +--------+ ``` + +The following example shows behavior with only `NULL` values: ```ppl source=people @@ -396,7 +484,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -406,6 +494,8 @@ fetched rows / total rows = 1/1 | null | +--------+ ``` + +The following example concatenates multiple arrays: ```ppl source=people @@ -414,7 +504,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -424,6 +514,8 @@ fetched rows / total rows = 1/1 | [1,2,3,4] | +-----------+ ``` + +The following example appends field values: ```ppl source=accounts @@ -432,7 +524,7 @@ source=accounts | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -442,6 +534,8 @@ fetched rows / total rows = 1/1 | [Amber,Duke] | +--------------+ ``` + +The following example demonstrates mixed data types: ```ppl source=people @@ -450,7 +544,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -461,17 +555,22 @@ fetched rows / total rows = 1/1 +--------------+ ``` -## SPLIT +## SPLIT + +**Usage**: `split(str, delimiter)` -### Description +Splits the string values on the delimiter and returns the string values as a multivalue field (array). Use an empty string (`""`) to split the original string into one value per character. If the delimiter is not found, the function returns an array containing the original string. If the input string is empty, the function returns an empty array. -Usage: `split(str, delimiter)` splits the string values on the delimiter and returns the string values as a multivalue field (array). Use an empty string ("") to split the original string into one value per character. If the delimiter is not found, returns an array containing the original string. If the input string is empty, returns an empty array. +**Parameters**: -**Argument type:** `str: STRING, delimiter: STRING` +- `str` (Required): The string to split. +- `delimiter` (Required): The string to use as a delimiter for splitting. -**Return type:** `ARRAY of STRING` +**Return type**: `ARRAY` -### Example +#### Example + +The following example splits a string using a semicolon delimiter: ```ppl source=people @@ -480,7 +579,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -491,6 +590,8 @@ fetched rows / total rows = 1/1 +------------------------------------+ ``` +The following example uses a multi-character delimiter: + ```ppl source=people | eval test = '1a2b3c4def567890', result = split(test, 'def') @@ -498,7 +599,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -509,6 +610,8 @@ fetched rows / total rows = 1/1 +------------------+ ``` +The following example splits a string into individual characters using an empty delimiter: + ```ppl source=people | eval test = 'abcd', result = split(test, '') @@ -516,7 +619,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -527,6 +630,8 @@ fetched rows / total rows = 1/1 +-----------+ ``` +The following example splits using a double-colon delimiter: + ```ppl source=people | eval test = 'name::value', result = split(test, '::') @@ -534,7 +639,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -545,6 +650,8 @@ fetched rows / total rows = 1/1 +--------------+ ``` +The following example shows behavior when the delimiter is not found: + ```ppl source=people | eval test = 'hello', result = split(test, ',') @@ -552,7 +659,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -563,15 +670,22 @@ fetched rows / total rows = 1/1 +---------+ ``` -## MVDEDUP +## MVDEDUP -### Description +**Usage**: `mvdedup(array)` + +Removes duplicate values from a multivalue array while preserving the order of the first occurrence. `NULL` elements are filtered out. Returns a deduplicated array, or `NULL` if the input is `NULL`. + +**Parameters**: + +- `array` (Required): The array from which to remove duplicates. + +**Return type**: `ARRAY` + +#### Example + +The following example removes duplicate numbers while preserving order: -Usage: `mvdedup(array)` removes duplicate values from a multivalue array while preserving the order of first occurrence. NULL elements are filtered out. Returns an array with duplicates removed, or null if the input is null. -**Argument type:** `array: ARRAY` -**Return type:** `ARRAY` -### Example - ```ppl source=people | eval array = array(1, 2, 2, 3, 1, 4), result = mvdedup(array) @@ -579,7 +693,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -589,6 +703,8 @@ fetched rows / total rows = 1/1 | [1,2,3,4] | +-----------+ ``` + +The following example deduplicates string values: ```ppl source=people @@ -597,7 +713,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -607,6 +723,8 @@ fetched rows / total rows = 1/1 | [z,a,b,c] | +-----------+ ``` + +The following example shows behavior with an empty array: ```ppl source=people @@ -615,7 +733,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -628,12 +746,20 @@ fetched rows / total rows = 1/1 ## MVFIND -### Description +**Usage**: `mvfind(array, regex)` + +Searches a multivalue array and returns the `0`-based index of the first element that matches the regular expression. Returns `NULL` if no match is found. + +**Parameters**: + +- `array` (Required): The array to search. +- `regex` (Required): The regular expression pattern to match against array elements. + +**Return type**: `INTEGER` (or `NULL` if no match found) -Usage: mvfind(array, regex) searches a multivalue array and returns the 0-based index of the first element that matches the regular expression. Returns NULL if no match is found. -Argument type: array: ARRAY, regex: STRING -Return type: INTEGER (nullable) -Example +#### Example + +The following example searches for the first element that matches a regular expression: ```ppl source=people @@ -642,7 +768,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -653,6 +779,8 @@ fetched rows / total rows = 1/1 +--------+ ``` +The following example shows behavior when no match is found: + ```ppl source=people | eval array = array('cat', 'dog', 'bird'), result = mvfind(array, 'fish') @@ -660,7 +788,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -671,6 +799,8 @@ fetched rows / total rows = 1/1 +--------+ ``` +The following example uses a regex pattern with character classes: + ```ppl source=people | eval array = array('error123', 'info', 'error456'), result = mvfind(array, 'error[0-9]+') @@ -678,7 +808,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -689,6 +819,8 @@ fetched rows / total rows = 1/1 +--------+ ``` +The following example demonstrates case-insensitive matching: + ```ppl source=people | eval array = array('Apple', 'Banana', 'Cherry'), result = mvfind(array, '(?i)banana') @@ -696,7 +828,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -707,15 +839,24 @@ fetched rows / total rows = 1/1 +--------+ ``` -## MVINDEX +## MVINDEX -### Description +**Usage**: `mvindex(array, start, [end])` + +Returns a subset of the multivalue array using the start and optional end index values. Indexes are `0`-based (the first element is at index `0`). Supports negative indexing where `-1` refers to the last element. When only start is provided, the function returns a single element. When both start and end are provided, the function returns an array of elements from start to end (inclusive). + +**Parameters**: + +- `array` (Required): The array from which to extract elements. +- `start` (Required): The starting index (`0`-based). +- `end` (Optional): The ending index (`0`-based, inclusive). + +**Return type**: Single element type when only `start` is provided; `ARRAY` when both `start` and `end` are provided + +#### Example + +The following example gets a single element at index 1: -Usage: `mvindex(array, start, [end])` returns a subset of the multivalue array using the start and optional end index values. Indexes are 0-based (first element is at index 0). Supports negative indexing where -1 refers to the last element. When only start is provided, returns a single element. When both start and end are provided, returns an array of elements from start to end (inclusive). -**Argument type:** `array: ARRAY, start: INTEGER, end: INTEGER (optional)` -**Return type:** `ANY (single element) or ARRAY (range)` -### Example - ```ppl source=people | eval array = array('a', 'b', 'c', 'd', 'e'), result = mvindex(array, 1) @@ -723,7 +864,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -733,6 +874,8 @@ fetched rows / total rows = 1/1 | b | +--------+ ``` + +The following example uses negative indexing to get the last element: ```ppl source=people @@ -741,7 +884,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -751,6 +894,8 @@ fetched rows / total rows = 1/1 | e | +--------+ ``` + +The following example extracts a range of elements: ```ppl source=people @@ -759,7 +904,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -769,6 +914,8 @@ fetched rows / total rows = 1/1 | [2,3,4] | +---------+ ``` + +The following example uses negative indexing for a range: ```ppl source=people @@ -777,7 +924,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -787,6 +934,8 @@ fetched rows / total rows = 1/1 | [3,4,5] | +---------+ ``` + +The following example extracts elements from the beginning of an array: ```ppl source=people @@ -795,7 +944,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -808,12 +957,20 @@ fetched rows / total rows = 1/1 ## MVMAP -### Description +**Usage**: `mvmap(array, expression)` -Usage: mvmap(array, expression) iterates over each element of a multivalue array, applies the expression to each element, and returns a multivalue array with the transformed results. The field name in the expression is implicitly bound to each element value. -Argument type: array: ARRAY, expression: EXPRESSION -Return type: ARRAY -Example +Iterates over each element of a multivalue array, applies the expression to each element, and returns a multivalue array containing the transformed results. The field name in the expression is implicitly bound to each element value. + +**Parameters**: + +- `array` (Required): The array to map over. +- `expression` (Required): The expression to apply to each element. + +**Return type**: `ARRAY` + +#### Example + +The following example applies a mathematical operation to each element of an array: ```ppl source=people @@ -822,7 +979,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -833,6 +990,8 @@ fetched rows / total rows = 1/1 +------------+ ``` +The following example applies a different mathematical operation: + ```ppl source=people | eval array = array(1, 2, 3), result = mvmap(array, array + 5) @@ -840,7 +999,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -851,9 +1010,9 @@ fetched rows / total rows = 1/1 +---------+ ``` -Note: For nested expressions like ``mvmap(mvindex(arr, 1, 3), arr * 2)``, the field name (``arr``) is extracted from the first argument and must match the field referenced in the expression. +> **Note**: For nested expressions such as `mvmap(mvindex(arr, 1, 3), arr * 2)`, the field name (`arr`) is extracted from the first argument and must match the field referenced in the expression. -The expression can also reference other single-value fields: +The following example shows how the expression can reference other single-value fields: ```ppl source=people @@ -862,7 +1021,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -876,19 +1035,27 @@ fetched rows / total rows = 1/1 ## MVZIP -### Description +**Usage**: `mvzip(mv_left, mv_right, [delim])` + +Combines the values in two multivalue arrays by pairing corresponding elements and joining them into strings. The delimiter specifies the character or string used to join the two values. This is similar to the Python zip command. -Usage: `mvzip(mv_left, mv_right, [delim])` combines the values in two multivalue arrays by pairing corresponding elements and joining them into strings. The delimiter is used to specify a delimiting character to join the two values. This is similar to the Python zip command. +The values are combined by pairing the first value of `mv_left` with the first value of `mv_right`, then the second with the second, and so on. Each pair is concatenated into a string using the delimiter. The function stops at the length of the shorter array. -The values are stitched together combining the first value of mv_left with the first value of mv_right, then the second with the second, and so on. Each pair is concatenated into a string using the delimiter. The function stops at the length of the shorter array. +The delimiter is optional. When specified, it must be enclosed in quotation marks. The default delimiter is a comma. -The delimiter is optional. When specified, it must be enclosed in quotation marks. The default delimiter is a comma ( , ). +Returns `NULL` if either input is `NULL`. Returns an empty array if either input array is empty. -Returns null if either input is null. Returns an empty array if either input array is empty. +**Parameters**: -**Argument type:** `mv_left: ARRAY, mv_right: ARRAY, delim: STRING (optional)` -**Return type:** `ARRAY of STRING` -### Example +- `mv_left` (Required): The first array to combine. +- `mv_right` (Required): The second array to combine. +- `delim` (Optional): The delimiter to use for joining pairs. Defaults to comma. + +**Return type**: `ARRAY` + +#### Example + +The following example combines host and port arrays with a colon delimiter: ```ppl source=people @@ -897,7 +1064,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -908,6 +1075,8 @@ fetched rows / total rows = 1/1 +----------------------+ ``` +The following example uses a pipe delimiter with equal-length arrays: + ```ppl source=people | eval arr1 = array('a', 'b', 'c'), arr2 = array('x', 'y', 'z'), result = mvzip(arr1, arr2, '|') @@ -915,7 +1084,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -926,6 +1095,8 @@ fetched rows / total rows = 1/1 +---------------+ ``` +The following example demonstrates behavior with arrays of different lengths: + ```ppl source=people | eval arr1 = array('1', '2', '3'), arr2 = array('a', 'b'), result = mvzip(arr1, arr2, '-') @@ -933,7 +1104,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -944,6 +1115,8 @@ fetched rows / total rows = 1/1 +-----------+ ``` +The following example shows nested mvzip calls: + ```ppl source=people | eval arr1 = array('a', 'b', 'c'), arr2 = array('x', 'y', 'z'), arr3 = array('1', '2', '3'), result = mvzip(mvzip(arr1, arr2, '-'), arr3, ':') @@ -951,7 +1124,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -962,6 +1135,8 @@ fetched rows / total rows = 1/1 +---------------------+ ``` +The following example shows behavior with an empty array: + ```ppl source=people | eval arr1 = array('a', 'b'), arr2 = array(), result = mvzip(arr1, arr2) @@ -969,7 +1144,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 diff --git a/docs/user/ppl/functions/condition.md b/docs/user/ppl/functions/condition.md index cb5dff9107e..759fa09a1e5 100644 --- a/docs/user/ppl/functions/condition.md +++ b/docs/user/ppl/functions/condition.md @@ -1,22 +1,24 @@ -# Condition Functions -PPL functions use the search capabilities of the OpenSearch engine. However, these functions don't execute directly within the OpenSearch plugin's memory. Instead, they facilitate the global filtering of query results based on specific conditions, such as a `WHERE` or `HAVING` clause. +# Conditional functions -The following sections describe the condition PPL functions. -## ISNULL +PPL conditional functions enable global filtering of query results based on specific conditions, such as `WHERE` or `HAVING` clauses. These functions use the search capabilities of the OpenSearch engine but don't execute directly within the OpenSearch plugin's memory. +## ISNULL -### Description +**Usage**: `isnull(field)` -Usage: `isnull(field)` returns TRUE if field is NULL, FALSE otherwise. +Returns `TRUE` if the field is `NULL`, `FALSE` otherwise. The `isnull()` function is commonly used: -- In `eval` expressions to create conditional fields -- With the `if()` function to provide default values -- In `where` clauses to filter null records - -**Argument type:** All supported data types. -**Return type:** `BOOLEAN` +- In `eval` expressions to create conditional fields. +- With the `if()` function to provide default values. +- In `where` clauses to filter null records. -### Example +**Parameters**: + +- `field` (Required): The field to check for null values. + +**Return type**: `BOOLEAN` + +#### Example ```ppl source=accounts @@ -24,7 +26,7 @@ source=accounts | fields result, employer, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -38,7 +40,7 @@ fetched rows / total rows = 4/4 +--------+----------+-----------+ ``` -Using with if() to label records +The following example demonstrates using `isnull` with the `if` function to create conditional labels: ```ppl source=accounts @@ -46,7 +48,7 @@ source=accounts | fields firstname, employer, status ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -60,15 +62,15 @@ fetched rows / total rows = 4/4 +-----------+----------+------------+ ``` -Filtering with where clause - +The following example filters records using `isnull` in a `where` clause: + ```ppl source=accounts | where isnull(employer) | fields account_number, firstname, employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -79,23 +81,27 @@ fetched rows / total rows = 1/1 +----------------+-----------+----------+ ``` -## ISNOTNULL +## ISNOTNULL -### Description +**Usage**: `isnotnull(field)` -Usage: `isnotnull(field)` returns TRUE if field is NOT NULL, FALSE otherwise. The `isnotnull(field)` function is the opposite of `isnull(field)`. Instead of checking for null values, it checks a specific field and returns `true` if the field contains data, that is, it is not null. +Returns `TRUE` if the field is NOT `NULL`, `FALSE` otherwise. The `isnotnull()` function is commonly used: -- In `eval` expressions to create boolean flags -- In `where` clauses to filter out null values -- With the `if()` function for conditional logic -- To validate data presence - -**Argument type:** All supported data types. -**Return type:** `BOOLEAN` -**Synonyms:** [ISPRESENT](#ispresent) +- In `eval` expressions to create Boolean flags. +- In `where` clauses to filter out null values. +- With the `if()` function for conditional logic. +- To validate data presence. -### Example +**Synonyms**: [ISPRESENT](#ispresent) + +**Parameters**: + +- `field` (Required): The field to check for non-null values. + +**Return type**: `BOOLEAN` + +#### Example ```ppl source=accounts @@ -103,7 +109,7 @@ source=accounts | fields firstname, employer, has_employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -117,15 +123,15 @@ fetched rows / total rows = 4/4 +-----------+----------+--------------+ ``` -Filtering with where clause - +The following example shows how to filter records using `isnotnull` in a `where` clause: + ```ppl source=accounts | where not isnotnull(employer) | fields account_number, employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -136,15 +142,15 @@ fetched rows / total rows = 1/1 +----------------+----------+ ``` -Using with if() for validation messages - +The following example demonstrates using `isnotnull` with the `if` function to create validation messages: + ```ppl source=accounts | eval validation = if(isnotnull(employer), 'valid', 'missing employer') | fields firstname, employer, validation ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -158,10 +164,15 @@ fetched rows / total rows = 4/4 +-----------+----------+------------------+ ``` -## EXISTS +## EXISTS + +**Usage**: Use `isnull(field)` or `isnotnull(field)` to test field existence + +Since OpenSearch doesn't differentiate between null and missing values, functions like `ismissing`/`isnotmissing` are not available. Use `isnull`/`isnotnull` to test field existence instead. + +#### Example -[Since OpenSearch doesn't differentiate null and missing](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html), we can't provide functions like ismissing/isnotmissing to test if a field exists or not. But you can still use isnull/isnotnull for such purpose. -Example, the account 13 doesn't have email field +The following example shows account 13, which doesn't contain an `email` field: ```ppl source=accounts @@ -169,7 +180,7 @@ source=accounts | fields account_number, email ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -180,16 +191,20 @@ fetched rows / total rows = 1/1 +----------------+-------+ ``` -## IFNULL +## IFNULL -### Description +**Usage**: `ifnull(field1, field2)` -Usage: `ifnull(field1, field2)` returns field2 if field1 is null. +Returns `field2` if `field1` is `NULL`. -**Argument type:** All supported data types (NOTE: if two parameters have different types, you will fail semantic check). -**Return type:** `any` +**Parameters**: -### Example +- `field1` (Required): The field to check for `NULL` values. +- `field2` (Required): The value to return if `field1` is `NULL`. + +**Return type**: Any (matches input types) + +#### Example ```ppl source=accounts @@ -197,7 +212,7 @@ source=accounts | fields result, employer, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -211,11 +226,12 @@ fetched rows / total rows = 4/4 +---------+----------+-----------+ ``` -### Nested IFNULL Pattern +#### Nested ifnull pattern -For OpenSearch versions prior to 3.1, COALESCE-like functionality can be achieved using nested IFNULL statements. This pattern is particularly useful in observability use cases where field names may vary across different data sources. +For OpenSearch versions prior to 3.1, `coalesce`-like functionality can be achieved using nested `ifnull` statements. This pattern is particularly useful in observability use cases where field names may vary across different data sources. Usage: `ifnull(field1, ifnull(field2, ifnull(field3, default_value)))` -### Example + +#### Example ```ppl source=accounts @@ -223,7 +239,7 @@ source=accounts | fields result, employer, firstname, lastname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -237,16 +253,20 @@ fetched rows / total rows = 4/4 +---------+----------+-----------+----------+ ``` -## NULLIF +## NULLIF -### Description +**Usage**: `nullif(field1, field2)` -Usage: `nullif(field1, field2)` returns null if two parameters are same, otherwise returns field1. +Returns `NULL` if the two parameters are the same, otherwise returns `field1`. -**Argument type:** All supported data types (NOTE: if two parameters have different types, you will fail semantic check). -**Return type:** `any` +**Parameters**: -### Example +- `field1` (Required): The field to return if different from `field2`. +- `field2` (Required): The value to compare against `field1`. + +**Return type**: Any (matches `field1` type) + +#### Example ```ppl source=accounts @@ -254,7 +274,7 @@ source=accounts | fields result, employer, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -268,16 +288,23 @@ fetched rows / total rows = 4/4 +---------+----------+-----------+ ``` -## IF +## IF -### Description +**Usage**: `if(condition, expr1, expr2)` -Usage: `if(condition, expr1, expr2)` returns expr1 if condition is true, otherwise returns expr2. +Returns `expr1` if the condition is `true`, otherwise returns `expr2`. -**Argument type:** All supported data types (NOTE: if expr1 and expr2 are different types, you will fail semantic check). -**Return type:** `any` +**Parameters**: -### Example +- `condition` (Required): The Boolean expression to evaluate. +- `expr1` (Required): The value to return if the condition is `true`. +- `expr2` (Required): The value to return if the condition is `false`. + +**Return type**: Least restrictive common type of `expr1` and `expr2` + +#### Example + +The following example returns the first name when the condition is `true`: ```ppl source=accounts @@ -285,7 +312,7 @@ source=accounts | fields result, firstname, lastname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -298,14 +325,16 @@ fetched rows / total rows = 4/4 | Dale | Dale | Adams | +---------+-----------+----------+ ``` - + +The following example returns the last name when the condition is `false`: + ```ppl source=accounts | eval result = if(false, firstname, lastname) | fields result, firstname, lastname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -318,14 +347,17 @@ fetched rows / total rows = 4/4 | Adams | Dale | Adams | +--------+-----------+----------+ ``` - + +The following example uses a complex condition to determine VIP status: + ```ppl source=accounts | eval is_vip = if(age > 30 AND isnotnull(employer), true, false) | fields is_vip, firstname, lastname ``` - -Expected output: + + +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -339,30 +371,37 @@ fetched rows / total rows = 4/4 +--------+-----------+----------+ ``` -## CASE +## CASE -### Description +**Usage**: `case(condition1, expr1, condition2, expr2, ... conditionN, exprN else default)` -Usage: `case(condition1, expr1, condition2, expr2, ... conditionN, exprN else default)` returns expr1 if condition1 is true, or returns expr2 if condition2 is true, ... if no condition is true, then returns the value of ELSE clause. If the ELSE clause is not defined, returns NULL. +Returns `expr1` if `condition1` is `true`, `expr2` if `condition2` is `true`, and so on. If no condition is `true`, returns the value of the `else` clause. If the `else` clause is not defined, returns `NULL`. -**Argument type:** All supported data types (NOTE: there is no comma before "else"). -**Return type:** `any` +**Parameters**: -### Limitations +- `condition1, condition2, ..., conditionN` (Required): Boolean expressions to evaluate in sequence. +- `expr1, expr2, ..., exprN` (Required): Values to return when the corresponding condition is `true`. +- `default` (Optional): The value to return when no condition is `true`. If not specified, returns `NULL`. -When each condition is a field comparison with a numeric literal and each result expression is a string literal, the query will be optimized as [range aggregations](https://docs.opensearch.org/latest/aggregations/bucket/range) if pushdown optimization is enabled. However, this optimization has the following limitations: -- Null values will not be grouped into any bucket of a range aggregation and will be ignored -- The default ELSE clause will use the string literal `"null"` instead of actual NULL values - -### Example +**Return type**: Least restrictive common type of all result expressions + +#### Limitations + +When each condition is a field comparison against a numeric literal and each result expression is a string literal, the query is optimized as [range aggregations](https://docs.opensearch.org/latest/aggregations/bucket/range/) if pushdown optimization is enabled. However, this optimization has the following limitations: +- `NULL` values are not grouped into any bucket of a range aggregation and are ignored. +- The default `else` clauses use the string literal `"null"` instead of actual NULL values. +#### Example + +The following example demonstrates a case statement with an else clause: + ```ppl source=accounts | eval result = case(age > 35, firstname, age < 30, lastname else employer) | fields result, firstname, lastname, age, employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -375,14 +414,16 @@ fetched rows / total rows = 4/4 | null | Dale | Adams | 33 | null | +--------+-----------+----------+-----+----------+ ``` - + +The following example demonstrates a case statement without an else clause: + ```ppl source=accounts | eval result = case(age > 35, firstname, age < 30, lastname) | fields result, firstname, lastname, age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -395,14 +436,16 @@ fetched rows / total rows = 4/4 | null | Dale | Adams | 33 | +--------+-----------+----------+-----+ ``` - + +The following example uses case in a where clause to filter records: + ```ppl source=accounts | where true = case(age > 35, false, age < 30, false else true) | fields firstname, lastname, age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -414,32 +457,36 @@ fetched rows / total rows = 2/2 +-----------+----------+-----+ ``` -## COALESCE +## COALESCE -### Description +**Usage**: `coalesce(field1, field2, ...)` -Usage: `coalesce(field1, field2, ...)` returns the first non-null, non-missing value in the argument list. +Returns the first non-null, non-missing value in the parameter list. -**Argument type:** All supported data types. Supports mixed data types with automatic type coercion. -**Return type:** Determined by the least restrictive common type among all arguments, with fallback to string if no common type can be determined. -Behavior: -- Returns the first value that is not null and not missing (missing includes non-existent fields) -- Empty strings ("") and whitespace strings (" ") are considered valid values -- If all arguments are null or missing, returns null -- Automatic type coercion is applied to match the determined return type -- If type conversion fails, the value is converted to string representation -- For best results, use arguments of the same data type to avoid unexpected type conversions - -Performance Considerations: -- Optimized for multiple field evaluation, more efficient than nested IFNULL patterns -- Evaluates arguments sequentially, stopping at the first non-null value -- Consider field order based on likelihood of containing values to minimize evaluation overhead - -Limitations: -- Type coercion may result in unexpected string conversions for incompatible types -- Performance may degrade with very large numbers of arguments - -### Example +**Parameters**: + +- `field1, field2, ...` (Required): Fields or expressions to evaluate for non-null values. + +**Return type**: Least restrictive common type of all input parameters + +**Behavior**: +- Returns the first value that is not `NULL` and not missing (missing includes non-existent fields). +- Empty strings (`""`) and whitespace strings (`" "`) are considered valid values. +- If all parameters are `NULL` or missing, returns `NULL`. +- Automatic type coercion is applied to match the determined return type. +- If type conversion fails, the value is converted to string representation. +- For best results, use parameters of the same data type to avoid unexpected type conversions. + +**Performance considerations**: +- Optimized for multiple field evaluation, more efficient than nested `ifnull` patterns. +- Evaluates parameters sequentially, stopping at the first non-null value. +- Consider field order based on likelihood of containing values to minimize evaluation overhead. + +**Limitations**: +- Type coercion may result in unexpected string conversions for incompatible types. +- Performance may degrade when using large numbers of arguments. + +#### Example ```ppl source=accounts @@ -447,7 +494,7 @@ source=accounts | fields result, firstname, lastname, employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -461,7 +508,7 @@ fetched rows / total rows = 4/4 +---------+-----------+----------+----------+ ``` -Empty String Handling Examples +#### Empty String Handling Examples ```ppl source=accounts @@ -470,7 +517,7 @@ source=accounts | fields result, empty_field, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -490,7 +537,7 @@ source=accounts | fields result, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -504,7 +551,7 @@ fetched rows / total rows = 4/4 +--------+-----------+ ``` -Mixed Data Types with Auto Coercion +#### Mixed Data Types with Auto Coercion ```ppl source=accounts @@ -512,7 +559,7 @@ source=accounts | fields result, employer, balance ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -526,7 +573,7 @@ fetched rows / total rows = 4/4 +---------+----------+---------+ ``` -Non-existent Field Handling +#### Non-existent Field Handling ```ppl source=accounts @@ -534,7 +581,7 @@ source=accounts | fields result, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -548,17 +595,21 @@ fetched rows / total rows = 4/4 +---------+-----------+ ``` -## ISPRESENT +## ISPRESENT -### Description +**Usage**: `ispresent(field)` -Usage: `ispresent(field)` returns true if the field exists. +Returns `TRUE` if the field exists, `FALSE` otherwise. -**Argument type:** All supported data types. -**Return type:** `BOOLEAN` -**Synonyms:** [ISNOTNULL](#isnotnull) +**Parameters**: -### Example +- `field` (Required): The field to check for existence. + +**Return type**: `BOOLEAN` + +**Synonyms**: [ISNOTNULL](#isnotnull) + +#### Example ```ppl source=accounts @@ -566,7 +617,7 @@ source=accounts | fields employer, firstname ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 3/3 @@ -579,16 +630,19 @@ fetched rows / total rows = 3/3 +----------+-----------+ ``` -## ISBLANK +## ISBLANK -### Description +**Usage**: `isblank(field)` -Usage: `isblank(field)` returns true if the field is null, an empty string, or contains only white space. +Returns `TRUE` if the field is `NULL`, an empty string, or contains only white space. -**Argument type:** All supported data types. -**Return type:** `BOOLEAN` +**Parameters**: -### Example +- `field` (Required): The field to check for blank values. + +**Return type**: `BOOLEAN` + +#### Example ```ppl source=accounts @@ -597,7 +651,7 @@ source=accounts | fields `isblank(temp)`, temp, `isblank(employer)`, employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -611,16 +665,19 @@ fetched rows / total rows = 4/4 +---------------+---------+-------------------+----------+ ``` -## ISEMPTY +## ISEMPTY -### Description +**Usage**: `isempty(field)` -Usage: `isempty(field)` returns true if the field is null or is an empty string. +Returns `TRUE` if the field is `NULL` or is an empty string. -**Argument type:** All supported data types. -**Return type:** `BOOLEAN` +**Parameters**: -### Example +- `field` (Required): The field to check for empty values. + +**Return type**: `BOOLEAN` + +#### Example ```ppl source=accounts @@ -629,7 +686,7 @@ source=accounts | fields `isempty(temp)`, temp, `isempty(employer)`, employer ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -643,37 +700,37 @@ fetched rows / total rows = 4/4 +---------------+---------+-------------------+----------+ ``` -## EARLIEST +## EARLIEST -### Description +**Usage**: `earliest(relative_string, field)` -Usage: `earliest(relative_string, field)` returns true if the value of field is after the timestamp derived from relative_string relative to the current time. Otherwise, returns false. -relative_string: -The relative string can be one of the following formats: -1. `"now"` or `"now()"`: - - Uses the current system time. -2. Absolute format (`MM/dd/yyyy:HH:mm:ss` or `yyyy-MM-dd HH:mm:ss`): - - Converts the string to a timestamp and compares it with the data. -3. Relative format: `(+|-)[+<...>]@` - - Steps to specify a relative time: - - **a. Time offset:** Indicate the offset from the current time using `+` or `-`. - - **b. Time amount:** Provide a numeric value followed by a time unit (`s`, `m`, `h`, `d`, `w`, `M`, `y`). - - **c. Snap to unit:** Optionally specify a snap unit with `@` to round the result down to the nearest unit (e.g., hour, day, month). - - **Examples** (assuming current time is `2025-05-28 14:28:34`): - - `-3d+2y` → `2027-05-25 14:28:34` - - `+1d@m` → `2025-05-29 14:28:00` - - `-3M+1y@M` → `2026-02-01 00:00:00` - -Read more details [here](https://github.com/opensearch-project/opensearch-spark/blob/main/docs/ppl-lang/functions/ppl-datetime.md#relative_timestamp) +Returns `TRUE` if the field value is after the timestamp derived from `relative_string` relative to the current time, `FALSE` otherwise. -**Argument type:** `relative_string`: `STRING`, `field`: `TIMESTAMP` -**Return type:** `BOOLEAN` +**Parameters**: -### Example +- `relative_string` (Required): The reference time specification in one of the supported formats. +- `field` (Required): The timestamp field to compare against the reference time. + +**Return type**: `BOOLEAN` + +**Relative string formats**: +1. `"now"` or `"now()"`: Uses the current system time. +2. Absolute format (`MM/dd/yyyy:HH:mm:ss` or `yyyy-MM-dd HH:mm:ss`): Converts the string to a timestamp and compares it against the field value. +3. Relative format: `(+|-)[+<...>]@` + +**Steps to specify a relative time**: +- **Time offset**: Indicate the offset from the current time using `+` or `-`. +- **Time amount**: Provide a numeric value followed by a time unit (`s`, `m`, `h`, `d`, `w`, `M`, `y`). +- **Snap to unit**: Optionally, specify a snap unit using `@` to round the result down to the nearest unit (for example, hour, day, month). + +**Examples** (assuming current time is `2025-05-28 14:28:34`): +- `-3d+2y` → `2027-05-25 14:28:34`. +- `+1d@m` → `2025-05-29 14:28:00`. +- `-3M+1y@M` → `2026-02-01 00:00:00`. + +#### Example + +The following example compares timestamps against current time and relative time: ```ppl source=accounts @@ -683,7 +740,7 @@ source=accounts | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -693,14 +750,16 @@ fetched rows / total rows = 1/1 | False | True | +-------+------+ ``` - + +The following example filters records using an absolute time format: + ```ppl source=nyc_taxi | where earliest('07/01/2014:00:30:00', timestamp) | stats COUNT() as cnt ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -711,16 +770,22 @@ fetched rows / total rows = 1/1 +-----+ ``` -## LATEST +## LATEST -### Description +**Usage**: `latest(relative_string, field)` -Usage: `latest(relative_string, field)` returns true if the value of field is before the timestamp derived from relative_string relative to the current time. Otherwise, returns false. +Returns `TRUE` if the field value is before the timestamp derived from `relative_string` relative to the current time, `FALSE` otherwise. -**Argument type:** `relative_string`: `STRING`, `field`: `TIMESTAMP` -**Return type:** `BOOLEAN` +**Parameters**: -### Example +- `relative_string` (Required): The reference time specification in one of the supported formats. +- `field` (Required): The timestamp field to compare against the reference time. + +**Return type**: `BOOLEAN` + +#### Example + +The following example compares timestamps using the latest function: ```ppl source=accounts @@ -730,7 +795,7 @@ source=accounts | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -740,14 +805,16 @@ fetched rows / total rows = 1/1 | True | True | +------+------+ ``` - + +The following example filters records using latest with an absolute time format: + ```ppl source=nyc_taxi | where latest('07/21/2014:04:00:00', timestamp) | stats COUNT() as cnt ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -777,7 +844,7 @@ Syntax: ` contains ''` ### Example -Basic substring filter: +The following example filters accounts using a substring match to find names containing 'mbe': ```ppl source=accounts @@ -785,7 +852,7 @@ source=accounts | fields firstname, age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -796,7 +863,7 @@ fetched rows / total rows = 1/1 +-----------+-----+ ``` -Case-insensitive matching (all of the following are equivalent): +The following queries are all equivalent due to case-insensitive matching: ```ppl ignore source=accounts | where firstname contains 'mbe' @@ -804,7 +871,7 @@ source=accounts | where firstname CONTAINS 'MBE' source=accounts | where firstname Contains 'Mbe' ``` -Combining with other conditions: +The following example combines substring filtering with other conditions: ```ppl source=accounts @@ -812,7 +879,7 @@ source=accounts | fields firstname, employer, age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -823,73 +890,75 @@ fetched rows / total rows = 1/1 +-----------+----------+-----+ ``` -## REGEXP_MATCH +## REGEXP_MATCH -### Description +**Usage**: `regexp_match(string, pattern)` -Usage: `regexp_match(string, pattern)` returns true if the regular expression pattern finds a match against any substring of the string value, otherwise returns false. -The function uses Java regular expression syntax for the pattern. +Returns `TRUE` if the regular expression pattern finds a match against any substring of the string value, otherwise returns `FALSE`. The function uses Java regular expression syntax for the pattern. -**Argument type:** `STRING`, `STRING` -**Return type:** `BOOLEAN` +**Parameters**: -### Example +- `string` (Required): The string to search within. +- `pattern` (Required): The regular expression pattern to match against. + +**Return type**: `BOOLEAN` + +#### Example + +The following example filters log messages using a regex pattern: -``` ppl ignore -source=logs | where regexp_match(message, 'ERROR|WARN|FATAL') | fields timestamp, message +```ppl +source=logs +| where regexp_match(message, 'ERROR|WARN|FATAL') +| fields timestamp, message ``` + -```text -fetched rows / total rows = 3/100 -+---------------------+------------------------------------------+ -| timestamp | message | -|---------------------+------------------------------------------| -| 2024-01-15 10:23:45 | ERROR: Connection timeout to database | -| 2024-01-15 10:24:12 | WARN: High memory usage detected | -| 2024-01-15 10:25:33 | FATAL: System crashed unexpectedly | -+---------------------+------------------------------------------+ -``` +| timestamp | message | +| --- | --- | +| 2024-01-15 10:23:45 | ERROR: Connection timeout to database | +| 2024-01-15 10:24:12 | WARN: High memory usage detected | +| 2024-01-15 10:25:33 | FATAL: System crashed unexpectedly | + +The following example uses regex to validate email addresses: -``` ppl ignore -source=users | where regexp_match(email, '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}') | fields name, email +```ppl +source=users +| where regexp_match(email, '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}') +| fields name, email ``` + -```text -fetched rows / total rows = 2/3 -+-------+----------------------+ -| name | email | -|-------+----------------------| -| John | john@example.com | -| Alice | alice@company.org | -+-------+----------------------+ -``` +| name | email | +| --- | --- | +| John | john@example.com | +| Alice | alice@company.org | + +The following example filters for valid public IP addresses using regex: -```ppl ignore -source=network | where regexp_match(ip_address, '^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$') AND NOT regexp_match(ip_address, '^(10\.|172\.(1[6-9]|2[0-9]|3[01])\.|192\.168\.)') | fields ip_address, status +```ppl +source=network +| where regexp_match(ip_address, '^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$') AND NOT regexp_match(ip_address, '^(10\.|172\.(1[6-9]|2[0-9]|3[01])\.|192\.168\.)') +| fields ip_address, status ``` + -```text -fetched rows / total rows = 2/10 -+---------------+--------+ -| ip_address | status | -|---------------+--------| -| 8.8.8.8 | active | -| 1.1.1.1 | active | -+---------------+--------+ -``` +| ip_address | status | +| --- | --- | +| 8.8.8.8 | active | +| 1.1.1.1 | active | + +The following example uses regex for product categorization with case-insensitive matching: -```ppl ignore -source=products | eval category = if(regexp_match(name, '(?i)(laptop|computer|desktop)'), 'Computing', if(regexp_match(name, '(?i)(phone|tablet|mobile)'), 'Mobile', 'Other')) | fields name, category +```ppl +source=products +| eval category = if(regexp_match(name, '(?i)(laptop|computer|desktop)'), 'Computing', if(regexp_match(name, '(?i)(phone|tablet|mobile)'), 'Mobile', 'Other')) +| fields name, category ``` + -```text -fetched rows / total rows = 4/4 -+------------------------+----------+ -| name | category | -|------------------------+----------| -| Dell Laptop XPS | Computing| -| iPhone 15 Pro | Mobile | -| Wireless Mouse | Other | -| Desktop Computer Tower | Computing| -+------------------------+----------+ -``` \ No newline at end of file +| name | category | +| --- | --- | +| Dell Laptop XPS | Computing | +| iPhone 15 Pro | Mobile | +| Wireless Mouse | Other | \ No newline at end of file diff --git a/docs/user/ppl/functions/conversion.md b/docs/user/ppl/functions/conversion.md index 99efe161033..7654932129c 100644 --- a/docs/user/ppl/functions/conversion.md +++ b/docs/user/ppl/functions/conversion.md @@ -1,10 +1,21 @@ -# Type Conversion Functions +# Type conversion functions -## CAST +The following type conversion functions are supported in PPL. -### Description +## CAST -Usage: `cast(expr as dateType)` cast the expr to dataType. return the value of dataType. The following conversion rules are used: +**Usage**: `cast(expr as dataType)` + +Casts the expression to the specified data type and returns the converted value. + +**Parameters**: + +- `expr` (Required): The expression to cast to a different data type. +- `dataType` (Required): The target data type for the cast operation. + +**Return type**: Specified by data type + +The following table shows the conversion rules used for casting between data types: | Src/Target | STRING | NUMBER | BOOLEAN | TIMESTAMP | DATE | TIME | IP | | --- | --- | --- | --- | --- | --- | --- | --- | @@ -16,11 +27,12 @@ Usage: `cast(expr as dateType)` cast the expr to dataType. return the value of d | TIME | Note1 | N/A | N/A | N/A | N/A | | N/A | | IP | Note2 | N/A | N/A | N/A | N/A | N/A | | -Note1: the conversion follow the JDK specification. -Note2: IP will be converted to its canonical representation. Canonical representation -for IPv6 is described in [RFC 5952](https://datatracker.ietf.org/doc/html/rfc5952). +Note1: The conversion follows the JDK specification. +Note2: IP addresses are converted to their canonical representation. The canonical representation for IPv6 is described in [RFC 5952](https://datatracker.ietf.org/doc/html/rfc5952). + +#### Example -### Example: Cast to string +The following example casts different data types to string: ```ppl source=people @@ -28,7 +40,7 @@ source=people | fields `cbool`, `cint`, `cdate` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -39,7 +51,7 @@ fetched rows / total rows = 1/1 +-------+------+------------+ ``` -### Example: Cast to number +The following example casts values to integer type: ```ppl source=people @@ -47,7 +59,7 @@ source=people | fields `cbool`, `cstring` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -58,7 +70,7 @@ fetched rows / total rows = 1/1 +-------+---------+ ``` -### Example: Cast to date +The following example casts strings to date, time, and timestamp types: ```ppl source=people @@ -66,7 +78,7 @@ source=people | fields `cdate`, `ctime`, `ctimestamp` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -77,7 +89,7 @@ fetched rows / total rows = 1/1 +------------+----------+---------------------+ ``` -### Example: Cast function can be chained +The following example demonstrates chaining cast functions: ```ppl source=people @@ -85,7 +97,7 @@ source=people | fields `cbool` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -96,20 +108,19 @@ fetched rows / total rows = 1/1 +-------+ ``` -## IMPLICIT (AUTO) TYPE CONVERSION +## Implicit type conversion + +Implicit conversion is automatic casting. When a function does not have an exact match for the input types, the engine looks for another signature that can safely handle the values. It selects the option that requires the least conversion of the original types, so you can mix literals and fields without adding explicit `cast` functions. -Implicit conversion is automatic casting. When a function does not have an exact match for the -input types, the engine looks for another signature that can safely work with the values. It picks -the option that requires the least stretching of the original types, so you can mix literals and -fields without adding `CAST` everywhere. +### String to numeric type conversion -### String to numeric +When a string is used where a numeric value is expected, the engine attempts to parse the string as a number: +- The string must represent a valid numeric value, such as `"3.14"` or `"42"`. Any other value causes the query to fail. +- If a string is used alongside numeric arguments, the engine treats it as a `DOUBLE` so that the numeric overload of the function can be applied. -When a string stands in for a number we simply parse the text: -- The value must be something like `"3.14"` or `"42"`. Anything else causes the query to fail. -- If a string appears next to numeric arguments, it is treated as a `DOUBLE` so the numeric overload of the function can run. +#### Example -### Example: Use string in arithmetic operator +The following example demonstrates using strings in arithmetic operations: ```ppl source=people @@ -117,7 +128,7 @@ source=people | fields divide, multiply, add, minus, concat ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -128,7 +139,7 @@ fetched rows / total rows = 1/1 +--------+----------+------+-------+--------+ ``` -### Example: Use string in comparison operator +The following example demonstrates using strings in comparison operations: ```ppl source=people @@ -136,7 +147,7 @@ source=people | fields e, en, ed, edn, l, ld, i ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -147,34 +158,30 @@ fetched rows / total rows = 1/1 +------+-------+------+-------+------+------+------+ ``` -## TOSTRING +## TOSTRING -### Description +**Usage**: `tostring(value[, format])` -The following usage options are available, depending on the parameter types and the number of parameters. +Converts the value to a string representation. If a format is provided, converts numbers to the specified format type. For Boolean values, converts to `TRUE` or `FALSE`. -Usage with format type: `tostring(ANY, [format])`: Converts the value in first argument to provided format type string in second argument. If second argument is not provided, then it converts to default string representation. +**Parameters**: -**Return type:** `STRING` +- `value` (Required): The value to convert to string (any data type). +- `format` (Optional): The format type for number conversion. This parameter is only used when `value` is a number. If `value` is a Boolean, this parameter is ignored. -Usage for boolean parameter without format type `tostring(boolean)`: Converts the string to 'TRUE' or 'FALSE'. - -**Return type:** `STRING` +Format types: -You can use this function with the eval commands and as part of eval expressions. If first argument can be any valid type, second argument is optional and if provided, it needs to be format name to convert to where first argument contains only numbers. If first argument is boolean, then second argument is not used even if its provided. +- `binary`: Converts a number to a binary value. +- `hex`: Converts the number to a hexadecimal value. +- `commas`: Formats the number using commas. If the number includes a decimal, the function rounds the number to the nearest two decimal places. +- `duration`: Converts the value in seconds to the readable time format `HH:MM:SS`. +- `duration_millis`: Converts the value in milliseconds to the readable time format `HH:MM:SS`. -Format types: -1. "binary" Converts a number to a binary value. -2. "hex" Converts the number to a hexadecimal value. -3. "commas" Formats the number with commas. If the number includes a decimal, the function rounds the number to nearest two decimal places. -4. "duration" Converts the value in seconds to the readable time format HH:MM:SS. -5. "duration_millis" Converts the value in milliseconds to the readable time format HH:MM:SS. - -The format argument is optional and is only used when the value argument is a number. The tostring function supports the following formats. +**Return type**: `STRING` -### Example: Convert number to binary string +#### Example -You can use this function to convert a number to a string of its binary representation. +The following example converts a number to its binary string representation: ```ppl source=accounts @@ -183,7 +190,7 @@ source=accounts | fields firstname, balance_binary, balance ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -194,9 +201,7 @@ fetched rows / total rows = 1/1 +-----------+------------------+---------+ ``` -### Example: Convert number to hex string - -You can use this function to convert a number to a string of its hex representation. +The following example converts a number to its hexadecimal string representation: ```ppl source=accounts @@ -205,7 +210,7 @@ source=accounts | fields firstname, balance_hex, balance ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -216,9 +221,7 @@ fetched rows / total rows = 1/1 +-----------+-------------+---------+ ``` -### Example: Format number with commas - -The following example formats the column totalSales to display values with commas. +The following example formats numbers with comma separators: ```ppl source=accounts @@ -227,7 +230,7 @@ source=accounts | fields firstname, balance_commas, balance ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -237,10 +240,10 @@ fetched rows / total rows = 1/1 | Amber | 39,225 | 39225 | +-----------+----------------+---------+ ``` - -### Example: Convert seconds to duration format -The following example converts number of seconds to HH:MM:SS format representing hours, minutes and seconds. +### Example: Convert seconds to duration format + +The following example converts the number of seconds to the `HH:MM:SS` format representing hours, minutes, and seconds: ```ppl source=accounts @@ -249,7 +252,7 @@ source=accounts | fields firstname, duration ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -260,9 +263,7 @@ fetched rows / total rows = 1/1 +-----------+----------+ ``` -### Example: Convert boolean to string - -The following example converts boolean parameter to string. +The following example converts a Boolean value to string: ```ppl source=accounts @@ -271,7 +272,7 @@ source=accounts | fields `boolean_str` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -284,22 +285,35 @@ fetched rows / total rows = 1/1 ## TONUMBER -### Description +**Usage**: `tonumber(string[, base])` + +Converts the string value to a number. The optional `base` parameter specifies the base of the input string. If not provided, the function assumes base `10`. + +**Parameters**: -Usage: `tonumber(string, [base])` converts the value in first argument. -The second argument describes the base of first argument. If second argument is not provided, then it converts to base 10 number representation. +- `string` (Required): The string representation of the number to convert. +- `base` (Optional): The base of the input string (between `2` and `36`). Defaults to `10`. -**Return type:** `NUMBER` +**Return type**: `NUMBER` -You can use this function with the eval commands and as part of eval expressions. Base values can be between 2 and 36. The maximum value supported for base 10 is +(2-2^-52)·2^1023 and minimum is -(2-2^-52)·2^1023. The maximum for other supported bases is 2^63-1 (or 7FFFFFFFFFFFFFFF) and minimum is -2^63 (or -7FFFFFFFFFFFFFFF). If the tonumber function cannot parse a field value to a number, the function returns NULL. You can use this function to convert a string representation of a binary number to return the corresponding number in base 10. +You can use this function with `eval` commands and as part of `eval` expressions. Base values can be between `2` and `36`. -### Example: Convert binary string to number +**Value limits**: +- Base 10: Maximum is +(2-2^-52)·2^1023 and minimum is -(2-2^-52)·2^1023. +- Other bases: Maximum is 2^63-1 (or 7FFFFFFFFFFFFFFF) and minimum is -2^63 (or -7FFFFFFFFFFFFFFF). + +If the `tonumber` function cannot parse a field value to a number, the function returns `NULL`. You can use this function to convert string representations of numbers in various bases to their corresponding base 10 values. + +#### Example: Convert a binary string to a number ```ppl -source=people | eval int_value = tonumber('010101',2) | fields int_value | head 1 +source=people +| eval int_value = tonumber('010101',2) +| fields int_value +| head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -310,13 +324,16 @@ fetched rows / total rows = 1/1 +-----------+ ``` -### Example: Convert hex string to number +#### Example: Convert a hexadecimal string to a number ```ppl -source=people | eval int_value = tonumber('FA34',16) | fields int_value | head 1 +source=people +| eval int_value = tonumber('FA34',16) +| fields int_value +| head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -327,13 +344,16 @@ fetched rows / total rows = 1/1 +-----------+ ``` -### Example: Convert decimal string to number +#### Example: Convert a decimal string without a decimal part to a number ```ppl -source=people | eval int_value = tonumber('4598') | fields int_value | head 1 +source=people +| eval int_value = tonumber('4598') +| fields int_value +| head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -344,13 +364,16 @@ fetched rows / total rows = 1/1 +-----------+ ``` -### Example: Convert decimal string with fraction to number +#### Example: Convert a decimal string with a decimal part to a number ```ppl -source=people | eval double_value = tonumber('4598.678') | fields double_value | head 1 +source=people +| eval double_value = tonumber('4598.678') +| fields double_value +| head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 diff --git a/docs/user/ppl/functions/cryptographic.md b/docs/user/ppl/functions/cryptographic.md index 1ea1ca50f5c..25c38df2d33 100644 --- a/docs/user/ppl/functions/cryptographic.md +++ b/docs/user/ppl/functions/cryptographic.md @@ -1,16 +1,20 @@ -# PPL Cryptographic Functions +# Cryptographic functions -## MD5 +The following cryptographic functions are supported in PPL. -### Description +## MD5 -Version: 3.1.0 -Usage: `md5(str)` calculates the MD5 digest and returns the value as a 32-character hex string. +**Usage**: `MD5(str)` -**Argument type:** `STRING` -**Return type:** `STRING` +Calculates the MD5 digest and returns the value as a 32-character hex string. -### Example +**Parameters**: + +- `str` (Required): The string for which to calculate the MD5 digest. + +**Return type**: `STRING` + +#### Example ```ppl source=people @@ -18,7 +22,7 @@ source=people | fields `MD5('hello')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -29,17 +33,19 @@ fetched rows / total rows = 1/1 +----------------------------------+ ``` -## SHA1 +## SHA1 + +**Usage**: `SHA1(str)` -### Description +Returns the SHA-1 hash as a hex string. -Version: 3.1.0 -Usage: `sha1(str)` returns the hex string result of SHA-1. +**Parameters**: -**Argument type:** `STRING` -**Return type:** `STRING` +- `str` (Required): The string for which to calculate the SHA-1 hash. -### Example +**Return type**: `STRING` + +#### Example ```ppl source=people @@ -47,7 +53,7 @@ source=people | fields `SHA1('hello')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -58,18 +64,20 @@ fetched rows / total rows = 1/1 +------------------------------------------+ ``` -## SHA2 +## SHA2 + +**Usage**: `SHA2(str, numBits)` -### Description +Returns the result of SHA-2 family hash functions (SHA-224, SHA-256, SHA-384, and SHA-512) as a hex string. -Version: 3.1.0 -Usage: `sha2(str, numBits)` returns the hex string result of SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). -The numBits indicates the desired bit length of the result, which must have a value of 224, 256, 384, or 512. +**Parameters**: -**Argument type:** `STRING`, `INTEGER` -**Return type:** `STRING` +- `str` (Required): The string for which to calculate the SHA-2 hash. +- `numBits` (Required): The desired bit length of the result, which must be `224`, `256`, `384`, or `512`. -### Example +**Return type**: `STRING` + +#### Example: SHA-256 hash ```ppl source=people @@ -77,7 +85,7 @@ source=people | fields `SHA2('hello',256)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -87,6 +95,8 @@ fetched rows / total rows = 1/1 | 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 | +------------------------------------------------------------------+ ``` + +#### Example: SHA-512 hash ```ppl source=people @@ -94,7 +104,7 @@ source=people | fields `SHA2('hello',512)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 diff --git a/docs/user/ppl/functions/datetime.md b/docs/user/ppl/functions/datetime.md index 9ed105ea91a..3f53679b39c 100644 --- a/docs/user/ppl/functions/datetime.md +++ b/docs/user/ppl/functions/datetime.md @@ -1,22 +1,25 @@ -# Date and Time Functions +# Date and time functions - All PPL date and time functions use the UTC time zone. Both input and output values are interpreted as UTC. - For instance, an input timestamp literal like '2020-08-26 01:01:01' is assumed to be in UTC, and the now() - function also returns the current date and time in UTC. +All PPL date and time functions use the UTC time zone. Both input and output values are interpreted as UTC. For example, an input timestamp literal such as `'2020-08-26 01:01:01'` is assumed to be in UTC, and the `now()` function also returns the current date and time in UTC. -## ADDDATE +The following date and time functions are supported in PPL. -### Description +## ADDDATE + +**Usage**: `ADDDATE(date, INTERVAL expr unit)`, `ADDDATE(date, days)` + +Adds the interval or number of days to the date. The first form adds an interval to the date, the second form adds the specified number of days as an integer to the date. If the first argument is `TIME`, today's date is used. If the first argument is `DATE`, the time at midnight is used. + +**Parameters**: + +- `date` (Required): The date, timestamp, or time value to modify. +- `INTERVAL expr unit` (Required in first form): The interval to add to the date. +- `days` (Required in second form): The number of days to add as an integer. + +**Return type**: `TIMESTAMP` for the interval form, `DATE` for the integer days form when the input is `DATE`, `TIMESTAMP` when the input is `TIMESTAMP` or `TIME`. + +Synonyms: [`DATE_ADD`](#date_add) (when used in interval form) -Usage: `adddate(date, INTERVAL expr unit)` / adddate(date, days) adds the interval of second argument to date; adddate(date, days) adds the second argument as integer number of days to date. -If first argument is TIME, today's date is used; if first argument is DATE, time at midnight is used. -**Argument type:** `DATE/TIMESTAMP/TIME, INTERVAL/LONG` -Return type map: -(DATE/TIMESTAMP/TIME, INTERVAL) -> TIMESTAMP -(DATE, LONG) -> DATE -(TIMESTAMP/TIME, LONG) -> TIMESTAMP -Synonyms: [DATE_ADD](#date_add) when invoked with the INTERVAL form of the second argument. -Antonyms: [SUBDATE](#subdate) ### Example ```ppl @@ -25,7 +28,7 @@ source=people | fields `'2020-08-26' + 1h`, `'2020-08-26' + 1`, `ts '2020-08-26 01:01:01' + 1` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -36,25 +39,30 @@ fetched rows / total rows = 1/1 +---------------------+------------------+------------------------------+ ``` -## ADDTIME +## ADDTIME -### Description +**Usage**: `ADDTIME(expr1, expr2)` + +Adds the second expression to the first expression and returns the result. If an argument is `TIME`, today's date is used. If an argument is `DATE`, the time at midnight is used. + +**Parameters**: + +- `expr1` (Required): The base date, timestamp, or time value. +- `expr2` (Required): The date, timestamp, or time value to add to the first expression. + +**Return type**: `TIMESTAMP` when the first argument is `DATE` or `TIMESTAMP`, `TIME` when the first argument is `TIME`. + +#### Examples + +The following example shows adding two DATE values: -Usage: `addtime(expr1, expr2)` adds expr2 to expr1 and returns the result. If argument is TIME, today's date is used; if argument is DATE, time at midnight is used. -**Argument type:** `DATE/TIMESTAMP/TIME, DATE/TIMESTAMP/TIME` -Return type map: -(DATE/TIMESTAMP, DATE/TIMESTAMP/TIME) -> TIMESTAMP -(TIME, DATE/TIMESTAMP/TIME) -> TIME -Antonyms: [SUBTIME](#subtime) -### Example - ```ppl source=people | eval `'2008-12-12' + 0` = ADDTIME(DATE('2008-12-12'), DATE('2008-11-15')) | fields `'2008-12-12' + 0` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -64,6 +72,8 @@ fetched rows / total rows = 1/1 | 2008-12-12 00:00:00 | +---------------------+ ``` + +The following example shows adding TIME and DATE values: ```ppl source=people @@ -71,7 +81,7 @@ source=people | fields `'23:59:59' + 0` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -81,6 +91,8 @@ fetched rows / total rows = 1/1 | 23:59:59 | +----------------+ ``` + +The following example shows combining DATE and TIME into a timestamp: ```ppl source=people @@ -88,7 +100,7 @@ source=people | fields `'2004-01-01' + '23:59:59'` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -98,6 +110,8 @@ fetched rows / total rows = 1/1 | 2004-01-01 23:59:59 | +---------------------------+ ``` + +The following example shows adding two TIME values: ```ppl source=people @@ -105,7 +119,7 @@ source=people | fields `'10:20:30' + '00:05:42'` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -115,6 +129,8 @@ fetched rows / total rows = 1/1 | 10:26:12 | +-------------------------+ ``` + +The following example shows adding two TIMESTAMP values: ```ppl source=people @@ -122,7 +138,7 @@ source=people | fields `'2007-02-28 10:20:30' + '20:40:50'` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -133,23 +149,29 @@ fetched rows / total rows = 1/1 +------------------------------------+ ``` -## CONVERT_TZ +## CONVERT_TZ -### Description +**Usage**: `CONVERT_TZ(timestamp, from_timezone, to_timezone)` + +Constructs a local timestamp converted from the source time zone to the target time zone. Returns `NULL` when any of the three function arguments is invalid: the timestamp is not in the format `yyyy-MM-dd HH:mm:ss`, a time zone is not in `(+/-)HH:mm` format, dates are invalid (such as February 30th), or time zones are outside the valid range of -13:59 to +14:00. + +**Parameters**: + +- `timestamp` (Required): The timestamp or string to convert in `yyyy-MM-dd HH:mm:ss` format. +- `from_timezone` (Required): The source time zone in `(+/-)HH:mm` format. +- `to_timezone` (Required): The target time zone in `(+/-)HH:mm` format. + +**Return type**: `TIMESTAMP` + +#### Examples -Usage: `convert_tz(timestamp, from_timezone, to_timezone)` constructs a local timestamp converted from the from_timezone to the to_timezone. CONVERT_TZ returns null when any of the three function arguments are invalid, i.e. timestamp is not in the format yyyy-MM-dd HH:mm:ss or the timezone is not in (+/-)HH:mm. It also is invalid for invalid dates, such as February 30th and invalid timezones, which are ones outside of -13:59 and +14:00. -**Argument type:** `TIMESTAMP/STRING, STRING, STRING` -**Return type:** `TIMESTAMP` -Conversion from +00:00 timezone to +10:00 timezone. Returns the timestamp argument converted from +00:00 to +10:00 -### Example - ```ppl source=people | eval `convert_tz('2008-05-15 12:00:00','+00:00','+10:00')` = convert_tz('2008-05-15 12:00:00','+00:00','+10:00') | fields `convert_tz('2008-05-15 12:00:00','+00:00','+10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -160,8 +182,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -The valid timezone range for convert_tz is (-13:59, +14:00) inclusive. Timezones outside of the range, such as +15:00 in this example will return null. -### Example +The valid time zone range for `convert_tz` is (-13:59, +14:00) inclusive. Time zones outside of the range, such as +15:00 in this example, return `NULL`: ```ppl source=people @@ -169,7 +190,7 @@ source=people | fields `convert_tz('2008-05-15 12:00:00','+00:00','+15:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -180,8 +201,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -Conversion from a positive timezone to a negative timezone that goes over date line. -### Example +The following example shows conversion from a positive time zone to a negative time zone that crosses the date line: ```ppl source=people @@ -189,7 +209,7 @@ source=people | fields `convert_tz('2008-05-15 12:00:00','+03:30','-10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -200,8 +220,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -Valid dates are required in convert_tz, invalid dates such as April 31st (not a date in the Gregorian calendar) will result in null. -### Example +Valid dates are required in `convert_tz`. For invalid dates such as April 31st (not a date in the Gregorian calendar), `NULL` is returned: ```ppl source=people @@ -209,7 +228,7 @@ source=people | fields `convert_tz('2008-04-31 12:00:00','+03:30','-10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -220,8 +239,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -Valid dates are required in convert_tz, invalid dates such as February 30th (not a date in the Gregorian calendar) will result in null. -### Example +The following example shows that February 30th also returns `NULL`: ```ppl source=people @@ -229,7 +247,7 @@ source=people | fields `convert_tz('2008-02-30 12:00:00','+03:30','-10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -240,8 +258,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -February 29th 2008 is a valid date because it is a leap year. -### Example +February 29th 2008 is a valid date because it is a leap year: ```ppl source=people @@ -249,7 +266,7 @@ source=people | fields `convert_tz('2008-02-29 12:00:00','+03:30','-10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -260,8 +277,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -Valid dates are required in convert_tz, invalid dates such as February 29th 2007 (2007 is not a leap year) will result in null. -### Example +The following example shows that February 29th 2007 returns `NULL` because 2007 is not a leap year: ```ppl source=people @@ -269,7 +285,7 @@ source=people | fields `convert_tz('2007-02-29 12:00:00','+03:30','-10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -280,8 +296,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -The valid timezone range for convert_tz is (-13:59, +14:00) inclusive. Timezones outside of the range, such as +14:01 in this example will return null. -### Example +The valid time zone range for `convert_tz` is [-13:59, +14:00] inclusive. Time zones outside of the range, such as +14:01 in this example, return `NULL`: ```ppl source=people @@ -289,7 +304,7 @@ source=people | fields `convert_tz('2008-02-01 12:00:00','+14:01','+00:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -300,8 +315,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -The valid timezone range for convert_tz is (-13:59, +14:00) inclusive. Timezones outside of the range, such as +14:00 in this example will return a correctly converted date time object. -### Example +The valid time zone range for `convert_tz` is (-13:59, +14:00) inclusive. Time zones within the range, such as +14:00 in this example, return a correctly converted date time object: ```ppl source=people @@ -309,7 +323,7 @@ source=people | fields `convert_tz('2008-02-01 12:00:00','+14:00','+00:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -320,8 +334,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -The valid timezone range for convert_tz is (-13:59, +14:00) inclusive. Timezones outside of the range, such as -14:00 will result in null -### Example +The following example shows that -14:00 (outside the valid range) returns `NULL`: ```ppl source=people @@ -329,7 +342,7 @@ source=people | fields `convert_tz('2008-02-01 12:00:00','-14:00','+00:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -340,8 +353,7 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -The valid timezone range for convert_tz is (-13:59, +14:00) inclusive. This timezone is within range so it is valid and will convert the time. -### Example +The valid time zone range for `convert_tz` is [-13:59, +14:00] inclusive. Time zones at the lower boundary of the range, such as -13:59, are valid and return converted results: ```ppl source=people @@ -349,7 +361,7 @@ source=people | fields `convert_tz('2008-02-01 12:00:00','-13:59','+00:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -360,14 +372,16 @@ fetched rows / total rows = 1/1 +-----------------------------------------------------+ ``` -## CURDATE +## CURDATE + +**Usage**: `CURDATE()` -### Description +Returns the current date as a value in `YYYY-MM-DD` format. The function returns the current date in UTC at the time when the statement is executed. + +**Parameters**: None + +**Return type**: `DATE` -Returns the current date as a value in 'YYYY-MM-DD' format. -CURDATE() returns the current date in UTC at the time the statement is executed. -**Return type:** `DATE` -Specification: CURDATE() -> DATE ### Example ```ppl ignore @@ -376,7 +390,7 @@ source=people | fields `CURDATE()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -387,11 +401,16 @@ fetched rows / total rows = 1/1 +------------+ ``` -## CURRENT_DATE +## CURRENT_DATE + +**Usage**: `CURRENT_DATE()` -### Description +A synonym for `CURDATE()`. + +**Parameters**: None + +**Return type**: `DATE` -`CURRENT_DATE()` is a synonym for [CURDATE()](#curdate). ### Example ```ppl ignore @@ -400,7 +419,7 @@ source=people | fields `CURRENT_DATE()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -411,11 +430,16 @@ fetched rows / total rows = 1/1 +------------------+ ``` -## CURRENT_TIME +## CURRENT_TIME + +**Usage**: `CURRENT_TIME()` + +A synonym for `CURTIME()`. + +**Parameters**: None -### Description +**Return type**: `TIME` -`CURRENT_TIME()` is a synonym for [CURTIME()](#curtime). ### Example ```ppl ignore @@ -424,7 +448,7 @@ source=people | fields `CURRENT_TIME()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -435,11 +459,16 @@ fetched rows / total rows = 1/1 +------------------+ ``` -## CURRENT_TIMESTAMP +## CURRENT_TIMESTAMP -### Description +**Usage**: `CURRENT_TIMESTAMP()` + +A synonym for `NOW()`. + +**Parameters**: None + +**Return type**: `TIMESTAMP` -`CURRENT_TIMESTAMP()` is a synonym for [NOW()](#now). ### Example ```ppl ignore @@ -448,7 +477,7 @@ source=people | fields `CURRENT_TIMESTAMP()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -459,23 +488,25 @@ fetched rows / total rows = 1/1 +-----------------------+ ``` -## CURTIME +## CURTIME -### Description +**Usage**: `CURTIME()` + +Returns the current time as a value in the `hh:mm:ss` format in the UTC time zone. `CURTIME()` returns the time at which the statement began to execute, as [`NOW()`](#now) does. + +**Parameters**: None + +**Return type**: `TIME` + +#### Example -Returns the current time as a value in 'hh:mm:ss' format in the UTC time zone. -CURTIME() returns the time at which the statement began to execute as [NOW()](#now) does. -**Return type:** `TIME` -Specification: CURTIME() -> TIME -### Example - ```ppl ignore source=people | eval `value_1` = CURTIME(), `value_2` = CURTIME() | fields `value_1`, `value_2` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -486,22 +517,28 @@ fetched rows / total rows = 1/1 +----------+----------+ ``` -## DATE +## DATE -### Description +**Usage**: `DATE(expr)` + +Constructs a date type from the input string `expr`. If the argument is a date or timestamp, it extracts the date value part from the expression. + +**Parameters**: +- `expr` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `DATE` + +#### Examples + +The following example extracts a date from a string: -Usage: `date(expr)` constructs a date type with the input string expr as a date. If the argument is of date/timestamp, it extracts the date value part from the expression. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `DATE` -### Example - ```ppl source=people | eval `DATE('2020-08-26')` = DATE('2020-08-26') | fields `DATE('2020-08-26')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -511,6 +548,8 @@ fetched rows / total rows = 1/1 | 2020-08-26 | +--------------------+ ``` + +The following example extracts the date from a timestamp: ```ppl source=people @@ -518,7 +557,7 @@ source=people | fields `DATE(TIMESTAMP('2020-08-26 13:49:00'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -528,6 +567,8 @@ fetched rows / total rows = 1/1 | 2020-08-26 | +----------------------------------------+ ``` + +The following example extracts the date from a string containing both date and time: ```ppl source=people @@ -535,24 +576,7 @@ source=people | fields `DATE('2020-08-26 13:49')` ``` -Expected output: - -```text -fetched rows / total rows = 1/1 -+--------------------------+ -| DATE('2020-08-26 13:49') | -|--------------------------| -| 2020-08-26 | -+--------------------------+ -``` - -```ppl -source=people -| eval `DATE('2020-08-26 13:49')` = DATE('2020-08-26 13:49') -| fields `DATE('2020-08-26 13:49')` -``` - -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -563,16 +587,22 @@ fetched rows / total rows = 1/1 +--------------------------+ ``` -## DATE_ADD +## DATE_ADD -### Description +**Usage**: `DATE_ADD(date, INTERVAL expr unit)` -Usage: `date_add(date, INTERVAL expr unit)` adds the interval expr to date. If first argument is TIME, today's date is used; if first argument is DATE, time at midnight is used. -**Argument type:** `DATE/TIMESTAMP/TIME, INTERVAL` -**Return type:** `TIMESTAMP` -Synonyms: [ADDDATE](#adddate) -Antonyms: [DATE_SUB](#date_sub) -### Example +Adds the interval `expr` to `date`. If the first argument is `TIME`, today's date is used. If the first argument is `DATE`, the time at midnight is used. + +**Parameters**: +- `date` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. +- `INTERVAL expr unit` (Required): An `INTERVAL` expression. + +**Return type**: `TIMESTAMP` + +Synonyms: [`ADDDATE`](#adddate) +Antonyms: [`DATE_SUB`](#date_sub) + +#### Example ```ppl source=people @@ -580,7 +610,7 @@ source=people | fields `'2020-08-26' + 1h`, `ts '2020-08-26 01:01:01' + 1d` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -591,56 +621,57 @@ fetched rows / total rows = 1/1 +---------------------+-------------------------------+ ``` -## DATE_FORMAT +## DATE_FORMAT -### Description +**Usage**: `DATE_FORMAT(date, format)` -Usage: `date_format(date, format)` formats the date argument using the specifiers in the format argument. -If an argument of type TIME is provided, the local date is used. -The following table describes the available specifier arguments. +Formats the `date` argument using the specifiers in the `format` argument. If an argument of type `TIME` is provided, the local date is used. +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. +- `format` (Required): A `STRING` containing format specifiers. + +**Return type**: `STRING` + +The following table describes the available format specifiers. | Specifier | Description | | --- | --- | -| %a | Abbreviated weekday name (Sun..Sat) | -| %b | Abbreviated month name (Jan..Dec) | -| %c | Month, numeric (0..12) | -| %D | Day of the month with English suffix (0th, 1st, 2nd, 3rd, ...) | -| %d | Day of the month, numeric (00..31) | -| %e | Day of the month, numeric (0..31) | -| %f | Microseconds (000000..999999) | -| %H | Hour (00..23) | -| %h | Hour (01..12) | -| %I | Hour (01..12) | -| %i | Minutes, numeric (00..59) | -| %j | Day of year (001..366) | -| %k | Hour (0..23) | -| %l | Hour (1..12) | -| %M | Month name (January..December) | -| %m | Month, numeric (00..12) | -| %p | AM or PM | -| %r | Time, 12-hour (hh:mm:ss followed by AM or PM) | -| %S | Seconds (00..59) | -| %s | Seconds (00..59) | -| %T | Time, 24-hour (hh:mm:ss) | -| %U | Week (00..53), where Sunday is the first day of the week; WEEK() mode 0 | -| %u | Week (00..53), where Monday is the first day of the week; WEEK() mode 1 | -| %V | Week (01..53), where Sunday is the first day of the week; WEEK() mode 2; used with %X | -| %v | Week (01..53), where Monday is the first day of the week; WEEK() mode 3; used with %x | -| %W | Weekday name (Sunday..Saturday) | -| %w | Day of the week (0=Sunday..6=Saturday) | -| %X | Year for the week where Sunday is the first day of the week, numeric, four digits; used with %V | -| %x | Year for the week, where Monday is the first day of the week, numeric, four digits; used with %v | -| %Y | Year, numeric, four digits | -| %y | Year, numeric (two digits) | -| %% | A literal % character | -| %x | x, for any “x” not listed above | -| x | x, for any smallcase/uppercase alphabet except [aydmshiHIMYDSEL] | - - -**Argument type:** `STRING/DATE/TIME/TIMESTAMP, STRING` -**Return type:** `STRING` -### Example +| `%a` | Abbreviated weekday name (Sun..Sat) | +| `%b` | Abbreviated month name (Jan..Dec) | +| `%c` | Month, numeric (0..12) | +| `%D` | Day of the month with English suffix (0th, 1st, 2nd, 3rd, ...) | +| `%d` | Day of the month, numeric (00..31) | +| `%e` | Day of the month, numeric (0..31) | +| `%f` | Microseconds (000000..999999) | +| `%H` | Hour (00..23) | +| `%h` | Hour (01..12) | +| `%I` | Hour (01..12) | +| `%i` | Minutes, numeric (00..59) | +| `%j` | Day of year (001..366) | +| `%k` | Hour (0..23) | +| `%l` | Hour (1..12) | +| `%M` | Month name (January..December) | +| `%m` | Month, numeric (00..12) | +| `%p` | AM or PM | +| `%r` | Time, 12-hour (hh:mm:ss followed by AM or PM) | +| `%S` | Seconds (00..59) | +| `%s` | Seconds (00..59) | +| `%T` | Time, 24-hour (hh:mm:ss) | +| `%U` | Week (00..53), where Sunday is the first day of the week; WEEK() mode 0 | +| `%u` | Week (00..53), where Monday is the first day of the week; WEEK() mode 1 | +| `%V` | Week (01..53), where Sunday is the first day of the week; WEEK() mode 2; used with `%X` | +| `%v` | Week (01..53), where Monday is the first day of the week; WEEK() mode 3; used with `%x` | +| `%W` | Weekday name (Sunday..Saturday) | +| `%w` | Day of the week (0=Sunday..6=Saturday) | +| `%X` | Year for the week where Sunday is the first day of the week, numeric, four digits; used with `%V` | +| `%x` | Year for the week, where Monday is the first day of the week, numeric, four digits; used with `%v` | +| `%Y` | Year, numeric, four digits | +| `%y` | Year, numeric (two digits) | +| `%%` | A literal % character | +| `x` | x, for any lowercase/uppercase alphabet except [aydmshiHIMYDSEL] | + +#### Example ```ppl source=people @@ -648,7 +679,7 @@ source=people | fields `DATE_FORMAT('1998-01-31 13:14:15.012345', '%T.%f')`, `DATE_FORMAT(TIMESTAMP('1998-01-31 13:14:15.012345'), '%Y-%b-%D %r')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -659,25 +690,29 @@ fetched rows / total rows = 1/1 +----------------------------------------------------+---------------------------------------------------------------------+ ``` -## DATETIME +## DATETIME -### Description +**Usage**: `DATETIME(timestamp)` or `DATETIME(date, to_timezone)` + +Converts the datetime to a new time zone. + +**Parameters**: +- `timestamp` (Required): A `TIMESTAMP` or `STRING` value. +- `to_timezone` (Optional): A `STRING` time zone value. + +**Return type**: `TIMESTAMP` + +#### Examples + +The following example converts a datetime to a different time zone: -Usage: `DATETIME(timestamp)`/ DATETIME(date, to_timezone) Converts the datetime to a new timezone -**Argument type:** `timestamp/STRING` -Return type map: -(TIMESTAMP, STRING) -> TIMESTAMP -(TIMESTAMP) -> TIMESTAMP -Converting timestamp with timezone to the second argument timezone. -### Example - ```ppl source=people | eval `DATETIME('2004-02-28 23:00:00-10:00', '+10:00')` = DATETIME('2004-02-28 23:00:00-10:00', '+10:00') | fields `DATETIME('2004-02-28 23:00:00-10:00', '+10:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -688,8 +723,7 @@ fetched rows / total rows = 1/1 +-------------------------------------------------+ ``` -The valid timezone range for convert_tz is (-13:59, +14:00) inclusive. Timezones outside of the range will result in null. -### Example +The valid time zone range is (-13:59, +14:00) inclusive. The following example shows that time zones outside of this range return `NULL`: ```ppl source=people @@ -697,7 +731,7 @@ source=people | fields `DATETIME('2008-01-01 02:00:00', '-14:00')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -708,16 +742,22 @@ fetched rows / total rows = 1/1 +-------------------------------------------+ ``` -## DATE_SUB +## DATE_SUB -### Description +**Usage**: `DATE_SUB(date, INTERVAL expr unit)` -Usage: `date_sub(date, INTERVAL expr unit)` subtracts the interval expr from date. If first argument is TIME, today's date is used; if first argument is DATE, time at midnight is used. -**Argument type:** `DATE/TIMESTAMP/TIME, INTERVAL` -**Return type:** `TIMESTAMP` -Synonyms: [SUBDATE](#subdate) -Antonyms: [DATE_ADD](#date_add) -### Example +Subtracts the interval `expr` from `date`. If the first argument is `TIME`, today's date is used. If the first argument is `DATE`, the time at midnight is used. + +**Parameters**: +- `date` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. +- `INTERVAL expr unit` (Required): An `INTERVAL` expression. + +**Return type**: `TIMESTAMP` + +Synonyms: [`SUBDATE`](#subdate) +Antonyms: [`DATE_ADD`](#date_add) + +#### Example ```ppl source=people @@ -725,7 +765,7 @@ source=people | fields `'2008-01-02' - 31d`, `ts '2020-08-26 01:01:01' + 1h` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -736,12 +776,18 @@ fetched rows / total rows = 1/1 +---------------------+-------------------------------+ ``` -## DATEDIFF +## DATEDIFF -Usage: Calculates the difference of date parts of given values. If the first argument is time, today's date is used. -**Argument type:** `DATE/TIMESTAMP/TIME, DATE/TIMESTAMP/TIME` -**Return type:** `LONG` -### Example +**Usage**: `DATEDIFF(date1, date2)` +Calculates the difference of the date parts of given values. If the first argument is `TIME`, today's date is used. + +**Parameters**: +- `date1` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. +- `date2` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. + +**Return type**: `LONG` + +#### Example ```ppl source=people @@ -749,7 +795,7 @@ source=people | fields `'2000-01-02' - '2000-01-01'`, `'2001-02-01' - '2004-01-01'`, `today - today` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -760,15 +806,20 @@ fetched rows / total rows = 1/1 +-----------------------------+-----------------------------+---------------+ ``` -## DAY +## DAY -### Description +**Usage**: `DAY(date)` -Usage: `day(date)` extracts the day of the month for date, in the range 1 to 31. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAYOFMONTH](#dayofmonth), [DAY_OF_MONTH](#day_of_month) -### Example +Extracts the day of the month for `date`, in the range 1 to 31. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAYOFMONTH`](#dayofmonth), [`DAY_OF_MONTH`](#day_of_month) + +#### Example ```ppl source=people @@ -776,7 +827,7 @@ source=people | fields `DAY(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -787,14 +838,18 @@ fetched rows / total rows = 1/1 +-------------------------+ ``` -## DAYNAME +## DAYNAME -### Description +**Usage**: `DAYNAME(date)` -Usage: `dayname(date)` returns the name of the weekday for date, including Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `STRING` -### Example +Returns the name of the weekday for `date`. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `STRING` + +#### Example ```ppl source=people @@ -802,7 +857,7 @@ source=people | fields `DAYNAME(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -813,15 +868,20 @@ fetched rows / total rows = 1/1 +-----------------------------+ ``` -## DAYOFMONTH +## DAYOFMONTH -### Description +**Usage**: `DAYOFMONTH(date)` -Usage: `dayofmonth(date)` extracts the day of the month for date, in the range 1 to 31. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAY](#day), [DAY_OF_MONTH](#day_of_month) -### Example +Extracts the day of the month for `date`, in the range 1 to 31. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAY`](#day), [`DAY_OF_MONTH`](#day_of_month) + +#### Example ```ppl source=people @@ -829,7 +889,7 @@ source=people | fields `DAYOFMONTH(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -840,15 +900,20 @@ fetched rows / total rows = 1/1 +--------------------------------+ ``` -## DAY_OF_MONTH +## DAY_OF_MONTH -### Description +**Usage**: `DAY_OF_MONTH(date)` -Usage: `day_of_month(date)` extracts the day of the month for date, in the range 1 to 31. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAY](#day), [DAYOFMONTH](#dayofmonth) -### Example +Extracts the day of the month for `date`, in the range 1 to 31. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAY`](#day), [`DAYOFMONTH`](#dayofmonth) + +#### Example ```ppl source=people @@ -856,7 +921,7 @@ source=people | fields `DAY_OF_MONTH(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -867,15 +932,20 @@ fetched rows / total rows = 1/1 +----------------------------------+ ``` -## DAYOFWEEK +## DAYOFWEEK -### Description +**Usage**: `DAYOFWEEK(date)` -Usage: `dayofweek(date)` returns the weekday index for date (1 = Sunday, 2 = Monday, ..., 7 = Saturday). -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAY_OF_WEEK](#day_of_week) -### Example +Returns the weekday index for `date` (1 = Sunday, 2 = Monday, ..., 7 = Saturday). + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAY_OF_WEEK`](#day_of_week) + +#### Example ```ppl source=people @@ -883,7 +953,7 @@ source=people | fields `DAYOFWEEK(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -894,15 +964,20 @@ fetched rows / total rows = 1/1 +-------------------------------+ ``` -## DAY_OF_WEEK +## DAY_OF_WEEK -### Description +**Usage**: `DAY_OF_WEEK(date)` -Usage: `day_of_week(date)` returns the weekday index for date (1 = Sunday, 2 = Monday, ..., 7 = Saturday). -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAYOFWEEK](#dayofweek) -### Example +Returns the weekday index for `date` (1 = Sunday, 2 = Monday, ..., 7 = Saturday). + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAYOFWEEK`](#dayofweek) + +#### Example ```ppl source=people @@ -910,7 +985,7 @@ source=people | fields `DAY_OF_WEEK(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -921,15 +996,20 @@ fetched rows / total rows = 1/1 +---------------------------------+ ``` -## DAYOFYEAR +## DAYOFYEAR -### Description +**Usage**: `DAYOFYEAR(date)` -Usage: dayofyear(date) returns the day of the year for date, in the range 1 to 366. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAY_OF_YEAR](#day_of_year) -### Example +Returns the day of the year for `date`, in the range 1 to 366. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAY_OF_YEAR`](#day_of_year) + +#### Example ```ppl source=people @@ -937,7 +1017,7 @@ source=people | fields `DAYOFYEAR(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -948,15 +1028,20 @@ fetched rows / total rows = 1/1 +-------------------------------+ ``` -## DAY_OF_YEAR +## DAY_OF_YEAR -### Description +**Usage**: `DAY_OF_YEAR(date)` -Usage: day_of_year(date) returns the day of the year for date, in the range 1 to 366. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [DAYOFYEAR](#dayofyear) -### Example +Returns the day of the year for `date`, in the range 1 to 366. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`DAYOFYEAR`](#dayofyear) + +#### Example ```ppl source=people @@ -964,7 +1049,7 @@ source=people | fields `DAY_OF_YEAR(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -975,42 +1060,44 @@ fetched rows / total rows = 1/1 +---------------------------------+ ``` -## EXTRACT +## EXTRACT + +**Usage**: `EXTRACT(part FROM date)` -### Description +Returns a `LONG` containing digits in order according to the given `part` argument. The specific format of the returned `LONG` is determined by the following table. -Usage: `extract(part FROM date)` returns a LONG with digits in order according to the given 'part' arguments. -The specific format of the returned long is determined by the table below. -**Argument type:** `PART, where PART is one of the following tokens in the table below.` -The format specifiers found in this table are the same as those found in the [DATE_FORMAT](#date_format) function. -The following table describes the mapping of a 'part' to a particular format. +**Parameters**: +- `part` (Required): A part token (see following table). +- `date` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. +**Return type**: `LONG` -| Part | Format | +The format specifiers found in this table are the same as those found in the [`DATE_FORMAT`](#date_format) function. The following table describes the mapping of a `part` to a particular format. + + +| `part` | Format | | --- | --- | -| MICROSECOND | %f | -| SECOND | %s | -| MINUTE | %i | -| HOUR | %H | -| DAY | %d | -| WEEK | %X | -| MONTH | %m | -| YEAR | %V | -| SECOND_MICROSECOND | %s%f | -| MINUTE_MICROSECOND | %i%s%f | -| MINUTE_SECOND | %i%s | -| HOUR_MICROSECOND | %H%i%s%f | -| HOUR_SECOND | %H%i%s | -| HOUR_MINUTE | %H%i | -| DAY_MICROSECOND | %d%H%i%s%f | -| DAY_SECOND | %d%H%i%s | -| DAY_MINUTE | %d%H%i | -| DAY_HOUR | %d%H% | -| YEAR_MONTH | %V%m | - - -**Return type:** `LONG` -### Example +| `MICROSECOND` | `%f` | +| `SECOND` | `%s` | +| `MINUTE` | `%i` | +| `HOUR` | `%H` | +| `DAY` | `%d` | +| `WEEK` | `%X` | +| `MONTH` | `%m` | +| `YEAR` | `%V` | +| `SECOND_MICROSECOND` | `%s%f` | +| `MINUTE_MICROSECOND` | `%i%s%f` | +| `MINUTE_SECOND` | `%i%s` | +| `HOUR_MICROSECOND` | `%H%i%s%f` | +| `HOUR_SECOND` | `%H%i%s` | +| `HOUR_MINUTE` | `%H%i` | +| `DAY_MICROSECOND` | `%d%H%i%s%f` | +| `DAY_SECOND` | `%d%H%i%s` | +| `DAY_MINUTE` | `%d%H%i` | +| `DAY_HOUR` | `%d%H%` | +| `YEAR_MONTH` | `%V%m` | + +#### Example ```ppl source=people @@ -1018,7 +1105,7 @@ source=people | fields `extract(YEAR_MONTH FROM "2023-02-07 10:11:12")` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1029,14 +1116,18 @@ fetched rows / total rows = 1/1 +------------------------------------------------+ ``` -## FROM_DAYS +## FROM_DAYS -### Description +**Usage**: `FROM_DAYS(N)` -Usage: `from_days(N)` returns the date value given the day number N. -**Argument type:** `INTEGER/LONG` -**Return type:** `DATE` -### Example +Returns the date value given the day number `N`. + +**Parameters**: +- `N` (Required): An `INTEGER` or `LONG` value. + +**Return type**: `DATE` + +#### Example ```ppl source=people @@ -1044,7 +1135,7 @@ source=people | fields `FROM_DAYS(733687)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1055,18 +1146,19 @@ fetched rows / total rows = 1/1 +-------------------+ ``` -## FROM_UNIXTIME +## FROM_UNIXTIME + +**Usage**: `FROM_UNIXTIME(timestamp)` or `FROM_UNIXTIME(timestamp, format)` + +Returns a representation of the argument as a timestamp or character string value. Performs the reverse conversion for the [`UNIX_TIMESTAMP`](#unix_timestamp) function. If the second argument is provided, it is used to format the result in the same way as the format string used for the [`DATE_FORMAT`](#date_format) function. If the timestamp is outside the range 1970-01-01 00:00:00 - 3001-01-18 23:59:59.999999 (0 to 32536771199.999999 epoch time), the function returns `NULL`. -### Description +**Parameters**: +- `timestamp` (Required): A `DOUBLE` value representing Unix timestamp. +- `format` (Optional): A `STRING` format specifier. -Usage: Returns a representation of the argument given as a timestamp or character string value. Perform reverse conversion for [UNIX_TIMESTAMP](#unix_timestamp) function. -If second argument is provided, it is used to format the result in the same way as the format string used for the [DATE_FORMAT](#date_format) function. -If timestamp is outside of range 1970-01-01 00:00:00 - 3001-01-18 23:59:59.999999 (0 to 32536771199.999999 epoch time), function returns NULL. -**Argument type:** `DOUBLE, STRING` -Return type map: -DOUBLE -> TIMESTAMP -DOUBLE, STRING -> STRING -Examples +**Return type**: `TIMESTAMP` (without format), `STRING` (with format) + +**Examples** ```ppl source=people @@ -1074,7 +1166,7 @@ source=people | fields `FROM_UNIXTIME(1220249547)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1091,7 +1183,7 @@ source=people | fields `FROM_UNIXTIME(1220249547, '%T')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1102,14 +1194,19 @@ fetched rows / total rows = 1/1 +---------------------------------+ ``` -## GET_FORMAT +## GET_FORMAT + +**Usage**: `GET_FORMAT(type, format)` -### Description +Returns a string value containing string format specifiers based on the input arguments. -Usage: Returns a string value containing string format specifiers based on the input arguments. -**Argument type:** `TYPE, STRING, where TYPE must be one of the following tokens: [DATE, TIME, TIMESTAMP], and` -STRING must be one of the following tokens: ["USA", "JIS", "ISO", "EUR", "INTERNAL"] (" can be replaced by '). -Examples +**Parameters**: +- `type` (Required): One of the following tokens: `DATE`, `TIME`, `TIMESTAMP`. +- `format` (Required): A `STRING` that must be one of: `USA`, `JIS`, `ISO`, `EUR`, `INTERNAL`. + +**Return type**: `STRING` + +**Examples** ```ppl source=people @@ -1117,7 +1214,7 @@ source=people | fields `GET_FORMAT(DATE, 'USA')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1128,15 +1225,20 @@ fetched rows / total rows = 1/1 +-------------------------+ ``` -## HOUR +## HOUR -### Description +**Usage**: `HOUR(time)` -Usage: `hour(time)` extracts the hour value for time. Different from the time of day value, the time value has a large range and can be greater than 23, so the return value of hour(time) can be also greater than 23. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [HOUR_OF_DAY](#hour_of_day) -### Example +Extracts the hour value for `time`. Different from a time of day value, the time value has a large range and can be greater than 23, so the return value of `HOUR(time)` can also be greater than 23. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`HOUR_OF_DAY`](#hour_of_day) + +#### Example ```ppl source=people @@ -1144,7 +1246,7 @@ source=people | fields `HOUR(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1155,15 +1257,20 @@ fetched rows / total rows = 1/1 +------------------------+ ``` -## HOUR_OF_DAY +## HOUR_OF_DAY -### Description +**Usage**: `HOUR_OF_DAY(time)` -Usage: `hour_of_day(time)` extracts the hour value for time. Different from the time of day value, the time value has a large range and can be greater than 23, so the return value of hour_of_day(time) can be also greater than 23. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [HOUR](#hour) -### Example +Extracts the hour value for `time`. Different from a time of day value, the time value has a large range and can be greater than 23, so the return value of `HOUR_OF_DAY(time)` can also be greater than 23. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`HOUR`](#hour) + +#### Example ```ppl source=people @@ -1171,7 +1278,7 @@ source=people | fields `HOUR_OF_DAY(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1184,10 +1291,16 @@ fetched rows / total rows = 1/1 ## LAST_DAY -Usage: Returns the last day of the month as a DATE for a valid argument. -**Argument type:** `DATE/STRING/TIMESTAMP/TIME` -**Return type:** `DATE` -### Example +**Usage**: `LAST_DAY(date)` + +Returns the last day of the month as a `DATE` for a valid argument. + +**Parameters**: +- `date` (Required): A `DATE`, `STRING`, `TIMESTAMP`, or `TIME` value. + +**Return type**: `DATE` + +#### Example ```ppl source=people @@ -1195,7 +1308,7 @@ source=people | fields `last_day('2023-02-06')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1206,12 +1319,17 @@ fetched rows / total rows = 1/1 +------------------------+ ``` -## LOCALTIMESTAMP +## LOCALTIMESTAMP -### Description +**Usage**: `LOCALTIMESTAMP()` -`LOCALTIMESTAMP()` are synonyms for [NOW()](#now). -### Example +`LOCALTIMESTAMP()` is a synonym for [`NOW()`](#now). + +**Parameters**: None + +**Return type**: `TIMESTAMP` + +#### Example ```ppl ignore source=people @@ -1219,7 +1337,7 @@ source=people | fields `LOCALTIMESTAMP()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1230,12 +1348,17 @@ fetched rows / total rows = 1/1 +---------------------+ ``` -## LOCALTIME +## LOCALTIME -### Description +**Usage**: `LOCALTIME()` -`LOCALTIME()` are synonyms for [NOW()](#now). -### Example +`LOCALTIME()` is a synonym for [`NOW()`](#now). + +**Parameters**: None + +**Return type**: `TIMESTAMP` + +#### Example ```ppl ignore source=people @@ -1243,7 +1366,7 @@ source=people | fields `LOCALTIME()` ``` -Expected output: +The query returns the following results: ```text ignore fetched rows / total rows = 1/1 @@ -1254,24 +1377,25 @@ fetched rows / total rows = 1/1 +---------------------+ ``` -## MAKEDATE +## MAKEDATE + +**Usage**: `MAKEDATE(year, dayofyear)` + +Returns a date, given `year` and `day-of-year` values. `dayofyear` must be greater than 0, otherwise the result is `NULL`. The result is also `NULL` if either argument is `NULL`. Arguments are rounded to an integer. -### Description +**Parameters**: +- `year` (Required): A `DOUBLE` value for the year. +- `dayofyear` (Required): A `DOUBLE` value for the day of year. + +**Return type**: `DATE` -Returns a date, given `year` and `day-of-year` values. `dayofyear` must be greater than 0 or the result is `NULL`. The result is also `NULL` if either argument is `NULL`. -Arguments are rounded to an integer. Limitations: -- Zero `year` interpreted as 2000; -- Negative `year` is not accepted; -- `day-of-year` should be greater than zero; -- `day-of-year` could be greater than 365/366, calculation switches to the next year(s) (see example). - -Specifications: -1. MAKEDATE(DOUBLE, DOUBLE) -> DATE - -**Argument type:** `DOUBLE` -**Return type:** `DATE` -### Example +- A zero `year` is interpreted as 2000 +- A negative `year` is not accepted +- `day-of-year` should be greater than zero +- `day-of-year` can be greater than 365/366, and the calculation switches to the next year(s) (see example) + +#### Example ```ppl source=people @@ -1279,7 +1403,7 @@ source=people | fields `MAKEDATE(1945, 5.9)`, `MAKEDATE(1984, 1984)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1290,22 +1414,24 @@ fetched rows / total rows = 1/1 +---------------------+----------------------+ ``` -## MAKETIME +## MAKETIME -### Description +**Usage**: `MAKETIME(hour, minute, second)` + +Returns a time value calculated from the hour, minute, and second arguments. Returns `NULL` if any of its arguments are `NULL`. The second argument can have a fractional part, and the rest of the arguments are rounded to an integer. + +**Parameters**: +- `hour` (Required): A `DOUBLE` value for the hour. +- `minute` (Required): A `DOUBLE` value for the minute. +- `second` (Required): A `DOUBLE` value for the second. + +**Return type**: `TIME` -Returns a time value calculated from the hour, minute, and second arguments. Returns `NULL` if any of its arguments are `NULL`. -The second argument can have a fractional part, rest arguments are rounded to an integer. Limitations: -- 24-hour clock is used, available time range is [00:00:00.0 - 23:59:59.(9)]; -- Up to 9 digits of second fraction part is taken (nanosecond precision). - -Specifications: -1. MAKETIME(DOUBLE, DOUBLE, DOUBLE) -> TIME - -**Argument type:** `DOUBLE` -**Return type:** `TIME` -### Example +- A 24-hour clock is used, and the available time range is [00:00:00.0 - 23:59:59.(9)] +- Up to 9 digits of the second fraction part are taken (nanosecond precision) + +#### Example ```ppl source=people @@ -1313,7 +1439,7 @@ source=people | fields `MAKETIME(20, 30, 40)`, `MAKETIME(20.2, 49.5, 42.100502)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1324,14 +1450,18 @@ fetched rows / total rows = 1/1 +----------------------+---------------------------------+ ``` -## MICROSECOND +## MICROSECOND -### Description +**Usage**: `MICROSECOND(expr)` -Usage: `microsecond(expr)` returns the microseconds from the time or timestamp expression expr as a number in the range from 0 to 999999. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -### Example +Returns the microseconds from the time or timestamp expression `expr` as a number in the range from 0 to 999999. + +**Parameters**: +- `expr` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +#### Example ```ppl source=people @@ -1339,7 +1469,7 @@ source=people | fields `MICROSECOND(TIME('01:02:03.123456'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1350,15 +1480,20 @@ fetched rows / total rows = 1/1 +--------------------------------------+ ``` -## MINUTE +## MINUTE -### Description +**Usage**: `MINUTE(time)` -Usage: `minute(time)` returns the minute for time, in the range 0 to 59. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [MINUTE_OF_HOUR](#minute_of_hour) -### Example +Returns the minute for `time`, in the range 0 to 59. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`MINUTE_OF_HOUR`](#minute_of_hour) + +#### Example ```ppl source=people @@ -1366,7 +1501,7 @@ source=people | fields `MINUTE(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1377,14 +1512,18 @@ fetched rows / total rows = 1/1 +--------------------------+ ``` -## MINUTE_OF_DAY +## MINUTE_OF_DAY -### Description +**Usage**: `MINUTE_OF_DAY(time)` -Usage: `minute(time)` returns the amount of minutes in the day, in the range of 0 to 1439. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -### Example +Returns the amount of minutes in the day, in the range of 0 to 1439. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +#### Example ```ppl source=people @@ -1392,7 +1531,7 @@ source=people | fields `MINUTE_OF_DAY(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1403,15 +1542,20 @@ fetched rows / total rows = 1/1 +---------------------------------+ ``` -## MINUTE_OF_HOUR +## MINUTE_OF_HOUR -### Description +**Usage**: `MINUTE_OF_HOUR(time)` -Usage: `minute(time)` returns the minute for time, in the range 0 to 59. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [MINUTE](#minute) -### Example +Returns the minute for `time`, in the range 0 to 59. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`MINUTE`](#minute) + +#### Example ```ppl source=people @@ -1419,7 +1563,7 @@ source=people | fields `MINUTE_OF_HOUR(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1430,15 +1574,20 @@ fetched rows / total rows = 1/1 +----------------------------------+ ``` -## MONTH +## MONTH -### Description +**Usage**: `MONTH(date)` -Usage: `month(date)` returns the month for date, in the range 1 to 12 for January to December. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [MONTH_OF_YEAR](#month_of_year) -### Example +Returns the month for `date`, in the range 1 to 12 for January to December. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`MONTH_OF_YEAR`](#month_of_year) + +#### Example ```ppl source=people @@ -1446,7 +1595,7 @@ source=people | fields `MONTH(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1457,15 +1606,20 @@ fetched rows / total rows = 1/1 +---------------------------+ ``` -## MONTH_OF_YEAR +## MONTH_OF_YEAR -### Description +**Usage**: `MONTH_OF_YEAR(date)` -Usage: `month_of_year(date)` returns the month for date, in the range 1 to 12 for January to December. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [MONTH](#month) -### Example +Returns the month for `date`, in the range 1 to 12 for January to December. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`MONTH`](#month) + +#### Example ```ppl source=people @@ -1473,7 +1627,7 @@ source=people | fields `MONTH_OF_YEAR(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1484,14 +1638,18 @@ fetched rows / total rows = 1/1 +-----------------------------------+ ``` -## MONTHNAME +## MONTHNAME -### Description +**Usage**: `MONTHNAME(date)` -Usage: `monthname(date)` returns the full name of the month for date. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `STRING` -### Example +Returns the full name of the month for `date`. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `STRING` + +#### Example ```ppl source=people @@ -1499,7 +1657,7 @@ source=people | fields `MONTHNAME(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1510,15 +1668,17 @@ fetched rows / total rows = 1/1 +-------------------------------+ ``` -## NOW +## NOW -### Description +**Usage**: `NOW()` -Returns the current date and time as a value in 'YYYY-MM-DD hh:mm:ss' format. The value is expressed in the UTC time zone. -`NOW()` returns a constant time that indicates the time at which the statement began to execute. This differs from the behavior for [SYSDATE()](#sysdate), which returns the exact time at which it executes. -**Return type:** `TIMESTAMP` -Specification: NOW() -> TIMESTAMP -### Example +Returns the current date and time as a value in 'YYYY-MM-DD hh:mm:ss' format. The value is expressed in the UTC time zone. `NOW()` returns a constant time that indicates the time at which the statement began to execute. This differs from the behavior for [`SYSDATE()`](#sysdate), which returns the exact time at which it executes. + +**Parameters**: None + +**Return type**: `TIMESTAMP` + +#### Example ```ppl ignore source=people @@ -1526,7 +1686,7 @@ source=people | fields `value_1`, `value_2` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1537,14 +1697,19 @@ fetched rows / total rows = 1/1 +---------------------+---------------------+ ``` -## PERIOD_ADD +## PERIOD_ADD -### Description +**Usage**: `PERIOD_ADD(P, N)` -Usage: `period_add(P, N)` add N months to period P (in the format YYMM or YYYYMM). Returns a value in the format YYYYMM. -**Argument type:** `INTEGER, INTEGER` -**Return type:** `INTEGER` -### Example +Adds `N` months to period `P` (in the format YYMM or YYYYMM). Returns a value in the format YYYYMM. + +**Parameters**: +- `P` (Required): An `INTEGER` value representing a period in YYMM or YYYYMM format. +- `N` (Required): An `INTEGER` number of months to add. + +**Return type**: `INTEGER` + +#### Example ```ppl source=people @@ -1552,7 +1717,7 @@ source=people | fields `PERIOD_ADD(200801, 2)`, `PERIOD_ADD(200801, -12)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1563,14 +1728,19 @@ fetched rows / total rows = 1/1 +-----------------------+-------------------------+ ``` -## PERIOD_DIFF +## PERIOD_DIFF -### Description +**Usage**: `PERIOD_DIFF(P1, P2)` -Usage: `period_diff(P1, P2)` returns the number of months between periods P1 and P2 given in the format YYMM or YYYYMM. -**Argument type:** `INTEGER, INTEGER` -**Return type:** `INTEGER` -### Example +Returns the number of months between periods `P1` and `P2` given in the format YYMM or YYYYMM. + +**Parameters**: +- `P1` (Required): An `INTEGER` value representing a period in YYMM or YYYYMM format. +- `P2` (Required): An `INTEGER` value representing a period in YYMM or YYYYMM format. + +**Return type**: `INTEGER` + +#### Example ```ppl source=people @@ -1578,7 +1748,7 @@ source=people | fields `PERIOD_DIFF(200802, 200703)`, `PERIOD_DIFF(200802, 201003)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1589,14 +1759,18 @@ fetched rows / total rows = 1/1 +-----------------------------+-----------------------------+ ``` -## QUARTER +## QUARTER -### Description +**Usage**: `QUARTER(date)` -Usage: `quarter(date)` returns the quarter of the year for date, in the range 1 to 4. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -### Example +Returns the quarter of the year for `date`, in the range 1 to 4. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +#### Example ```ppl source=people @@ -1604,7 +1778,7 @@ source=people | fields `QUARTER(DATE('2020-08-26'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1615,17 +1789,18 @@ fetched rows / total rows = 1/1 +-----------------------------+ ``` -## SEC_TO_TIME +## SEC_TO_TIME -### Description +**Usage**: `SEC_TO_TIME(number)` -Usage: `sec_to_time(number)` returns the time in HH:mm:ssss[.nnnnnn] format. -Note that the function returns a time between 00:00:00 and 23:59:59. -If an input value is too large (greater than 86399), the function will wrap around and begin returning outputs starting from 00:00:00. -If an input value is too small (less than 0), the function will wrap around and begin returning outputs counting down from 23:59:59. -**Argument type:** `INTEGER, LONG, DOUBLE, FLOAT` -**Return type:** `TIME` -### Example +Returns the time in HH:mm:ss[.nnnnnn] format. Note that the function returns a time between 00:00:00 and 23:59:59. If the input value is too large (greater than 86399), the function will wrap around and begin returning outputs starting from 00:00:00. If the input value is too small (less than 0), the function will wrap around and begin returning outputs counting down from 23:59:59. + +**Parameters**: +- `number` (Required): An `INTEGER`, `LONG`, `DOUBLE`, or `FLOAT` value. + +**Return type**: `TIME` + +#### Example ```ppl source=people @@ -1634,7 +1809,7 @@ source=people | fields `SEC_TO_TIME(3601)`, `SEC_TO_TIME(1234.123)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1645,15 +1820,20 @@ fetched rows / total rows = 1/1 +-------------------+-----------------------+ ``` -## SECOND +## SECOND -### Description +**Usage**: `SECOND(time)` -Usage: `second(time)` returns the second for time, in the range 0 to 59. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [SECOND_OF_MINUTE](#second_of_minute) -### Example +Returns the second for `time`, in the range 0 to 59. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`SECOND_OF_MINUTE`](#second_of_minute) + +#### Example ```ppl source=people @@ -1661,7 +1841,7 @@ source=people | fields `SECOND(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1672,15 +1852,20 @@ fetched rows / total rows = 1/1 +--------------------------+ ``` -## SECOND_OF_MINUTE +## SECOND_OF_MINUTE -### Description +**Usage**: `SECOND_OF_MINUTE(time)` -Usage: `second_of_minute(time)` returns the second for time, in the range 0 to 59. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `INTEGER` -Synonyms: [SECOND](#second) -### Example +Returns the second for `time`, in the range 0 to 59. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +Synonyms: [`SECOND`](#second) + +#### Example ```ppl source=people @@ -1688,7 +1873,7 @@ source=people | fields `SECOND_OF_MINUTE(TIME('01:02:03'))` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1699,68 +1884,71 @@ fetched rows / total rows = 1/1 +------------------------------------+ ``` -## STRFTIME +## STRFTIME + +**Usage**: `STRFTIME(time, format)` + +Takes a UNIX timestamp (in seconds) and renders it as a string using the format specified. For numeric inputs, the UNIX time must be in seconds. Values greater than 100000000000 are automatically treated as milliseconds and converted to seconds. You can use time format variables with the strftime function. This function performs the reverse operation of [`UNIX_TIMESTAMP`](#unix_timestamp) and is similar to [`FROM_UNIXTIME`](#from_unixtime) but with POSIX-style format specifiers. -**Version: 3.3.0** -### Description +**Parameters**: +- `time` (Required): An `INTEGER`, `LONG`, `DOUBLE`, or `TIMESTAMP` value. +- `format` (Required): A `STRING` format specifier. -Usage: `strftime(time, format)` takes a UNIX timestamp (in seconds) and renders it as a string using the format specified. For numeric inputs, the UNIX time must be in seconds. Values greater than 100000000000 are automatically treated as milliseconds and converted to seconds. -You can use time format variables with the strftime function. This function performs the reverse operation of [UNIX_TIMESTAMP](#unix_timestamp) and is similar to [FROM_UNIXTIME](#from_unixtime) but with POSIX-style format specifiers. - - **Available only when Calcite engine is enabled** +**Return type**: `STRING` + +**Notes**: +- Available only when Calcite engine is enabled - All timestamps are interpreted as UTC timezone - Text formatting uses language-neutral Locale.ROOT (weekday and month names appear in abbreviated form) - String inputs are NOT supported - use `unix_timestamp()` to convert strings first - Functions that return date/time values (like `date()`, `now()`, `timestamp()`) are supported -**Argument type:** `INTEGER/LONG/DOUBLE/TIMESTAMP, STRING` -**Return type:** `STRING` -Format specifiers: -The following table describes the available specifier arguments. +The following table describes the available specifier arguments: | Specifier | Description | | --- | --- | -| %a | Abbreviated weekday name (Mon..Sun) | -| %A | Weekday name (Mon..Sun) - Note: Locale.ROOT uses abbreviated form | -| %b | Abbreviated month name (Jan..Dec) | -| %B | Month name (Jan..Dec) - Note: Locale.ROOT uses abbreviated form | -| %c | Date and time (e.g., Mon Jul 18 09:30:00 2019) | -| %C | Century as 2-digit decimal number | -| %d | Day of the month, zero-padded (01..31) | -| %e | Day of the month, space-padded ( 1..31) | -| %Ez | Timezone offset in minutes from UTC (e.g., +0 for UTC, +330 for IST, -300 for EST) | -| %f | Microseconds as decimal number (000000..999999) | -| %F | ISO 8601 date format (%Y-%m-%d) | -| %g | ISO 8601 year without century (00..99) | -| %G | ISO 8601 year with century | -| %H | Hour (24-hour clock) (00..23) | -| %I | Hour (12-hour clock) (01..12) | -| %j | Day of year (001..366) | -| %k | Hour (24-hour clock), space-padded ( 0..23) | -| %m | Month as decimal number (01..12) | -| %M | Minute (00..59) | -| %N | Subsecond digits (default %9N = nanoseconds). Accepts any precision value from 1-9 (e.g., %3N = 3 digits, %5N = 5 digits, %9N = 9 digits). The precision directly controls the number of digits displayed | -| %p | AM or PM | -| %Q | Subsecond component (default milliseconds). Can specify precision: %3Q = milliseconds, %6Q = microseconds, %9Q = nanoseconds. Other precision values (e.g., %5Q) default to %3Q | -| %s | UNIX Epoch timestamp in seconds | -| %S | Second (00..59) | -| %T | Time in 24-hour notation (%H:%M:%S) | -| %U | Week of year starting from 0 (00..53) | -| %V | ISO week number (01..53) | -| %w | Weekday as decimal (0=Sunday..6=Saturday) | -| %x | Date in MM/dd/yyyy format (e.g., 07/13/2019) | -| %X | Time in HH:mm:ss format (e.g., 09:30:00) | -| %y | Year without century (00..99) | -| %Y | Year with century | -| %z | Timezone offset (+hhmm or -hhmm) | -| %:z | Timezone offset with colon (+hh:mm or -hh:mm) | -| %::z | Timezone offset with colons (+hh:mm:ss) | -| %:::z | Timezone offset hour only (+hh or -hh) | -| %Z | Timezone abbreviation (e.g., EST, PDT) | -| %% | Literal % character | - - -Examples +| `%a` | Abbreviated weekday name (Mon..Sun) | +| `%A` | Weekday name (Mon..Sun) - Note: Locale.ROOT uses abbreviated form | +| `%b` | Abbreviated month name (Jan..Dec) | +| `%B` | Month name (Jan..Dec) - Note: Locale.ROOT uses abbreviated form | +| `%c` | Date and time (e.g., Mon Jul 18 09:30:00 2019) | +| `%C` | Century as 2-digit decimal number | +| `%d` | Day of the month, zero-padded (01..31) | +| `%e` | Day of the month, space-padded ( 1..31) | +| `%Ez` | Timezone offset in minutes from UTC (e.g., +0 for UTC, +330 for IST, -300 for EST) | +| `%f` | Microseconds as decimal number (000000..999999) | +| `%F` | ISO 8601 date format (`%Y-%m-%d`) | +| `%g` | ISO 8601 year without century (00..99) | +| `%G` | ISO 8601 year with century | +| `%H` | Hour (24-hour clock) (00..23) | +| `%I` | Hour (12-hour clock) (01..12) | +| `%j` | Day of year (001..366) | +| `%k` | Hour (24-hour clock), space-padded ( 0..23) | +| `%m` | Month as decimal number (01..12) | +| `%M` | Minute (00..59) | +| `%N` | Subsecond digits (default `%9N` = nanoseconds). Accepts any precision value from 1-9 (e.g., `%3N` = 3 digits, `%5N` = 5 digits, `%9N` = 9 digits). The precision directly controls the number of digits displayed | +| `%p` | AM or PM | +| `%Q` | Subsecond component (default milliseconds). Can specify precision: `%3Q` = milliseconds, `%6Q` = microseconds, `%9Q` = nanoseconds. Other precision values (e.g., `%5Q`) default to `%3Q` | +| `%s` | UNIX Epoch timestamp in seconds | +| `%S` | Second (00..59) | +| `%T` | Time in 24-hour notation (`%H:%M:%S`) | +| `%U` | Week of year starting from 0 (00..53) | +| `%V` | ISO week number (01..53) | +| `%w` | Weekday as decimal (0=Sunday..6=Saturday) | +| `%x` | Date in MM/dd/yyyy format (e.g., 07/13/2019) | +| `%X` | Time in HH:mm:ss format (e.g., 09:30:00) | +| `%y` | Year without century (00..99) | +| `%Y` | Year with century | +| `%z` | Timezone offset (+hhmm or -hhmm) | +| `%:z` | Timezone offset with colon (+hh:mm or -hh:mm) | +| `%::z` | Timezone offset with colons (+hh:mm:ss) | +| `%:::z` | Timezone offset hour only (+hh or -hh) | +| `%Z` | Timezone abbreviation (e.g., EST, PDT) | +| `%%` | Literal % character | + + +**Examples** ```ppl ignore source=people | eval `strftime(1521467703, "%Y-%m-%dT%H:%M:%S")` = strftime(1521467703, "%Y-%m-%dT%H:%M:%S") | fields `strftime(1521467703, "%Y-%m-%dT%H:%M:%S")` @@ -1861,15 +2049,17 @@ fetched rows / total rows = 1/1 ``` ## STR_TO_DATE -### Description +**Usage**: `STR_TO_DATE(string, format)` -Usage: `str_to_date(string, string)` is used to extract a TIMESTAMP from the first argument string using the formats specified in the second argument string. -The input argument must have enough information to be parsed as a DATE, TIMESTAMP, or TIME. -Acceptable string format specifiers are the same as those used in the [DATE_FORMAT](#date_format) function. -It returns NULL when a statement cannot be parsed due to an invalid pair of arguments, and when 0 is provided for any DATE field. Otherwise, it will return a TIMESTAMP with the parsed values (as well as default values for any field that was not parsed). -**Argument type:** `STRING, STRING` -**Return type:** `TIMESTAMP` -### Example +Extracts a `TIMESTAMP` from the first argument string using the formats specified in the second argument string. The input argument must have enough information to be parsed as a `DATE`, `TIMESTAMP`, or `TIME`. Acceptable string format specifiers are the same as those used in the [`DATE_FORMAT`](#date_format) function. Returns `NULL` when the statement cannot be parsed due to an invalid pair of arguments, and when 0 is provided for any `DATE` field. Otherwise, returns a `TIMESTAMP` with the parsed values (as well as default values for any field that was not parsed). + +**Parameters**: +- `string` (Required): A `STRING` value to parse. +- `format` (Required): A `STRING` format specifier. + +**Return type**: `TIMESTAMP` + +#### Example ```ppl @@ -1879,7 +2069,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -1895,18 +2085,20 @@ fetched rows / total rows = 1/1 ## SUBDATE -### Description - -Usage: `subdate(date, INTERVAL expr unit)` / subdate(date, days) subtracts the interval expr from date; subdate(date, days) subtracts the second argument as integer number of days from date. -If first argument is TIME, today's date is used; if first argument is DATE, time at midnight is used. -**Argument type:** `DATE/TIMESTAMP/TIME, INTERVAL/LONG` -Return type map: -(DATE/TIMESTAMP/TIME, INTERVAL) -> TIMESTAMP -(DATE, LONG) -> DATE -(TIMESTAMP/TIME, LONG) -> TIMESTAMP -Synonyms: [DATE_SUB](#date_sub) when invoked with the INTERVAL form of the second argument. -Antonyms: [ADDDATE](#adddate) -### Example +**Usage**: `SUBDATE(date, INTERVAL expr unit)` or `SUBDATE(date, days)` + +Subtracts the interval `expr` from `date`, or subtracts the second argument as an integer number of days from `date`. If the first argument is `TIME`, today's date is used. If the first argument is `DATE`, the time at midnight is used. + +**Parameters**: +- `date` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. +- `expr` (Required): Either an `INTERVAL` expression or a `LONG` number of days. + +**Return type**: `TIMESTAMP` (with INTERVAL), `DATE` (DATE with LONG), `TIMESTAMP` (TIMESTAMP/TIME with LONG) + +Synonyms: [`DATE_SUB`](#date_sub) when invoked with the INTERVAL form of the second argument +Antonyms: [`ADDDATE`](#adddate) + +#### Example ```ppl @@ -1916,7 +2108,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -1932,15 +2124,19 @@ fetched rows / total rows = 1/1 ## SUBTIME -### Description +**Usage**: `SUBTIME(expr1, expr2)` -Usage: `subtime(expr1, expr2)` subtracts expr2 from expr1 and returns the result. If argument is TIME, today's date is used; if argument is DATE, time at midnight is used. -**Argument type:** `DATE/TIMESTAMP/TIME, DATE/TIMESTAMP/TIME` -Return type map: -(DATE/TIMESTAMP, DATE/TIMESTAMP/TIME) -> TIMESTAMP -(TIME, DATE/TIMESTAMP/TIME) -> TIME -Antonyms: [ADDTIME](#addtime) -### Example +Subtracts `expr2` from `expr1` and returns the result. If an argument is `TIME`, today's date is used. If an argument is `DATE`, the time at midnight is used. + +**Parameters**: +- `expr1` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. +- `expr2` (Required): A `DATE`, `TIMESTAMP`, or `TIME` value. + +**Return type**: `TIMESTAMP` (DATE/TIMESTAMP with DATE/TIMESTAMP/TIME), `TIME` (TIME with DATE/TIMESTAMP/TIME) + +Antonyms: [`ADDTIME`](#addtime) + +#### Example ```ppl @@ -1950,7 +2146,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -1972,7 +2168,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -1994,7 +2190,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2016,7 +2212,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2038,7 +2234,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2054,15 +2250,16 @@ fetched rows / total rows = 1/1 ## SYSDATE -### Description +**Usage**: `SYSDATE()` or `SYSDATE(precision)` -Returns the current date and time as a value in 'YYYY-MM-DD hh:mm:ss[.nnnnnn]'. -SYSDATE() returns the date and time at which it executes in UTC. This differs from the behavior for [NOW()](#now), which returns a constant time that indicates the time at which the statement began to execute. -If an argument is given, it specifies a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits. -Optional argument type: INTEGER -**Return type:** `TIMESTAMP` -Specification: SYSDATE([INTEGER]) -> TIMESTAMP -### Example +Returns the current date and time as a value in 'YYYY-MM-DD hh:mm:ss[.nnnnnn]'. `SYSDATE()` returns the date and time at which it executes in UTC. This differs from the behavior for [`NOW()`](#now), which returns a constant time that indicates the time at which the statement began to execute. If an argument is given, it specifies a fractional seconds precision from 0 to 6, the return value includes a fractional seconds part of that many digits. + +**Parameters**: +- `precision` (Optional): An `INTEGER` value from 0 to 6 for fractional seconds precision. + +**Return type**: `TIMESTAMP` + +#### Example ```ppl ignore @@ -2072,7 +2269,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2088,12 +2285,16 @@ fetched rows / total rows = 1/1 ## TIME -### Description +**Usage**: `TIME(expr)` -Usage: `time(expr)` constructs a time type with the input string expr as a time. If the argument is of date/time/timestamp, it extracts the time value part from the expression. -**Argument type:** `STRING/DATE/TIME/TIMESTAMP` -**Return type:** `TIME` -### Example +Constructs a time type with the input string `expr` as a time. If the argument is of date/time/timestamp, it extracts the time value part from the expression. + +**Parameters**: +- `expr` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `TIME` + +#### Example ```ppl @@ -2103,7 +2304,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2125,7 +2326,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2147,7 +2348,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2169,7 +2370,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2185,33 +2386,32 @@ fetched rows / total rows = 1/1 ## TIME_FORMAT -### Description +**Usage**: `TIME_FORMAT(time, format)` + +Formats the `time` argument using the specifiers in the `format` argument. This supports a subset of the time format specifiers available for the [`DATE_FORMAT`](#date_format) function. Using date format specifiers supported by [`DATE_FORMAT`](#date_format) will return 0 or `NULL`. Acceptable format specifiers are listed in the following table. If an argument of type `DATE` is passed in, it is treated as a `TIMESTAMP` at midnight (i.e., 00:00:00). -Usage: `time_format(time, format)` formats the time argument using the specifiers in the format argument. -This supports a subset of the time format specifiers available for the [date_format](#date_format) function. -Using date format specifiers supported by [date_format](#date_format) will return 0 or null. -Acceptable format specifiers are listed in the table below. -If an argument of type DATE is passed in, it is treated as a TIMESTAMP at midnight (i.e., 00:00:00). -The following table describes the available specifier arguments. +**Parameters**: +- `time` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. +- `format` (Required): A `STRING` format specifier. +**Return type**: `STRING` + +The following table describes the available specifier arguments: | Specifier | Description | | --- | --- | -| %f | Microseconds (000000..999999) | -| %H | Hour (00..23) | -| %h | Hour (01..12) | -| %I | Hour (01..12) | -| %i | Minutes, numeric (00..59) | -| %p | AM or PM | -| %r | Time, 12-hour (hh:mm:ss followed by AM or PM) | -| %S | Seconds (00..59) | -| %s | Seconds (00..59) | -| %T | Time, 24-hour (hh:mm:ss) | - - -**Argument type:** `STRING/DATE/TIME/TIMESTAMP, STRING` -**Return type:** `STRING` -### Example +| `%f` | Microseconds (000000..999999) | +| `%H` | Hour (00..23) | +| `%h` | Hour (01..12) | +| `%I` | Hour (01..12) | +| `%i` | Minutes, numeric (00..59) | +| `%p` | `AM` or `PM` | +| `%r` | Time, 12-hour (hh:mm:ss followed by `AM` or `PM`) | +| `%S` | Seconds (00..59) | +| `%s` | Seconds (00..59) | +| `%T` | Time, 24-hour (hh:mm:ss) | + +#### Example ```ppl @@ -2221,7 +2421,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2237,12 +2437,16 @@ fetched rows / total rows = 1/1 ## TIME_TO_SEC -### Description +**Usage**: `TIME_TO_SEC(time)` -Usage: `time_to_sec(time)` returns the time argument, converted to seconds. -**Argument type:** `STRING/TIME/TIMESTAMP` -**Return type:** `LONG` -### Example +Returns the `time` argument, converted to seconds. + +**Parameters**: +- `time` (Required): A `STRING`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `LONG` + +#### Example ```ppl @@ -2252,7 +2456,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2268,12 +2472,17 @@ fetched rows / total rows = 1/1 ## TIMEDIFF -### Description +**Usage**: `TIMEDIFF(time1, time2)` -Usage: returns the difference between two time expressions as a time. -**Argument type:** `TIME, TIME` -**Return type:** `TIME` -### Example +Returns the difference between two time expressions as a time. + +**Parameters**: +- `time1` (Required): A `TIME` value. +- `time2` (Required): A `TIME` value. + +**Return type**: `TIME` + +#### Example ```ppl @@ -2283,7 +2492,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2299,15 +2508,17 @@ fetched rows / total rows = 1/1 ## TIMESTAMP -### Description +**Usage**: `TIMESTAMP(expr)` or `TIMESTAMP(expr1, expr2)` -Usage: `timestamp(expr)` constructs a timestamp type with the input string `expr` as an timestamp. If the argument is not a string, it casts `expr` to timestamp type with default timezone UTC. If argument is a time, it applies today's date before cast. -With two arguments `timestamp(expr1, expr2)` adds the time expression `expr2` to the date or timestamp expression `expr1` and returns the result as a timestamp value. -**Argument type:** `STRING/DATE/TIME/TIMESTAMP` -Return type map: -(STRING/DATE/TIME/TIMESTAMP) -> TIMESTAMP -(STRING/DATE/TIME/TIMESTAMP, STRING/DATE/TIME/TIMESTAMP) -> TIMESTAMP -### Example +Constructs a timestamp type with the input string `expr` as a timestamp. If the argument is not a string, it casts `expr` to a timestamp type with the default time zone UTC. If the argument is a time, it applies today's date before the cast. With two arguments, adds the time expression `expr2` to the date or timestamp expression `expr1` and returns the result as a timestamp value. + +**Parameters**: +- `expr` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. +- `expr2` (Optional): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `TIMESTAMP` + +#### Example ```ppl @@ -2317,7 +2528,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2333,14 +2544,18 @@ fetched rows / total rows = 1/1 ## TIMESTAMPADD -### Description +**Usage**: `TIMESTAMPADD(interval, count, datetime)` + +Returns a `TIMESTAMP` value based on a passed-in `DATE`/`TIME`/`TIMESTAMP`/`STRING` argument and an `INTERVAL` and `INTEGER` argument which determine the amount of time to be added. If the third argument is a `STRING`, it must be formatted as a valid `TIMESTAMP`. If only a `TIME` is provided, a `TIMESTAMP` is still returned with the `DATE` portion filled in using the current date. If the third argument is a `DATE`, it will be automatically converted to a `TIMESTAMP`. -Usage: Returns a TIMESTAMP value based on a passed in DATE/TIME/TIMESTAMP/STRING argument and an INTERVAL and INTEGER argument which determine the amount of time to be added. -If the third argument is a STRING, it must be formatted as a valid TIMESTAMP. If only a TIME is provided, a TIMESTAMP is still returned with the DATE portion filled in using the current date. -If the third argument is a DATE, it will be automatically converted to a TIMESTAMP. -**Argument type:** `INTERVAL, INTEGER, DATE/TIME/TIMESTAMP/STRING` -INTERVAL must be one of the following tokens: [MICROSECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR] -Examples +**Parameters**: +- `interval` (Required): One of: `MICROSECOND`, `SECOND`, `MINUTE`, `HOUR`, `DAY`, `WEEK`, `MONTH`, `QUARTER`, `YEAR`. +- `count` (Required): An `INTEGER` number of intervals to add. +- `datetime` (Required): A `DATE`, `TIME`, `TIMESTAMP`, or `STRING` value. + +**Return type**: `TIMESTAMP` + +**Examples** ```ppl @@ -2351,7 +2566,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2367,15 +2582,18 @@ fetched rows / total rows = 1/1 ## TIMESTAMPDIFF -### Description +**Usage**: `TIMESTAMPDIFF(interval, start, end)` + +Returns the difference between the start and end date/times in interval units. If a `TIME` is provided as an argument, it will be converted to a `TIMESTAMP` with the `DATE` portion filled in using the current date. Arguments will be automatically converted to a `TIME`/`TIMESTAMP` when appropriate. Any argument that is a `STRING` must be formatted as a valid `TIMESTAMP`. -Usage: `TIMESTAMPDIFF(interval, start, end)` returns the difference between the start and end date/times in interval units. -If a TIME is provided as an argument, it will be converted to a TIMESTAMP with the DATE portion filled in using the current date. -Arguments will be automatically converted to a TIME/TIMESTAMP when appropriate. -Any argument that is a STRING must be formatted as a valid TIMESTAMP. -**Argument type:** `INTERVAL, DATE/TIME/TIMESTAMP/STRING, DATE/TIME/TIMESTAMP/STRING` -INTERVAL must be one of the following tokens: [MICROSECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR] -Examples +**Parameters**: +- `interval` (Required): One of: `MICROSECOND`, `SECOND`, `MINUTE`, `HOUR`, `DAY`, `WEEK`, `MONTH`, `QUARTER`, `YEAR`. +- `start` (Required): A `DATE`, `TIME`, `TIMESTAMP`, or `STRING` value. +- `end` (Required): A `DATE`, `TIME`, `TIMESTAMP`, or `STRING` value. + +**Return type**: `LONG` + +**Examples** ```ppl @@ -2386,7 +2604,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2402,12 +2620,16 @@ fetched rows / total rows = 1/1 ## TO_DAYS -### Description +**Usage**: `TO_DAYS(date)` -Usage: `to_days(date)` returns the day number (the number of days since year 0) of the given date. Returns NULL if date is invalid. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `LONG` -### Example +Returns the day number (the number of days since year 0) of the given date. Returns `NULL` if date is invalid. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `LONG` + +#### Example ```ppl @@ -2417,7 +2639,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2433,13 +2655,16 @@ fetched rows / total rows = 1/1 ## TO_SECONDS -### Description +**Usage**: `TO_SECONDS(date)` -Usage: `to_seconds(date)` returns the number of seconds since the year 0 of the given value. Returns NULL if value is invalid. -An argument of a LONG type can be used. It must be formatted as YMMDD, YYMMDD, YYYMMDD or YYYYMMDD. Note that a LONG type argument cannot have leading 0s as it will be parsed using an octal numbering system. -**Argument type:** `STRING/LONG/DATE/TIME/TIMESTAMP` -**Return type:** `LONG` -### Example +Returns the number of seconds since the year 0 of the given value. Returns `NULL` if value is invalid. An argument of a `LONG` type can be used. It must be formatted as YMMDD, YYMMDD, YYYMMDD, or YYYYMMDD. Note that a `LONG` type argument cannot have leading 0s as it will be parsed using an octal numbering system. + +**Parameters**: +- `date` (Required): A `STRING`, `LONG`, `DATE`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `LONG` + +#### Example ```ppl @@ -2450,7 +2675,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2466,15 +2691,16 @@ fetched rows / total rows = 1/1 ## UNIX_TIMESTAMP -### Description +**Usage**: `UNIX_TIMESTAMP()` or `UNIX_TIMESTAMP(date)` -Usage: Converts given argument to Unix time (seconds since Epoch - very beginning of year 1970). If no argument given, it returns the current Unix time. -The date argument may be a DATE, or TIMESTAMP string, or a number in YYMMDD, YYMMDDhhmmss, YYYYMMDD, or YYYYMMDDhhmmss format. If the argument includes a time part, it may optionally include a fractional seconds part. -If argument is in invalid format or outside of range 1970-01-01 00:00:00 - 3001-01-18 23:59:59.999999 (0 to 32536771199.999999 epoch time), function returns NULL. -You can use [FROM_UNIXTIME](#from_unixtime) to do reverse conversion. -**Argument type:** `\/DOUBLE/DATE/TIMESTAMP` -**Return type:** `DOUBLE` -### Example +Converts the given argument to Unix time (seconds since Epoch - the very beginning of the year 1970). If no argument is given, it returns the current Unix time. The date argument may be a `DATE`, or `TIMESTAMP` string, or a number in YYMMDD, YYMMDDhhmmss, YYYYMMDD, or YYYYMMDDhhmmss format. If the argument includes a time part, it may optionally include a fractional seconds part. If the argument is in an invalid format or outside the range 1970-01-01 00:00:00 - 3001-01-18 23:59:59.999999 (0 to 32536771199.999999 epoch time), the function returns `NULL`. You can use [`FROM_UNIXTIME`](#from_unixtime) to perform the reverse conversion. + +**Parameters**: +- `date` (Optional): A `DOUBLE`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `DOUBLE` + +#### Example ```ppl @@ -2484,7 +2710,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2500,12 +2726,15 @@ fetched rows / total rows = 1/1 ## UTC_DATE -### Description +**Usage**: `UTC_DATE()` -Returns the current UTC date as a value in 'YYYY-MM-DD'. -**Return type:** `DATE` -Specification: UTC_DATE() -> DATE -### Example +Returns the current UTC date as a value in `YYYY-MM-DD` format. + +**Parameters**: None + +**Return type**: `DATE` + +#### Example ```ppl ignore @@ -2515,7 +2744,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2531,12 +2760,15 @@ fetched rows / total rows = 1/1 ## UTC_TIME -### Description +**Usage**: `UTC_TIME()` Returns the current UTC time as a value in 'hh:mm:ss'. -**Return type:** `TIME` -Specification: UTC_TIME() -> TIME -### Example + +**Parameters**: None + +**Return type**: `TIME` + +#### Example ```ppl ignore @@ -2546,7 +2778,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2562,12 +2794,15 @@ fetched rows / total rows = 1/1 ## UTC_TIMESTAMP -### Description +**Usage**: `UTC_TIMESTAMP()` Returns the current UTC timestamp as a value in 'YYYY-MM-DD hh:mm:ss'. -**Return type:** `TIMESTAMP` -Specification: UTC_TIMESTAMP() -> TIMESTAMP -### Example + +**Parameters**: None + +**Return type**: `TIMESTAMP` + +#### Example ```ppl ignore @@ -2577,7 +2812,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2593,11 +2828,19 @@ fetched rows / total rows = 1/1 ## WEEK -### Description +**Usage**: `WEEK(date)` or `WEEK(date, mode)` + +Returns the week number for `date`. If the mode argument is omitted, the default mode 0 is used. -Usage: `week(date[, mode])` returns the week number for date. If the mode argument is omitted, the default mode 0 is used. -The following table describes how the mode argument works. +**Parameters**: +- `date` (Required): A `DATE`, `TIMESTAMP`, or `STRING` value. +- `mode` (Optional): An `INTEGER` mode value (0-7). +**Return type**: `INTEGER` + +Synonyms: [`WEEK_OF_YEAR`](#week_of_year) + +The following table describes how the `mode` parameter works. | Mode | First day of week | Range | Week 1 is the first week ... | | --- | --- | --- | --- | @@ -2609,12 +2852,8 @@ The following table describes how the mode argument works. | 5 | Monday | 0-53 | with a Monday in this year | | 6 | Sunday | 1-53 | with 4 or more days this year | | 7 | Monday | 1-53 | with a Monday in this year | - -**Argument type:** `DATE/TIMESTAMP/STRING` -**Return type:** `INTEGER` -Synonyms: [WEEK_OF_YEAR](#week_of_year) -### Example +#### Example ```ppl @@ -2624,7 +2863,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2640,13 +2879,16 @@ fetched rows / total rows = 1/1 ## WEEKDAY -### Description +**Usage**: `WEEKDAY(date)` -Usage: `weekday(date)` returns the weekday index for date (0 = Monday, 1 = Tuesday, ..., 6 = Sunday). -It is similar to the [dayofweek](#dayofweek) function, but returns different indexes for each day. -**Argument type:** `STRING/DATE/TIME/TIMESTAMP` -**Return type:** `INTEGER` -### Example +Returns the weekday index for `date` (0 = Monday, 1 = Tuesday, ..., 6 = Sunday). It is similar to the [`DAYOFWEEK`](#dayofweek) function, but returns different indexes for each day. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +#### Example ```ppl @@ -2657,7 +2899,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2673,11 +2915,19 @@ fetched rows / total rows = 1/1 ## WEEK_OF_YEAR -### Description +**Usage**: `WEEK_OF_YEAR(date)` or `WEEK_OF_YEAR(date, mode)` + +Returns the week number for `date`. If the mode argument is omitted, the default mode 0 is used. + +**Parameters**: +- `date` (Required): A `DATE`, `TIMESTAMP`, or `STRING` value. +- `mode` (Optional): An `INTEGER` mode value (0-7). + +**Return type**: `INTEGER` -Usage: `week_of_year(date[, mode])` returns the week number for date. If the mode argument is omitted, the default mode 0 is used. -The following table describes how the mode argument works. +Synonyms: [`WEEK`](#week) +The following table describes how the mode argument works: | Mode | First day of week | Range | Week 1 is the first week ... | | --- | --- | --- | --- | @@ -2689,12 +2939,8 @@ The following table describes how the mode argument works. | 5 | Monday | 0-53 | with a Monday in this year | | 6 | Sunday | 1-53 | with 4 or more days this year | | 7 | Monday | 1-53 | with a Monday in this year | - -**Argument type:** `DATE/TIMESTAMP/STRING` -**Return type:** `INTEGER` -Synonyms: [WEEK](#week) -### Example +#### Example ```ppl @@ -2704,7 +2950,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2720,12 +2966,16 @@ fetched rows / total rows = 1/1 ## YEAR -### Description +**Usage**: `YEAR(date)` -Usage: `year(date)` returns the year for date, in the range 1000 to 9999, or 0 for the “zero” date. -**Argument type:** `STRING/DATE/TIMESTAMP` -**Return type:** `INTEGER` -### Example +Returns the year for `date`, in the range 1000 to 9999, or 0 for the "zero" date. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, or `TIMESTAMP` value. + +**Return type**: `INTEGER` + +#### Example ```ppl @@ -2735,7 +2985,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text @@ -2751,12 +3001,17 @@ fetched rows / total rows = 1/1 ## YEARWEEK -### Description +**Usage**: `YEARWEEK(date)` or `YEARWEEK(date, mode)` -Usage: `yearweek(date[, mode])` returns the year and week for date as an integer. It accepts and optional mode arguments aligned with those available for the [WEEK](#week) function. -**Argument type:** `STRING/DATE/TIME/TIMESTAMP` -**Return type:** `INTEGER` -### Example +Returns the year and week for `date` as an integer. It accepts an optional mode argument aligned with those available for the [`WEEK`](#week) function. + +**Parameters**: +- `date` (Required): A `STRING`, `DATE`, `TIME`, or `TIMESTAMP` value. +- `mode` (Optional): An `INTEGER` mode value (0-7). + +**Return type**: `INTEGER` + +#### Example ```ppl @@ -2767,7 +3022,7 @@ source=people ``` -Expected output: +The query returns the following results: ```text diff --git a/docs/user/ppl/functions/expressions.md b/docs/user/ppl/functions/expressions.md index 999531cabbe..e42d867705c 100644 --- a/docs/user/ppl/functions/expressions.md +++ b/docs/user/ppl/functions/expressions.md @@ -1,34 +1,27 @@ -# Expressions +# Expressions -## Introduction +Expressions, particularly value expressions, return a scalar value. Expressions have different types and forms. For example, there are literal values as atomic expressions, as well as arithmetic, predicate, and function expressions built on top of them. You can use expressions in different clauses, such as arithmetic expressions in the `Filter` or `Stats` commands. -Expressions, particularly value expressions, are those which return a scalar value. Expressions have different types and forms. For example, there are literal values as atom expression and arithmetic, predicate and function expression built on top of them. And also expressions can be used in different clauses, such as using arithmetic expression in `Filter`, `Stats` command. -## Arithmetic Operators +## Arithmetic operators -### Description +Arithmetic expressions are formed by combining numeric literals and binary arithmetic operators. The following operators are available: +1. `+`: Addition +2. `-`: Subtraction +3. `*`: Multiplication +4. `/`: Division. When [`plugins.ppl.syntax.legacy.preferred`](../admin/settings.md) is `true` (default), integer operands follow the legacy truncating result. When the setting is `false`, the operands are promoted to floating-point, preserving the fractional part. Division by zero returns `NULL`. +5. `%`: Modulo. This operator can only be used with integers and returns the remainder of the division. -#### Operators +### Precedence -Arithmetic expression is an expression formed by numeric literals and binary arithmetic operators as follows: -1. `+`: Add. -2. `-`: Subtract. -3. `*`: Multiply. -4. `/`: Divide. Integer operands follow the legacy truncating result when - - [plugins.ppl.syntax.legacy.preferred](../admin/settings.md) is `true` (default). When the - setting is `false` the operands are promoted to floating point, preserving - the fractional part. Division by zero still returns `NULL`. -5. `%`: Modulo. This can be used with integers only with remainder of the division as result. - -#### Precedence +You can use parentheses to control the precedence of arithmetic operators. Otherwise, operators with higher precedence are performed first. + +### Type conversion -Parentheses can be used to control the precedence of arithmetic operators. Otherwise, operators of higher precedence is performed first. -#### Type Conversion +The system performs implicit type conversion when determining which operator to use. For example, adding an integer to a real number matches the signature `+(double,double)`, which results in a real number. The same type conversion rules apply to function calls. -Implicit type conversion is performed when looking up operator signature. For example, an integer `+` a real number matches signature `+(double,double)` which results in a real number. This rule also applies to function call discussed below. -### Examples +### Examples -Here is an example for different type of arithmetic expressions +The following are examples of different types of arithmetic expressions: ```ppl source=accounts @@ -36,7 +29,7 @@ source=accounts | fields age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 3/3 @@ -49,36 +42,46 @@ fetched rows / total rows = 3/3 +-----+ ``` -## Predicate Operators +## Predicate operators + +Predicate operators are expressions that evaluate to `true` or `false`. + +Comparisons for `MISSING` and `NULL` values follow these rules: -### Description +- `MISSING` values only equal other `MISSING` values and are less than all other values. +- `NULL` values equal other `NULL` values, are greater than `MISSING` values, but less than all other values. -Predicate operator is an expression that evaluated to be ture. The MISSING and NULL value comparison has following the rule. MISSING value only equal to MISSING value and less than all the other values. NULL value equals to NULL value, large than MISSING value, but less than all the other values. -#### Operators +### Operators -| name | description | +| Name | Description | | --- | --- | -| > | Greater than operator | -| >= | Greater than or equal operator | -| < | Less than operator | -| != | Not equal operator | -| <= | Less than or equal operator | -| = | Equal operator | -| == | Equal operator (alternative syntax) | -| LIKE | Simple Pattern matching | -| IN | NULL value test | -| AND | AND operator | -| OR | OR operator | -| XOR | XOR operator | -| NOT | NOT NULL value test | - -It is possible to compare datetimes. When comparing different datetime types, for example `DATE` and `TIME`, both converted to `DATETIME`. -The following rule is applied on coversion: a `TIME` applied to today's date; `DATE` is interpreted at midnight. -### Examples - -#### Basic Predicate Operator - -Here is an example for comparison operators +| `>` | Greater than | +| `>=` | Greater than or equal to | +| `<` | Less than | +| `!=` | Not equal to | +| `<=` | Less than or equal to | +| `=` | Equal to | +| `==` | Equal to (alternative syntax) | +| `LIKE` | Simple pattern matching | +| `IN` | Value list membership test | +| `AND` | Logical AND | +| `OR` | Logical OR | +| `XOR` | Logical XOR | +| `NOT` | Logical NOT | + +You can compare date and time values. When comparing different date and time types (for example, `DATE` and `TIME`), both values are converted to `DATETIME`. + +The following conversion rules are applied: +- A `TIME` value is combined with today's date. +- A `DATE` value is interpreted as midnight on that date. + +### Examples + +The following examples demonstrate how to use predicate operators in PPL queries. + +#### Basic predicate operators + +The following is an example of comparison operators: ```ppl source=accounts @@ -86,7 +89,7 @@ source=accounts | fields age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -97,7 +100,7 @@ fetched rows / total rows = 1/1 +-----+ ``` -The `==` operator can be used as an alternative to `=` for equality comparisons +The `==` operator can be used as an alternative to `=` for equality comparisons. ```ppl source=accounts @@ -105,7 +108,7 @@ source=accounts | fields age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -116,10 +119,10 @@ fetched rows / total rows = 1/1 +-----+ ``` -Note: Both `=` and `==` perform the same equality comparison. You can use either based on your preference. +> **Note**: Both `=` and `==` perform the same equality comparison. You can use either based on your preference. #### IN -IN operator test field in value lists +The `IN` operator tests whether a field value is in the specified list of values. ```ppl source=accounts @@ -127,7 +130,7 @@ source=accounts | fields age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -141,7 +144,7 @@ fetched rows / total rows = 2/2 #### OR -OR operator +The `OR` operator performs a logical OR operation between two Boolean expressions. ```ppl source=accounts @@ -149,7 +152,7 @@ source=accounts | fields age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -163,7 +166,7 @@ fetched rows / total rows = 2/2 #### NOT -NOT operator +The `NOT` operator performs a logical NOT operation, negating a Boolean expression. ```ppl source=accounts @@ -171,7 +174,7 @@ source=accounts | fields age ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 diff --git a/docs/user/ppl/functions/index.md b/docs/user/ppl/functions/index.md index cdfbbd201ce..2ade4a09b98 100644 --- a/docs/user/ppl/functions/index.md +++ b/docs/user/ppl/functions/index.md @@ -3,237 +3,238 @@ PPL supports a wide range of built-in functions for data processing and analysis. These functions are organized into categories based on their functionality and can be used within PPL queries to manipulate and transform data. -- [Aggregation Functions](aggregations.md) - - [COUNT](aggregations.md/#count) - - [SUM](aggregations.md/#sum) - - [AVG](aggregations.md/#avg) - - [MAX](aggregations.md/#max) - - [MIN](aggregations.md/#min) - - [VAR_SAMP](aggregations.md/#var_samp) - - [VAR_POP](aggregations.md/#var_pop) - - [STDDEV_SAMP](aggregations.md/#stddev_samp) - - [STDDEV_POP](aggregations.md/#stddev_pop) - - [DISTINCT_COUNT, DC](aggregations.md/#distinct_count-dc) - - [DISTINCT_COUNT_APPROX](aggregations.md/#distinct_count_approx) - - [EARLIEST](aggregations.md/#earliest) - - [LATEST](aggregations.md/#latest) - - [TAKE](aggregations.md/#take) - - [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-or-percentile_approx) - - [MEDIAN](aggregations.md/#median) - - [FIRST](aggregations.md/#first) - - [LAST](aggregations.md/#last) - - [LIST](aggregations.md/#list) - - [VALUES](aggregations.md/#values) - -- [Collection Functions](collection.md) - - [ARRAY](collection.md/#array) - - [ARRAY_LENGTH](collection.md/#array_length) - - [FORALL](collection.md/#forall) - - [EXISTS](collection.md/#exists) - - [FILTER](collection.md/#filter) - - [TRANSFORM](collection.md/#transform) - - [REDUCE](collection.md/#reduce) - - [MVJOIN](collection.md/#mvjoin) - - [MVAPPEND](collection.md/#mvappend) - - [SPLIT](collection.md/#split) - - [MVDEDUP](collection.md/#mvdedup) - - [MVFIND](collection.md/#mvfind) - - [MVINDEX](collection.md/#mvindex) - - [MVMAP](collection.md/#mvmap) - - [MVZIP](collection.md/#mvzip) - -- [Condition Functions](condition.md) - - [ISNULL](condition.md/#isnull) - - [ISNOTNULL](condition.md/#isnotnull) - - [EXISTS](condition.md/#exists) - - [IFNULL](condition.md/#ifnull) - - [NULLIF](condition.md/#nullif) - - [IF](condition.md/#if) - - [CASE](condition.md/#case) - - [COALESCE](condition.md/#coalesce) - - [ISPRESENT](condition.md/#ispresent) - - [ISBLANK](condition.md/#isblank) - - [ISEMPTY](condition.md/#isempty) - - [EARLIEST](condition.md/#earliest) - - [LATEST](condition.md/#latest) - - [REGEXP_MATCH](condition.md/#regexp_match) - - [CONTAINS](condition.md/#contains) - -- [Type Conversion Functions](conversion.md) - - [CAST](conversion.md/#cast) - - [TOSTRING](conversion.md/#tostring) - - [TONUMBER](conversion.md/#tonumber) - -- [Cryptographic Functions](cryptographic.md) - - [SHA1](cryptographic.md/#sha1) - - [SHA2](cryptographic.md/#sha2) - -- [Date and Time Functions](datetime.md) - - [ADDDATE](datetime.md/#adddate) - - [ADDTIME](datetime.md/#addtime) - - [CONVERT_TZ](datetime.md/#convert_tz) - - [CURDATE](datetime.md/#curdate) - - [CURRENT_DATE](datetime.md/#current_date) - - [CURRENT_TIME](datetime.md/#current_time) - - [CURRENT_TIMESTAMP](datetime.md/#current_timestamp) - - [CURTIME](datetime.md/#curtime) - - [DATE](datetime.md/#date) - - [DATE_ADD](datetime.md/#date_add) - - [DATE_FORMAT](datetime.md/#date_format) - - [DATETIME](datetime.md/#datetime) - - [DATE_SUB](datetime.md/#date_sub) - - [DATEDIFF](datetime.md/#datediff) - - [DAY](datetime.md/#day) - - [DAYNAME](datetime.md/#dayname) - - [DAYOFMONTH](datetime.md/#dayofmonth) - - [DAY_OF_MONTH](datetime.md/#day_of_month) - - [DAYOFWEEK](datetime.md/#dayofweek) - - [DAY_OF_WEEK](datetime.md/#day_of_week) - - [DAYOFYEAR](datetime.md/#dayofyear) - - [DAY_OF_YEAR](datetime.md/#day_of_year) - - [EXTRACT](datetime.md/#extract) - - [FROM_DAYS](datetime.md/#from_days) - - [FROM_UNIXTIME](datetime.md/#from_unixtime) - - [GET_FORMAT](datetime.md/#get_format) - - [HOUR](datetime.md/#hour) - - [HOUR_OF_DAY](datetime.md/#hour_of_day) - - [LAST_DAY](datetime.md/#last_day) - - [LOCALTIMESTAMP](datetime.md/#localtimestamp) - - [LOCALTIME](datetime.md/#localtime) - - [MAKEDATE](datetime.md/#makedate) - - [MAKETIME](datetime.md/#maketime) - - [MICROSECOND](datetime.md/#microsecond) - - [MINUTE](datetime.md/#minute) - - [MINUTE_OF_HOUR](datetime.md/#minute_of_hour) - - [MONTH](datetime.md/#month) - - [MONTH_OF_YEAR](datetime.md/#month_of_year) - - [MONTHNAME](datetime.md/#monthname) - - [NOW](datetime.md/#now) - - [PERIOD_ADD](datetime.md/#period_add) - - [PERIOD_DIFF](datetime.md/#period_diff) - - [QUARTER](datetime.md/#quarter) - - [SEC_TO_TIME](datetime.md/#sec_to_time) - - [SECOND](datetime.md/#second) - - [SECOND_OF_MINUTE](datetime.md/#second_of_minute) - - [STRFTIME](datetime.md/#strftime) - - [STR_TO_DATE](datetime.md/#str_to_date) - - [SUBDATE](datetime.md/#subdate) - - [SUBTIME](datetime.md/#subtime) - - [SYSDATE](datetime.md/#sysdate) - - [TIME](datetime.md/#time) - - [TIME_FORMAT](datetime.md/#time_format) - - [TIME_TO_SEC](datetime.md/#time_to_sec) - - [TIMEDIFF](datetime.md/#timediff) - - [TIMESTAMP](datetime.md/#timestamp) - - [TIMESTAMPADD](datetime.md/#timestampadd) - - [TIMESTAMPDIFF](datetime.md/#timestampdiff) - - [TO_DAYS](datetime.md/#to_days) - - [TO_SECONDS](datetime.md/#to_seconds) - - [UNIX_TIMESTAMP](datetime.md/#unix_timestamp) - - [UTC_DATE](datetime.md/#utc_date) - - [UTC_TIME](datetime.md/#utc_time) - - [UTC_TIMESTAMP](datetime.md/#utc_timestamp) - - [WEEK](datetime.md/#week) - - [WEEKDAY](datetime.md/#weekday) - - [WEEK_OF_YEAR](datetime.md/#week_of_year) - - [YEAR](datetime.md/#year) - - [YEARWEEK](datetime.md/#yearweek) - -- [Expressions](expressions.md) - - [Arithmetic Operators](expressions.md#arithmetic-operators) - - [Predicate Operators](expressions.md/#predicate-operators) - -- [IP Address Functions](ip.md) - - [CIDRMATCH](ip.md/#cidrmatch) - - [GEOIP](ip.md/#geoip) - -- [JSON Functions](json.md) - - [JSON](json.md/#json) - - [JSON_VALID](json.md/#json_valid) - - [JSON_OBJECT](json.md/#json_object) - - [JSON_ARRAY](json.md/#json_array) - - [JSON_ARRAY_LENGTH](json.md/#json_array_length) - - [JSON_EXTRACT](json.md/#json_extract) - - [JSON_DELETE](json.md/#json_delete) - - [JSON_SET](json.md/#json_set) - - [JSON_APPEND](json.md/#json_append) - - [JSON_EXTEND](json.md/#json_extend) - - [JSON_KEYS](json.md/#json_keys) - -- [Mathematical Functions](math.md) - - [ADD](math.md/#add) - - [SUBTRACT](math.md/#subtract) - - [MULTIPLY](math.md/#multiply) - - [DIVIDE](math.md/#divide) - - [SUM](math.md/#sum) - - [AVG](math.md/#avg) - - [ACOS](math.md/#acos) - - [ASIN](math.md/#asin) - - [ATAN](math.md/#atan) - - [ATAN2](math.md/#atan2) - - [CEIL](math.md/#ceil) - - [CEILING](math.md/#ceiling) - - [CONV](math.md/#conv) - - [COS](math.md/#cos) - - [COSH](math.md/#cosh) - - [COT](math.md/#cot) - - [CRC32](math.md/#crc32) - - [DEGREES](math.md/#degrees) - - [E](math.md/#e) - - [EXP](math.md/#exp) - - [EXPM1](math.md/#expm1) - - [FLOOR](math.md/#floor) - - [LN](math.md/#ln) - - [LOG](math.md/#log) - - [LOG2](math.md/#log2) - - [LOG10](math.md/#log10) - - [MOD](math.md/#mod) - - [MODULUS](math.md/#modulus) - - [PI](math.md/#pi) - - [POW](math.md/#pow) - - [POWER](math.md/#power) - - [RADIANS](math.md/#radians) - - [RAND](math.md/#rand) - - [ROUND](math.md/#round) - - [SIGN](math.md/#sign) - - [SIGNUM](math.md/#signum) - - [SIN](math.md/#sin) - - [SINH](math.md/#sinh) - - [SQRT](math.md/#sqrt) - - [CBRT](math.md/#cbrt) - - [RINT](math.md/#rint) - -- [Relevance Functions](relevance.md) - - [MATCH](relevance.md/#match) - - [MATCH_PHRASE](relevance.md/#match_phrase) - - [MATCH_PHRASE_PREFIX](relevance.md/#match_phrase_prefix) - - [MULTI_MATCH](relevance.md/#multi_match) - - [SIMPLE_QUERY_STRING](relevance.md/#simple_query_string) - - [MATCH_BOOL_PREFIX](relevance.md/#match_bool_prefix) - - [QUERY_STRING](relevance.md/#query_string) - -- [Statistical Functions](statistical.md) - - [MAX](statistical.md/#max) - - [MIN](statistical.md/#min) - -- [String Functions](string.md) - - [CONCAT](string.md/#concat) - - [CONCAT_WS](string.md/#concat_ws) - - [LENGTH](string.md/#length) - - [LIKE](string.md/#like) - - [ILIKE](string.md/#ilike) - - [LOCATE](string.md/#locate) - - [LOWER](string.md/#lower) - - [LTRIM](string.md/#ltrim) - - [POSITION](string.md/#position) - - [REPLACE](string.md/#replace) - - [REVERSE](string.md/#reverse) - - [RIGHT](string.md/#right) - - [RTRIM](string.md/#rtrim) - - [SUBSTRING](string.md/#substring) - - [TRIM](string.md/#trim) - - [UPPER](string.md/#upper) - - [REGEXP_REPLACE](string.md/#regexp_replace) - -- [System Functions](system.md) - - [TYPEOF](system.md/#typeof) \ No newline at end of file +- [Aggregation functions](aggregations.md): + - [COUNT](aggregations.md/#count). + - [SUM](aggregations.md/#sum). + - [AVG](aggregations.md/#avg). + - [MAX](aggregations.md/#max). + - [MIN](aggregations.md/#min). + - [VAR_SAMP](aggregations.md/#var_samp). + - [VAR_POP](aggregations.md/#var_pop). + - [STDDEV_SAMP](aggregations.md/#stddev_samp). + - [STDDEV_POP](aggregations.md/#stddev_pop). + - [DISTINCT_COUNT, DC](aggregations.md/#distinct_count-dc). + - [DISTINCT_COUNT_APPROX](aggregations.md/#distinct_count_approx). + - [EARLIEST](aggregations.md/#earliest). + - [LATEST](aggregations.md/#latest). + - [TAKE](aggregations.md/#take). + - [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-percentile_approx). + - [MEDIAN](aggregations.md/#median). + - [FIRST](aggregations.md/#first). + - [LAST](aggregations.md/#last). + - [LIST](aggregations.md/#list). + - [VALUES](aggregations.md/#values). + +- [Collection functions](collection.md): + - [ARRAY](collection.md/#array). + - [ARRAY_LENGTH](collection.md/#array_length). + - [FORALL](collection.md/#forall). + - [EXISTS](collection.md/#exists). + - [FILTER](collection.md/#filter). + - [TRANSFORM](collection.md/#transform). + - [REDUCE](collection.md/#reduce). + - [MVJOIN](collection.md/#mvjoin). + - [MVAPPEND](collection.md/#mvappend). + - [SPLIT](collection.md/#split). + - [MVDEDUP](collection.md/#mvdedup). + - [MVFIND](collection.md/#mvfind). + - [MVINDEX](collection.md/#mvindex). + - [MVMAP](collection.md/#mvmap). + - [MVZIP](collection.md/#mvzip). + +- [Conditional functions](condition.md): + - [ISNULL](condition.md/#isnull). + - [ISNOTNULL](condition.md/#isnotnull). + - [EXISTS](condition.md/#exists). + - [IFNULL](condition.md/#ifnull). + - [NULLIF](condition.md/#nullif). + - [IF](condition.md/#if). + - [CASE](condition.md/#case). + - [COALESCE](condition.md/#coalesce). + - [ISPRESENT](condition.md/#ispresent). + - [ISBLANK](condition.md/#isblank). + - [ISEMPTY](condition.md/#isempty). + - [EARLIEST](condition.md/#earliest). + - [LATEST](condition.md/#latest). + - [REGEXP_MATCH](condition.md/#regexp_match). + - [CONTAINS](condition.md/#contains). + +- [Type conversion functions](conversion.md): + - [CAST](conversion.md/#cast). + - [TOSTRING](conversion.md/#tostring). + - [TONUMBER](conversion.md/#tonumber). + +- [Cryptographic functions](cryptographic.md): + - [MD5](cryptographic.md/#md5). + - [SHA1](cryptographic.md/#sha1). + - [SHA2](cryptographic.md/#sha2). + +- [Date and time functions](datetime.md): + - [ADDDATE](datetime.md/#adddate). + - [ADDTIME](datetime.md/#addtime). + - [CONVERT_TZ](datetime.md/#convert_tz). + - [CURDATE](datetime.md/#curdate). + - [CURRENT_DATE](datetime.md/#current_date). + - [CURRENT_TIME](datetime.md/#current_time). + - [CURRENT_TIMESTAMP](datetime.md/#current_timestamp). + - [CURTIME](datetime.md/#curtime). + - [DATE](datetime.md/#date). + - [DATE_ADD](datetime.md/#date_add). + - [DATE_FORMAT](datetime.md/#date_format). + - [DATETIME](datetime.md/#datetime). + - [DATE_SUB](datetime.md/#date_sub). + - [DATEDIFF](datetime.md/#datediff). + - [DAY](datetime.md/#day). + - [DAYNAME](datetime.md/#dayname). + - [DAYOFMONTH](datetime.md/#dayofmonth). + - [DAY_OF_MONTH](datetime.md/#day_of_month). + - [DAYOFWEEK](datetime.md/#dayofweek). + - [DAY_OF_WEEK](datetime.md/#day_of_week). + - [DAYOFYEAR](datetime.md/#dayofyear). + - [DAY_OF_YEAR](datetime.md/#day_of_year). + - [EXTRACT](datetime.md/#extract). + - [FROM_DAYS](datetime.md/#from_days). + - [FROM_UNIXTIME](datetime.md/#from_unixtime). + - [GET_FORMAT](datetime.md/#get_format). + - [HOUR](datetime.md/#hour). + - [HOUR_OF_DAY](datetime.md/#hour_of_day). + - [LAST_DAY](datetime.md/#last_day). + - [LOCALTIMESTAMP](datetime.md/#localtimestamp). + - [LOCALTIME](datetime.md/#localtime). + - [MAKEDATE](datetime.md/#makedate). + - [MAKETIME](datetime.md/#maketime). + - [MICROSECOND](datetime.md/#microsecond). + - [MINUTE](datetime.md/#minute). + - [MINUTE_OF_HOUR](datetime.md/#minute_of_hour). + - [MONTH](datetime.md/#month). + - [MONTH_OF_YEAR](datetime.md/#month_of_year). + - [MONTHNAME](datetime.md/#monthname). + - [NOW](datetime.md/#now). + - [PERIOD_ADD](datetime.md/#period_add). + - [PERIOD_DIFF](datetime.md/#period_diff). + - [QUARTER](datetime.md/#quarter). + - [SEC_TO_TIME](datetime.md/#sec_to_time). + - [SECOND](datetime.md/#second). + - [SECOND_OF_MINUTE](datetime.md/#second_of_minute). + - [STRFTIME](datetime.md/#strftime). + - [STR_TO_DATE](datetime.md/#str_to_date). + - [SUBDATE](datetime.md/#subdate). + - [SUBTIME](datetime.md/#subtime). + - [SYSDATE](datetime.md/#sysdate). + - [TIME](datetime.md/#time). + - [TIME_FORMAT](datetime.md/#time_format). + - [TIME_TO_SEC](datetime.md/#time_to_sec). + - [TIMEDIFF](datetime.md/#timediff). + - [TIMESTAMP](datetime.md/#timestamp). + - [TIMESTAMPADD](datetime.md/#timestampadd). + - [TIMESTAMPDIFF](datetime.md/#timestampdiff). + - [TO_DAYS](datetime.md/#to_days). + - [TO_SECONDS](datetime.md/#to_seconds). + - [UNIX_TIMESTAMP](datetime.md/#unix_timestamp). + - [UTC_DATE](datetime.md/#utc_date). + - [UTC_TIME](datetime.md/#utc_time). + - [UTC_TIMESTAMP](datetime.md/#utc_timestamp). + - [WEEK](datetime.md/#week). + - [WEEKDAY](datetime.md/#weekday). + - [WEEK_OF_YEAR](datetime.md/#week_of_year). + - [YEAR](datetime.md/#year). + - [YEARWEEK](datetime.md/#yearweek). + +- [Expressions](expressions.md): + - [Arithmetic operators](expressions.md#arithmetic-operators). + - [Predicate operators](expressions.md/#predicate-operators). + +- [IP address functions](ip.md): + - [CIDRMATCH](ip.md/#cidrmatch). + - [GEOIP](ip.md/#geoip). + +- [JSON functions](json.md): + - [JSON](json.md/#json). + - [JSON_VALID](json.md/#json_valid). + - [JSON_OBJECT](json.md/#json_object). + - [JSON_ARRAY](json.md/#json_array). + - [JSON_ARRAY_LENGTH](json.md/#json_array_length). + - [JSON_EXTRACT](json.md/#json_extract). + - [JSON_DELETE](json.md/#json_delete). + - [JSON_SET](json.md/#json_set). + - [JSON_APPEND](json.md/#json_append). + - [JSON_EXTEND](json.md/#json_extend). + - [JSON_KEYS](json.md/#json_keys). + +- [Mathematical functions](math.md): + - [ADD](math.md/#add). + - [SUBTRACT](math.md/#subtract). + - [MULTIPLY](math.md/#multiply). + - [DIVIDE](math.md/#divide). + - [SUM](math.md/#sum). + - [AVG](math.md/#avg). + - [ACOS](math.md/#acos). + - [ASIN](math.md/#asin). + - [ATAN](math.md/#atan). + - [ATAN2](math.md/#atan2). + - [CEIL](math.md/#ceil). + - [CEILING](math.md/#ceiling). + - [CONV](math.md/#conv). + - [COS](math.md/#cos). + - [COSH](math.md/#cosh). + - [COT](math.md/#cot). + - [CRC32](math.md/#crc32). + - [DEGREES](math.md/#degrees). + - [E](math.md/#e). + - [EXP](math.md/#exp). + - [EXPM1](math.md/#expm1). + - [FLOOR](math.md/#floor). + - [LN](math.md/#ln). + - [LOG](math.md/#log). + - [LOG2](math.md/#log2). + - [LOG10](math.md/#log10). + - [MOD](math.md/#mod). + - [MODULUS](math.md/#modulus). + - [PI](math.md/#pi). + - [POW](math.md/#pow). + - [POWER](math.md/#power). + - [RADIANS](math.md/#radians). + - [RAND](math.md/#rand). + - [ROUND](math.md/#round). + - [SIGN](math.md/#sign). + - [SIGNUM](math.md/#signum). + - [SIN](math.md/#sin). + - [SINH](math.md/#sinh). + - [SQRT](math.md/#sqrt). + - [CBRT](math.md/#cbrt). + - [RINT](math.md/#rint). + +- [Relevance functions](relevance.md): + - [MATCH](relevance.md/#match). + - [MATCH_PHRASE](relevance.md/#match_phrase). + - [MATCH_PHRASE_PREFIX](relevance.md/#match_phrase_prefix). + - [MULTI_MATCH](relevance.md/#multi_match). + - [SIMPLE_QUERY_STRING](relevance.md/#simple_query_string). + - [MATCH_BOOL_PREFIX](relevance.md/#match_bool_prefix). + - [QUERY_STRING](relevance.md/#query_string). + +- [Statistical functions](statistical.md): + - [MAX](statistical.md/#max). + - [MIN](statistical.md/#min). + +- [String functions](string.md): + - [CONCAT](string.md/#concat). + - [CONCAT_WS](string.md/#concat_ws). + - [LENGTH](string.md/#length). + - [LIKE](string.md/#like). + - [ILIKE](string.md/#ilike). + - [LOCATE](string.md/#locate). + - [LOWER](string.md/#lower). + - [LTRIM](string.md/#ltrim). + - [POSITION](string.md/#position). + - [REPLACE](string.md/#replace). + - [REVERSE](string.md/#reverse). + - [RIGHT](string.md/#right). + - [RTRIM](string.md/#rtrim). + - [SUBSTRING](string.md/#substring). + - [TRIM](string.md/#trim). + - [UPPER](string.md/#upper). + - [REGEXP_REPLACE](string.md/#regexp_replace). + +- [System functions](system.md): + - [TYPEOF](system.md/#typeof). diff --git a/docs/user/ppl/functions/ip.md b/docs/user/ppl/functions/ip.md index c21816baea9..d59ccecc8c8 100644 --- a/docs/user/ppl/functions/ip.md +++ b/docs/user/ppl/functions/ip.md @@ -1,13 +1,19 @@ -# IP Address Functions +# IP address functions -## CIDRMATCH +The following IP address functions are supported in PPL. -### Description +## CIDRMATCH -Usage: `cidrmatch(ip, cidr)` checks if `ip` is within the specified `cidr` range. +**Usage**: `CIDRMATCH(ip, cidr)` -**Argument type:** `STRING`/`IP`, `STRING` -**Return type:** `BOOLEAN` +Checks whether an IP address is within the specified CIDR range. + +**Parameters**: + +- `ip` (Required): The IP address to check, as a string or IP value. Supports both IPv4 and IPv6. +- `cidr` (Required): The CIDR range to check against, as a string. Supports both IPv4 and IPv6 blocks. + +**Return type**: `BOOLEAN` ### Example @@ -17,7 +23,7 @@ source=weblogs | fields host, url ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -29,26 +35,29 @@ fetched rows / total rows = 2/2 +---------+--------------------+ ``` -Note: - - `ip` can be an IPv4 or IPv6 address - - `cidr` can be an IPv4 or IPv6 block - - `ip` and `cidr` must both be valid and non-missing/non-null - -## GEOIP +## GEOIP + +**Usage**: `GEOIP(dataSourceName, ipAddress[, options])` -### Description +Retrieves location information for IP addresses using the OpenSearch Geospatial plugin API. -Usage: `geoip(dataSourceName, ipAddress[, options])` to lookup location information from given IP addresses via OpenSearch GeoSpatial plugin API. +**Parameters**: -**Argument type:** `STRING`, `STRING`/`IP`, `STRING` -**Return type:** `OBJECT` +- `dataSourceName` (Required): The name of an established data source on the OpenSearch Geospatial plugin. For configuration details, see the [IP2Geo processor documentation](https://docs.opensearch.org/latest/ingest-pipelines/processors/ip2geo/). +- `ipAddress` (Required): The IP address to look up, as a string or IP value. Supports both IPv4 and IPv6. +- `options` (Optional): A comma-separated string of fields to output. The available fields depend on the data source provider's schema. For example, the `geolite2-city` dataset includes fields like `country_iso_code`, `country_name`, `continent_name`, `region_iso_code`, `region_name`, `city_name`, `time_zone`, and `location`. -### Example: + +**Return type**: `OBJECT` + +### Example ```ppl ignore source=weblogs | eval LookupResult = geoip("dataSourceName", "50.68.18.229", "country_iso_code,city_name") ``` + +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -58,8 +67,3 @@ fetched rows / total rows = 1/1 | {'city_name': 'Vancouver', 'country_iso_code': 'CA'} | +-------------------------------------------------------------+ ``` - -Note: - - `dataSourceName` must be an established dataSource on OpenSearch GeoSpatial plugin, detail of configuration can be found: https://opensearch.org/docs/latest/ingest-pipelines/processors/ip2geo/ - - `ip` can be an IPv4 or an IPv6 address - - `options` is an optional String of comma separated fields to output: the selection of fields is subject to dataSourceProvider's schema. For example, the list of fields in the provided `geolite2-city` dataset includes: "country_iso_code", "country_name", "continent_name", "region_iso_code", "region_name", "city_name", "time_zone", "location" diff --git a/docs/user/ppl/functions/json.md b/docs/user/ppl/functions/json.md index e9bd8cf8ac6..d459ed5f981 100644 --- a/docs/user/ppl/functions/json.md +++ b/docs/user/ppl/functions/json.md @@ -1,31 +1,36 @@ -# JSON Functions +# JSON functions -## JSON Path +PPL supports the following JSON functions for creating, parsing, and manipulating JSON data. -### Description +## JSON path All JSON paths used in JSON functions follow the format `{}.{}...`. -Each `` represents a field name. The `{}` part is optional and is only applicable when the corresponding key refers to an array. -For example +Each `` represents a field name. The `{}` part is optional and is used only when the corresponding key refers to an array. +For example: ```bash a{2}.b{0} - ``` -This refers to the element at index 0 of the `b` array, which is nested inside the element at index 2 of the `a` array. -Notes: -1. The `{}` notation applies **only when** the associated key points to an array. -2. `{}` (without a specific index) is interpreted as a **wildcard**, equivalent to `{*}`, meaning "all elements" in the array at that level. +This path accesses the element at index `0` in the `b` array, which is located within the element at index `2` of the `a` array. + +**Notes**: +1. The `{}` notation applies only when the associated key points to an array. +2. `{}` (without a specific index) is interpreted as a wildcard, equivalent to `{*}`, meaning `all elements` in the array at that level. ## JSON -### Description +**Usage**: `JSON(value)` + +Validates and parses a JSON string. Returns the parsed JSON value if the string is valid JSON, or `NULL` if invalid. + +**Parameters**: + +- `value` (Required): The string to validate and parse as JSON. + +**Return type**: `STRING` -Usage: `json(value)` Evaluates whether a string can be parsed as a json-encoded string. Returns the value if valid, null otherwise. -**Argument type:** `STRING` -**Return type:** `STRING` -### Example +#### Example ```ppl source=json_test @@ -34,7 +39,7 @@ source=json_test | fields test_name, json_string, json ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -48,17 +53,23 @@ fetched rows / total rows = 4/4 +--------------------+---------------------------------+---------------------------------+ ``` -## JSON_VALID +## JSON_VALID -### Description +**Usage**: `JSON_VALID(value)` + +Evaluates whether a string uses valid JSON syntax. Returns `TRUE` if valid, `FALSE` if invalid. `NULL` input returns `NULL`. + +**Version**: 3.1.0 +**Limitation**: Only works when `plugins.calcite.enabled=true` + +**Parameters**: + +- `value` (Required): The string to validate as JSON. + +**Return type**: `BOOLEAN` + +#### Example -Version: 3.1.0 -Limitation: Only works when `plugins.calcite.enabled=true` -Usage: `json_valid(value)` Evaluates whether a string uses valid JSON syntax. Returns TRUE if valid, FALSE if invalid. NULL input returns NULL. -**Argument type:** `STRING ` -**Return type:** `BOOLEAN ` -Example - ```ppl source=people | eval is_valid_json = json_valid('[1,2,3,4]'), is_invalid_json = json_valid('{invalid}') @@ -66,7 +77,7 @@ source=people | head 1 ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -77,15 +88,21 @@ fetched rows / total rows = 1/1 +---------------+-----------------+ ``` -## JSON_OBJECT +## JSON_OBJECT -### Description +**Usage**: `JSON_OBJECT(key1, value1, key2, value2, ...)` + +Creates a JSON object string from the specified key-value pairs. All keys must be strings. + +**Parameters**: + +- `key1`, `value1` (Required): The first key-value pair. The key must be a string. +- `key2`, `value2`, `...` (Optional): Additional key-value pairs. + +**Return type**: `STRING` + +#### Example -Usage: `json_object(key1, value1, key2, value2...)` create a json object string with key value pairs. The key must be string. -**Argument type:** `key1: STRING, value1: ANY, key2: STRING, value2: ANY ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval test_json = json_object('key', 123.45) @@ -93,7 +110,7 @@ source=json_test | fields test_json ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -104,15 +121,20 @@ fetched rows / total rows = 1/1 +----------------+ ``` -## JSON_ARRAY +## JSON_ARRAY -### Description +**Usage**: `JSON_ARRAY(element1, element2, ...)` + +Creates a JSON array string from the specified elements. + +**Parameters**: + +- `element1`, `element2`, `...` (Optional): The elements to include in the array. Can be any data type. + +**Return type**: `STRING` + +#### Example -Usage: `json_array(element1, element2, ...)` create a json array string with elements. -**Argument type:** `element1: ANY, element2: ANY ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval test_json_array = json_array('key', 123.45) @@ -120,7 +142,7 @@ source=json_test | fields test_json_array ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -131,15 +153,22 @@ fetched rows / total rows = 1/1 +-----------------+ ``` -## JSON_ARRAY_LENGTH +## JSON_ARRAY_LENGTH -### Description +**Usage**: `JSON_ARRAY_LENGTH(value)` + +Returns the number of elements in a JSON array. Returns `NULL` if the input is not a valid JSON array, is `NULL`, or contains invalid JSON. + +**Parameters**: + +- `value` (Required): A string containing a JSON array. + +**Return type**: `INTEGER` + +#### Examples + +The following example returns the length of a valid JSON array: -Usage: `json_array_length(value)` parse the string to json array and return size,, null is returned in case of any other valid JSON string, null or an invalid JSON. -**Argument type:** `value: A JSON STRING` -**Return type:** `INTEGER` -### Example - ```ppl source=json_test | eval array_length = json_array_length("[1,2,3]") @@ -147,7 +176,7 @@ source=json_test | fields array_length ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -157,6 +186,8 @@ fetched rows / total rows = 1/1 | 3 | +--------------+ ``` + +The following example returns `NULL` for non-array JSON values: ```ppl source=json_test @@ -165,7 +196,7 @@ source=json_test | fields array_length ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -176,15 +207,30 @@ fetched rows / total rows = 1/1 +--------------+ ``` -## JSON_EXTRACT +## JSON_EXTRACT -### Description +**Usage**: `JSON_EXTRACT(json_string, path1, path2, ...)` + +Extracts values from a JSON string using the specified JSON paths. + +**Behavior**: +- **Single path**: Returns the extracted value directly. +- **Multiple paths**: Returns a JSON array containing the extracted values in path order. +- **Invalid path**: Returns `NULL` for that path in the result. + +For path syntax details, see the [JSON path](#json-path) section. + +**Parameters**: + +- `json_string` (Required): The JSON string to extract values from. +- `path1`, `path2`, `...` (Required): One or more JSON paths specifying which values to extract. + +**Return type**: `STRING` + +#### Examples + +The following example extracts values using a single JSON path: -Usage: `json_extract(json_string, path1, path2, ...)` Extracts values using the specified JSON paths. If only one path is provided, it returns a single value. If multiple paths are provided, it returns a JSON Array in the order of the paths. If one path cannot find value, return null as the result for this path. The path use "{}" to represent index for array, "{}" means "{*}". -**Argument type:** `json_string: STRING, path1: STRING, path2: STRING ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval extract = json_extract('{"a": [{"b": 1}, {"b": 2}]}', 'a{}.b') @@ -192,7 +238,7 @@ source=json_test | fields extract ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -202,6 +248,8 @@ fetched rows / total rows = 1/1 | [1,2] | +---------+ ``` + +The following example extracts values using multiple JSON paths: ```ppl source=json_test @@ -210,7 +258,7 @@ source=json_test | fields extract ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -221,15 +269,23 @@ fetched rows / total rows = 1/1 +---------------------------+ ``` -## JSON_DELETE +## JSON_DELETE -### Description +**Usage**: `JSON_DELETE(json_string, path1, path2, ...)` + +Deletes values from a JSON string at the specified JSON paths. Returns the modified JSON string. If a path cannot find a value, no changes are made for that path. + +**Parameters**: + +- `json_string` (Required): The JSON string to delete values from. +- `path1`, `path2`, `...` (Required): One or more JSON paths specifying which values to delete. + +**Return type**: `STRING` + +#### Examples + +The following example deletes a value using a single JSON path: -Usage: `json_delete(json_string, path1, path2, ...)` Delete values using the specified JSON paths. Return the json string after deleting. If one path cannot find value, do nothing. -**Argument type:** `json_string: STRING, path1: STRING, path2: STRING ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval delete = json_delete('{"a": [{"b": 1}, {"b": 2}]}', 'a{0}.b') @@ -237,7 +293,7 @@ source=json_test | fields delete ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -247,6 +303,8 @@ fetched rows / total rows = 1/1 | {"a":[{},{"b":2}]} | +--------------------+ ``` + +The following example deletes values using multiple JSON paths: ```ppl source=json_test @@ -255,7 +313,7 @@ source=json_test | fields delete ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -265,6 +323,8 @@ fetched rows / total rows = 1/1 | {"a":[{},{}]} | +---------------+ ``` + +The following example shows no changes occur when trying to delete a non-existent path: ```ppl source=json_test @@ -273,7 +333,7 @@ source=json_test | fields delete ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -284,15 +344,24 @@ fetched rows / total rows = 1/1 +-------------------------+ ``` -## JSON_SET +## JSON_SET -### Description +**Usage**: `JSON_SET(json_string, path1, value1, path2, value2, ...)` + +Sets values in a JSON string at the specified JSON paths. Returns the modified JSON string. If a path's parent node is not a JSON object, that path is skipped. + +**Parameters**: + +- `json_string` (Required): The JSON string to modify. +- `path1`, `value1` (Required): The first path-value pair to set. +- `path2`, `value2`, `...` (Optional): Additional path-value pairs. + +**Return type**: `STRING` + +#### Examples + +The following example sets a single value at a JSON path: -Usage: `json_set(json_string, path1, value1, path2, value2...)` Set values to corresponding paths using the specified JSON paths. If one path's parent node is not a json object, skip the path. Return the json string after setting. -**Argument type:** `json_string: STRING, path1: STRING, value1: ANY, path2: STRING, value2: ANY ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval jsonSet = json_set('{"a": [{"b": 1}]}', 'a{0}.b', 3) @@ -300,7 +369,7 @@ source=json_test | fields jsonSet ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -310,6 +379,8 @@ fetched rows / total rows = 1/1 | {"a":[{"b":3}]} | +-----------------+ ``` + + ```ppl source=json_test @@ -318,7 +389,7 @@ source=json_test | fields jsonSet ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -329,32 +400,43 @@ fetched rows / total rows = 1/1 +-------------------------+ ``` -## JSON_APPEND +## JSON_APPEND -### Description +**Usage**: `JSON_APPEND(json_string, path1, value1, path2, value2, ...)` + +Appends values to arrays in a JSON string at the specified JSON paths. Returns the modified JSON string. If a path's target node is not an array, that path is skipped. + +**Parameters**: + +- `json_string` (Required): The JSON string to modify. +- `path1`, `value1` (Required): The first path-value pair to append. +- `path2`, `value2`, `...` (Optional): Additional path-value pairs. + +**Return type**: `STRING` + +#### Examples + +The following example appends a value to an array: -Usage: `json_append(json_string, path1, value1, path2, value2...)` Append values to corresponding paths using the specified JSON paths. If one path's target node is not an array, skip the path. Return the json string after setting. -**Argument type:** `json_string: STRING, path1: STRING, value1: ANY, path2: STRING, value2: ANY ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test -| eval jsonAppend = json_set('{"a": [{"b": 1}]}', 'a', 3) +| eval jsonAppend = json_append('{"a": [{"b": 1}]}', 'a', 3) | head 1 | fields jsonAppend ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 -+------------+ -| jsonAppend | -|------------| -| {"a":3} | -+------------+ ++-------------------+ +| jsonAppend | +|-------------------| +| {"a":[{"b":1},3]} | ++-------------------+ ``` + +The following example shows paths to non-array targets are skipped: ```ppl source=json_test @@ -363,7 +445,7 @@ source=json_test | fields jsonAppend ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -373,6 +455,8 @@ fetched rows / total rows = 1/1 | {"a":[{"b":1},{"b":2}]} | +-------------------------+ ``` + +The following example appends values using mixed path types: ```ppl source=json_test @@ -381,7 +465,7 @@ source=json_test | fields jsonAppend ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -392,15 +476,28 @@ fetched rows / total rows = 1/1 +-------------------------+ ``` -## JSON_EXTEND +## JSON_EXTEND -### Description +**Usage**: `JSON_EXTEND(json_string, path1, value1, path2, value2, ...)` + +Extends arrays in a JSON string at the specified JSON paths with new values. Returns the modified JSON string. If a path's target node is not an array, that path is skipped. + +The function attempts to parse each value as an array: +- If parsing succeeds: The parsed array elements are added to the target array. +- If parsing fails: The value is treated as a single element and added to the target array. + +**Parameters**: + +- `json_string` (Required): The JSON string to modify. +- `path1`, `value1` (Required): The first path-value pair to extend. +- `path2`, `value2`, `...` (Optional): Additional path-value pairs. + +**Return type**: `STRING` + +#### Examples + +The following example extends an array with a single value: -Usage: `json_extend(json_string, path1, value1, path2, value2...)` Extend values to corresponding paths using the specified JSON paths. If one path's target node is not an array, skip the path. The function will try to parse the value as an array. If it can be parsed, extend it to the target array. Otherwise, regard the value a single one. Return the json string after setting. -**Argument type:** `json_string: STRING, path1: STRING, value1: ANY, path2: STRING, value2: ANY ...` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval jsonExtend = json_extend('{"a": [{"b": 1}]}', 'a', 3) @@ -408,7 +505,7 @@ source=json_test | fields jsonExtend ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -418,6 +515,8 @@ fetched rows / total rows = 1/1 | {"a":[{"b":1},3]} | +-------------------+ ``` + +The following example shows paths to non-array targets are skipped: ```ppl source=json_test @@ -426,7 +525,7 @@ source=json_test | fields jsonExtend ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -436,6 +535,8 @@ fetched rows / total rows = 1/1 | {"a":[{"b":1},{"b":2}]} | +-------------------------+ ``` + +The following example extends an array by parsing the value as an array: ```ppl source=json_test @@ -444,7 +545,7 @@ source=json_test | fields jsonExtend ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -455,15 +556,22 @@ fetched rows / total rows = 1/1 +-------------------------+ ``` -## JSON_KEYS +## JSON_KEYS -### Description +**Usage**: `JSON_KEYS(json_string)` + +Returns the keys of a JSON object as a JSON array. Returns `NULL` if the input is not a valid JSON object. + +**Parameters**: + +- `json_string` (Required): A string containing a JSON object. + +**Return type**: `STRING` + +#### Examples + +The following example gets keys from a simple JSON object: -Usage: `json_keys(json_string)` Return the key list of the Json object as a Json array. Otherwise, return null. -**Argument type:** `json_string: A JSON STRING` -**Return type:** `STRING` -### Example - ```ppl source=json_test | eval jsonKeys = json_keys('{"a": 1, "b": 2}') @@ -471,7 +579,7 @@ source=json_test | fields jsonKeys ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -481,6 +589,8 @@ fetched rows / total rows = 1/1 | ["a","b"] | +-----------+ ``` + +The following example gets keys from a nested JSON object: ```ppl source=json_test @@ -489,7 +599,7 @@ source=json_test | fields jsonKeys ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 diff --git a/docs/user/ppl/functions/math.md b/docs/user/ppl/functions/math.md index 834e3523fdf..19bf60889f7 100644 --- a/docs/user/ppl/functions/math.md +++ b/docs/user/ppl/functions/math.md @@ -1,12 +1,19 @@ -# Mathematical Functions +# Mathematical functions -## ABS +The following mathematical functions are supported in PPL. -### Description +## ABS + +**Usage**: `ABS(x)` + +Calculates the absolute value of `x`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` (same type as input) -Usage: `abs(x)` calculates the abs x. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `INTEGER/LONG/FLOAT/DOUBLE` ### Example ```ppl @@ -15,7 +22,7 @@ source=people | fields `ABS(-1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -26,14 +33,21 @@ fetched rows / total rows = 1/1 +---------+ ``` -## ADD +## ADD + +**Usage**: `ADD(x, y)` -### Description +Calculates the sum of `x` and `y`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: The wider numeric type between `x` and `y` + +**Synonyms**: Addition Symbol (`+`) -Usage: `add(x, y)` calculates x plus y. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `Wider number between x and y` -Synonyms: Addition Symbol (+) ### Example ```ppl @@ -42,7 +56,7 @@ source=people | fields `ADD(2, 1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -53,14 +67,21 @@ fetched rows / total rows = 1/1 +-----------+ ``` -## SUBTRACT +## SUBTRACT + +**Usage**: `SUBTRACT(x, y)` + +Calculates `x` minus `y`. -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: The wider numeric type between `x` and `y` + +**Synonyms**: Subtraction Symbol (`-`) -Usage: `subtract(x, y)` calculates x minus y. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `Wider number between x and y` -Synonyms: Subtraction Symbol (-) ### Example ```ppl @@ -69,7 +90,7 @@ source=people | fields `SUBTRACT(2, 1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -80,14 +101,21 @@ fetched rows / total rows = 1/1 +----------------+ ``` -## MULTIPLY +## MULTIPLY + +**Usage**: `MULTIPLY(x, y)` + +Calculates the product of `x` and `y`. + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: The wider numeric type between `x` and `y` + +**Synonyms**: Multiplication Symbol (`*`) -Usage: `multiply(x, y)` calculates the multiplication of x and y. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `Wider number between x and y. If y equals to 0, then returns NULL.` -Synonyms: Multiplication Symbol (\*) ### Example ```ppl @@ -96,7 +124,7 @@ source=people | fields `MULTIPLY(2, 1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -107,14 +135,21 @@ fetched rows / total rows = 1/1 +----------------+ ``` -## DIVIDE +## DIVIDE + +**Usage**: `DIVIDE(x, y)` + +Calculates `x` divided by `y`. + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: The wider numeric type between `x` and `y` + +**Synonyms**: Division Symbol (`/`) -Usage: `divide(x, y)` calculates x divided by y. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `Wider number between x and y` -Synonyms: Division Symbol (/) ### Example ```ppl @@ -123,7 +158,7 @@ source=people | fields `DIVIDE(2, 1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -134,14 +169,21 @@ fetched rows / total rows = 1/1 +--------------+ ``` -## SUM +## SUM + +**Usage**: `SUM(x, y, ...)` + +Calculates the sum of all provided arguments. This function accepts a variable number of arguments. + +This function is only available in the `eval` command context and is rewritten to arithmetic addition during query parsing. +{: .note} + +**Parameters**: -### Description +- `x, y, ...` (Required): Variable number of `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` arguments. + +**Return type**: The widest numeric type among all arguments -Usage: `sum(x, y, ...)` calculates the sum of all provided arguments. This function accepts a variable number of arguments. -Note: This function is only available in the eval command context and is rewritten to arithmetic addition while query parsing. -**Argument type:** `Variable number of INTEGER/LONG/FLOAT/DOUBLE arguments` -**Return type:** `Wider number type among all arguments` ### Example ```ppl @@ -150,7 +192,7 @@ source=accounts | fields `SUM(1, 2, 3)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -170,7 +212,7 @@ source=accounts | fields age, total ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -184,14 +226,21 @@ fetched rows / total rows = 4/4 +-----+-------+ ``` -## AVG +## AVG + +**Usage**: `AVG(x, y, ...)` + +Calculates the average (arithmetic mean) of all provided arguments. This function accepts a variable number of arguments. + +This function is only available in the `eval` command context and is rewritten to an arithmetic expression (sum or count) during query parsing. +{: .note} + +**Parameters**: + +- `x, y, ...` (Required): Variable number of `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` arguments. -### Description +**Return type**: `DOUBLE` -Usage: `avg(x, y, ...)` calculates the average (arithmetic mean) of all provided arguments. This function accepts a variable number of arguments. -Note: This function is only available in the eval command context and is rewritten to arithmetic expression (sum / count) at query parsing time. -**Argument type:** `Variable number of INTEGER/LONG/FLOAT/DOUBLE arguments` -**Return type:** `DOUBLE` ### Example ```ppl @@ -200,7 +249,7 @@ source=accounts | fields `AVG(1, 2, 3)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -220,7 +269,7 @@ source=accounts | fields age, average ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 4/4 @@ -234,13 +283,18 @@ fetched rows / total rows = 4/4 +-----+---------+ ``` -## ACOS +## ACOS -### Description +**Usage**: `ACOS(x)` + +Calculates the arccosine of `x`. Returns `NULL` if `x` is not in the `[-1, 1]` range. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `acos(x)` calculates the arc cosine of x. Returns NULL if x is not in the range -1 to 1. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -249,7 +303,7 @@ source=people | fields `ACOS(0)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -260,13 +314,18 @@ fetched rows / total rows = 1/1 +--------------------+ ``` -## ASIN +## ASIN + +**Usage**: `ASIN(x)` + +Calculates the arcsine of `x`. Returns `NULL` if `x` is not in the `[-1, 1]` range. -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `asin(x)` calculate the arc sine of x. Returns NULL if x is not in the range -1 to 1. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -275,7 +334,7 @@ source=people | fields `ASIN(0)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -286,13 +345,19 @@ fetched rows / total rows = 1/1 +---------+ ``` -## ATAN +## ATAN + +**Usage**: `ATAN(x)`, `ATAN(y, x)` + +Calculates the arctangent of `x`. `ATAN(y, x)` calculates the arctangent of the quotient y / x, using the signs of both arguments to determine the quadrant of the result. + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Optional): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value (when using two-argument form). + +**Return type**: `DOUBLE` -Usage: `atan(x)` calculates the arc tangent of x. atan(y, x) calculates the arc tangent of y / x, except that the signs of both arguments are used to determine the quadrant of the result. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -301,7 +366,7 @@ source=people | fields `ATAN(2)`, `ATAN(2, 3)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -312,13 +377,19 @@ fetched rows / total rows = 1/1 +--------------------+--------------------+ ``` -## ATAN2 +## ATAN2 + +**Usage**: `ATAN2(y, x)` + +Calculates the arctangent of the quotient y / x, using the signs of both arguments to determine the quadrant of the result. -### Description +**Parameters**: + +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: atan2(y, x) calculates the arc tangent of y / x, except that the signs of both arguments are used to determine the quadrant of the result. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -327,7 +398,7 @@ source=people | fields `ATAN2(2, 3)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -340,16 +411,35 @@ fetched rows / total rows = 1/1 ## CEIL +**Usage**: `CEIL(x)` + +Returns the ceiling of the value `x`. + An alias for [CEILING](#ceiling) function. -## CEILING -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: Same type as input + +## CEILING + +**Usage**: `CEILING(x)` + +Returns the ceiling of the value `x`. + +The [`CEIL`](#ceil) and `CEILING` functions have the same implementation and functionality. +{: .note} + +Limitation: `CEILING` only works as expected when the IEEE 754 double type displays a decimal when stored. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: Same type as input -Usage: `CEILING(T)` takes the ceiling of value T. -Note: [CEIL](#ceil) and CEILING functions have the same implementation & functionality -Limitation: CEILING only works as expected when IEEE 754 double type displays decimal when stored. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `same type with input` ### Example ```ppl @@ -358,7 +448,7 @@ source=people | fields `CEILING(0)`, `CEILING(50.00005)`, `CEILING(-50.00005)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -375,7 +465,7 @@ source=people | fields `CEILING(3147483647.12345)`, `CEILING(113147483647.12345)`, `CEILING(3147483647.00001)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -386,13 +476,20 @@ fetched rows / total rows = 1/1 +---------------------------+-----------------------------+---------------------------+ ``` -## CONV +## CONV + +**Usage**: `CONV(x, a, b)` -### Description +Converts the number `x` from base `a` to base `b`. + +**Parameters**: + +- `x` (Required): A `STRING` value. +- `a` (Required): An `INTEGER` value. +- `b` (Required): An `INTEGER` value. + +**Return type**: `STRING` -Usage: `CONV(x, a, b)` converts the number x from a base to b base. -**Argument type:** `x: STRING, a: INTEGER, b: INTEGER` -**Return type:** `STRING` ### Example ```ppl @@ -401,7 +498,7 @@ source=people | fields `CONV('12', 10, 16)`, `CONV('2C', 16, 10)`, `CONV(12, 10, 2)`, `CONV(1111, 2, 10)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -412,13 +509,18 @@ fetched rows / total rows = 1/1 +--------------------+--------------------+-----------------+-------------------+ ``` -## COS +## COS + +**Usage**: `COS(x)` + +Calculates the cosine of `x`, where `x` is given in radians. -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `cos(x)` calculates the cosine of x, where x is given in radians. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -427,7 +529,7 @@ source=people | fields `COS(0)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -438,13 +540,18 @@ fetched rows / total rows = 1/1 +--------+ ``` -## COSH +## COSH + +**Usage**: `COSH(x)` + +Calculates the hyperbolic cosine of `x`, defined as (((e^x) + (e^(-x))) / 2). + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. -### Description +**Return type**: `DOUBLE` -Usage: `cosh(x)` calculates the hyperbolic cosine of x, defined as (((e^x) + (e^(-x))) / 2). -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -453,7 +560,7 @@ source=people | fields `COSH(2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -464,13 +571,18 @@ fetched rows / total rows = 1/1 +--------------------+ ``` -## COT +## COT -### Description +**Usage**: `COT(x)` + +Calculates the cotangent of `x`. Returns an error if `x` equals 0. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `cot(x)` calculates the cotangent of x. Returns out-of-range error if x equals to 0. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -479,7 +591,7 @@ source=people | fields `COT(1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -490,13 +602,18 @@ fetched rows / total rows = 1/1 +--------------------+ ``` -## CRC32 +## CRC32 + +**Usage**: `CRC32(expr)` -### Description +Calculates a cyclic redundancy check value and returns a 32-bit unsigned value. + +**Parameters**: + +- `expr` (Required): A `STRING` value. + +**Return type**: `LONG` -Usage: Calculates a cyclic redundancy check value and returns a 32-bit unsigned value. -**Argument type:** `STRING` -**Return type:** `LONG` ### Example ```ppl @@ -505,7 +622,7 @@ source=people | fields `CRC32('MySQL')` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -516,13 +633,18 @@ fetched rows / total rows = 1/1 +----------------+ ``` -## DEGREES +## DEGREES + +**Usage**: `DEGREES(x)` + +Converts `x` from radians to degrees. + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `degrees(x)` converts x from radians to degrees. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -531,7 +653,7 @@ source=people | fields `DEGREES(1.57)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -542,12 +664,16 @@ fetched rows / total rows = 1/1 +-------------------+ ``` -## E +## E + +**Usage**: `E()` + +Returns Euler's number (e ≈ 2.718281828459045). -### Description +**Parameters**: None + +**Return type**: `DOUBLE` -Usage: `E()` returns the Euler's number -**Return type:** `DOUBLE` ### Example ```ppl @@ -556,7 +682,7 @@ source=people | fields `E()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -567,13 +693,18 @@ fetched rows / total rows = 1/1 +-------------------+ ``` -## EXP +## EXP + +**Usage**: `EXP(x)` -### Description +Returns e raised to the power of `x`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `exp(x)` return e raised to the power of x. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -582,7 +713,7 @@ source=people | fields `EXP(2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -593,13 +724,18 @@ fetched rows / total rows = 1/1 +------------------+ ``` -## EXPM1 +## EXPM1 + +**Usage**: `EXPM1(x)` + +Returns e^x - 1 (exponential of `x` minus 1). + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: expm1(NUMBER T) returns the exponential of T, minus 1. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -608,7 +744,7 @@ source=people | fields `EXPM1(1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -619,14 +755,20 @@ fetched rows / total rows = 1/1 +-------------------+ ``` -## FLOOR +## FLOOR + +**Usage**: `FLOOR(x)` + +Returns the floor of the value `x`. + +Limitation: `FLOOR` only works as expected when the IEEE 754 double type displays a decimal when stored. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. -### Description +**Return type**: Same type as input -Usage: `FLOOR(T)` takes the floor of value T. -Limitation: FLOOR only works as expected when IEEE 754 double type displays decimal when stored. -**Argument type:** `a: INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `same type with input` ### Example ```ppl @@ -635,7 +777,7 @@ source=people | fields `FLOOR(0)`, `FLOOR(50.00005)`, `FLOOR(-50.00005)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -652,7 +794,7 @@ source=people | fields `FLOOR(3147483647.12345)`, `FLOOR(113147483647.12345)`, `FLOOR(3147483647.00001)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -669,7 +811,7 @@ source=people | fields `FLOOR(282474973688888.022)`, `FLOOR(9223372036854775807.022)`, `FLOOR(9223372036854775807.0000001)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -680,13 +822,18 @@ fetched rows / total rows = 1/1 +----------------------------+--------------------------------+------------------------------------+ ``` -## LN +## LN -### Description +**Usage**: `LN(x)` + +Returns the natural logarithm of `x`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `ln(x)` return the the natural logarithm of x. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -695,7 +842,7 @@ source=people | fields `LN(2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -706,14 +853,19 @@ fetched rows / total rows = 1/1 +--------------------+ ``` -## LOG +## LOG + +**Usage**: `LOG(x)`, `LOG(B, x)` + +Returns the natural logarithm of `x` (base e logarithm). `LOG(B, x)` is equivalent to log(x)/log(B). -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `B` (Optional): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value (when using two-argument form). + +**Return type**: `DOUBLE` -Specifications: -Usage: `log(x)` returns the natural logarithm of x that is the base e logarithm of the x. log(B, x) is equivalent to log(x)/log(B). -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -722,7 +874,7 @@ source=people | fields `LOG(2)`, `LOG(2, 8)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -733,14 +885,18 @@ fetched rows / total rows = 1/1 +--------------------+-----------+ ``` -## LOG2 +## LOG2 + +**Usage**: `LOG2(x)` + +Returns the base-2 logarithm of `x`. Equivalent to log(x)/log(2). + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Specifications: -Usage: log2(x) is equivalent to log(x)/log(2). -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -749,7 +905,7 @@ source=people | fields `LOG2(8)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -760,14 +916,18 @@ fetched rows / total rows = 1/1 +---------+ ``` -## LOG10 +## LOG10 + +**Usage**: `LOG10(x)` + +Returns the base-10 logarithm of `x`. Equivalent to log(x)/log(10). -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Specifications: -Usage: log10(x) is equivalent to log(x)/log(10). -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -776,7 +936,7 @@ source=people | fields `LOG10(100)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -787,13 +947,19 @@ fetched rows / total rows = 1/1 +------------+ ``` -## MOD +## MOD + +**Usage**: `MOD(n, m)` -### Description +Calculates the remainder of the number `n` divided by `m`. + +**Parameters**: + +- `n` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `m` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: The wider type between `n` and `m` if `m` is nonzero value. If `m` equals `0`, then returns `NULL`. -Usage: `MOD(n, m)` calculates the remainder of the number n divided by m. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `Wider type between types of n and m if m is nonzero value. If m equals to 0, then returns NULL.` ### Example ```ppl @@ -802,7 +968,7 @@ source=people | fields `MOD(3, 2)`, `MOD(3.1, 2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -813,13 +979,19 @@ fetched rows / total rows = 1/1 +-----------+-------------+ ``` -## MODULUS +## MODULUS + +**Usage**: `MODULUS(n, m)` + +Calculates the remainder of the number `n` divided by `m`. -### Description +**Parameters**: + +- `n` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `m` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: The wider type between `n` and `m` if `m` is nonzero value. If `m` equals `0`, then returns `NULL`. -Usage: `MODULUS(n, m)` calculates the remainder of the number n divided by m. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `Wider type between types of n and m if m is nonzero value. If m equals to 0, then returns NULL.` ### Example ```ppl @@ -828,7 +1000,7 @@ source=people | fields `MODULUS(3, 2)`, `MODULUS(3.1, 2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -839,12 +1011,16 @@ fetched rows / total rows = 1/1 +---------------+-----------------+ ``` -## PI +## PI + +**Usage**: `PI()` -### Description +Returns the mathematical constant π (pi ≈ 3.141592653589793). + +**Parameters**: None + +**Return type**: `DOUBLE` -Usage: `PI()` returns the constant pi -**Return type:** `DOUBLE` ### Example ```ppl @@ -853,7 +1029,7 @@ source=people | fields `PI()` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -864,14 +1040,21 @@ fetched rows / total rows = 1/1 +-------------------+ ``` -## POW +## POW + +**Usage**: `POW(x, y)` -### Description +Calculates the value of `x` raised to the power of `y`. Invalid inputs return `NULL`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` + +**Synonyms**: [POWER](#power) -Usage: `POW(x, y)` calculates the value of x raised to the power of y. Bad inputs return NULL result. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` -Synonyms: [POWER](#power) ### Example ```ppl @@ -880,7 +1063,7 @@ source=people | fields `POW(3, 2)`, `POW(-3, 2)`, `POW(3, -2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -891,14 +1074,21 @@ fetched rows / total rows = 1/1 +-----------+------------+--------------------+ ``` -## POWER +## POWER + +**Usage**: `POWER(x, y)` -### Description +Calculates the value of `x` raised to the power of `y`. Invalid inputs return `NULL`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` + +**Synonyms**: [POW](#pow) -Usage: `POWER(x, y)` calculates the value of x raised to the power of y. Bad inputs return NULL result. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE, INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` -Synonyms: [POW](#pow) ### Example ```ppl @@ -907,7 +1097,7 @@ source=people | fields `POWER(3, 2)`, `POWER(-3, 2)`, `POWER(3, -2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -918,13 +1108,18 @@ fetched rows / total rows = 1/1 +-------------+--------------+--------------------+ ``` -## RADIANS +## RADIANS + +**Usage**: `RADIANS(x)` + +Converts x from degrees to radians. -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `radians(x)` converts x from degrees to radians. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -933,7 +1128,7 @@ source=people | fields `RADIANS(90)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -944,13 +1139,18 @@ fetched rows / total rows = 1/1 +--------------------+ ``` -## RAND +## RAND + +**Usage**: `RAND()`, `RAND(N)` + +Returns a random floating-point value in the `[0, 1)` range. If an integer `N` is specified, the seed is initialized prior to execution. As a result, calling `RAND(N)` with the same value of `N` always returns the same result, producing a repeatable sequence of column values. + +**Parameters**: + +- `N` (Optional): An `INTEGER` value. -### Description +**Return type**: `FLOAT` -Usage: `RAND()`/`RAND(`N) returns a random floating-point value in the range 0 <= value < 1.0. If integer N is specified, the seed is initialized prior to execution. One implication of this behavior is with identical argument N, rand(N) returns the same value each time, and thus produces a repeatable sequence of column values. -**Argument type:** `INTEGER` -**Return type:** `FLOAT` ### Example ```ppl @@ -959,7 +1159,7 @@ source=people | fields `RAND(3)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -970,15 +1170,21 @@ fetched rows / total rows = 1/1 +---------------------+ ``` -## ROUND +## ROUND -### Description +**Usage**: `ROUND(x, d)` + +Rounds the argument `x` to `d` decimal places. `d` defaults to `0`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `d` (Optional): An `INTEGER` value. + +**Return type**: +- `(INTEGER/LONG [,INTEGER])` -> `LONG`. +- `(FLOAT/DOUBLE [,INTEGER])` -> `LONG`. -Usage: `ROUND(x, d)` rounds the argument x to d decimal places, d defaults to 0 if not specified -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -Return type map: -(INTEGER/LONG [,INTEGER]) -> LONG -(FLOAT/DOUBLE [,INTEGER]) -> LONG ### Example ```ppl @@ -987,7 +1193,7 @@ source=people | fields `ROUND(12.34)`, `ROUND(12.34, 1)`, `ROUND(12.34, -1)`, `ROUND(12, 1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -998,13 +1204,18 @@ fetched rows / total rows = 1/1 +--------------+-----------------+------------------+--------------+ ``` -## SIGN +## SIGN + +**Usage**: `SIGN(x)` + +Returns the sign of the argument as `-1`, `0`, or `1`, depending on whether the number is negative, zero, or positive. -### Description +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: Same type as input -Usage: Returns the sign of the argument as -1, 0, or 1, depending on whether the number is negative, zero, or positive -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `same type with input` ### Example ```ppl @@ -1013,7 +1224,7 @@ source=people | fields `SIGN(1)`, `SIGN(0)`, `SIGN(-1.1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1024,14 +1235,20 @@ fetched rows / total rows = 1/1 +---------+---------+------------+ ``` -## SIGNUM +## SIGNUM + +**Usage**: `SIGNUM(x)` + +Returns the sign of the argument as `-1`, `0`, or `1`, depending on whether the number is negative, zero, or positive. + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `INTEGER` + +**Synonyms**: `SIGN` -Usage: Returns the sign of the argument as -1, 0, or 1, depending on whether the number is negative, zero, or positive -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `INTEGER` -Synonyms: `SIGN` ### Example ```ppl @@ -1040,7 +1257,7 @@ source=people | fields `SIGNUM(1)`, `SIGNUM(0)`, `SIGNUM(-1.1)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1051,13 +1268,18 @@ fetched rows / total rows = 1/1 +-----------+-----------+--------------+ ``` -## SIN +## SIN + +**Usage**: `SIN(x)` + +Calculates the sine of `x`, where `x` is given in radians. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. -### Description +**Return type**: `DOUBLE` -Usage: `sin(x)` calculates the sine of x, where x is given in radians. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -1066,7 +1288,7 @@ source=people | fields `SIN(0)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1077,13 +1299,18 @@ fetched rows / total rows = 1/1 +--------+ ``` -## SINH +## SINH -### Description +**Usage**: `SINH(x)` + +Calculates the hyperbolic sine of `x`, defined as (((e^x) - (e^(-x))) / 2). + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: `sinh(x)` calculates the hyperbolic sine of x, defined as (((e^x) - (e^(-x))) / 2). -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -1092,7 +1319,7 @@ source=people | fields `SINH(2)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1103,15 +1330,20 @@ fetched rows / total rows = 1/1 +-------------------+ ``` -## SQRT +## SQRT + +**Usage**: `SQRT(x)` -### Description +Calculates the square root of a non-negative number `x`. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: +- `(Non-negative) INTEGER/LONG/FLOAT/DOUBLE` -> `DOUBLE`. +- `(Negative) INTEGER/LONG/FLOAT/DOUBLE` -> `NULL`. -Usage: Calculates the square root of a non-negative number -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -Return type map: -(Non-negative) INTEGER/LONG/FLOAT/DOUBLE -> DOUBLE -(Negative) INTEGER/LONG/FLOAT/DOUBLE -> NULL ### Example ```ppl @@ -1120,7 +1352,7 @@ source=people | fields `SQRT(4)`, `SQRT(4.41)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 @@ -1131,14 +1363,18 @@ fetched rows / total rows = 1/1 +---------+------------+ ``` -## CBRT +## CBRT + +**Usage**: `CBRT(x)` + +Calculates the cube root of a number `x`. + +**Parameters**: -### Description +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +**Return type**: `DOUBLE` -Usage: Calculates the cube root of a number -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -Return type DOUBLE: -INTEGER/LONG/FLOAT/DOUBLE -> DOUBLE ### Example ```ppl ignore @@ -1147,7 +1383,7 @@ source=location | fields `CBRT(8)`, `CBRT(9.261)`, `CBRT(-27)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 2/2 @@ -1159,13 +1395,18 @@ fetched rows / total rows = 2/2 +---------+-------------+-----------+ ``` -## RINT +## RINT + +**Usage**: `RINT(x)` + +Returns `x` rounded to the nearest integer. + +**Parameters**: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. -### Description +**Return type**: `DOUBLE` -Usage: `rint(NUMBER T)` returns T rounded to the closest whole integer number. -**Argument type:** `INTEGER/LONG/FLOAT/DOUBLE` -**Return type:** `DOUBLE` ### Example ```ppl @@ -1174,7 +1415,7 @@ source=people | fields `RINT(1.7)` ``` -Expected output: +The query returns the following results: ```text fetched rows / total rows = 1/1 diff --git a/docs/user/ppl/functions/relevance.md b/docs/user/ppl/functions/relevance.md index a40a3cd7644..a0bcfe59cd2 100644 --- a/docs/user/ppl/functions/relevance.md +++ b/docs/user/ppl/functions/relevance.md @@ -1,26 +1,41 @@ -# Relevance Functions - -The relevance based functions enable users to search the index for documents by the relevance of the input query. The functions are built on the top of the search queries of the OpenSearch engine, but in memory execution within the plugin is not supported. These functions are able to perform the global filter of a query, for example the condition expression in a `WHERE` clause or in a `HAVING` clause. For more details of the relevance based search, check out the design here: [Relevance Based Search With SQL/PPL Query Engine](https://github.com/opensearch-project/sql/issues/182) -## MATCH - -### Description - -`match(field_expression, query_expression[, option=]*)` -The match function maps to the match query used in search engine, to return the documents that match a provided text, number, date or boolean value with a given field. Available parameters include: -- analyzer -- auto_generate_synonyms_phrase -- fuzziness -- max_expansions -- prefix_length -- fuzzy_transpositions -- fuzzy_rewrite -- lenient -- operator -- minimum_should_match -- zero_terms_query -- boost - -Example with only `field` and `query` expressions, and all other parameters are set default values +# Relevance functions + +Relevance-based functions enable users to search an index for documents based on query relevance. These functions are built on top of OpenSearch engine search queries, but in-memory execution within the plugin is not supported. + +You can use these functions for global query filtering, such as in condition expressions within `WHERE` or `HAVING` clauses. For more details about relevance-based search, see [Relevance Based Search With SQL/PPL Query Engine](https://github.com/opensearch-project/sql/issues/182). + +## MATCH + +**Usage**: `MATCH(, [,