docs: add PPL language reference with data-grounded examples by anirudha · Pull Request #143 · opensearch-project/observability-stack

anirudha · 2026-03-28T04:24:33Z

Summary

Add comprehensive PPL (Piped Processing Language) documentation section to the Observability Stack docs, targeting Splunk SREs evaluating PPL as a query language for OpenSearch observability.

27 detailed per-command reference pages with consistent structure
All examples use real OTel data from logs-otel-v1* and otel-v1-apm-span-* indices - no fabricated data
Every example verified against local OpenSearch PPL API and includes a playground link
PPL overview page positioning it as the native query language (with KQL/EQL comparison)
Function reference covering 200+ built-in functions across 13 categories
Masterclass pipeline examples showcasing PPL's full power for SRE workflows

New pages

Page	Description
`ppl/index.md`	PPL overview - why PPL, comparison table, getting started
`ppl/commands.md`	Command reference summary (50+ commands)
`ppl/commands/*.md`	27 individual command pages with full detail
`ppl/functions.md`	Function reference (aggregation, string, datetime, math, etc.)
`ppl/examples.md`	Real-world OTel queries with playground links

Per-command pages

Search & Filter: search, where
Fields & Transformation: fields, eval, rename, fillnull, expand, flatten
Aggregation & Statistics: stats, eventstats, streamstats, timechart, trendline
Sorting & Limiting: sort, head, dedup, top, rare
Text Extraction: parse, grok, rex, patterns, spath
Data Combination: join, lookup
Machine Learning: ml
Metadata: describe

Each command page follows a consistent structure:

Description - what it does, when to use it
Syntax - full syntax block
Arguments - required/optional table with defaults
Usage notes - behavioral notes, gotchas, performance tips
Basic examples (3-5) - with playground links
Extended examples (1-2) - OTel observability scenarios
See also - cross-references to related commands

Examples page highlights

SRE incident response: error rate over time, first error per service, P95 latency timeseries
Trace analysis: slowest traces, error spans, latency percentiles, trace fan-out
AI agent observability: token usage, cost analysis, tool execution, agent invocation latency
Advanced analytics: eventstats outlier detection, streamstats rolling windows, trendline smoothing
Masterclass pipelines: service health scorecard, GenAI cost/perf analysis, Envoy access log parsing, ML-based error pattern discovery, cross-signal log-trace correlation

Data grounding

All text extraction examples (grok, rex, parse, spath) were tested against actual log bodies in the cluster:

Envoy access logs from frontend-proxy: [timestamp] "METHOD /path HTTP/1.1" status ...
Kafka broker logs: [ComponentName id=N] message ...
Load generator logs: User action product: ID

Key PPL behavioral findings documented:

parse requires full-string match (implicitly anchored); rex does partial matching
Java regex named capture groups cannot contain underscores (camelCase only)
Grok patterns with multiple unnamed %{DATA} cause "Duplicate key" errors

Other changes

Sidebar reordered: Overview → Get Started → Send Data → PPL → Discover → ...
Updated main docs index and investigate page with PPL links
README updated with PPL section

Test plan

npm run build passes with all internal links validated (starlight-links-validator)
All playground URLs use correct RISON encoding (!%27 for single quotes)
grok/rex/parse/spath patterns verified against real OTel data via local PPL API
No fabricated data (my-index, accounts, Apache CLF) remains in any example
All See Also links point to correct specific command pages
Visual review of each page in browser
Verify playground links open correctly with pre-filled queries

🤖 Generated with Claude Code

Add comprehensive PPL (Piped Processing Language) documentation section targeting Splunk SREs evaluating PPL for OpenSearch observability. New pages: - PPL overview with comparison to KQL and EQL - Command reference summary (50+ commands) - 27 individual command pages with Description, Syntax, Arguments, Usage notes, Basic/Extended examples, and See also - Function reference (200+ functions across 13 categories) - Observability examples with live playground links for OTel data - Masterclass pipelines (service health scorecard, GenAI cost analysis, Envoy log parsing, error pattern discovery, cross-signal correlation) All examples use real OTel data from logs-otel-v1* and otel-v1-apm-span-* indices. Text extraction patterns (grok, rex, parse, spath) verified against actual Envoy access logs and Kafka broker logs. Updated sidebar, main page, investigate page, and README to highlight PPL. Signed-off-by: Anirudha Jadhav <anirudha@nyu.edu> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codecov · 2026-03-28T04:26:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 18.51%. Comparing base (5d6beb0) to head (be1b227).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #143   +/-   ##
=======================================
  Coverage   18.51%   18.51%           
=======================================
  Files           3        3           
  Lines          54       54           
  Branches       18       18           
=======================================
  Hits           10       10           
  Misses         44       44

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vamsimanohar · 2026-03-28T04:44:11Z

[Not related to this PR]. These PPL md files can be reused by PPL skill for progressive loading instead of big skill file in the current state.

docs/starlight-docs/src/content/docs/agent-health/configuration/index.md

vamsimanohar

LGTM

anirudha requested review from goyamegh, kylehounslow, ps48 and vamsimanohar as code owners March 28, 2026 04:24

vamsimanohar reviewed Mar 28, 2026

View reviewed changes

docs/starlight-docs/src/content/docs/agent-health/configuration/index.md Show resolved Hide resolved

vamsimanohar approved these changes Mar 28, 2026

View reviewed changes

anirudha merged commit 00ac85f into opensearch-project:main Mar 28, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add PPL language reference with data-grounded examples#143

docs: add PPL language reference with data-grounded examples#143
anirudha merged 1 commit intoopensearch-project:mainfrom
anirudha:ppl-docs-dco

anirudha commented Mar 28, 2026

Uh oh!

codecov bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

vamsimanohar commented Mar 28, 2026

Uh oh!

Uh oh!

vamsimanohar left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anirudha commented Mar 28, 2026

Summary

New pages

Per-command pages

Examples page highlights

Data grounding

Other changes

Test plan

Uh oh!

codecov bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vamsimanohar commented Mar 28, 2026

Uh oh!

Uh oh!

vamsimanohar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 28, 2026 •

edited

Loading