Skip to content

docs: add PPL language reference with data-grounded examples#143

Merged
anirudha merged 1 commit intoopensearch-project:mainfrom
anirudha:ppl-docs-dco
Mar 28, 2026
Merged

docs: add PPL language reference with data-grounded examples#143
anirudha merged 1 commit intoopensearch-project:mainfrom
anirudha:ppl-docs-dco

Conversation

@anirudha
Copy link
Copy Markdown
Collaborator

Summary

Add comprehensive PPL (Piped Processing Language) documentation section to the Observability Stack docs, targeting Splunk SREs evaluating PPL as a query language for OpenSearch observability.

  • 27 detailed per-command reference pages with consistent structure
  • All examples use real OTel data from logs-otel-v1* and otel-v1-apm-span-* indices - no fabricated data
  • Every example verified against local OpenSearch PPL API and includes a playground link
  • PPL overview page positioning it as the native query language (with KQL/EQL comparison)
  • Function reference covering 200+ built-in functions across 13 categories
  • Masterclass pipeline examples showcasing PPL's full power for SRE workflows

New pages

Page Description
ppl/index.md PPL overview - why PPL, comparison table, getting started
ppl/commands.md Command reference summary (50+ commands)
ppl/commands/*.md 27 individual command pages with full detail
ppl/functions.md Function reference (aggregation, string, datetime, math, etc.)
ppl/examples.md Real-world OTel queries with playground links

Per-command pages

Search & Filter: search, where
Fields & Transformation: fields, eval, rename, fillnull, expand, flatten
Aggregation & Statistics: stats, eventstats, streamstats, timechart, trendline
Sorting & Limiting: sort, head, dedup, top, rare
Text Extraction: parse, grok, rex, patterns, spath
Data Combination: join, lookup
Machine Learning: ml
Metadata: describe

Each command page follows a consistent structure:

  1. Description - what it does, when to use it
  2. Syntax - full syntax block
  3. Arguments - required/optional table with defaults
  4. Usage notes - behavioral notes, gotchas, performance tips
  5. Basic examples (3-5) - with playground links
  6. Extended examples (1-2) - OTel observability scenarios
  7. See also - cross-references to related commands

Examples page highlights

  • SRE incident response: error rate over time, first error per service, P95 latency timeseries
  • Trace analysis: slowest traces, error spans, latency percentiles, trace fan-out
  • AI agent observability: token usage, cost analysis, tool execution, agent invocation latency
  • Advanced analytics: eventstats outlier detection, streamstats rolling windows, trendline smoothing
  • Masterclass pipelines: service health scorecard, GenAI cost/perf analysis, Envoy access log parsing, ML-based error pattern discovery, cross-signal log-trace correlation

Data grounding

All text extraction examples (grok, rex, parse, spath) were tested against actual log bodies in the cluster:

  • Envoy access logs from frontend-proxy: [timestamp] "METHOD /path HTTP/1.1" status ...
  • Kafka broker logs: [ComponentName id=N] message ...
  • Load generator logs: User action product: ID

Key PPL behavioral findings documented:

  • parse requires full-string match (implicitly anchored); rex does partial matching
  • Java regex named capture groups cannot contain underscores (camelCase only)
  • Grok patterns with multiple unnamed %{DATA} cause "Duplicate key" errors

Other changes

  • Sidebar reordered: Overview → Get Started → Send Data → PPL → Discover → ...
  • Updated main docs index and investigate page with PPL links
  • README updated with PPL section

Test plan

  • npm run build passes with all internal links validated (starlight-links-validator)
  • All playground URLs use correct RISON encoding (!%27 for single quotes)
  • grok/rex/parse/spath patterns verified against real OTel data via local PPL API
  • No fabricated data (my-index, accounts, Apache CLF) remains in any example
  • All See Also links point to correct specific command pages
  • Visual review of each page in browser
  • Verify playground links open correctly with pre-filled queries

🤖 Generated with Claude Code

Add comprehensive PPL (Piped Processing Language) documentation section
targeting Splunk SREs evaluating PPL for OpenSearch observability.

New pages:
- PPL overview with comparison to KQL and EQL
- Command reference summary (50+ commands)
- 27 individual command pages with Description, Syntax, Arguments,
  Usage notes, Basic/Extended examples, and See also
- Function reference (200+ functions across 13 categories)
- Observability examples with live playground links for OTel data
- Masterclass pipelines (service health scorecard, GenAI cost analysis,
  Envoy log parsing, error pattern discovery, cross-signal correlation)

All examples use real OTel data from logs-otel-v1* and
otel-v1-apm-span-* indices. Text extraction patterns (grok, rex, parse,
spath) verified against actual Envoy access logs and Kafka broker logs.

Updated sidebar, main page, investigate page, and README to highlight PPL.

Signed-off-by: Anirudha Jadhav <anirudha@nyu.edu>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 18.51%. Comparing base (5d6beb0) to head (be1b227).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #143   +/-   ##
=======================================
  Coverage   18.51%   18.51%           
=======================================
  Files           3        3           
  Lines          54       54           
  Branches       18       18           
=======================================
  Hits           10       10           
  Misses         44       44           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vamsimanohar
Copy link
Copy Markdown
Member

[Not related to this PR]. These PPL md files can be reused by PPL skill for progressive loading instead of big skill file in the current state.

Copy link
Copy Markdown
Member

@vamsimanohar vamsimanohar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@anirudha anirudha merged commit 00ac85f into opensearch-project:main Mar 28, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants