Add AI bot classification for event enrichment#57
Open
jaredmixpanel wants to merge 4 commits intomasterfrom
Open
Add AI bot classification for event enrichment#57jaredmixpanel wants to merge 4 commits intomasterfrom
jaredmixpanel wants to merge 4 commits intomasterfrom
Conversation
Part of AI bot classification feature for Java SDK.
Part of AI bot classification feature for Java SDK.
Part of AI bot classification feature for Java SDK.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds optional AI-bot user-agent classification and event enrichment to the Mixpanel Java SDK via a BotClassifyingMessageBuilder decorator, plus unit tests for classification and enrichment behavior.
Changes:
- Introduces an AI bot “database” (
AiBotEntry) and classifier (AiBotClassifier) that returns an immutable result (AiBotClassification). - Adds
BotClassifyingMessageBuilderwrapper to enricheventandimportEventproperties with$is_ai_botand related$ai_bot_*fields based on$user_agent. - Adds focused JUnit tests for classifier behavior and message enrichment/passthrough behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/java/com/mixpanel/mixpanelapi/AiBotClassification.java | Immutable classification result model returned by the classifier. |
| src/main/java/com/mixpanel/mixpanelapi/AiBotClassifier.java | Default bot database + classification logic + builder for custom bot patterns. |
| src/main/java/com/mixpanel/mixpanelapi/AiBotEntry.java | Immutable DB entry mapping regex patterns to bot metadata. |
| src/main/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilder.java | Decorator around MessageBuilder that enriches event/import properties based on $user_agent. |
| src/test/java/com/mixpanel/mixpanelapi/AiBotClassifierTest.java | Tests for default classification, negative cases, case-insensitivity, and custom pattern priority. |
| src/test/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilderTest.java | Tests for enrichment, passthrough behavior, preservation, and end-to-end delivery serialization. |
src/main/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilder.java
Show resolved
Hide resolved
src/test/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilderTest.java
Show resolved
Hide resolved
src/main/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilder.java
Show resolved
Hide resolved
src/main/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilder.java
Show resolved
Hide resolved
- Add null guard in AiBotEntry.matches() - Validate null elements in Builder.addBots() - Remove unused ArrayList import - Fix getBotDatabase() Javadoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds AI bot classification with a BotClassifyingMessageBuilder decorator that automatically detects AI crawler requests and enriches tracked events with classification properties.
What it does
$is_ai_bot,$ai_bot_name,$ai_bot_provider, and$ai_bot_categorypropertiesAI Bots Detected
GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-User, Google-Extended, PerplexityBot, Bytespider, CCBot, Applebot-Extended, Meta-ExternalAgent, cohere-ai
Implementation Details
Architecture
BotClassifyingMessageBuilderis a drop-in replacement forMessageBuilder— zero modifications to existing SDK filesevent()andimportEvent(); all people/group methods delegate unchanged to the wrappedMessageBuilderJSONObjectproperties vianew JSONObject(properties.toString())serialization to prevent mutation of caller's objectAiBotClassifier.Builder)NOT_A_BOTsingleton (AiBotClassification.noMatch()) for non-bot classification results — avoids allocations on the hot pathPublic API
AiBotClassifierstatic AiBotClassification classify(String userAgent)AiBotClassifierAiBotClassification classifyUserAgent(String userAgent)BuilderAiBotClassifierstatic List<AiBotEntry> getBotDatabase()AiBotClassifier.BuilderBuilder addBot(AiBotEntry entry)AiBotClassifier.BuilderBuilder addBots(List<AiBotEntry> entries)AiBotClassifier.BuilderAiBotClassifier build()AiBotClassificationboolean isAiBot()AiBotClassificationString getBotName()"GPTBot"), ornullif not a botAiBotClassificationString getProvider()"OpenAI"), ornullif not a botAiBotClassificationString getCategory()"indexing","retrieval", or"agent"), ornullif not a botAiBotEntryAiBotEntry(Pattern pattern, String name, String provider, String category, String description)AiBotEntryboolean matches(String userAgent)Matcher.find()BotClassifyingMessageBuilderBotClassifyingMessageBuilder(MessageBuilder delegate)BotClassifyingMessageBuilderBotClassifyingMessageBuilder(MessageBuilder delegate, AiBotClassifier classifier)BotClassifyingMessageBuilderJSONObject event(String distinctId, String eventName, JSONObject properties)BotClassifyingMessageBuilderJSONObject importEvent(String distinctId, String eventName, JSONObject properties)Notable Design Decisions
BotClassifyingMessageBuilderwrapsMessageBuildervia composition rather than extending it, keeping the change fully additive with zero edits to existing SDK source files.AiBotClassifier.classify(ua)provides a zero-setup static path for the common case;AiBotClassifier.Builder+classifyUserAgent(ua)supports custom bots when needed. Custom entries are prepended so they take priority over built-in patterns.enrichProperties()serializes the inputJSONObjectto a string and re-parses it (new JSONObject(properties.toString())) before injecting properties, ensuring the caller's original object is never mutated. OnJSONException, the original properties are returned unchanged.Usage Examples
Drop-in MessageBuilder Replacement
Standalone Classification
Custom Bot Patterns
Full Tracking Flow
Files Added
src/main/java/com/mixpanel/mixpanelapi/AiBotClassification.javasrc/main/java/com/mixpanel/mixpanelapi/AiBotClassifier.javasrc/main/java/com/mixpanel/mixpanelapi/AiBotEntry.javasrc/main/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilder.javasrc/test/java/com/mixpanel/mixpanelapi/AiBotClassifierTest.javasrc/test/java/com/mixpanel/mixpanelapi/BotClassifyingMessageBuilderTest.javaFiles Modified
Test Plan
$is_ai_bot: false(Chrome, Googlebot, curl, etc.)