Crocking

AI authorship detector for git repositories. Analyzes commit history for statistical signatures of undisclosed AI-generated code.

AI-generated code is entering codebases at an accelerating rate. Much of it arrives with proper attribution. Some doesn't. Crocking examines git history for the signals that AI authorship leaves behind — tool markers, commit message patterns, timing anomalies, diff structure, and code style uniformity — so teams can make informed decisions about compliance, licensing, and trust.

Why This Exists

The Claude Code source leak revealed undercover.ts — a mode that strips AI attribution from commits when contributing to open-source repositories. That's one tool at one company. The broader pattern is everywhere: AI coding assistants generating commits that look human-authored.

This matters for licensing compliance (some licenses restrict AI-generated content), intellectual property audits, open-source contribution policies, and supply chain trust. Not to block AI use — to detect when it's undisclosed.

No standalone tool existed to scan git history for AI authorship signals. Crocking fills that gap.

Quick Start

pip install crocking
crocking scan /path/to/repo
crocking scan . --format json
crocking check abc123def
crocking scan . --max-commits 500

What It Detects

Known AI Markers (AUTH-001)

Pattern	Confidence
Co-authored-by trailers (Copilot, Claude, Cursor, GPT, Codex)	DEFINITIVE
Generated-by trailers referencing AI tools	DEFINITIVE
Explicit AI-generation attribution markers	DEFINITIVE
GitHub Copilot suggestion commits	DEFINITIVE

Commit Message Patterns (AUTH-002)

Pattern	Confidence
Suspiciously uniform conventional commit formatting	MEDIUM
Repetitive commit message prefixes/templates	MEDIUM
Precisely-scoped conventional commits (characteristic length)	LOW
Structured bullet-point commit bodies	MEDIUM

Timing Signals (AUTH-003)

Pattern	Confidence
Burst commits (3+ in under 5 minutes)	MEDIUM
Uniform commit intervals (low coefficient of variation)	LOW

Diff Structure (AUTH-004)

Pattern	Confidence
Bulk file creation (5+ new files, 1000+ lines)	HIGH
Bulk file creation (3+ new files, 500+ lines)	MEDIUM
Pure addition commits (high insertion/deletion ratio)	MEDIUM

Code Style (AUTH-005)

Pattern	Confidence
AI generation comments in code	HIGH
Unusually high comment density (>35% of added lines)	LOW

File/Directory Markers

.claude/, CLAUDE.md, .cursor/, .codex/, AGENTS.md, .github/copilot/, .aider/, .codeium/, .continue/, .v0/, .bolt/, .lovable/

Per-Author Scoring

Score	Label	Meaning
70-100	LIKELY AI	Strong signals of AI-assisted authorship
40-69	POSSIBLE AI	Multiple suggestive patterns
15-39	SOME SIGNALS	Weak indicators, could be human
0-14	LOW	No significant AI signals

Weights: DEFINITIVE (1.0), HIGH (0.7), MEDIUM (0.4), LOW (0.15). The tool surfaces signals. Humans make the judgment call.

Zero Dependencies

crocking uses only Python standard library modules plus the git CLI. No ML models, no API calls, no telemetry.

Also by goweft

When	Tool	What
Before publish	tenter	Pre-publish artifact integrity scanner
After fork	unshear	AI agent fork divergence detector
At runtime	heddle	Policy-and-trust layer for MCP tool servers
Across sessions	ratine	Agent memory poisoning detector
In git history	crocking	AI authorship detector (this tool)

License

MIT - see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
src/crocking		src/crocking
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
banner.svg		banner.svg
demo.svg		demo.svg
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crocking

Why This Exists

Quick Start

What It Detects

Known AI Markers (AUTH-001)

Commit Message Patterns (AUTH-002)

Timing Signals (AUTH-003)

Diff Structure (AUTH-004)

Code Style (AUTH-005)

File/Directory Markers

Per-Author Scoring

Zero Dependencies

Also by goweft

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Crocking

Why This Exists

Quick Start

What It Detects

Known AI Markers (AUTH-001)

Commit Message Patterns (AUTH-002)

Timing Signals (AUTH-003)

Diff Structure (AUTH-004)

Code Style (AUTH-005)

File/Directory Markers

Per-Author Scoring

Zero Dependencies

Also by goweft

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages