Introducing AgentFerrum - Browser automation for AI agents #563
Florian95
started this conversation in
Show and tell
Replies: 1 comment
-
|
I named it AgentFerrum as a nod to this project since it's built entirely on top of it. If the maintainers would prefer I don't use "Ferrum" in the name, just let me know and I'll happily rename it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey all,
Sharing AgentFerrum, a gem I built on top of Ferrum for browser automation in an AI agent context.
The problem
I'm building AI agents that browse the web, and dumping raw HTML into the LLM context is wasteful. A typical web page is tens of thousands of tokens of noise (scripts, styles, hidden elements, data-* attributes). The agent needs to understand the page and interact with it, not parse a DOM.
What AgentFerrum does
The gem produces a hybrid snapshot: an accessibility tree of interactive elements (with clickable refs) + a markdown rendering of the visible content. In practice that's a 50-80% token reduction compared to raw HTML.
The agent reads the snapshot, decides to checkout, clicks
@e4. No CSS selectors needed.Benchmark on real sites
What's inside
Accessibility.getFullAXTree) + markdown (via ReverseMarkdown)@e1,@e2... resolved throughbackendNodeId:minimal,:moderate,:maximum) ported from puppeteer-extra-plugin-stealthInstall
Links
Inspirations
agent-browser (Vercel) for the accessibility tree + refs concept, Crucible for the stealth/downloads patterns, and FerrumMCP for Ruby/Ferrum patterns.
About the name
I named it AgentFerrum as a nod to this project since it's built entirely on top of it. If the maintainers would prefer I don't use "Ferrum" in the name, just let me know and I'll happily rename it.
Happy to hear any feedback. Ferrum is really nice to work with — the direct CDP access is exactly what's needed for accessibility tree extraction. Thanks to the Ferrum team for the solid foundation!
Beta Was this translation helpful? Give feedback.
All reactions