feat: Add SearchableToolset for dynamic tool discovery from large tool catalogs#10426
feat: Add SearchableToolset for dynamic tool discovery from large tool catalogs#10426
Conversation
…alogs Implements ToolSearchToolset - a Toolset subclass that enables dynamic tool discovery from large catalogs. Tools are discovered via `search_tools` bm25 based special search tool and become available to the LLM. Key features: - Single discovery mode: "bm25", postpone "embedding" for future - Passthrough mode for small catalogs (< search_threshold) - Self-contained BM25L search engine implementation - Full serialization support (to_dict/from_dict) - Auto warm-up when iterating to ensure bootstrap tool availability
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Pull Request Test Coverage Report for Build 22308342425Details
💛 - Coveralls |
The description claimed to return "a JSON array of tool definitions" but actually returns a plain text confirmation message with tool names.
Tools discovered via search_tools were added to _discovered_tools without calling warm_up(), causing tools that require initialization (connections, model loading) to fail when invoked.
The inherited __getitem__ accessed self.tools which is always empty in ToolSearchToolset. This caused IndexError for valid indexes even when tools were available through __iter__.
|
@sjrl @julian-risch - give me a 1-2 days to test it thoroughly with large mcp toolsets and if all good I'll open this PR. This is the general direction @mpangrazzi and I talked about. LMK if you agree. |
anakin87
left a comment
There was a problem hiding this comment.
Cool implementation!
I left some comments and there's still some work to be done, but it seems a good direction.
|
Looking very cool! One additional question I had is that this wouldn't work with |
@sjrl @anakin87 I thought it would. It warms up tools so perhaps it should work. Still didn't get around to test but I will today or tomorrow. Will report back! |
Renamed the `query` parameter to `tool_keywords` and refined the description to guide LLMs toward providing vocabulary from tool names/descriptions rather than echoing user requests. Before: LLMs often passed user intent like "south of france highlights" After: LLMs provide tool vocabulary like "route weather search" This improves BM25 matching since it relies on lexical overlap with indexed tool names and descriptions.
Rename class and module for clarity: - tool_search_toolset.py -> searchable_toolset.py - ToolSearchToolset -> SearchableToolset - Update all imports, tests, and release notes
Remove the hand-rolled BM25L engine (~107 lines) and delegate to Haystack's built-in InMemoryDocumentStore.bm25_retrieval(), which uses the same algorithm and tokenization. Update tests accordingly.
Replace manual Tool construction with create_tool_from_function and Annotated type hints in _create_search_tool. Reduce test duplication by using create_tool_from_function for fixtures and large_catalog. Consolidate 4 integration tests into 1 deterministic math test and remove the redundant integration_catalog fixture.
|
These were good improvements, one last thing I want to do is to make sure is that SearchableToolset is robust and battle tested - i.e. that it works even if tools are loaded lazily i.e in MCPToolset and potentially others. It works great now in RL demo with itinerary agent when MCPToolset is eager but not lazy. A few more days and I'll open this PR @sjrl @anakin87 |
| This method allows resetting the toolset's discovered tools between agent runs | ||
| when the same toolset instance is reused. This can be useful for long-running |
There was a problem hiding this comment.
We say this in the docstrings which sounds nice, but I don't think our Agent is set up to call this method. If we think it should do that perhaps we should open up a follow-up issue to update Agent to utilize this method? (Not entirely sure what that would look like tbh).
…tack into tool_search_tool
| Key features include: | ||
| configurable search threshold for automatic passthrough mode, top-k result limiting, | ||
| and a ``clear()`` method to reset discovered tools between agent runs. |
There was a problem hiding this comment.
Still feel a little odd highlighting this since I don't know how I'd use clear in practice, nor is it highlighted in the code example
There was a problem hiding this comment.
I can remove the mention from the release note.
Just to share an example
from haystack.tools import create_tool_from_function, SearchableToolset
from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
def get_weather(city: str) -> str:
"""Get current weather for a city."""
return f"Weather in {city}: 22°C, sunny"
def add_numbers(a: int, b: int) -> int:
"""Add two numbers together."""
return a + b
def multiply_numbers(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
def get_stock_price(symbol: str) -> str:
"""Get stock price by ticker symbol."""
return f"{symbol}: $150.00"
def search_database(query: str) -> str:
"""Search the database for records."""
return f"Found 5 records matching '{query}'"
def send_email(to: str, subject: str, body: str) -> str:
"""Send an email to a recipient."""
return f"Email sent to {to}"
def calculate_tax(amount: float, rate: float) -> float:
"""Calculate tax on an amount."""
return amount * rate
def convert_currency(amount: float, from_currency: str, to_currency: str) -> float:
"""Convert currency from one to another."""
return amount * 1.1 # Simplified conversion
# Test fixtures
weather_tool = create_tool_from_function(get_weather)
add_tool = create_tool_from_function(add_numbers)
multiply_tool = create_tool_from_function(multiply_numbers)
stock_tool = create_tool_from_function(get_stock_price)
search_tool = create_tool_from_function(search_database)
send_email_tool = create_tool_from_function(send_email)
calculate_tax_tool = create_tool_from_function(calculate_tax)
convert_currency_tool = create_tool_from_function(convert_currency)
large_catalog = [weather_tool, add_tool, multiply_tool, stock_tool, search_tool, send_email_tool, calculate_tax_tool, convert_currency_tool]
searchable_toolset = SearchableToolset(catalog=large_catalog)
agent = Agent(tools=searchable_toolset, chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"))
result= agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")])
print(result["messages"])
# ...
print(len(searchable_toolset))
# 2
print(searchable_toolset._discovered_tools)
# {'get_weather': Tool(name='get_weather', description='Get current weather for a city.', parameters={'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object'}, function=<function get_weather at 0x1017a6700>, outputs_to_string=None, inputs_from_state=None, outputs_to_state=None)}
searchable_toolset.clear()
print(len(searchable_toolset))
# 1
print(searchable_toolset._discovered_tools)
# {}
result= agent.run(messages=[ChatMessage.from_user("How many records in the database for query: 'apple'. Use the appropriate tool to search the database.")])
print(result["messages"])
# ...
print(len(searchable_toolset))
# 2
print(searchable_toolset._discovered_tools)
# {'search_database': Tool(name='search_database', description='Search the database for records.', parameters={'properties': {'query': {'type': 'string'}}, 'required': ['query'], 'type': 'object'}, function=<function search_database at 0x1189276a0>, outputs_to_string=None, inputs_from_state=None, outputs_to_state=None)}There was a problem hiding this comment.
Thanks for the example and opening the issue! The issue should help us figure out how this can be used in a production app rather than just in a script.
sjrl
left a comment
There was a problem hiding this comment.
Looks good! Just two minor comments about how a user is practically meant to use the clear method. But probably something we can do in a follow up PR
Why
Large tool catalogs overwhelm LLM context windows. Agents need a way to discover tools on-demand rather than receiving all tool definitions upfront.
What
SearchableToolset: Toolset subclass with BM25-based tool discoveryInMemoryDocumentStore.bm25_retrieval()internally for BM25L search — no custom search engine, no extra dependenciessearch_tools(query, k)bootstrap tool for LLM-driven discoverysearch_threshold(default: 8)clear()method for resetting discovered tools between agent runsto_dict/from_dict)How can it be used
How did you test it
Notes for the reviewer
InMemoryDocumentStore.bm25_retrieval(), reusing well-tested Haystack infrastructure instead of a hand-rolled engine