33This directory contains examples for the ` PlannerExecutorAgent ` , a two-tier agent
44architecture with separate Planner (7B+) and Executor (3B-7B) models.
55
6+ > ** See also** : [ Full User Manual] ( ../../docs/PLANNER_EXECUTOR_AGENT.md ) for comprehensive documentation.
7+
68## Examples
79
810| File | Description |
911| ------| -------------|
1012| ` minimal_example.py ` | Basic usage with OpenAI models |
13+ | ` stepwise_example.py ` | Stepwise (ReAct-style) planning for unfamiliar sites |
14+ | ` automation_task_example.py ` | Using AutomationTask for flexible task definition |
15+ | ` captcha_example.py ` | CAPTCHA handling with different solvers |
1116| ` local_models_example.py ` | Using local HuggingFace/MLX models |
1217| ` custom_config_example.py ` | Custom configuration (escalation, retry, vision) |
1318| ` tracing_example.py ` | Full tracing integration for Predicate Studio |
@@ -23,6 +28,7 @@ architecture with separate Planner (7B+) and Executor (3B-7B) models.
2328│ • Generates JSON plan │ • Executes each step │
2429│ • Includes predicates │ • Snapshot-first approach │
2530│ • Handles replanning │ • Vision fallback │
31+ │ • Stepwise (ReAct) mode │ │
2632└─────────────────────────────────────────────────────────────┘
2733 │
2834 ▼
@@ -34,6 +40,50 @@ architecture with separate Planner (7B+) and Executor (3B-7B) models.
3440└─────────────────────────────────────────────────────────────┘
3541```
3642
43+ ## Planning Modes
44+
45+ ### Upfront Planning (Default)
46+
47+ The planner generates a complete multi-step plan before execution. Use for well-known sites.
48+
49+ ``` python
50+ result = await agent.run(runtime, task)
51+ ```
52+
53+ ### Stepwise Planning (ReAct-style)
54+
55+ The planner decides one action at a time based on current page state. ** Recommended for unfamiliar sites.**
56+
57+ ``` python
58+ from predicate.agents import StepwisePlanningConfig
59+
60+ config = PlannerExecutorConfig(
61+ stepwise = StepwisePlanningConfig(
62+ max_steps = 30 ,
63+ action_history_limit = 5 ,
64+ ),
65+ )
66+
67+ agent = PlannerExecutorAgent(planner = planner, executor = executor, config = config)
68+ result = await agent.run_stepwise(runtime, task)
69+ ```
70+
71+ ### Auto-Fallback (Default Behavior)
72+
73+ By default, ` agent.run() ` automatically falls back to stepwise planning when upfront planning fails:
74+
75+ ``` python
76+ # Default: auto_fallback_to_stepwise=True
77+ result = await agent.run(runtime, task)
78+
79+ # Check if fallback was used
80+ if result.fallback_used:
81+ print (" Automatically switched to stepwise planning" )
82+
83+ # Disable auto-fallback
84+ config = PlannerExecutorConfig(auto_fallback_to_stepwise = False )
85+ ```
86+
3787## Quick Start
3888
3989``` python
@@ -139,3 +189,117 @@ agent = PlannerExecutorAgent(
139189
140190tracer.close() # Upload trace to Studio
141191```
192+
193+ ## AutomationTask
194+
195+ Use ` AutomationTask ` for flexible task definition with built-in recovery:
196+
197+ ``` python
198+ from predicate.agents import AutomationTask, TaskCategory
199+
200+ # Basic task
201+ task = AutomationTask(
202+ task_id = " search-products" ,
203+ starting_url = " https://amazon.com" ,
204+ task = " Search for laptops and add the first result to cart" ,
205+ category = TaskCategory.TRANSACTION ,
206+ enable_recovery = True ,
207+ )
208+
209+ # Add success criteria
210+ task = task.with_success_criteria(
211+ {" predicate" : " url_contains" , " args" : [" /cart" ]},
212+ {" predicate" : " exists" , " args" : [" .cart-item" ]},
213+ )
214+
215+ result = await agent.run(runtime, task)
216+ ```
217+
218+ ## Permissions
219+
220+ Grant browser permissions to prevent permission dialogs from interrupting automation:
221+
222+ ``` python
223+ from predicate import AsyncPredicateBrowser
224+
225+ # Grant permissions to avoid "Allow this site to access your location?" dialogs
226+ permission_policy = {
227+ " auto_grant" : [
228+ " geolocation" , # Store locators, local inventory
229+ " notifications" , # Push notification prompts
230+ " clipboard-read" , # Paste coupon codes
231+ " clipboard-write" , # Copy product info
232+ ],
233+ " geolocation" : {" latitude" : 47.6762 , " longitude" : - 122.2057 }, # Mock location
234+ }
235+
236+ async with AsyncPredicateBrowser(
237+ permission_policy = permission_policy,
238+ ) as browser:
239+ # Run automation without permission dialogs
240+ ...
241+ ```
242+
243+ ## CAPTCHA Handling
244+
245+ Configure CAPTCHA solving with different strategies:
246+
247+ ``` python
248+ from predicate.agents.browser_agent import CaptchaConfig
249+ from predicate.captcha_strategies import HumanHandoffSolver, ExternalSolver
250+
251+ # Human handoff: wait for manual solve
252+ config = PlannerExecutorConfig(
253+ captcha = CaptchaConfig(
254+ policy = " callback" ,
255+ handler = HumanHandoffSolver(timeout_ms = 120_000 ),
256+ ),
257+ )
258+
259+ # External solver: integrate with 2Captcha, CapSolver, etc.
260+ def solve_captcha (ctx ):
261+ # Call your CAPTCHA solving service
262+ pass
263+
264+ config = PlannerExecutorConfig(
265+ captcha = CaptchaConfig(
266+ policy = " callback" ,
267+ handler = ExternalSolver(resolver = solve_captcha),
268+ ),
269+ )
270+ ```
271+
272+ ## Modal/Drawer Dismissal
273+
274+ Automatic modal and drawer dismissal is enabled by default in both upfront and stepwise planning modes.
275+
276+ After successful CLICK actions, the agent automatically detects and dismisses blocking overlays:
277+
278+ ``` python
279+ from predicate.agents import PlannerExecutorConfig, ModalDismissalConfig
280+
281+ # Default: enabled with common patterns (works in both modes)
282+ config = PlannerExecutorConfig()
283+
284+ # Custom patterns for non-English sites
285+ config = PlannerExecutorConfig(
286+ modal = ModalDismissalConfig(
287+ dismiss_patterns = (
288+ " no thanks" , " not now" , " close" , " skip" , # English
289+ " nein danke" , " schließen" , # German
290+ " no gracias" , " cerrar" , # Spanish
291+ ),
292+ ),
293+ )
294+
295+ # Disable modal dismissal
296+ config = PlannerExecutorConfig(
297+ modal = ModalDismissalConfig(enabled = False ),
298+ )
299+ ```
300+
301+ This handles common e-commerce scenarios like:
302+ - Amazon's "Add Protection Plan" drawer after Add to Cart
303+ - Cookie consent banners
304+ - Newsletter signup popups
305+ - Promotional overlays
0 commit comments