Current state
Pilo plans once at task start via planTask (webAgent.ts:1377-1477). The planner LLM returns:
{
successCriteria: string;
plan: string; // Markdown step-by-step plan
url?: string;
actionItems?: string[];
}
These are stored on the WebAgent instance:
private plan: string;
private successCriteria: string;
private actionItems?: string[];
private url: string;
The plan is embedded permanently into messages[1] (the first user message after the system prompt) via buildTaskAndPlanPrompt. The agent reads this once, then proceeds through iterations. There is no mechanism to revise the plan mid-task.
The gap
Common patterns this breaks:
- The plan turns out to be wrong — the planner assumed the user can be reached at
example.com but actually the site has moved. The original plan keeps showing up in messages[1] forever even though it's misleading.
- A new constraint emerges mid-task — the agent discovers a CAPTCHA, a paywall, a region-block. The plan didn't account for this. The agent reactively works around it without updating the canonical plan it's working from.
- Long tasks accumulate hidden context — by iteration 20, the model has discovered many facts about the page/task that aren't reflected in the stale plan. The plan in
messages[1] is the original plan; everything since is implicit in the conversation history.
Compounding this: the actionItems array (3-6 word UI labels for plan steps) is set once and never updated. UI consumers showing progress see the original plan stage labels even when the agent has substantially deviated.
Proposed scope
A. Add a revise_plan tool (gated)
revise_plan: tool({
description:
"Update your task plan when your understanding of the task has materially changed " +
"(e.g., a constraint emerged, the original approach won't work, or you discovered " +
"a better path). Provide the revised plan as Markdown. Use sparingly — only when " +
"the original plan is misleading or incomplete.",
inputSchema: z.object({
revisedPlan: z.string().describe("The updated plan as Markdown"),
reason: z.string().describe("Brief explanation of why the plan needed revision"),
revisedActionItems: z.array(z.string()).optional()
.describe("Updated 3-6-word action item labels"),
}),
execute: async ({ revisedPlan, reason, revisedActionItems }) => {
// Update agent-instance state
// Emit PLAN_REVISED event
return {
success: true,
action: "revise_plan",
revisedPlan,
reason,
revisedActionItems,
};
},
}),
Gated on a config flag: WebAgentOptions.enableReplanning?: boolean (default false). Off by default — adds complexity, may not be worth it for all tasks.
B. Plan-update propagation
When revise_plan is called, update the agent's instance state (this.plan, this.actionItems) and append a system-message-style note to messages:
[Plan revised at iteration N]
Reason: {reason}
Updated plan:
{revisedPlan}
This makes the revised plan visible to subsequent turns. Do not modify messages[1] directly — leave the original plan as the historical anchor so the conversation history stays coherent.
C. Emit PLAN_REVISED event
PLAN_REVISED: {
iterationId: string;
iteration: number;
reason: string;
newPlan: string;
newActionItems?: string[];
}
UI consumers (CLI progress display, extension popup) can re-render the action items list.
D. Surface in validator
If revise_plan was called, the validator should see both the original task and the revised plan. Update buildTaskValidationPrompt to include revisedPlan if it differs from the original.
E. System prompt update
If enableReplanning is true, append a best-practices bullet:
- If you discover the original plan won't work or needs significant adjustment (a
constraint emerged, a key assumption was wrong), call revise_plan() with an
updated plan and a brief reason. Do not call revise_plan() for minor tactical
changes — only when the original plan is materially misleading.
Implementation notes
- Replanning is a power tool that's easy to misuse. Without prompt guardrails, models may call
revise_plan every few iterations as a form of nervous restructuring. Add to the prompt: "Use sparingly. Tactical changes don't need a revised plan."
- The validator's success criteria are derived from the planner output. If
revise_plan updates them, validator behavior could change mid-task. Decide whether revise_plan can update successCriteria (probably yes, but it's a sharper edge) or only plan and actionItems.
- Plan revision interacts with the validation force-accept path. If the agent revises the plan to be much simpler and then claims
done() on the simpler plan, the validator might rubber-stamp it. The validator prompt should be aware that a plan revision happened.
- Test scenarios:
- Agent revises plan once mid-task, completes successfully.
- Agent abuses
revise_plan (calls it 5 times in 10 iterations) — does the warning kick in?
- Validator with revised plan correctly assesses against the revised success criteria.
Acceptance criteria
revise_plan tool exists, gated by enableReplanning config.
- Calling it updates instance state and appends an annotated message to the conversation.
PLAN_REVISED event fires with the right payload.
- Validator sees the revised plan when applicable.
- System prompt includes guidance when feature is enabled.
- Tests cover: single revision, multiple revisions, validator behavior after revision, gated-off behavior.
Effort estimate
2-3 days including tests and prompt iteration. The hard part is preventing over-use, not implementing the basic mechanism.
Related issues
Pairs with the validator-fix issue (validator with conversation context naturally absorbs plan revisions). Distinct from the multi-action-per-turn issue.
Files likely affected
packages/core/src/tools/planningTools.ts (new revise_plan tool)
packages/core/src/webAgent.ts (plumbing through generateAndProcessAction, instance state)
packages/core/src/types/ (WebAgentOptions, event types)
packages/core/src/events.ts (PLAN_REVISED)
packages/core/src/prompts.ts (system prompt update, validator prompt)
packages/core/test/webAgent.test.ts
Current state
Pilo plans once at task start via
planTask(webAgent.ts:1377-1477). The planner LLM returns:These are stored on the WebAgent instance:
The plan is embedded permanently into
messages[1](the first user message after the system prompt) viabuildTaskAndPlanPrompt. The agent reads this once, then proceeds through iterations. There is no mechanism to revise the plan mid-task.The gap
Common patterns this breaks:
example.combut actually the site has moved. The original plan keeps showing up inmessages[1]forever even though it's misleading.messages[1]is the original plan; everything since is implicit in the conversation history.Compounding this: the
actionItemsarray (3-6 word UI labels for plan steps) is set once and never updated. UI consumers showing progress see the original plan stage labels even when the agent has substantially deviated.Proposed scope
A. Add a
revise_plantool (gated)Gated on a config flag:
WebAgentOptions.enableReplanning?: boolean(default false). Off by default — adds complexity, may not be worth it for all tasks.B. Plan-update propagation
When
revise_planis called, update the agent's instance state (this.plan,this.actionItems) and append a system-message-style note tomessages:This makes the revised plan visible to subsequent turns. Do not modify
messages[1]directly — leave the original plan as the historical anchor so the conversation history stays coherent.C. Emit
PLAN_REVISEDeventUI consumers (CLI progress display, extension popup) can re-render the action items list.
D. Surface in validator
If
revise_planwas called, the validator should see both the original task and the revised plan. UpdatebuildTaskValidationPromptto includerevisedPlanif it differs from the original.E. System prompt update
If
enableReplanningis true, append a best-practices bullet:Implementation notes
revise_planevery few iterations as a form of nervous restructuring. Add to the prompt: "Use sparingly. Tactical changes don't need a revised plan."revise_planupdates them, validator behavior could change mid-task. Decide whetherrevise_plancan updatesuccessCriteria(probably yes, but it's a sharper edge) or onlyplanandactionItems.done()on the simpler plan, the validator might rubber-stamp it. The validator prompt should be aware that a plan revision happened.revise_plan(calls it 5 times in 10 iterations) — does the warning kick in?Acceptance criteria
revise_plantool exists, gated byenableReplanningconfig.PLAN_REVISEDevent fires with the right payload.Effort estimate
2-3 days including tests and prompt iteration. The hard part is preventing over-use, not implementing the basic mechanism.
Related issues
Pairs with the validator-fix issue (validator with conversation context naturally absorbs plan revisions). Distinct from the multi-action-per-turn issue.
Files likely affected
packages/core/src/tools/planningTools.ts(newrevise_plantool)packages/core/src/webAgent.ts(plumbing throughgenerateAndProcessAction, instance state)packages/core/src/types/(WebAgentOptions, event types)packages/core/src/events.ts(PLAN_REVISED)packages/core/src/prompts.ts(system prompt update, validator prompt)packages/core/test/webAgent.test.ts