LLM agents with unrestricted tool access are effectively RCE vulnerabilities. System prompts saying 'don't run dangerous code' are trivially bypassed.
Defense in depth: (1) Sandboxed container (gVisor, Firecracker) with no network, read-only filesystem, resource limits. (2) Static analysis: AST-parse code, block dangerous imports (os, subprocess, socket). (3) Capability-based permissions: agent declares needed resources, sandbox grants only those. (4) Audit logging of all executed code.
Direct inter-agent communication with unstructured text loses information. The writer can't distinguish 'critical fact' from 'background context' when everything arrives as a wall of text.
Use structured handoff protocols: researcher outputs a typed schema with priority-ranked facts, source citations, and relevance scores. The writer receives top-K facts sorted by relevance + full list as appendix. Consider a 'critic' agent that reviews writer output against researcher findings and requests revisions if coverage is low.
Token counters and API call limits are too coarse -- they kill the agent mid-task. The agent needs cost awareness built into its planning, not just hard cutoffs.
Cost-aware planning loop: (1) Before each tool call, estimate cost and check remaining budget. (2) Cheaper tools first (cache lookup before API, small model before large). (3) Per-step budgets with warnings. (4) Budget checkpoint where agent justifies remaining spend. (5) Log cost per tool call for observability.
Naive approaches either lose context (stateless) or explode context length (pass everything). The agent needs structured memory management, not raw conversation history.
Implement tiered memory: (1) Working memory -- current turn + extracted key facts (name, order_id, issue). (2) Short-term memory -- compressed summary of last N turns. (3) Long-term memory -- persistent store (Redis/DB) keyed by user_id with structured facts. Extract entities each turn, update structured state, reconstruct context from state + last 2 raw turns.
Agents that chain tools without understanding data dependencies create race conditions and infinite loops. The LLM doesn't inherently understand distributed system consistency guarantees.
Add tool dependency metadata: declare which tools read/write which data. Implement a tool execution planner that respects data flow ordering. Add circuit breakers: max 3 retries per tool, exponential backoff. Detect loops by tracking (tool_name, input_hash) pairs -- if seen twice, break and summarize.
AI Agents — 5 questions
Questions answered before reveal