Mastering Cloud Code Sub‑Agents: Best Practices for Faster, Low‑Token AI Coding

 4 min read

YouTube video ID: LCYBVpSB0Wo

Source: YouTube video by AI JasonWatch original video

PDF

Introduction

The Cloud Code platform recently added a sub‑agent feature. While the idea was exciting, many users experienced slow performance, high token consumption, and disappointing results. After learning the proper way to use sub‑agents, the author achieved consistent, high‑quality code generation.

Why Sub‑Agents Were Introduced

  • Cloud Code’s main agent has built‑in tools (read file, list files, edit files, etc.).
  • Some tools, especially read‑file, inject the entire file content into the conversation, quickly eating up the token window.
  • When the main agent alone handles a large codebase, it can consume up to 80 % of the context before any implementation starts, forcing a compact‑conversation step that loses important history and degrades performance.

How Sub‑Agents Reduce Token Usage

  1. Task Delegation – The parent agent assigns a specific research task to a sub‑agent.
  2. Isolated Execution – The sub‑agent runs its own tools; its intermediate steps never appear in the parent’s conversation history.
  3. Summarized Return – After completing the research, the sub‑agent sends back a concise markdown summary (a few hundred tokens) that the parent can use to decide the next action.
  4. Context Engineering – By turning heavy‑token operations into tiny summaries, the overall token budget stays low while preserving essential information.

Common Pitfalls

  • Using Sub‑Agents for Full Implementation – When a sub‑agent writes code directly, any mistake forces the parent to re‑orchestrate a new conversation, but the parent lacks detailed context about what the sub‑agent did.
  • Limited Cross‑Agent Memory – Each sub‑agent only knows its own session; it cannot see previous work of other agents, leading to duplicated effort or missing dependencies.
  • Bug Fix Loops – If a front‑end sub‑agent produces buggy UI, the parent cannot efficiently guide a fix because it never saw the actual file changes.

Designing Effective Research‑Only Sub‑Agents

  • Treat every sub‑agent as a specialist researcher (e.g., Front‑End UI expert, Stripe integration expert, Vercel AI SDK expert).
  • Provide each sub‑agent with:
  • Up‑to‑date documentation in its system prompt.
  • Access to custom MCP tools that fetch relevant components or code snippets.
  • Clear goals: “Produce a design/implementation plan and store it in a markdown file; do not write code directly.”
  • After the sub‑agent finishes, the parent agent reads the markdown plan and executes the actual implementation, keeping full context.

Context Sharing via Markdown Files

Inspired by Manus’s context‑engineering blog, the workflow stores large tool outputs in local .md files instead of the conversation history: 1. Sub‑agent writes a research report and implementation plan to doccloud/task/<feature>.md. 2. Parent agent reads this file to gain the necessary context. 3. Before starting, every sub‑agent reads the shared context file to understand the current project state. 4. After completing its work, the sub‑agent updates the same file with a summary of actions taken. This file‑based approach dramatically improves success rates and keeps token usage minimal.

Step‑by‑Step Setup Example

  1. Create a Personal Agent via code doc clock and add a new sub‑agent (e.g., Chassen Front‑End Expert).
  2. Populate System Prompt with:
  3. Goal description.
  4. Relevant documentation excerpts (e.g., Chassen component guide, Vercel AI SDK v5 docs).
  5. Rules that forbid direct code generation.
  6. Output format that points to a markdown file.
  7. Configure MCP Tools in the global settings (MCP server) so the sub‑agent can retrieve components, example code, and design references.
  8. Run a Project Prompt such as “Build a replica of CHBT using Chassen for UI and Vercel AI SDK for backend.”
  9. The parent agent creates a project‑wide context file, delegates UI design to the Chassen expert, receives a detailed UI plan, then implements the UI itself.
  10. The same pattern repeats for backend integration with the Vercel AI SDK expert.
  11. Throughout, the parent agent monitors a background session, updates the context file, and can instantly answer user queries because it retains the full execution history.

Results and Takeaways

  • The author generated a high‑fidelity UI and fully functional backend in a single run.
  • Token consumption stayed low because heavy file reads were hidden inside sub‑agents.
  • The parent agent retained complete context, enabling quick bug fixes and interactive demos.
  • The workflow scales: add more specialist sub‑agents (Stripe, Supabase, Tailwind, etc.) and reuse the same context‑file pattern.

Bonus Resource

The video also promotes a free guide “Money‑Making AI Agents” by Dimmitri Shapier, covering product launch, pricing, and sales scripts for AI‑powered services.

Treat sub‑agents as focused researchers that return concise markdown plans, store heavy data in external files, and let the parent agent handle all implementation. This design preserves context, cuts token usage, and yields faster, more reliable Cloud Code projects.

Frequently Asked Questions

Who is AI Jason on YouTube?

AI Jason is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

Why Sub‑Agents Were Introduced

* Cloud Code’s main agent has built‑in tools (read file, list files, edit files, etc.). * Some tools, especially **read‑file**, inject the entire file content into the conversation, quickly eating up the token window. * When the main agent alone handles a large codebase, it can consume up to 80 % of the context before any implementation starts, forcing a *compact‑conversation* step that loses important history and degrades performance.

How Sub‑Agents Reduce Token Usage

1. **Task Delegation** – The parent agent assigns a specific research task to a sub‑agent. 2. **Isolated Execution** – The sub‑agent runs its own tools; its intermediate steps never appear in the parent’s conversation history. 3. **Summarized Return** – After completing the research, the sub‑agent sends back a concise markdown summary (a few hundred tokens) that the parent can use to decide the next action. 4. **Context Engineering** – By turning heavy‑token operations into tiny summaries, the overall token budget stays low while preserving essential information.

Helpful resources related to this video

If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.

Links may be affiliate links. We only include resources that are genuinely relevant to the topic.

PDF