Agent vs Canvas vs Flow
Same models, three levels of control. Real scenarios mapped to each mode, and a checklist for picking fast.
PonPon runs the same library of models — the video generators, the image models, the upscalers — through three very different control surfaces. The agent drives everything for you. Canvas lets you compare outputs by hand. Flow lets you wire a repeatable pipeline. New users often pick one and never try the others, which means they fight the wrong tool for half their work. This guide matches the mode to the job so you stop doing that, with concrete scenarios and a quick decision checklist.
The one-line version: use the agent when you want a finished video from a brief, use Canvas when you want to compare and art-direct, and use Flow when you want to run the same process again and again. The rest explains where each line actually falls.
Same models, three levels of control
The important thing to understand is that these are not three products with different quality. They share the same underlying models. What differs is how much of the work you hand over. Think of it as a control dial: the agent is one end (hands off), Flow is a structured middle (you design the process, it runs), and Canvas is the open end (you do everything yourself). Picking well is mostly about being honest about how much control this particular job actually needs.
The agent: hands-off, brief to edit
When you let the agent run, you give it a brief and it handles planning, model selection, generation, and assembly. You trade fine control for speed and a finished sequence. This is the right mode for the bulk of real work: drafts, ads, social clips, anything where you want a usable edit fast and can refine afterward. If briefing rather than prompting is new, the walkthrough on getting the most out of the agent covers how to run it well.
Strengths: fastest path from idea to finished cut; built-in shot planning; consistency across shots. Trade-off: it makes the production decisions, so when you want a specific frame done a specific way, you have outgrown it for that shot.
Canvas: compare and art-direct
Canvas is the manual workspace. You write a prompt, run it across several models at once, and pick the winner — then iterate on that winner shot by shot. This is where you go for hero shots and for the moments where your eye, not an automated plan, should make the call. The cost is time: running prompts side by side and judging each output is slower than handing over a brief, which is exactly why it is wasteful for routine drafts and valuable for the shots that carry a project.
Canvas is also where you learn what the models can do. Comparing outputs teaches you which model suits which look — knowledge that then makes your agent briefs sharper. Strengths: total creative control; model comparison; best output per shot. Trade-off: slow for volume; you do the sequencing yourself.
Flow: build a repeatable pipeline
Flow is for when the process matters more than any single video. You wire models together as nodes — generate, then upscale, then remove background, in a fixed order — and run that pipeline on input after input. If you produce the same kind of asset repeatedly, a node-based pipeline turns a manual checklist into one reusable graph. The investment is the setup; the payoff is consistency at volume. For a deeper look at the two manual modes, our Canvas and Flow head to head goes further than there is room for here. Strengths: repeatability; consistency at scale; no manual steps once built. Trade-off: setup cost; overkill for one-offs.
Which mode fits which job
| You want to... | Use | Why |
|---|---|---|
| Get a finished video from a brief | Agent | It plans, generates, and assembles for you |
| Compare models on one prompt | Canvas | Side-by-side output, you pick the winner |
| Art-direct a hero shot | Canvas | Frame-level manual control |
| Run the same process repeatedly | Flow | A reusable node pipeline |
| Make a multi-shot narrative fast | Agent | Built-in shot planning |
| Produce assets at volume, consistently | Flow | One pipeline, many inputs |
Real scenarios, mapped
The table is abstract; here is how it plays out.
- "I need a product ad by end of day." Agent. Brief it, refine the cut, ship. Manual modes would cost you the deadline.
- "I am locking the visual look for a campaign." Canvas. Run the key prompt across models, compare, and pick the look you will standardize on.
- "I cut out and upscale 50 product shots a week." Flow. Build the pipeline once; feed it the week's inputs.
- "I want a cinematic multi-shot scene." Agent for the draft; for long-form sequenced work, the multi-shot cinema mode extends the same idea to full scenes.
- "This one hero shot has to be perfect." Canvas. Art-direct it by hand, then drop it back into the larger edit.
Notice the pattern: deadline and volume push you toward automation (agent, Flow); a single high-stakes shot pulls you toward manual control (Canvas).
How the three work together
The modes are not rivals; they are a progression. A common workflow is to draft with the agent, lift the one shot that must be perfect into Canvas to art-direct it, and — if the project becomes recurring — codify the whole thing as a Flow pipeline so the next one is push-button. None of this requires committing to a single tool up front; you move a project between modes as its needs change, and because they share models, nothing is lost in the move.
A quick decision checklist
- Do you have a deadline and want a finished video? → Agent.
- Are you choosing between models or perfecting one shot? → Canvas.
- Will you run this exact process many times? → Flow.
- Is it a one-off, illustrative, single clip? → Agent, or a single generation.
When none of these is the answer
Honesty matters here. If your video's value is a real human being — a founder to camera, a genuine testimonial — no control mode changes the fact that audiences respond to authenticity, and a real person may serve you better. And if you only ever need one clip, one time, you do not need Flow's pipeline or Canvas's comparison. Pick the lightest tool that does the job. The point of three modes is not to use all three; it is to never be stuck in the wrong one. If the idea of an agent that takes a brief is still new, the agent concept explained covers it before you decide how hands-on to be.