Current interfaces conflate “what” with “how”. Users must navigate implementation logic to achieve goals.
Want to book travel? Search flights, compare prices, cross-check policies, calculate costs, coordinate timing, manage multiple systems. The outcome is simple: “get to Berlin on these dates”. The execution is complex: death by a thousand clicks.
The intention-first paradigm:
- User specifies high-level intention
- System determines optimal execution
- AI infers goals, evaluates options, executes workflows, handles ambiguity, learns from outcomes
The technical foundation exists. What’s missing is trust—and the design frameworks that enable trust without visibility.
The proper division of labor:
- Humans: desire, judgment, meaning-making
- Machines: execution, coordination, optimization
We should each do what we’re best at.
Think of Tony Stark working with JARVIS in Iron Man. Stark doesn’t operate the suit by pulling levers or pressing buttons. He states intentions: “divert power to repulsors", “run diagnostics on arc reactor”, and JARVIS executes. The technical complexity is delegated. Stark focuses on strategy, creativity, and judgment.
That’s not passivity. That’s augmentation.
When I say “intention over instruction", I mean freeing humans to focus on what we’re uniquely good at: vision, creativity, judgment. While AI handles what it’s uniquely good at: execution, coordination, optimization.
The goal isn’t to make us lazy. It’s to make us powerful.
The death of the "Mode": Today, we are forced to choose a "mode" before we act. We decide to open a chat app (to type), or press a microphone button (to speak), or pick up a mouse (to click). We have to pre-configure our input method.
Post-Interface Design abolishes the "mode".
In a truly intelligent system, input is fluid.
- I might point at a lamp (gesture).
- And say, "Make it warmer" (voice).
- While looking at my book (gaze context).
The system doesn't need three separate apps for this. It fuses the signals. It understands that "It" refers to the lamp I'm pointing at, and "Warmer" refers to the light temperature, not the heat, because it knows I am reading.
We don't stop typing. We don't stop talking. We just stop switching. We use whatever signal is most natural in the millisecond, and the machine catches it all.