Aanubhuti.design

✦ Microsoft 365 · 2024 to 2025 · Generative AI

From a thought
to an artifact.

Role
Lead Designer · Create on Mobile
Surface
Copilot mobile · Chat · Create
Year
2024 to 2025
North-star
Make creation a weekly habit
US Patents
2 filed · Voice · Mobile-first edit

Designing the creation layer for Microsoft 365 Copilot on mobile, turning a tab people rarely visited into an agent that meets them in conversation, with a framework and composer pattern now travelling across the agent ecosystem.

The lowest-engagement surface in Copilot, rebuilt as the agent that now anchors Microsoft's mobile creation strategy, with two US patents filed and a framework other agent teams build on.

✦ How I led

  • Co-authored both US patents with eng partners
  • Drove framework review across Designer · Researcher · Analyst directors
  • Ran a 2-week design sprint with 18 research participants
  • Pushed for the model migration that unlocked the SAT lift
  • Wrote the spec that became the cross-agent composer standard
  • Made the contrarian Pill call against 3 teams pushing Drawer

✦ The moment, in motion

A thought, typed in chat, and an agent that runs the rest.

The shape of the bet, in 40 seconds. Inline gathering, long-running render, focus scroll, results, all without leaving the conversation. Everything that follows in this case study is a derivative of this single moment.

In-chat creation agent · mobile capture

✦ Why this mattered

The lowest-engagement surface in Copilot, on the device most users carry.

Microsoft was betting the company on Copilot. Mobile was the surface most users actually carried, and Create was its weakest funnel. Without a credible answer there, Copilot's enterprise narrative would lose ground to consumer-grade alternatives users were already turning to (Canva, Adobe Express, ChatGPT image).

The bet I framed wasn't to fix the Create tab. It was to dissolve it, push creation into the surfaces users were already in, write a stage vocabulary the org didn't yet have, and ship a composer pattern other agent teams could build on. That meant breaking the established model, betting on a stronger image generator, and aligning four product orgs around a single way of saying 'create.'

✦ Outcomes at a glance

+15%

discoverability lift on the new composer

more users attempted artifact creation

41%

of triers returned to create again

+12pt

SAT on hard cases post-model migration

✦ Beyond the metrics

  • 02US patents filedVoice creation · Mobile-first text editing
  • 03Agent teams on the frameworkDesigner · Researcher · Analyst
  • 01Cross-agent composerSame Pill, three agents, one mental model

Outcomes are described directionally for portfolio purposes; the underlying telemetry, OKRs and quality lift figures are available on request under appropriate context.

01, The friction

A small moment, told too often.

A user opens Copilot on their phone. They want a quick infographic for a team review. They tap, type, switch the dropdown if they can find it, wait. The render is long-running, and the moment they swap apps to check Slack, it dies. The result, when it survives, is riddled with spelling errors. There's nowhere to find it again.

The person who lived through that moment didn't open Create again the next week. Multiplied across the funnel, that's the OKR, directly, mechanically, the OKR.

02, Shape of the problem

Three failures, one funnel.

01

The tab people never visited

Mobile Copilot users overwhelmingly stayed in chat. Fewer than 8% ever reached the Create tab in a given week, and of those, fewer than 3% triggered a generation. Creation was a destination people had to find, not a moment they fell into.

<8% reached Create · <3% generated

02

The intents nobody discovered

Even inside Create, almost no one switched the intent dropdown, under 4% of sessions ever changed it from the default. Posters, infographics, drafts and stories were technically available, functionally invisible. The system had a vocabulary; the user couldn't see it.

<4% of sessions switched intent

03

The output that didn't earn trust

Pre-4o images were 'very, very AI', typos, cartoonish, unusable in a professional context. Long-running artifacts died when the app backgrounded; abandoned tasks accounted for ~30% of session loss. With no history, users feared losing what they did make.

~30% of sessions lost to backgrounded renders

Top issues identified across telemetry and user research, mapped to next steps.
The diagnosis · top issues mapped to design responses

03, The bet

Three prongs, one design intent.

Not a roadmap. A way of thinking about creation as something that happens where the user already is, across more of what they actually need, with the fidelity to be reused. I framed the three prongs to leadership as Reach · Coverage · Quality , and held the line on all three even as scope pressure tried to collapse them into one.

Reach

, Meet creation where work already happens

Stop treating Create as a destination. Push the same creation intelligence into chat as a first-class agent and into Bizchat as an inline composer, so users never have to leave the conversation to make something.

Inline · agentic · ambient

Bizchat · inline composer in conversation

Coverage

, Span the artifacts a working day actually needs

Beyond images: posters, drafts, infographics, stories, podcasts, video, charts. Each intent gets enterprise grounding (brand, templates, internal context) so the output starts where work already is, not from zero.

Image · Poster · Draft · Story · Infographic · Podcast · Video · Chart

The intent map · existing artifacts and the new ones that complete the span
The intent map · existing artifacts and the new ones that complete the span

Quality

, Earn the next iteration

Move to a stronger image model. Render in the background like the apps users compare us to. Keep a library so nothing is lost. Add fine-grained editing on canvas. Quality isn't a single ship, it's the loop that keeps people coming back.

Better model · background render · history · editable canvas

04, The framework

An agent is a journey, not a button.

Long-running creation needs a shape. We named four stages and a vocabulary of building blocks, contextual ZQ, capable selection, cards for long-running tasks, robust internal and external notifications, so any agent (Designer, Researcher, Analyst) could speak the same way.

Stage 01

Gathering

The agent acknowledges the ask, surfaces what it's pulling in, and lets the user shape the inputs (style, size, brand, references) before any pixels move.

Stage 02

Drafting

A long-running task that doesn't trap the screen. Cards minimize and expand. The user can scroll, switch app, come back, work continues.

Stage 03

Reviewing

A focused state where the agent narrates what it checked and the user can intervene before commit. Errors and edge cases get a fallback path, not a dead-end.

Stage 04

Results & Sources

The artifact lands with provenance, what was used, where it came from, and a one-tap path to iterate. The session becomes a memory, not a moment.

Designer Agent · the four stages, end-to-end
Designer Agent on mobile, the four-stage journey from gathering to results.
Annotated keyframes · gathering, drafting, reviewing, results
Gathering and selection, cards minimize and expand; multi-input selection.
Gathering · cards minimize, expand, accept multi-input selection
Outcome, single and multi-artifact creation, active card scrolls to focus, push and in-app notifications.
Outcome · interaction patterns, focus management, notifications

05, The composer crux

The keystroke that has to do everything.

On mobile, the composer is the product. It has to invite a prompt, expose intent, accept multi-modal input, switch between agents, and stay coherent across desktop, mobile and embedded surfaces. Three principles came first.

🪜

Anchor in chat, extend to creation

Leverage what people already know, typing into a chat box and getting a response. Then layer scaffolds (chips, pills, prompts) on top, because creation expects structured outputs that pure conversation can't carry alone.

🔍

Make intent discoverable

Auto-detection isn't reliable yet, and the dropdown switcher had near-total drop-off. The composer has to expose its capability before the first character is typed.

🧬

Build for one system, many agents

Designer, Researcher, Analyst, each with its own intents and dimensions. The composer pattern has to scale across agents, surfaces (desktop, mobile, embedded) and entry points without splintering.

✦ Three composer explorations

Each one trades something different.

Exploration 01

Drawer

What it gets right

Best continuity with the existing desktop composer; structurally simplest to ship.

Where it strains

Intent discovery is broken, the switcher had documented drop-off rates inside the Create module. Bottom-bar functions lose coherence.

Drawer · refiners and dimensions tucked into a slide-up sheet.

Exploration 02

Pill

What it gets right

Intents are visible before the first keystroke. Refiners stay in a single inline button. The pattern scales across other agents (Researcher, Analyst).

Where it strains

Breaks the established composer model, but in a way that pays the discoverability tax up front instead of every session.

Pill · intents exposed as a top row above the composer.

Exploration 03

Combined add

What it gets right

Aligns with the prevailing chat-app idiom; cleanest hierarchy; best raw fit for a small mobile screen.

Where it strains

Requires a structural change at the bizchat layer; intent discoverability still routes through one tap of indirection.

Combined add · agent change and tools collapsed into a single + button.

✦ The call I made

Pill won, against three teams pushing Drawer.

Drawer was ~6 weeks faster to ship and structurally simpler. Three teams pushed for it. I argued the opposite, because the discoverability data made the trade obvious: Drawer paid the intent-discovery tax every session; Pill paid it once, upfront, and scaled across agents. Across Microsoft's mobile user base, that math wasn't close.

The cost was real: it broke the established composer model, slipped the schedule, and required eng investment at the bizchat layer. The return was structural , intents visible at first glance, refiners tucked inline, and the same pattern lived unchanged on Researcher and Analyst. The composer became a system, not a one-off.

+15% discoverability lift creation attempts3 agents on the same composer
The Pill composer pattern scaled across other agents, Researcher and Analyst.
Scaling · the same Pill pattern on Researcher and Analyst

✦ A note on dimensions

Refiners that don't crowd the keystroke.

Pill solves intent discovery; dimensions solve precision. Aspect ratio, style, references and brand sit one tap away, surfaced as a drawer the moment the user reaches for them, hidden the moment they don't. Refinement belongs to the second thought, not the first.

Dimensions · refiners drawer, summoned on demand

06, Two inventions, one thesis

✦ US Patents · two filed

Editing text on mobile, without the keyboard.

Mobile keyboards are where creation flow goes to die. They eat half the screen, force two-thumb posture, and turn every micro-edit into a context switch out of the artifact. The second US patent attacks that directly: a mobile-first interaction model for editing text in a generated artifact without ever raising the keyboard.

Tap a phrase to select it. Speak the change, or pick from AI-suggested rewrites surfaced inline. Tone, length, tense, formality, adjusted by gesture, not by typing. The artifact stays in view; the keyboard stays down.

US Patent 01 · Voice-led creation

Speak the artifact into being.

A method for multi-modal mobile creation where voice is the primary input and the screen is augmentation , letting users go from thought to artifact without surrendering both hands to the keyboard.

US Patent 02 · Mobile-first text editing

Edit text without lifting a keyboard.

A mobile-first interaction model for editing text in generated artifacts without keyboard input, tap to select, speak or pick AI-suggested rewrites, adjust tone and length by gesture. The keyboard stays down; the artifact stays in view.

Mobile text editing · tap to select, speak or pick a rewrite, keyboard never opens

Built for the thumb, not the cursor

Selection by tap, modification by voice or one-tap suggestion. The whole interaction is reachable from a single hand, the way mobile is actually held.

AI does the typing

Inline rewrite suggestions (shorter · friendlier · more formal · fix this) cover the long tail of edits that would otherwise force a keyboard. The user picks; the model writes.

The keyboard stays down

Editing happens on top of the artifact, not under a half-screen of QWERTY. Visual context never breaks, and the loop from 'change this' to 'changed' collapses to a tap.

07, What changed

A framework, a composer, and a new way of saying 'create'.

Quality, finally trusted

Moving image generation to the new model lifted SAT by ~12 points on the harder cases (multi-turn, image references, aspect ratios). Users started saying things like 'massive improvement' and 'first try'. I drove that migration internally, quality was the unblock.

A vocabulary the org could use

Gathering · Drafting · Reviewing · Results became shorthand inside Microsoft for how an agent should narrate long-running work, picked up across web and mobile, cited in cross-team specs as the default vocabulary for agent journeys.

A composer pattern that travels

Pill is shipping as the cross-agent standard, same affordances on Designer, Researcher and Analyst, in chat and in Bizchat. One spec, three agents, one mental model for users to learn.

✦ The principles, distilled

Three things mobile Create has to be.

🎙️

Enterprise-grounded multimodal input

Voice, image, file, text, and the system stitches them to people, meetings, docs. Inputs adapt to how mobile users already work, not the other way around.

🗣️

Voice-led interaction

Speak to create. The screen becomes augmentation, not the centre of gravity. Frees the hands, frees the screen, makes mobile creation feel native.

🧠

From thought to artifact, with memory

Begin with a thought. End with something send-ready. And the work-in-progress remembers itself, so the loop closes and the user comes back next week.

08, What I learned

The composer is the operating system.

Discoverability is the unblocked OKR

Before optimising any one artifact, the funnel was leaking at the entry. The single biggest design lever wasn't a better composer, it was making capability visible at all.

An agent is a journey, not a feature

Naming the four stages, Gathering, Drafting, Reviewing, Results, gave every other team a shared vocabulary. The framework travels, even when the surface changes.

The composer is the operating system

On mobile, the composer carries the whole product. Every gram of weight on it has to earn its keep. Pill won not because it was prettier, it surfaced what the user couldn't otherwise see.

✦ What this work proves

A thesis for AI on
the device users carry.

Mobile-first AI creation is a different design problem than desktop. It demands a composer that makes capability visible before the keystroke, an agent that owns its journey end-to-end, and an editing layer that turns generation into negotiation.

These three patterns now shape how Microsoft ships AI on mobile, and they generalize. Any agent-driven creation product that lives on a phone will eventually converge on something close to them. What I'm proudest of isn't any single screen, it's that the vocabulary I wrote got picked up.

The next leap is keyboard-less, voice + touch as the primary mode of mobile creation, and editing text without ever raising the keyboard. Both US patents stake out that ground; the Concepts work below shows where it goes hands-on.