Transforming the Anti-Craft Machine - Part 2

Launch the Pilot

Nov 10, 2025

You’ve run the diagnostic. You’ve cleaned house on dead features. Leadership is at least willing to try something different. Now comes the hard part: actually launching a pilot team that can prove craft works in your org.

This is where most transformations fail. Not because the idea is wrong, but because the pilot team gets crushed by organizational pressure within a few weeks. Here’s how to keep them alive long enough to show results.

The Minimum Viable Change

You cannot change everything at once. You’ll lose. Pick one team. The smallest unit that can own a complete customer problem. Three to eight people. Product manager, engineers, designer.

Give them one problem to solve. Not a feature to build. A problem. “Customers churn within 90 days because they can’t figure out how to integrate our product into their workflow” is a problem. “Build an onboarding wizard” is a feature.

Change their metrics.

All of them. Collectively accountable for:

Adoption rate of whatever they ship (percentage of target customers actively using it)
Time to value (how long until customers get meaningful results)
Single Ease Question (SEQ) scores (ask “How easy was it to complete this task?” on a 7-point scale immediately after they use the feature)
Retention improvement for customers who use it versus those who don’t

Remove their individual metrics. No story points. No velocity. Just shared accountability for customer outcomes.

Give them authority to make tradeoffs. They can simplify the solution to increase adoption. They can say no to requirements that won’t drive outcomes. They can build less but better. They can kill features from the plan that dilute the core value. They decide what will actually get customers to adopt and get value, not what checks every stakeholder’s box.

Protect them.

When sales wants a custom feature for a deal, the answer is no unless it serves the problem this team owns. When an executive has an idea, it goes in the backlog like everything else. When someone complains they’re moving too slow, you point to the outcomes.

This team becomes your proof of concept. If they can’t show better outcomes within two quarters, craft won’t work in your org. If they do, you have evidence to expand.

How to Actually Align Incentives

You can’t just “add” customer outcome metrics to existing metrics. That’s how you get 15 OKRs and nobody knows what actually matters.

You have to replace metrics. Not supplement. Replace.

For product managers

Old metric: Features launched, deals closed, revenue influenced
New metric: Adoption rate of shipped features, time to value for customers, SEQ scores, retention improvement tied to product quality

How to calculate each new metric

Adoption rate: Users who completed feature’s core action ≥2x in 30 days / Total users with access. Example: 450 users integrated ÷ 1,200 users with access = 37.5% adoption.

Time to value: Median days from feature availability to first meaningful action. Track in product analytics. Example: Median 11 days from invitation sent to first successful integration.

SEQ scores: Average response to “How easy was it to complete this task?” (1–7 scale) triggered immediately after core action. Example: 1,240 responses, average 5.9 out of 7.

Retention lift: 90-day retention rate for users who adopted feature minus baseline retention. Example: Feature adopters: 78% retained. Baseline: 65% retained. Lift: +13 points.

Engineers and designers get measured on:

Same metrics. Shared accountability for adoption, time to value, SEQ scores, retention.

No code coverage targets. No lines of code. No “designs approved” or “design system contributions.” The only question: did customers adopt what we built and did it improve their outcomes?

How comp structure changes

Bonuses

70% based on team outcomes (adoption, time to value, SEQ, retention), 30% based on company outcomes (revenue, overall retention). No individual metrics.

Promotions

“Shipped features customers loved and used” becomes the criterion. Not “shipped on time” or “led three projects.”

Performance reviews

Template changes to:

What customer problem did you help solve?
What were the outcome metrics? (Include adoption %, time to value, SEQ, retention lift)
What did you learn that will make the next thing better?

The transition plan (because you can’t flip overnight)

Quarter 1

Announce new metrics for pilot team only. Rest of org stays on old metrics. Pilot team comp is 50% old metrics, 50% new metrics (hedge during transition).

Quarter 2

Pilot team comp is 100% new metrics. Start measuring new metrics for three expansion teams, but don’t tie comp yet (tracking period).

Quarter 3

Expansion teams move to 50% old, 50% new metrics. Pilot team shows results to broader org.

Quarter 4

Expansion teams move to 100% new metrics. Half the org is now tracking new metrics.

Quarter 5–6

Scale new metrics to entire product org.

You can’t skip the transition. Moving everyone overnight triggers panic and resistance. Gradual rollout with visible results gives doubters time to see it works.

Navigating the Political Landmines

The VPs will resist. They’ll say you need revenue metrics. They’ll say engineering needs delivery metrics. They’ll say this is too risky.

Before you push back, find an executive sponsor. Ideally the CEO, but at minimum someone with real authority who believes craft matters. Without air cover, the next part can get you fired.

How to identify your executive sponsor

Look for someone who:

Has asked “why don’t customers use our features?” in the last 90 days
Has direct P&L or product authority (not a staff role)
Has fired or threatened to fire someone over product quality
Has mentioned a competitor and said “their product feels better than ours”
Has at least four years tenure (they’ve seen enough cycles to value durability)

The conversation to have with them

“I pulled the data on our last 10 shipped features. Average adoption: 14%. Three features have under 5% adoption. That’s $2.3 million in engineering time building things customers don’t use. Our retention trails [competitor] by 18 points.

The current metrics got us here: launches, velocity, features shipped. If we keep measuring the same things, we’ll keep getting the same results.

I want to run a pilot. One small team, one customer problem, new metrics: adoption, time to value, satisfaction, retention. Two quarters. If outcomes improve, we expand. If not, we kill it and I’ll own the failure.

I need you to protect them from sales escalations and exec requests for two quarters. Can you do that?”

If they say yes, you have a sponsor. If they hedge: ”sounds interesting, let me think about it”, they’re not your sponsor. Keep looking.

How to build a coalition among peers

Don’t present to all VPs at once. You’ll get crossfire and they’ll protect their turf.

Instead: One-on-one conversations with each VP. Customize the pitch:

VP Engineering

“Our velocity looks high, but we’re rebuilding the same things every six months because we didn’t build them well the first time. Technical debt is compounding. This pilot will slow initial velocity but reduce rework long-term.”

VP Sales

“We keep building custom features to close deals, then 80% of customers don’t use them. That’s not a moat. This pilot will build features customers actually adopt, which improves retention and gives you referenceable customers.”

VP Customer Success

“We’re drowning in tickets because features are confusing. SEQ scores will force us to ship things customers can actually use, which reduces support load.”

Get each VP to yes individually before the group meeting. By the time you present to everyone, you have allies in the room.

What to do if CEO says yes but VPs undermine it

Month 1

VP asks pilot team to “quickly add” a feature for a deal. You email CEO: “Pilot team received request from [VP] to work on [feature] for sales deal. This dilutes focus on [problem]. Confirming pilot team priority: [problem] through [date], correct?”

Month 2

VP complains pilot team is moving too slow. You email CEO: “Pilot team shipped first iteration. Early metrics: 35% adoption, SEQ 5.8. [VP] is concerned about velocity. Want to confirm we’re optimizing for outcomes, not speed. Align?”

Month 3

VP tries to add metrics to pilot team. You email CEO: “Pilot team metrics: adoption, time to value, SEQ, retention. [VP] wants to add velocity tracking. This muddies accountability. Can you reconfirm pilot metrics with [VP]?”

You’re not tattling. You’re giving your sponsor clear decision points. Every time a VP undermines the pilot, your sponsor reinforces the boundary or admits they can’t protect you. Either way, you know where you stand.

How to use board members or advisors as leverage

If you have no internal executive sponsor, look external.

Board member who’s built product companies

“I’m trying to transform how we build product. Current approach: 12% feature adoption, retention trails competitors by 15 points. Can I walk you through a pilot plan and get your feedback?”

If they like it: “Would you be willing to mention this to [CEO] as something worth trying?” Board signal often unlocks CEO attention.

Advisor who scaled craft-focused companies

Same approach. “Can you share what worked at [their company]?” Then: “Would you talk to our CEO about this?”

External validation from someone the CEO respects can substitute for internal sponsorship, but only temporarily. You still need to convert an internal exec within two quarters or the pilot dies when the advisor’s attention shifts.

Protection Playbook: Rituals and Responses

Once you have executive sponsorship, protection becomes a set of repeatable moves you execute weekly.

Weekly rituals that create boundaries:

Monday: Review escalation requests. Any sales custom feature request? Engineering lead emails: “Thanks for flagging. This doesn’t align with [team’s problem statement]. Parking in the general backlog for future consideration.”

Tuesday: Check exec calendar invites. Did someone invite the team to a strategy session, roadmap review, or planning meeting outside their scope? Product lead declines: “Team is focused exclusively on [problem] through [date]. Can sync after we measure outcomes.”

Thursday: Review what the team said yes to this week. Did they commit to anything that dilutes focus? Kill it in standup: “That’s not our problem to solve. Who owns that? Let’s hand it off.”

The four fights you’ll have (and what to say)

Sales: “This feature blocks a $500k deal.”

You: “What’s the evidence customers will adopt it after we close the deal? Last three sales-driven features had 8% adoption. That’s $1.2 million in engineering time for features customers don’t use. Show me adoption evidence, or it goes in the backlog.”

Exec: “Why can’t they also work on [strategic initiative]?”

You: “They’re accountable for improving retention for [customer segment] by 10 points. We measure that in six months. Splitting focus means we measure nothing. Which outcome matters more?”

Engineering: “Pilot team shipped one thing this quarter. We used to ship four features per quarter. This is killing our velocity.”

You: “We used to ship four features with 12% average adoption. Pilot team shipped one feature with 58% adoption. Customers care about the second number. Velocity means improving customer outcomes, not deploying code nobody uses.”

Team reverts to specs:

When product starts writing a detailed spec: “What’s the evidence customers will adopt this? Don’t write a spec until you’ve talked to 8–10 customers, prototyped three approaches, tested the simplest one, and confirmed customers will actually use it. Then write the spec for what you validated, not what you guessed.”

Monthly check-ins to maintain air cover

Month 1: Meet with executive sponsor. Show pilot team focus: “One problem. Four metrics. No distractions. Will have data in 60 days.”

Month 2: Meet with executive sponsor. Show early signals: “Shipped first iteration. 35% adoption in two weeks versus 12% historical. SEQ scores 5.8 out of 7. Still early, but directionally good.”

Month 3: Meet with executive sponsor + VPs who’ve been pushing back. Show results: “Adoption: 58%. Time to value: down from 18 days to 6. SEQ scores: 6.2. Retention: +12 points for customers who adopted. This is what craft looks like.”

If you can’t execute these moves weekly, don’t start the pilot. The team will drown in organizational pressure within six weeks.

What End-to-End Ownership Actually Looks Like

Stop writing specs. Seriously. If product is writing detailed specs that get handed to design that get handed to engineering, you’re still running a feature factory.

Instead

Product brings a customer problem with evidence (not a solution). Template:

Problem statement: [Customer segment] can’t [achieve outcome] because [obstacle].
Evidence: Talked to [number] customers. [X]% confirmed this is their top-3 problem. Current behavior: [what they do instead]. Cost of current approach: [time/money wasted].
Success metrics: Adoption target [%], time to value target [days], SEQ target [score out of 7], retention lift target [points].

The team (product, design, engineering together) explores solutions:

Monday: Product shares problem and evidence. Team brainstorms 5–8 solution directions. No filtering yet.
Tuesday: Designer sketches three approaches. Engineer does quick feasibility check on each (1–2 hours, not detailed design). Product identifies 3–5 customers to test with.
Wednesday: Team picks simplest approach that could work. Engineer builds quick prototype (clickable, not production code).
Thursday: Product tests prototype with 3 customers. Records sessions. Team watches together.
Friday: Team decides: ship it, iterate it, or kill it. If ship: engineer scopes production build. If iterate: designer revises based on feedback. If kill: back to Tuesday with different approach.

They decide together what to build based on what will drive outcomes. No handoffs. No “design is done, now engineering starts.” Overlapping work, continuous collaboration.

They build it together

Designer in daily standups. Not just “here’s the status,” but “here’s what I’m seeing in build and here’s how we could adjust.”

Product reviewing builds in staging every two days, not just at the end. Catches misalignments early.

Engineer flagging technical constraints that affect UX. “This approach requires 8-second load time. That’ll kill adoption. Can we simplify?”

They measure together whether it worked

Week 1 post-launch: Review adoption curve. 15% adoption in first week—on track for 50%+ by week 4, or lagging?

Week 4 post-launch: Review full metrics. Adoption: target hit? Time to value: target hit? SEQ: target hit?

Week 8 post-launch: Review retention lift. Are customers who adopted this sticking around at higher rates?

Retrospective: What worked? What didn’t? What will we do differently next time?

Hiring and Training for Craft

Most people don’t have the skills for end-to-end ownership. You’ll need to hire differently or train differently or both.

For hiring

Designers: Give take-home projects that require understanding technical constraints.

Project: “Design a feature that needs to work on web, iOS, and Android. Explain which elements can be shared across platforms and which need platform-specific approaches. Show your design system thinking.”
Red flag answer: Designs three completely different approaches without acknowledging platform constraints.
Good answer: Identifies shared components, explains platform-specific adaptations, shows awareness of engineering effort.

Engineers: Give take-home projects that require thinking about user experience.

Project: “Build a feature that lets users upload and preview files. Optimize for ease of use, not just for technical correctness. Explain your UX decisions.”
Red flag answer: Builds feature with no upload progress, no error states, no empty states. Pure happy path.
Good answer: Considers edge cases from user perspective. Progress indicators. Clear error messages. Graceful failures.

Product managers: Look for facilitators, not deciders.

Interview question: “Tell me about a time your team disagreed about what to build. How did you resolve it?”
Red flag answer: “I made the call based on my experience.”
Good answer: “I brought everyone together. We listed our assumptions. We ran a quick test to validate the key assumption. The data showed which approach would work. We aligned around that.”

For training

Designers: Pair with engineers for one week. Sit with them during implementation. See what’s easy and what’s hard to build. Watch trade-offs in real time.

Engineers: Attend five customer interviews. Watch users struggle with the product. See what confuses them. Take notes on friction points. Report back to team.

Product managers: Use the product daily for two weeks. Document every friction point. Time how long tasks take. Record every moment of confusion. Share with team.

The goal is to make everyone care about the whole problem, not just their piece.

The Person Who Can Say No

Someone needs authority to kill things that won’t drive customer outcomes. Not veto power over everything. Authority specifically to say “this won’t get adopted” or “this won’t improve retention” or “customers won’t get value from this.”

This is usually a product leader, but it could be anyone with:

Direct access to customer outcome data
Understanding of the full product, not just one area
Political cover to piss people off

Their job is to look at every proposed feature and ask

What’s the evidence customers need this?

Acceptable: “Talked to 12 customers. 9 said this is top-3 problem. Showed them rough prototype, 8 said they’d use it weekly.”
Not acceptable: “Sales says customers are asking for it.” (Sales hears what closes deals, not what drives adoption.)

What’s the adoption hypothesis? (How many customers will use it?)

Acceptable: “Feature targets customers in [segment]. That’s 2,400 customers. Based on problem severity and prototype feedback, expect 55–65% adoption.”
Not acceptable: “Everyone will love this.”

What’s the value hypothesis? (What outcome improves if they use it?)

Acceptable: “Customers who adopt this will complete [workflow] in 4 minutes instead of 18 minutes. Expect retention to improve by 8–12 points based on correlation between time-to-value and retention in similar features.”
Not acceptable: “It’ll make the product better.”

Can we test this more cheaply than building it?

Acceptable: “Yes. We can prototype in Figma, test with 8 customers, validate adoption hypothesis before engineering starts.”
Not acceptable: “We need to build it to see if it works.”

If those questions don’t have good answers, the feature doesn’t get built. Even if sales needs it. Even if a competitor has it. Even if an executive wants it.

This person will be hated

Sales will complain they’re blocking deals. Executives will complain they’re not responsive to strategy. Other product managers will complain they’re too rigid.

Protect them. Publicly.

When someone complains in a meeting: “What’s the evidence customers will adopt this and it will drive retention or growth?” If they don’t have evidence, case closed.

When someone escalates to you: “We’re optimizing for features customers actually use. Last quarter, pilot team: 58% adoption. Rest of org: 14% adoption. [This person] is making sure we build things that work. That’s the job.”

If you can’t protect this person, don’t hire them. They’ll burn out or leave within six months.

When to Start Small vs When to Burn It Down

Most orgs should start with one pilot team. But some are so broken that a pilot can’t work. Here’s how to tell:

Start with a pilot if

You can get budget for one autonomous team (typically $800k–$1.2M annually: 5–7 people fully loaded)

You have at least one executive who believes craft matters (will protect the team for two quarters)

Your product has discrete areas that one team could own (you’re not so entangled that everything touches everything)

You can protect that team from organizational pressure for two quarters (sales escalations, exec requests, resource-sharing demands)

Burn it down and start over if

Every decision requires five stakeholders to approve (one team can’t move without triggering coordination overhead)

Your product is so entangled that no team could own a complete problem (changing login flow requires touching 14 services owned by 8 teams)

Your best people are already leaving because they’re frustrated (lost 3+ high performers in last 6 months)

Leadership explicitly values speed over quality and won’t reconsider (”ship fast, fix later” is official strategy)

If you’re in the second category, you have two options: convince the CEO this is existential (good luck) or leave. There’s no middle ground.

What This Looks Like in Practice

Real example from a company that did this:

They had a team of 6 people. Product manager, three engineers, designer, data scientist. Their problem: customers who didn’t integrate within 30 days churned at 80%.

The feature factory approach

Product writes spec for integration mapping wizard based on sales feedback. Design makes mocks. Engineering builds it over 8 weeks. They ship. 7% of customers use it. Churn doesn’t improve. Feature gets abandoned.

Cost: 8 weeks × 6 people × $3,000 per week = $144,000 for something that didn’t work.

The craft approach

Week 1

Whole team talks to 10 customers who churned. Learns integration is confusing but wizard isn’t the problem. The problem is customers don’t know what data to map. The wizard won’t fix that.

Week 2

Designer sketches three different approaches:

wizard with better guidance
pre-configured templates for common use cases
AI that suggests mappings.

Engineer does quick feasibility check. Templates are simplest. AI is 6-month project, they’re in a regulated industry so extra compliance work is required. Wizard still doesn’t solve core problem. Team picks templates.

Week 3

Engineer prototypes templates in two days. Not production code, just enough to test. They test with 5 customers. Customers love it but want one change: ability to customize template after selecting it.

Week 4

Engineer adds customization to prototype. They ship to 10% of customers as beta.

Weeks 5–7: Monitor metrics:

60% of beta users select a template (adoption is working)
Time to integration drops from 12 days to 4 days (time to value is working)
SEQ scores average 6.2 out of 7 (ease is working)

Week 8

Ship to 100% of customers.

Week 12

Retention data comes in. Customers who integrated with new flow: 75% retention. Customers who integrated with old flow: 20% retention. Retention lift: +55 points.

Why this worked

They owned the problem. They could make tradeoffs (picked templates over wizard when they learned what actually worked). They tested fast (prototype in days, not months). They measured outcomes (adoption, time to value, SEQ, retention). No handoffs. No 6-month project. No feature nobody uses.

Cost: 3 weeks × 6 people × $3,000 per week = $54,000 for something that drove massive retention improvement.

That’s craft.

Next: Part 3 covers your detailed week-by-week execution playbook, the quarter-by-quarter transformation timeline, how to sustain craft through leadership changes, and when to give up.

Jeff on Product

Discussion about this post

Ready for more?