PHIL/CRUMM VOL.I · NO.001 · 2026.05.18
← Writing · NO. 001 · 2026.05.18 · 13m read · Services + Operating

T&M Punishes the Efficient

T&M punishes the efficient, hides AI cost in the wrong column, and trains your best people to smooth their timesheets. This is a structural problem that represents itself as an individual behavior problem.


Whose hour is it?

Imagine one engineer, three Claude agents, three clients, one wall-clock hour.

Each task is real work. Each task takes its full estimated time. The agents are not doing trivial things in parallel while a human does the real one. They are doing three substantive pieces of work at once, with the engineer steering, reviewing, intervening — the way the best people in my practice already work, and the way most of us will be working soon enough.

One engineer routes three Claude agents across three concurrent client projects within a single wall-clock hour; three arrows labeled 'this hour?' point from the agents back at three separate client billing buckets, with no resolution.The bookkeeping problem made visible.EngineerClaude agent AClaude agent BClaude agent COne wall-clock hourClient AClient BClient C this hour?this hour?this hour?
The bookkeeping problem made visible.

Whose hour is it?

Time-and-materials billing has one answer to that question and it is not a good one. Pick one. Bill the elapsed wall-clock to whichever client your conscience nominates. Don’t think too hard about it.

There is no honest answer inside the model. T&M was built on an assumption — one body, one task, one timer — and that assumption is failing in front of us, on Tuesday afternoons, in shared screens, in commit histories, in the part of the timesheet your senior engineers fill out last.

I spent the last four months in conference rooms with my practice leadership trying to figure out how to bill three agents at once. I did not get a clean answer. I got something better, which was clarity about why the question is unanswerable inside T&M, and what has to change for the question to stop being asked at all.

The model T&M was built for

Hour-of-labor was a defensible proxy for value-delivered when human throughput was roughly linear and there was no other way to fake it.

If a developer billed forty hours, you got roughly forty hours of developer-grade output. If they billed sixty, you got roughly sixty. The mapping was lossy but monotonic, and the proxy survived because it had to. Nothing else was countable. Story points were a planning fiction, not a billing artifact. Outcomes were too fuzzy. Hours were what the courts understood, what procurement understood, what your CFO understood. Hours were the lingua franca and they earned the role.

The proxy held because the world held. The world stopped holding.

This is a structural problem that represents itself as an individual behavior problem. When utilization slips, when a senior engineer “loses” a billable hour to mentoring, when an estimate gets revised down mid-project, when somebody quietly bills elapsed wall-clock for two agents running in tmux — none of these are character flaws. They are the structure failing in the only way it knows how to fail, which is by routing the failure through individual people and asking them to absorb it.

The structure made sense in 1998. It made sense in 2008. It made sense in 2018. It is failing in 2026. Hours stopped being the unit of value, and the meter that measures hours is now measuring the wrong thing while everybody inside the building gets graded on it.

The four ways T&M is now broken

There are four. Each is a no, but.

It punishes the efficient

Every hour you save with AI is an hour of revenue you didn’t bill.

Pick your best engineer. Give them Claude Code, Cursor, a tmux setup, and the kind of taste it takes to use those tools well. Now watch what happens to their billing. Better team, worse P&L. The incentive runs the wrong direction.

Pull the data from any T&M shop with a tenured practice and you’ll see the tell. Variance between estimates and actuals is high for junior people and low — strangely, suspiciously, structurally low — for senior people. I just fundamentally don’t believe that variance compression is a product of senior engineers being really good at guessing how long things take. It is a product of them being really good at managing the meter.

It hides AI cost in the wrong column

Agency billing rates were set against a labor cost curve. They were set against salaries and benefits and overhead and a margin target that had two decimal places.

They were not set against a two-hundred-dollar-per-seat Claude bill and a Cursor org plan and a compute line that grows with adoption. Those costs are real. They show up monthly. They scale with the engineers who are best at using them — which means they scale exactly with the people whose hours are simultaneously falling.

You can do the math two ways and neither one ends well. Charge a premium for AI-augmented work and you’re telling the client the meter is even more arbitrary than they suspected. Don’t charge a premium and you eat the cost on the margin line nobody re-priced. I have not met an agency doing otherwise. The margin gets eaten from a column nobody priced, and the people who priced the labor line aren’t the people running the AI line.

Estimates get more wrong, not less, mid-project

The team that delivers in week eight is materially faster than the team that estimated in week zero.

This was always somewhat true — teams learn the domain mid-project. AI has made it dramatically true. The statement of work described a slower team than the team executing the work, and the gap widens every sprint. The estimate is a fixed point. The team is not.

What happens next is predictable and human. People smooth hours. Not in a dishonest way — in a make-the-meter-make-sense way. The engineer who finishes a five-hour task in two hours bills four, because that’s what the SOW expected, because billing two would invite questions about the estimate, because the next task on the backlog needs to be available for them to roll into and the meter doesn’t know how to express “rolled in early.” The hours arrive at the line item the SOW called for them to arrive at. The work didn’t actually happen that way. You have trained your best people to be quietly dishonest with you, and they know they’re doing it, and you know they’re doing it, and the meter keeps running.

It can’t price parallelism

Return to the cold open. One engineer. Three agents. Three clients. One hour.

One body, one task, one timer is the assumption underneath every T&M invoice ever written. Strip it and the bookkeeping collapses. Bill elapsed to one client and you’ve stolen from the other two. Bill task-time to all three and you’ve billed three hours of wall-clock for one. The contractor you hired to do the same kind of work for you is contractually forbidden from billing this way. Pull a standard agency-to-subcontractor agreement and you will find a clause — sometimes two — that says the sub may not bill the same hour to multiple engagements, may not bill in parallel, may not invoice for unattended automation. Those clauses are in there because the agency wrote them, because the agency understood, in the specific case where it was the buyer, that the meter cannot survive parallelism without becoming a fraud vector. The same agency turns around and invoices its own clients on a model that has no such clause, because there is no counterparty in that direction with the leverage to demand one. The asymmetry is its own confession.

The cultural cost

“I’m out of hours” has become a collaboration killer.

People won’t help across projects because their meter doesn’t tick. Two-week retros pivot on whether somebody can absorb an unplanned hour without their utilization slipping. A senior engineer who notices a problem in another team’s code will sometimes pretend not to, because flagging it means owning it, and owning it means a billing conversation with two practice leads who will both lose.

The people doing these things are not bad. They are not lazy. They are not gaming. They are responding rationally to a structure that grades them on a number, and the number rewards a specific kind of behavior, and the behavior is corrosive to the company that depends on them not behaving that way.

This is a structural problem that represents itself as an individual behavior problem. The structure makes good people look like they’re lying. They aren’t. They’re answering the question the structure asked them, which is how do I make my line on the spreadsheet add up — and the only way to make it add up under T&M with AI in the toolchain is to bend something.

The leadership response is to manage the symptom. Talk about culture. Send a memo. Run a workshop on “billing integrity.” The memo is well-meant and the workshop is well-attended and the structure is unchanged the next morning, and the number is still wrong, and the smoothing continues. You can’t culture-train your way out of a meter that’s measuring the wrong thing.

”Clients won’t accept anything else”

The reflexive objection comes up in every conversation about getting off T&M. It is almost always wrong, but not for the reason most people offering the counter-argument think.

Clients already accept everything else. Day rates. Retainers. Fixed-fee. Milestone billing. Capacity blocks. Anyone selling capacity sells these. Large-enterprise procurement, on one extreme, audits every hour. Two-person studios on the other end need hour-by-hour visibility. Almost everything between those two extremes is already negotiable, and most agencies have some contracts that already work this way. The friction isn’t client appetite. It’s agency muscle memory.

Here is the harder version of the objection, which I heard in a leadership room earlier this spring: if you sell capacity instead of hours, you’d better make sure you’re delivering enough.

That isn’t a complaint about the model. It’s a real constraint on it. The hourly meter, for all its problems, is a continuous accountability mechanism — the client knows what they’re paying for in fifteen-minute increments. Switch to allocation billing and you lose that. The client is now buying a sprint, and if the sprint underdelivers, the client has paid for less than they thought they’d get.

The answer to that objection is not to wave it off. It’s to acknowledge that switching off T&M raises the bar on what you have to prove every two weeks. The accountability mechanism moves from the timesheet to the increment. You demonstrate value by shipping things, on a cadence, that the client can see and evaluate. That is harder than running a meter. It is also the entire point.

And no, you don’t escape by productizing

Y Combinator’s Spring 2026 RFS asked for AI-powered agencies with SaaS-like margins.

The framing is half-right. The delivery model is broken. Hours are the wrong unit. AI changes the labor math in ways that haven’t fully landed in agency P&Ls. All of that is true and the RFS deserves credit for naming it.

The naive read is wrong. Strip the humans, keep the meter, print SaaS margins. That misidentifies the bug.

In services, value and commoditizability are inversely correlated. The work clients pay agency rates for is judgment under ambiguity and accountability for outcomes — neither of which survives a price card. The moment you can put a clean number next to “build a checkout flow,” the platform you’d build it on has already built a better one and given it away. The work that is price-cardable has either already been compressed to commodity prices by the SaaS that owns the category, or will be.

Productized services is the attempt to sell the commoditizable slice at non-commoditized prices. It is a brief arbitrage, not a business model. It misidentifies the bug — the bug isn’t “humans are in the loop”; the bug is “the meter measures the wrong thing.” SaaS margins come from distribution economics, not from removing humans from delivery. And if you go productized, you’ve signed up to compete with funded SaaS on their home turf, which is not a fight your services P&L was designed to win.

What replaces it

I am inside the change while making the argument for it. That is the most honest version of this essay I can write, and it is probably the only version worth writing.

The unit of sale becomes capacity against a prioritized backlog over a fixed window. Hours become an internal velocity instrument — useful for steering, never the meter on the wall. Scope flexes inside the window. The window does not. The client is buying a guarantee of delivery on a date, with a known team, and a backlog that they help prioritize. If the work goes faster, they get more of the backlog. If it goes slower, the must-haves still land and the could-haves slide. The meter on the wall stops being hours; it becomes increments shipped.

The operating-model specifics — sprint cadences, capacity caps, riskshare structures, exchange-for-free swaps — are workable and well-documented in the agile-contracts literature. They are not the hard part. The hard part is what comes next.

The hard part isn’t the meter, it’s the accounting

Utilization. Gross margin. Revenue recognition. Individual contributor compensation. Weekly health metrics. Quarterly forecasting. All of it currently hangs off the billable hour.

Most “we moved off T&M” announcements quietly revert. I’ve watched a half-dozen of them in the last decade. They revert because the leadership team changed the meter on the wall and didn’t change the meters underneath the wall, and the underneath-meters are what determine whether anyone in the company believes the change is real.

Your sales team is still grading itself on bookings denominated in hours. Your delivery leads are still grading themselves on utilization percentages computed from hours. Your CFO is still recognizing revenue against hours billed. Your compensation system still pays people partly on their personal utilization. Three months in, somebody asks why margin looks weird, and somebody else says let’s just track hours alongside the new model so we can compare, and four months in the new model is theater and the meter is back.

Each of those underneath-meters is its own load-bearing migration. Utilization has to be redefined against allocations rather than billed hours, which means the denominator changes and so does every dashboard built on top of it — and the people who built those dashboards are not the people changing the pricing model. Revenue recognition rules under ASC 606 are workable for fixed-window allocation billing but the workable answer is not the answer your finance team is using today, and changing it touches audit. Compensation tied to personal utilization has to be unwound on a quarterly cadence that doesn’t disrupt people mid-cycle, which means the comp change lags the pricing change by at least two quarters, which means for two quarters your delivery org is being paid on the old meter while being asked to operate on the new one. Sales commissions denominated in bookings-hours have to be reconverted to bookings-capacity, and the salespeople notice immediately. Forecasting models built on a smooth hours-billed signal have to be rebuilt against the lumpier sprint-allocation signal, which the CFO will not love. None of these are pricing decisions. All of them have to land before the pricing decision is real.

The pricing change is the easy part. The accounting change — utilization redefined, comp restructured, revenue recognition rewritten, weekly rhythm rebuilt, forecasting reframed, dashboards re-instrumented — is the part that determines whether anyone in the company actually changes how they work. Most agencies that announce they’re getting off T&M are announcing the pricing change without the accounting change. They are announcing pricing theater and they don’t know it yet.

The agencies who win the next five years

The meter is the bug. Unbugging it does not save you from the squeeze.

AI is compressing what clients are willing to pay for delivery, and that compression is coming for the entire services category whether or not you change how you bill. Customers know AI makes the work faster. They are not going to keep paying old prices for new speeds. The compression lands on you either way.

The difference — and this is the difference — is who’s holding the pen when it lands. The agency that fixes the meter gets to design the compression on its own terms: allocations priced, scope flexed, margin known, AI cost-of-revenue priced into the line, the operating model rebuilt to make all of it legible. The agency that doesn’t fix the meter gets the same compression done to it on its customers’ terms — sliced off the topline as billed hours quietly fall, with the AI cost line eating margin from the other side, with senior engineers smoothing their timesheets to make the numbers look right, with the structure routing the failure through individual people until the people leave.

A feedback loop diagram showing how customer expectation of AI-driven speed flows through agency adaptation, falling billable hours, revenue compression, and increased cost-of-sale, then loops back to customer expectation. A second branch shows the meter-fix not breaking the loop, but giving the agency control over where the compression lands.The loop the meter-fix doesn't break.Customers expect AI-driven speedAgency delivers fasterRevenue compressesSell more, work harderCost-of-sale climbsMargin shrinksFix the meter expectations riseagency designs how the compression lands
The loop the meter-fix doesn't break.

The services business that wins the next five years is not the one that bills the most hours. It is the one whose pricing model stops paying people to be slow, and whose accounting model stops grading them on whether they were.

Fix the meter. Then fix the meters underneath the meter. Then prove, two weeks at a time, that the increment is worth what you charged for it.

That is the work. That is the only work.


Continue reading
— this is the most recent.
— end of the file.
All essays