Designing Entitlements: The Quiet Work Behind Pricing That Scales

Pricing conversations tend to start with tiers, price points, and packaging names. The question that actually determines whether your pricing holds up at scale gets skipped: what happens when a customer runs out?

Teams rarely answer that intentionally. They pick a default, ship it, and deal with the fallout later when a customer hits a wall mid-campaign, gets a surprise invoice, or quietly stops using the product to avoid overages.

I’ve been thinking about this in the context of AI-first products, where usage is more variable and more tied to customer outcomes than in most previous categories. Entitlement design sits at the intersection of product, pricing, and customer trust. Get it wrong and you’re fixing it at renewal.

Most SaaS products package features and capabilities into buying plans: a Starter, a Pro, an Enterprise. Each plan gives you access to more than the one below it. Within each plan, there are two kinds of things you get. Some features are binary: you either have access or you don’t. A sandbox environment, a security package, priority support. Others have amounts associated with them. A Starter plan might include 100 AI tokens, Pro gets 200, Enterprise gets 500. These are entitlements. They’re the specific limits within each plan that determine what a customer can actually do with what they bought.

What happens when a customer hits those limits is where most pricing frameworks go quiet.

What You Do When a Customer Hits a Limit

When a customer depletes a usage-based entitlement, you have three paths:

Force an upgrade to a higher plan. The simplest path on paper. A customer on Pro hits their limit, so you move them to Enterprise. Revenue goes up, billing stays clean. The problem is timing. Hitting a limit mid-campaign or mid-quarter is not when a customer wants a sales conversation. It creates friction at exactly the wrong moment and can sour a relationship that was otherwise healthy.

Let them continue and bill for overages. Feels customer-friendly because there’s no hard stop. But the goodwill evaporates the moment an unexpected charge shows up on an invoice. Budget owners get surprised, procurement gets involved, and what started as a usage question becomes a trust issue. It also creates unpredictable revenue on your end, which makes forecasting harder.

Pre-purchase packages at volume discounts. The right model for usage-based entitlements. Customers get discounts at higher volumes, so they’re incentivized to use more without being penalized. AI usage is a good example: there’s no natural ceiling, and the per-unit cost should decrease as commitment increases. Volume bundles don’t need a product name the way add-ons do, but they need the same purchase flexibility. A customer might need to buy additional capacity ahead of a big push, not just at contract start. Volume bundles should be available to purchase anytime.

Binary entitlements have no depletion to manage. The right model is an add-on: a support package, a security package, a sandbox, or other bundled feature sets sold at a flat fee that GTM teams or Deal Desk can discount as needed. The feature boundary is clear, and customers understand what they’re buying.

Picking whichever is easiest to implement is itself a choice, and usually not a good one. The option you land on shapes customer behavior, affects NRR, and signals what you actually believe about how customers should use your product.

Observability Has to Come Before Billing

You cannot charge customers for usage they cannot see. It sounds obvious, but it’s violated constantly. Companies build entitlements, implement overage billing, and then realize their customers have no way to monitor consumption in real time. The result is exactly the bill shock problem that erodes trust and drives churn.

Before you turn on any usage-based billing, customers need visibility into where they are relative to their limits. A real dashboard, with thresholds and alerts, so a customer knows they’re at 75% of their budget before they kick off a big deployment.

The same goes internally. Your CSMs need to know before the customer does. A customer at 90% of their entitlement is one your team should already be talking to. If they’re finding out about depletion from incoming complaints, your tooling is behind your billing model, and that’s a recipe for bad renewals.

The Grandfathering Problem Is Real and Often Ignored

When you implement new entitlements, the question of what to do about existing customers who signed contracts before the entitlement existed almost always comes up. If there was no language in their agreement about usage limits, you can’t just start charging them. You need a transition plan.

The customers who have been with you longest are often your heaviest users. They’re also the ones who will feel it most if a new limit shows up without adequate notice and a clear path forward. Getting this wrong can turn your most loyal customers into your loudest critics.

Build the observability layer first, share usage data with affected customers before any billing goes live, and use that data collaboratively to help them find the right tier or package.

Pricing Is Always a Hypothesis About Customer Behavior

Entitlement design forces you to make bets about how customers will behave: what a “typical” customer uses, where the outliers are, what signals correlate with high usage before it becomes a problem. Most of those bets are made without enough data, which is why the first version of any entitlement framework is almost always wrong.

Build a framework flexible enough to adapt as you learn, transparent enough that customers trust it, and simple enough that your GTM team can actually explain it.

Customers don’t think about entitlements. They think about whether the product works for them. Bad entitlement design is the reason they’re on the phone with your CSM at renewal.