Question 1

How long does this take?

Accepted Answer

1–2 weeks. Compact tier ships in 1 week; multi-vendor in 2. Bundle with the AI Security Review — Cost + Security under one MSA is the most common BST entry-point combination.

Question 2

What does this cost?

Accepted Answer

$10K–$20K depending on tier. Time-and-materials with cap. Compact tier: $10K / 1 week (single workload, single vendor, < $20K/mo) Standard tier: $14K / 1.5 weeks (2–4 workloads, single vendor, $20K–$100K/mo) Multi-vendor tier: $18K–$20K / 2 weeks (mixed-vendor, > $100K/mo)

Question 3

Why not a fixed price?

Accepted Answer

BST does not offer fixed-bid contracts. The cap functions as fixed-price upside — you cannot pay more than the cap, but if we land early you only pay for hours worked. See the BST Engagement Model for the full doctrine.

Question 4

We're already on Vercel AI Gateway / OpenRouter / LiteLLM. Does that change anything?

Accepted Answer

It accelerates the assessment — gateways already provide most of the attribution data. The fee tier doesn’t change but the work tends to land at the lower hours. If the gateway itself needs governance work (audit log, kill-switch, blocked-paths), that routes to the Engineering Practice cluster.

Question 5

We are regulated (HIPAA, SOC 2, ITAR). Can you do this?

Accepted Answer

Yes for the assessment. We work from billing data and code; we don’t handle PHI/PII during the assessment. For implementation engagements that touch live data, we add a BAA. PHI-touching AI workloads usually pair with the Healthcare AI Compliance Review .

Question 6

What if your projected savings don't materialize?

Accepted Answer

Every finding is sized with explicit assumptions and sensitivity range . If the assumptions hold and the levers are pulled correctly, the math is the math. We’re transparent about which assumptions matter most — the ROI calculator (yours to keep) projects current vs. optimized cost at 0.5x / 1x / 2x volume so you can stress-test the model yourself.

Question 7

Will this find anything if we already cache?

Accepted Answer

Almost always, yes. The most common finding among "we already cache" clients is a cache-hit-rate problem — usually a variable in the prefix invalidating cache on every request. Other recurring findings: No tier routing — every request hits the most expensive model Batch API unused for non-realtime workloads No vendor-mix evaluation — paying retail to one vendor when much of the traffic could go elsewhere

Question 8

What if our spend is below $5K/month?

Accepted Answer

We’ll tell you the engineering effort exceeds the savings. Come back at scale. We won’t take the work — that’s the published decline criterion. The ROI calculator pre-screens by monthly AI spend bracket; anything below $5K and the form returns a polite decline matching this published criterion.

Most AI bills carry substantial recoverable waste. We find it.

You shipped AI. Finance is asking what it costs at 2x volume. Your team is guessing.

What you get

Cost baseline report

Optimization roadmap

ROI calculator (yours to keep)

90-day implementation plan

Walkthrough call

How it works

Workload mapping

Lever analysis

Modeling & roadmap

Delivery

Pricing

Frequently asked questions

Often combined with

AI Security Review

Healthcare AI Compliance Review

Know what your AI bill could be.