Inference Control Plane
Decide where every AI request
should run
Software that runs in your cloud. Routes AI workloads across your self-hosted and hosted models. Your data never leaves your environment.
What Makes This Different
Not a gateway. A decision engine.
Four capabilities that commodity routing tools don't have.
Normalized Cost Engine
Other tools compare per-token sticker prices.
We compute cost-to-successful-completion across your self-hosted GPUs and your hosted API accounts. That means idle overhead, queue delays, retry probability, power costs, and amortization — all normalized into one comparable number. Your backends, our math.
Policy-Constrained Routing
Other tools treat policy as a scoring weight.
We enforce it as a hard gate. If a request violates a data residency, sensitivity, or compliance rule, that backend is excluded before scoring even begins. Cost never overrides policy.
Hybrid Staged Execution
Other tools route the whole request to one backend.
We split it across your backends. Run PII scanning or document retrieval on your local models, then send only sanitized snippets to your hosted API for synthesis. Raw data never leaves your environment. No other routing tool does this.
Ranked Fallback with Cost-to-Complete
Other tools retry the next backend in a static list.
We rank fallbacks dynamically using SLA fitness, real-time health, and the remaining cost to deliver a successful response. If the primary fails at 80% completion, the fallback accounts for that sunk cost.
Available on
AWS Marketplace
Install directly into your EKS cluster. Counts toward your AWS EDP commitment.
Get notified when listedInteractive Demo
See the decision engine in action
Choose a scenario and watch ICP evaluate backends, apply policies, estimate costs, and explain its routing decision.
Select a scenario
Decision trace
Select a scenario and click “Route this request”
Select a scenario and click “Route this request”
For Startups
Ship AI features fast without burning through your runway
Cut Inference Costs Immediately
Route simple queries to open-source models on your existing GPU. Complex queries auto-escalate to your hosted API accounts. VIP tiers get priority routing. You control every backend and every dollar.
One Endpoint, All Providers
OpenAI-compatible proxy works with LangChain, your custom code, and any framework. No vendor lock-in.
Budget Guardrails
Set daily spend caps. When limits are hit, traffic shifts to local models automatically. No surprise invoices.
For Enterprise
Deploy with confidence, protect your data, and build without limits
Your Cloud. Your Data. Always.
ICP runs entirely inside your VPC. Your prompts never leave your environment. Your provider credentials stay in your K8s secrets. Zero data exfiltration — verifiable by network policy.
SOC 2 & GDPR Ready
Access logging, audit trails, data export/purge APIs, and payload redaction built in. Every routing decision is fully explainable and auditable. Compliance controls, not afterthoughts.
Predictable Pricing
Free tier for evaluation. Usage-based after that — no per-seat fees, no hidden costs. Scale teams without scaling bills.
Pricing
Free to evaluate. Pay per request.
$1.00 per 1,000 routed requests. No seat fees, no base charge. Billed through your cloud marketplace — counts toward your EDP commitment.
Free
Full routing engine for evaluation and small teams. No credit card required.
Request access- Up to 1,000 routed requests / day
- All backend adapters (Bedrock, Azure, Vertex, self-hosted)
- Policy engine & cost normalization
- Hybrid staged execution
- Decision traces & explainability
- OpenAI-compatible proxy
- Runs entirely in your cloud
Usage
Pay only for what you route. No seat fees, no base charge. Billed directly through your cloud marketplace.
Get started- Unlimited routed requests
- Everything in Free
- ICP Agent — deploy models to your K8s
- Model catalog (Llama, Mistral, Qwen, Phi)
- Autoscaling & health monitoring
- Budget controls & spend caps
- AWS / Azure / GCP Marketplace billing
- Counts toward cloud EDP commitment
Enterprise
Volume discounts, dedicated support, and custom SLAs. Available as a private Marketplace offer.
Contact sales- Everything in Usage
- Volume discounts
- Dedicated support engineer
- Custom SLAs
- Deployment approval workflows
- SOC 2 & GDPR compliance features
- SSO / Entra ID integration
- Marketplace private offers
Coming soon
A second product is in development. Get notified when it launches.
Frequently Asked Questions
Answers to common questions about ICP and launching on your preferred cloud.
Start routing smarter today
Install in your cloud in minutes. Connect your backends. See cost savings on your first request. Your data never leaves your environment.
Contact
Get in touch
Questions, access requests, and sales inquiries — we respond within one business day.
Or email us directly: sales@simplegoose.com