The market is shifting. Customers are moving from clicking in apps to asking AI agents to act, and soon, AI-to-AI interactions — a customer’s agent negotiating with a merchant’s agent — will be normal. For commerce, this transition erodes legacy moats and resets the competitive landscape; delivering measurable, AI-driven outcomes is the new differentiator. This shift only works if we combine automation at scale with human control for sensitive operations.

Our purpose is to build the scalable platform foundation that embeds AI contextually across the Recurly suite. This will accelerate merchant workflows, deliver simpler, adaptive subscription experiences (Subscriber Experience), optimize revenue invisibly (Autonomous Revenue Engine), and extend our reach through a robust partner ecosystem (Developer & Ecosystem). This requires a disciplined foundation:

  • Hybrid by design: Automation for routine, low-risk flows; human approval for high-value, sensitive, or ambiguous actions

  • Data quality first: Curated, versioned, and continuously refreshed sources — because garbage in is garbage out

  • Evaluation as a practice: Every new model, MCP tool, and endpoint expansion carries a long tail of maintenance; we meet it with rigorous, ongoing evals instead of one-time tests

  • Own the standards: Building the protocol, guardrails, and tooling internally lets us set the bar and avoid fragmented stacks

Why does this matter? Merchants are able to move faster by configuring and launching via chat, not documents. Subscribers are able to manage their accounts more easily through conversational experiences. 

Through these systems and implementation, Recurly becomes the platform that enables next-generation subscription commerce, laying the groundwork for future capabilities like partner integrations and public-facing MCPs.

What is the operating model?

MCP as the execution standard

The Model Context Protocol (MCP) is the contract between models, business logic, and policy. It is the core of our strategy to make AI tools an embeddable platform capability, not just an isolated chat feature. It standardizes how agents understand context and which actions they can take such as:

  • Use-case mapping: Select high-value, repeatable workflows suited to automation (e.g., Instant Actions, Smart Navigation)

  • Protocol-driven tools: Package actions as MCP tools with explicit interfaces and safety contracts (e.g., update_plan, create_coupon, create_invoice, create_subscription, add_addon, early_renewal)

  • Context scoping: Bind each tool to the right corpus (API docs vs. marketing vs. internal Knowledge base), with provenance and recency tags to avoid cross-talk and stale answers

  • Hybrid execution: Policies define when tools may auto-execute vs. when to escalate for human approval (this includes thresholds, risk scores, policy exceptions)

  • Versioning & compatibility: Tools, prompts, and policies are versioned; model upgrades pass gates before promotion

What do the risk evaluation and governance processes look like?

At Recurly, risk evaluation is critical to everything we do. Once AI starts launching offers, pausing accounts, retrying payments, or talking to other AI agents, mistakes could cost money, trust, even break regulations. A disciplined, auditable control plane for safe scaling:

  • Segmentation: Start with internal teams and pilot merchants; expand by stage gates to fully handle risks as they arise. Roll out in progressive stages to limit radius and observe impact before full deployment

  • Endpoint scoping: Expose low-risk endpoints first; progressively unlock sensitive operations once observability and evals pass targets. This reviews allows us to keep tight control devices or data

  • Regulatory alignment: All AI flows are designed with GDPR, PCI, and HIPAA in mind. Data exposure is minimized, role-based permissions are enforced, and retention/deletion rules are built into the system

  • Financial risk modeling:Every AI-driven action is assigned a potential financial impact. Budget caps, rate limits, and mandatory approval thresholds apply when actions exceed defined risk level

  • Continuous validation: Automated evals (LLM-as-judge + metrics), negative testing (“do not answer” cases), and human review on sampled traffic

  • Observability & audit: Structured logs, input/output traces, tool invocations, model/tool versions, and rollback hooks to create a paper trail 

  • Kill-switches & dry-runs: Instant disablement per tool/tenant; simulation mode to preview outcomes before execution; idempotency and retry safe modes by default. This helps us spot problems before they happen and isolate tools

How will data and context management fit into our strategy?

AI is only as good as the data it’s built on. If inputs are outdated, inconsistent, or mixed across domains (e.g., marketing vs. API documentation), AI outputs become unreliable or misleading. Recurly is working across platforms to structure data across three layers.

  • Source curation: Deliberate inclusion/exclusion based on relevance and accuracy (e.g., separate technical docs from marketing collateral); per-domain corpora for product, support, sales, etc.

  • Index segmentation: Apply domain search filters at query time enforce data freshness windows, and follow deprecation policies to prevent AI from referencing outdated or irrelevant information

  • Consistency: Maintain reliability with periodic rebuilds and spot checks; automated change detection pipelines; documented ownership of corpora

What will the developer and operator experience be like?

AI at any scale only works if developers and operators can configure, monitor, and control the systems with ease. The system needs to be predictable and debuggable while not functioning like a black box. 

  • Composable agents: Agents are modular and defined in YAMLwith reusable prompts and tool bundles so teams can spin up or adjust behaviour

  • Policy engine: A central approval and risk rules system that defines what the AI is allowed to do with defaults, risk thresholds and per-tenant overrides for maximum flexibility and control

  • Admin console: Monitor metrics (task success, escalation rate, exposure), which enables teams manage versions, and trigger rollbacks

  • SLOs: Monitor latency, accuracy, task completion, and safe-completion targets with alerting to examine when performance falls below expected thresholds

What are our planned deliverables, examples, and rollout?

It can be hard to quantify exact time tables for this project, however, we’re prioritizing deliverables that prove value, establish standards, and make AI safe to operate in production. Most of our efforts focus on the core toolset, and use cases that have real world impact. 

Near-term Deliverables

  • AI will power core subscription and billing workflows: This includes things like subscription creation, plan adjustments with proration, add-on and credit management, coupon application, invoice generation, payment retries, and safe deep-link navigation into admin pages for review and approvals.

  • Policy & risk controls: AI behavior is governed by approval thresholds, rate limiter, budget guardrails, and dry-run mode to ensure decisions can be rolled back or financially safe

  • Eval harness: We want to provide a structured framework with golden prompts, regression suites, negative test cases, business-metric scoring to evaluate output quantity

  • Observability & audit: Every AI action is logged structured traces, version stamps, dashboards, on-demand rollbacks for control and safety

Use-case vignettes

These vignettes represent initial applications targeting the Merchant Assist and Subscriber Experience value streams.

  • Recurly-specific (production-grade hybrids):

    • Subscription upgrade with proration: “Upgrade me to the Premium plan right now.” Auto-calculate credit, re-invoice, charge payment method. Escalate if proration conflicts or payment exceptions cross thresholds.

  • Recurring add-on: “Add extra storage to my subscription.” Validate plan compatibility; attach add-on; prorate charges. Escalate on incompatibility or potential over-billing risk.

  • Early renewal: “Renew my subscription now instead of waiting until next month.” Trigger immediate invoice; extend term. Escalate if coupon interactions or prepaid terms create ambiguity.

  • Consumer-style (vision continuity):

    • Travel purchase: “Cheapest flight next Friday.” Auto-act under budget/policy; escalate on outliers.

    • Food delivery: “My usual to the office.” Auto-order within constraints; escalate when policy or spend triggers.

Rollout & measurement

It’s not about being the first to the party, it’s about ensuring safety at every turn. Controlled phases allows recurly to do internal testing, select merchants, and ensure quality before going to a broader audience. 

  • Phase 0 — Internal testing: AI tools will be tested internally. We will collect data, refine policies, and validate eval baselines.

  • Phase 1 — Pilot merchants: A small group of merchants will gain access to limited endpoints with strict guardrails.  We will monitor task success, escalation rate, incident count, time-to-complete to validate real-world use case.

  • Phase 2 — Broader exposure: Tools will then expand to include more merchants and workflows; Stable versions are published, SLAs/SLOs are documented, and this phase establishes the foundation for future partner-facing and ecosystem integrations.

Core success metrics:

  • Task success rate (completed safely, first attempt)

  • Escalation rate (and false-escalation %)

  • Time-to-complete (vs. manual baseline)

  • Financial exposure avoided (via guardrails)

  • Compliance incidents (target: zero)

  • Merchant activation time and end-user conversion in subscription flows

  • Demonstrated market leadership and brand value from showcasing advanced AI capabilities.

The future of agents and Recurly

AI is the future, but it needs a foundation built on clean data, proper governance, reliable tooling, and human oversight. We’re building this architecture with these goals in mind to deliver measurable outcomes.

Merchants will be able to configure, launch, and optimize subscription experiences through conversation instead of documentation. Subscribers will manage accounts naturally. Revenue systems will adapt in real time.

The future is bright here at Recurly and we plan to keep innovation and driving the future of subscriptions management and AI implementation. Want to learn more about our platform? 

Book a call with us today!