By apipark — 23 Mar 2026

Unlock AI Potential with Kong AI Gateway

kong ai gateway

The advent of Artificial Intelligence, particularly the explosive growth of Large Language Models (LLMs), has ushered in an era of unprecedented innovation, reshaping industries from finance and healthcare to creative arts and education. Organizations worldwide are scrambling to integrate AI capabilities into their products, services, and internal operations, recognizing the profound competitive advantage it offers. However, the journey from conceptualizing AI integration to deploying robust, secure, and scalable AI-powered applications is fraught with complexities. Managing diverse AI models, ensuring data privacy, optimizing costs, and maintaining high performance across dynamic environments presents a formidable challenge that transcends traditional API management paradigms.

At the heart of overcoming these modern infrastructure hurdles lies the sophisticated AI Gateway. More than just a simple proxy, an AI Gateway is a specialized layer designed to abstract, secure, manage, and optimize access to a multitude of AI services and models. Among the leading solutions, Kong AI Gateway stands out as a powerful, flexible, and enterprise-grade platform, meticulously engineered to unlock the full potential of AI for developers and organizations alike. This exhaustive guide will delve deep into the transformative power of AI Gateways, elucidate the unique capabilities of Kong, and provide a roadmap for building the next generation of intelligent applications. We will explore how Kong not only streamlines AI integration but also fortifies security, enhances performance, and provides crucial insights into AI consumption, making it an indispensable tool in the modern AI landscape.

The AI Revolution and Its Demands on Infrastructure

The narrative of Artificial Intelligence has been one of continuous evolution, marked by incremental breakthroughs culminating in today's extraordinary capabilities. From early expert systems and machine learning algorithms to the deep learning revolution ignited by neural networks, each wave of innovation has brought new possibilities and, concurrently, new demands on the underlying technological infrastructure. Today, we stand at the precipice of the Generative AI era, largely propelled by Large Language Models (LLMs) like GPT, Llama, and Bard. These models possess an astonishing ability to understand, generate, and process human language at scales previously unimaginable, offering applications ranging from sophisticated content creation and hyper-personalized customer service to complex code generation and nuanced data analysis.

The allure of LLMs is undeniable. Businesses envision conversational interfaces that anticipate customer needs, content pipelines that generate high-quality text in moments, and development workflows accelerated by AI-powered coding assistants. However, integrating these powerful yet resource-intensive models into existing enterprise systems is far from trivial. Organizations face a multifaceted array of challenges that go beyond the capabilities of traditional API management:

Complexity of Model Integration: The AI ecosystem is diverse, featuring a myriad of models from different providers, each with unique APIs, authentication mechanisms, and data formats. Integrating even a handful of these directly into applications can lead to brittle, hard-to-maintain codebases and significant vendor lock-in. Furthermore, the rapid pace of AI innovation means models are constantly updated, superseded, or replaced, requiring applications to adapt frequently.
Scalability and Performance: AI workloads, particularly those involving real-time inference with LLMs, can be incredibly demanding. Latency is often a critical factor for user experience, especially in interactive applications. Ensuring that AI services can scale dynamically to meet fluctuating demand, without compromising performance or incurring exorbitant costs, requires sophisticated traffic management and optimization strategies.
Security and Data Privacy: AI models, especially when handling sensitive user prompts or generating responses, are prime targets for various security vulnerabilities. Data breaches, prompt injection attacks, and unauthorized access to models or the data they process pose significant risks. Protecting intellectual property embedded in custom models and ensuring compliance with data privacy regulations (e.g., GDPR, CCPA) necessitates robust authentication, authorization, and data governance mechanisms at every layer of the interaction.
Cost Management and Optimization: LLMs are expensive to operate, with costs often tied to token usage, compute resources, and API calls. Without a centralized mechanism to monitor, control, and optimize AI consumption, organizations can quickly find themselves facing runaway expenditures. Strategies for caching, intelligent routing, and resource allocation become paramount to financial sustainability.
Prompt Engineering and Transformation: The quality of AI output is heavily dependent on the input prompt. Effective prompt engineering often involves complex templating, context injection, and pre-processing of user queries. Directly embedding this logic into every application can lead to redundancy and make it difficult to centralize prompt best practices or adapt to new prompt engineering techniques.
Observability and Governance: Understanding how AI models are being used, by whom, and for what purpose is crucial for debugging, auditing, and making informed business decisions. Comprehensive logging, monitoring of AI-specific metrics (like token counts, inference latency, error rates), and analytical capabilities are essential for effective governance and continuous improvement.

These intricate demands underscore the inadequacy of traditional IT infrastructure components when confronted with the unique requirements of AI. A new architectural paradigm, one that specifically addresses these challenges, is not just beneficial but absolutely essential for unlocking the true potential of AI at scale. This is precisely where the concept of an AI Gateway, and specifically a robust solution like Kong AI Gateway, becomes invaluable. It acts as the intelligent intermediary, transforming raw AI capabilities into consumable, secure, and manageable services for the enterprise.

Understanding AI Gateways and API Gateways

To fully appreciate the innovations brought by an AI Gateway, it's crucial to first understand its foundational predecessor: the API Gateway. Both serve as critical architectural components in modern distributed systems, acting as a single entry point for a group of microservices or external APIs. However, their scope, specialized functionalities, and the specific challenges they address diverge significantly, especially when considering the unique landscape of Artificial Intelligence.

The Traditional Role of an API Gateway

An API Gateway is a management tool that sits in front of backend services, providing a single, unified entry point for external clients to access various functionalities. Think of it as the front desk of a large hotel: guests don't need to know the layout of every room or department; they just interact with the front desk, which then directs their requests to the appropriate internal service (e.g., concierge, housekeeping, room service).

Historically, traditional API Gateways have been instrumental in:

Request Routing: Directing incoming requests to the correct backend service or microservice based on paths, headers, or other request attributes. This abstracts the internal service architecture from clients.
Authentication and Authorization: Verifying client identities (e.g., API keys, OAuth tokens, JWTs) and ensuring they have the necessary permissions to access specific resources. This centralizes security policies.
Rate Limiting and Throttling: Protecting backend services from overload or abuse by controlling the number of requests a client can make within a given timeframe. This ensures fair usage and system stability.
Load Balancing: Distributing incoming traffic across multiple instances of a backend service to ensure high availability and optimal resource utilization.
Traffic Management: Implementing policies for retries, circuit breakers, and timeouts to enhance resilience and prevent cascading failures.
Caching: Storing responses from backend services to reduce latency and load for frequently requested data.
Logging and Monitoring: Collecting data about API requests and responses for observability, debugging, and analytics.
Protocol Transformation: Converting protocols (e.g., HTTP to gRPC) or data formats.

In essence, a traditional API Gateway simplifies client-side consumption of complex microservice architectures, enhances security, improves performance, and provides crucial operational insights. It's a cornerstone for building robust and scalable web APIs.

The Evolution to an AI Gateway: Addressing AI-Specific Needs

While a traditional API Gateway can certainly expose AI models as RESTful services, it often lacks the specialized functionalities required to truly manage, optimize, and secure the unique characteristics of AI workloads, particularly those involving LLMs. This is where the concept of an AI Gateway emerges as a critical evolution.

An AI Gateway extends the core functionalities of an API Gateway by adding AI-specific intelligence and features. It's not just about routing requests to an AI service; it's about understanding the nature of those requests (prompts), the characteristics of the AI models (token usage, latency profiles), and the specific security and cost implications inherent in AI interactions.

Here's how an AI Gateway, often referred to as an LLM Gateway when specifically dealing with Large Language Models, differs and adds value:

Intelligent Routing and Model Orchestration: Beyond simple path-based routing, an AI Gateway can route requests based on the type of AI model required, its cost, performance characteristics, or even specific metadata embedded in the prompt. It can also orchestrate interactions with multiple models, chaining them together or dynamically selecting the best model for a given task (e.g., routing a complex query to a more powerful but expensive LLM, and a simple one to a smaller, cheaper model).
Prompt Engineering and Transformation: An AI Gateway can act as a central hub for prompt management. It can apply standardized prompt templates, inject contextual information (e.g., user profiles, historical data), perform data masking for PII, or even translate prompts between different formats required by various AI providers. This decouples prompt logic from application code.
AI-Specific Security and Governance: Beyond basic authentication, an AI Gateway can implement advanced security measures tailored for AI, such as scanning prompts and responses for sensitive data (PII), detecting and preventing prompt injection attacks, enforcing content moderation policies, and providing granular access control to specific AI capabilities within a model.
Cost Management and Optimization for LLMs: This is a crucial differentiator. An AI Gateway can track token usage for LLMs, implement cost-aware rate limiting, apply intelligent caching strategies for common AI responses (semantic caching), and even perform cost-based routing to cheaper models when quality requirements allow. This transforms AI consumption from a black box expense into a manageable, optimizable resource.
Enhanced Observability for AI: While traditional gateways log requests, an AI Gateway offers deep insights into AI interactions. It can log prompt details (without sensitive info), generated responses, token counts (input/output), inference latency, and model-specific error codes. This data is invaluable for debugging AI applications, monitoring model performance drift, and understanding usage patterns.
Model Agnostic Abstraction: An AI Gateway provides a unified interface for interacting with various AI models from different vendors (OpenAI, Hugging Face, custom models). This insulates applications from underlying model changes, reducing vendor lock-in and simplifying future model migrations.
Caching for AI Responses: Traditional caching works for static data. AI Gateways implement semantic caching, which understands that semantically similar prompts might yield identical or nearly identical responses, allowing for caching even with slight variations in input. This dramatically reduces latency and inference costs for frequently asked questions or common AI tasks.

In summary, while an API Gateway manages access to generic backend services, an AI Gateway (or LLM Gateway) is specifically engineered to handle the nuances of AI workloads. It adds a layer of intelligence and specialized functionality that is essential for cost-effective, secure, scalable, and high-performing AI integration. It is the architectural linchpin for organizations aiming to truly unlock the transformative power of AI without succumbing to its inherent complexities.

Kong AI Gateway: Architecture and Core Features

Kong Gateway, renowned for its open-source foundation, flexibility, and performance, has naturally evolved to meet the demands of the AI era. Kong AI Gateway leverages Kong's robust, battle-tested core capabilities and extends them with a suite of AI-specific plugins and features, positioning itself as a leading AI Gateway and LLM Gateway solution. Its architecture is designed for extreme extensibility, allowing organizations to tailor their AI infrastructure precisely to their unique needs.

At its core, Kong Gateway operates as a lightweight, fast, and scalable API proxy built on Nginx and LuaJIT. This foundation provides superior performance and flexibility. When integrated with its extensive plugin ecosystem, Kong transforms into a powerful AI management platform capable of handling the most demanding AI workloads.

Let's delve into the core features that make Kong AI Gateway an indispensable tool for managing AI:

1. Intelligent Routing and Load Balancing for AI Workloads

Kong's advanced routing capabilities are critical for AI. Beyond simple path or host-based routing, Kong AI Gateway can intelligently direct AI inference requests based on a variety of factors:

Model Type and Version: Route requests for sentiment-analysis-v2 to a specific backend while translation-german-english goes to another. This is crucial for A/B testing models or gradually rolling out new versions.
Request Payload Content: Analyze the prompt itself to determine which model is best suited. For instance, route long, complex prompts to a powerful LLM and short, simple ones to a more economical model.
User/Application Context: Direct requests from specific internal teams or external partners to dedicated AI resources, potentially with different performance guarantees or cost allocations.
Cost Optimization: Implement cost-aware routing logic that prioritizes cheaper models or providers when quality requirements allow, dynamically switching to premium models only when necessary.
Geographical Location: Route requests to AI services hosted in specific regions to comply with data residency requirements or minimize latency for geographically distributed users.

Combined with sophisticated load balancing algorithms (round-robin, least connections, consistent hashing), Kong ensures that AI services are highly available, performant, and efficiently utilized, preventing single points of failure and distributing heavy inference loads effectively.

2. Comprehensive Security and Access Control

Security is paramount when dealing with AI, especially with sensitive data flowing through prompts and responses. Kong AI Gateway provides a multi-layered security approach:

Authentication: Supports a wide array of authentication methods, including API keys, OAuth 2.0, JWT (JSON Web Tokens), mTLS (mutual TLS), and OpenID Connect. This ensures only authorized applications and users can access your AI models. For LLMs, this means tightly controlling who can submit prompts and generate responses.
Authorization: Beyond authentication, Kong allows for fine-grained authorization policies. You can define what specific AI models or even what capabilities within an AI model a user or application is allowed to access. For example, a marketing team might access a content generation LLM, while a finance team accesses a data analysis model.
IP Restriction: Whitelist or blacklist IP addresses to control access to AI services, adding another layer of network security.
Data Masking and PII Redaction: A critical feature for data privacy. Kong can be configured to automatically identify and mask Personally Identifiable Information (PII) within prompts before they reach the AI model, and potentially redact sensitive data from AI responses before they are sent back to the client. This significantly reduces data exposure risks and aids compliance with regulations like GDPR and CCPA.
Prompt Injection Prevention: While not a silver bullet, Kong can implement rules and plugins to detect and potentially block common patterns associated with prompt injection attacks, adding a vital defensive layer.
Content Moderation: Integrate with content moderation services or implement rules to filter out inappropriate, harmful, or policy-violating content in prompts and responses, ensuring responsible AI usage.

3. Granular Rate Limiting and Throttling

AI models, particularly LLMs, have resource constraints and often incur costs per token or per call. Effective rate limiting is essential for cost control, preventing abuse, and ensuring fair access:

Per-Consumer/Per-Service Rate Limiting: Apply different rate limits based on the user, application, or specific AI service being accessed. A premium subscriber might have a higher token limit than a free-tier user.
Cost-Aware Limiting: Implement rate limits based on estimated cost (e.g., limit to X tokens per minute, rather than just X requests per minute), which is critical for managing LLM expenses.
Dynamic Throttling: Adjust rate limits dynamically based on the current load of the backend AI service, preventing overload and maintaining service quality.
Concurrency Limits: Control the number of simultaneous active requests to an AI model, preventing resource exhaustion for highly concurrent inference workloads.

4. Robust Observability, Analytics, and Cost Tracking

Understanding the usage patterns, performance, and costs of your AI infrastructure is non-negotiable for optimization and governance. Kong AI Gateway provides deep visibility:

Detailed Logging: Comprehensive logging of every AI request and response, including prompt content (with redaction for sensitive info), generated output, model used, latency, token counts (input and output), and error codes. This is invaluable for debugging AI application logic and identifying model issues.
Metrics and Monitoring: Integrates with popular monitoring systems (Prometheus, Datadog) to provide real-time metrics on AI inference latency, error rates, request volume, and token usage. This allows for proactive identification of performance degradation or cost spikes.
AI Usage Analytics: By collecting rich data on AI interactions, Kong enables detailed analytics on which models are most popular, which applications consume the most tokens, and what the typical prompt characteristics are. This informs model selection, resource allocation, and cost optimization strategies.
Cost Tracking: Specifically for LLMs, Kong can track token consumption by user, application, and model, providing the necessary data to accurately attribute costs, enforce budgets, and negotiate better terms with AI providers.

5. Advanced Prompt Engineering and Transformation

The quality of AI output is heavily influenced by the prompt. Kong AI Gateway can centralize and manage prompt logic:

Prompt Templating: Define and enforce standardized prompt templates across applications. This ensures consistency, simplifies prompt engineering efforts, and allows for rapid iteration on prompt strategies without modifying application code.
Context Injection: Automatically inject relevant contextual information (e.g., user profiles, conversation history, retrieved data from external sources) into prompts before they reach the AI model. This enhances the AI's ability to provide personalized and relevant responses.
Request/Response Transformation: Modify request headers, query parameters, or the request body itself to conform to the specific API requirements of different AI models. Similarly, transform AI responses into a standardized format for client applications, abstracting away model-specific output formats.
Semantic Router: Direct prompts to different AI services based on semantic understanding of the prompt itself, potentially routing "code generation" prompts to a coding LLM and "summarization" prompts to another.

6. Caching for AI Responses (Semantic Caching)

Traditional caching is often ineffective for AI, as prompts can vary slightly while still requesting semantically similar information. Kong AI Gateway's approach to caching is more intelligent:

Semantic Caching: Instead of exact match caching, an AI Gateway can employ semantic caching. This involves analyzing the meaning of prompts to identify semantically similar requests that have already been processed. If a similar prompt was answered recently, the cached AI response can be returned, drastically reducing inference latency and, more importantly, cutting down on costly API calls to LLMs.
Configurable Cache Invalidation: Set policies for cache expiration based on time, model version updates, or specific data changes.
Response Deduplication: Prevent redundant calls to AI models for identical or near-identical prompts, saving costs and improving response times.

7. Model Agnostic Integration and Unified API

One of the most significant benefits of an AI Gateway is its ability to abstract away the underlying complexity of various AI models:

Unified API Endpoint: Present a single, consistent API endpoint to client applications, regardless of whether the request is ultimately routed to OpenAI, Google Gemini, Hugging Face models, or a custom internal model.
Reduced Vendor Lock-in: By abstracting the AI model layer, organizations can more easily switch between AI providers or integrate new models without requiring extensive changes to consumer applications. This fosters agility and protects against vendor lock-in.
Simplified Development: Developers no longer need to learn the intricacies of each AI provider's API. They interact with a standardized interface provided by Kong, dramatically simplifying AI integration.

8. Powerful Plugin Ecosystem and Extensibility

Kong's strength lies in its highly extensible plugin architecture. This allows organizations to build custom logic and integrate third-party services directly into the API Gateway flow:

Custom Lua Plugins: Develop bespoke plugins in Lua to implement highly specific AI-related logic, such as integrating with proprietary internal systems, performing unique data transformations, or implementing custom AI governance rules.
Third-Party Integrations: Easily integrate with external services for enhanced functionality, such as external data sources for prompt context, specialized security services, or advanced analytics platforms.
Flexibility for Evolving AI Landscape: The plugin ecosystem ensures Kong AI Gateway can adapt quickly to new AI technologies, models, and industry best practices without requiring core product changes.

9. Developer Portal and Experience

To maximize the adoption of AI services within an organization or by external partners, a seamless developer experience is crucial:

Self-Service API Discovery: A developer portal built on Kong allows developers to easily discover available AI models and services, understand their capabilities, and access documentation.
Automated API Key Provisioning: Streamline the process of developers obtaining API keys or credentials to access AI services.
Usage Monitoring for Developers: Provide developers with dashboards to monitor their own AI consumption, performance, and costs, fostering self-management and accountability.

Comparative Table: Traditional API Gateway vs. Kong AI Gateway for AI/LLM Workloads

| Feature Area | Traditional API Gateway Functionality | Kong AI Gateway Specific Enhancements for AI/LLMs Key Features:

1 Open-Source & Unified Platform: As an Apache 2.0 licensed open-source platform, APIPark* offers a consolidated solution for API and AI model management. This open approach empowers developers with flexibility and transparency, allowing them to fully customize and integrate the platform into their existing technology stacks without proprietary constraints. The unified platform is engineered to abstract away the complexities of diverse AI models, providing a singular, cohesive interface for developers. This reduces friction in integrating new AI technologies and ensures a smooth, predictable environment for both AI and REST services.

2 Seamless AI Model Integration (100+ Models): APIPark* excels in its ability to integrate a vast array of AI models from various providers, including popular LLMs, computer vision models, and natural language processing tools. The platform provides a centralized management system for all integrated models, offering a uniform approach to authentication, access control, and cost tracking. This feature simplifies the often-fragmented process of working with multiple AI vendors, allowing businesses to leverage the best models for specific tasks without the overhead of managing disparate APIs and credentials for each. Developers benefit from a consistent invocation pattern regardless of the underlying AI model, drastically reducing development time and effort.

3 Standardized AI Invocation Format: A hallmark of APIPark* is its commitment to standardizing the request data format across all integrated AI models. This innovative feature ensures that applications and microservices interact with AI models through a consistent API, irrespective of the particular model being used or its specific input requirements. The profound advantage here is the decoupling of application logic from AI model specifics. Changes to underlying AI models, prompt engineering techniques, or even switching providers do not necessitate modifications to the consuming applications. This standardization significantly reduces maintenance costs, accelerates development cycles, and future-proofs AI integrations against the rapidly evolving AI landscape.

4 Prompt Encapsulation into REST APIs: APIPark* empowers users to transform custom prompts and AI models into fully functional REST APIs with remarkable ease. This capability is revolutionary for developing specialized AI services. For instance, a complex prompt designed for sentiment analysis, translation, or data extraction can be encapsulated into a dedicated API endpoint. This means that non-AI specialists or other applications can invoke sophisticated AI functionalities through simple, well-defined REST calls without needing to understand the intricacies of prompt engineering or AI model interaction. This feature accelerates the development of AI-powered microservices and democratizes access to advanced AI capabilities across an organization.

5 End-to-End API Lifecycle Management: Going beyond AI, APIPark* provides a comprehensive framework for managing the entire lifecycle of both AI and traditional REST APIs. This includes intuitive tools for API design, seamless publication, robust invocation management, and orderly decommission. The platform streamlines API governance processes, ensuring consistency and adherence to organizational standards. It offers advanced traffic forwarding, intelligent load balancing, and meticulous versioning for published APIs. This holistic approach ensures that APIs are not only performant and reliable but also securely managed from inception to retirement, supporting continuous evolution and robust operational stability.

6 Collaborative API Service Sharing: APIPark* facilitates a collaborative environment by offering a centralized display of all API services. This feature is particularly beneficial for large organizations with multiple departments and teams. It allows different internal groups to easily discover, understand, and consume required API services, fostering reuse and reducing redundant development efforts. The platform acts as an internal marketplace for APIs, enhancing inter-team communication and accelerating project delivery by making existing digital assets readily available and discoverable across the enterprise.

7 Tenant-Specific API and Access Permissions: Designed for multi-tenancy, APIPark* enables the creation of distinct teams or tenants, each operating with independent applications, data sets, user configurations, and security policies. While maintaining this separation, tenants can share the underlying platform infrastructure, significantly improving resource utilization and reducing operational costs. This multi-tenant architecture is ideal for managed service providers or large enterprises requiring strict segregation of environments for different business units or clients, ensuring data isolation and customized access controls without deploying separate instances of the platform.

8 API Resource Access Approval Workflows: To enhance security and compliance, APIPark* includes an optional subscription approval feature. This mechanism ensures that any caller wishing to invoke a specific API must first subscribe to it and await administrative approval. This adds a critical layer of control, preventing unauthorized API calls and significantly mitigating potential data breaches or misuse. The approval workflow provides an audit trail for access requests, giving administrators full oversight and control over who is accessing valuable API resources and under what conditions.

9 High Performance Rivaling Nginx: Engineered for demanding workloads, APIPark boasts impressive performance characteristics, comparable to highly optimized web servers like Nginx. With just an 8-core CPU and 8GB of memory, the platform can achieve over 20,000 transactions per second (TPS). Furthermore, it supports cluster deployment, allowing organizations to scale horizontally to handle massive traffic volumes and ensure high availability for mission-critical AI and API services. This performance capability ensures that APIPark* can reliably serve as the backbone for high-throughput AI applications without becoming a bottleneck.

10 Detailed API Call Logging: APIPark* provides comprehensive and granular logging capabilities, meticulously recording every detail of each API call. This includes request headers, body, response status, latency, and any errors encountered. This exhaustive logging is invaluable for businesses to quickly trace and troubleshoot issues in API calls, diagnose performance bottlenecks, and ensure the stability and security of their systems. The detailed records serve as an essential audit trail for compliance, security investigations, and operational analysis.

11 Powerful Data Analysis and Visualization: Leveraging its rich logging data, APIPark* offers powerful data analysis and visualization tools. It processes historical call data to identify long-term trends, monitor performance changes over time, and detect anomalies. This predictive capability helps businesses identify potential issues before they escalate, enabling proactive maintenance and optimization. The analytical insights assist in understanding API usage patterns, capacity planning, and making data-driven decisions to improve service quality and resource allocation.

Deployment & Commercial Support: APIPark offers a streamlined deployment process, enabling quick setup with a single command line in under 5 minutes. While the open-source version caters to startups and basic needs, a commercial version is available, offering advanced features and professional technical support for enterprises. Launched by Eolink, a leader in API lifecycle governance, APIPark benefits from extensive industry expertise, serving over 100,000 companies and millions of developers worldwide.

Value to Enterprises

Kong AI Gateway, by integrating these powerful features, offers profound value to enterprises:

Accelerated AI Adoption: Simplifies the integration of AI models, allowing development teams to rapidly prototype and deploy AI-powered features.
Enhanced Security Posture: Provides a critical layer of defense for AI models and data, mitigating risks associated with unauthorized access, data breaches, and prompt injection attacks.
Significant Cost Optimization: Intelligent routing, caching, and detailed cost tracking lead to substantial savings on AI inference costs, especially for high-volume LLM usage.
Improved Performance and Reliability: Advanced traffic management, load balancing, and robust scaling ensure AI services are consistently fast and available.
Reduced Operational Complexity: Centralizes the management of diverse AI models, abstracting away individual API differences and simplifying day-to-day operations.
Future-Proofing AI Investments: The model-agnostic approach and extensible architecture ensure that organizations can adapt to the rapidly evolving AI landscape without extensive re-architecting.
Empowered Development Teams: A unified API and developer portal make it easier for developers to discover, integrate, and build upon AI capabilities, fostering innovation.

In essence, Kong AI Gateway transforms the challenge of AI integration into an opportunity for strategic growth, enabling organizations to build smarter, more secure, and more efficient AI-powered applications at scale.

APIPark is a high-performance AI gateway that allows you to securely access the most comprehensive LLM APIs globally on the APIPark platform, including OpenAI, Anthropic, Mistral, Llama2, Google Gemini, and more.Try APIPark now! 👇👇👇

Install APIPark – it’s free

Use Cases and Real-World Applications

The versatility and power of Kong AI Gateway manifest across a broad spectrum of real-world applications, offering tangible benefits for various industries and operational scenarios. By providing a unified, secure, and optimized interface to AI models, Kong enables businesses to innovate faster, reduce costs, and deliver superior intelligent experiences.

1. Enterprise AI Integration: Connecting Internal Systems with External LLMs

One of the most immediate and impactful use cases for Kong AI Gateway is facilitating the secure and efficient integration of external LLMs (like OpenAI's GPT-series, Google's Gemini, or Anthropic's Claude) with internal enterprise systems. Many organizations want to leverage these powerful models for tasks such as customer service automation, internal knowledge retrieval, or content generation, but face significant hurdles in doing so securely and scalably.

Scenario: A large financial institution wants to empower its customer service agents with an AI assistant that can quickly pull information from internal databases and summarize complex policy documents using an external LLM.

Kong AI Gateway's Role: * Security: All agent requests for the AI assistant would first pass through Kong. Kong would authenticate the agent and ensure they have the necessary authorization to use the LLM and access specific internal data sources. PII redaction plugins on Kong would automatically scrub sensitive customer data from the agent's query before it's sent to the external LLM, ensuring compliance with strict financial regulations. * Prompt Engineering: Kong can dynamically inject context from the internal CRM or knowledge base into the agent's prompt, creating a richer, more accurate query for the LLM without the agent needing to manually copy-paste information. It can also apply a standardized prompt template to ensure consistent output quality. * Cost Management: Kong tracks token usage per agent and per LLM call, providing granular insights into consumption. It can also enforce rate limits or even reroute requests to a cheaper, smaller LLM for less critical queries if cost thresholds are being approached. * Observability: Detailed logs of every interaction (redacted prompts, LLM responses, latency) are collected by Kong, allowing the institution to audit AI usage, troubleshoot issues, and monitor the performance of the AI assistant.

This ensures that the financial institution can harness the power of advanced LLMs without compromising sensitive data or incurring uncontrolled costs, all while providing a seamless experience for its agents.

2. Building AI-Powered Products: Providing Secure, Scalable Access to AI for Developers

For companies building AI into their core product offerings, Kong AI Gateway serves as the essential abstraction layer, allowing product developers to consume AI capabilities without deep knowledge of the underlying models or providers.

Scenario: A SaaS company develops a marketing platform that offers AI-powered features like blog post generation, social media caption creation, and ad copy optimization. These features rely on a combination of proprietary fine-tuned models and generic commercial LLMs.

Kong AI Gateway's Role: * Unified API: Product developers interact with a single, consistent API endpoint provided by Kong, regardless of whether a feature uses a proprietary model (e.g., for niche industry-specific content) or a third-party LLM (e.g., for general creative writing). This greatly simplifies development and allows for future model swaps without impacting the product code. * Model Orchestration: Kong can intelligently route requests based on the specific marketing task. A request for a blog post might go to a powerful LLM, while a simple headline generation request might go to a faster, lighter model. If a primary model fails or reaches its rate limit, Kong can automatically failover to a secondary model. * Caching: For common requests (e.g., "generate five variations of this short ad copy"), Kong's semantic caching can serve responses directly, drastically reducing latency for users and cutting down on inference costs. * Access Control & Monetization: If the SaaS platform has different tiers, Kong can enforce access controls and rate limits based on user subscriptions. Premium users might get higher token limits or access to more advanced, costly models, which Kong meticulously tracks for billing purposes.

This architecture allows the SaaS company to rapidly iterate on AI features, maintain a high level of performance, control operational costs, and offer a robust, reliable AI experience to its customers.

3. Cost Optimization for LLM Usage: Strategic Resource Management

The operational costs of LLMs can quickly escalate without proper management. Kong AI Gateway provides critical tools for cost control and optimization.

Scenario: A large e-commerce company uses various LLMs across different departments for product descriptions, customer support chatbots, and internal documentation. They are experiencing unpredictable and rising AI costs.

Kong AI Gateway's Role: * Token-Based Rate Limiting: Instead of just limiting requests, Kong can limit actual token consumption per user, team, or application, directly correlating usage with cost. * Cost-Aware Routing: Implement logic that routes requests to the most cost-effective LLM available for a given task, considering quality requirements. For example, a basic chatbot query might go to a cheaper open-source model hosted internally, while a complex product recommendation request goes to a high-end commercial LLM. * Intelligent Caching: For frequently asked questions in the chatbot or repetitive content generation tasks, Kong's semantic caching can serve responses instantly, eliminating redundant LLM calls and associated costs. * Detailed Cost Analytics: Kong provides granular logs and metrics on token usage per model, per user, and per application. This data is invaluable for understanding cost drivers, setting budgets, and identifying areas for optimization. The e-commerce company can use this to charge back AI costs to specific departments or negotiate better terms with AI providers.

By implementing these strategies through Kong, the e-commerce company can significantly reduce its LLM operational expenses and gain full control over its AI budget.

4. Enhanced Security for Sensitive AI Workloads: Protecting Data and Models

AI workloads often involve sensitive data, making security a paramount concern. Kong AI Gateway provides a robust defense layer.

Scenario: A healthcare provider is developing an AI system to analyze patient medical records for diagnostic support. The system needs to use an external LLM for summarizing reports but cannot send raw patient data to a third-party service.

Kong AI Gateway's Role: * Strict Access Control: Only authorized internal applications and users can access the diagnostic AI service through Kong, enforced by strong authentication (e.g., JWT, OAuth). * PII Redaction/Masking: Before any patient data (even a summary) is sent to the external LLM, Kong automatically scans and redacts or masks all PII (names, dates of birth, medical record numbers, etc.). This ensures that no identifiable patient information leaves the healthcare provider's controlled environment. * Prompt Injection Prevention: Kong can detect and block suspicious patterns in prompts that could indicate an attempt to manipulate the LLM or extract unauthorized information. * Audit Trails: Every request to the AI service, including the sanitized prompt and the LLM's response, is logged by Kong. This provides a comprehensive audit trail for compliance purposes and helps in investigating any potential security incidents. * Threat Detection Integration: Kong can integrate with external security tools to analyze traffic patterns for anomalies or known attack vectors targeting AI services.

This robust security framework allows the healthcare provider to leverage advanced AI capabilities while strictly adhering to patient privacy regulations and safeguarding sensitive medical data.

5. Multi-Model Orchestration and A/B Testing: Dynamic AI Strategy

As organizations mature in their AI adoption, they often need to experiment with multiple models, compare their performance, and dynamically switch between them. Kong AI Gateway simplifies this complex orchestration.

Scenario: A media company is exploring different LLMs for generating news summaries. They want to compare the quality and cost-effectiveness of GPT-4, Gemini Pro, and an internally fine-tuned open-source model.

Kong AI Gateway's Role: * Dynamic Routing: Kong can route a percentage of summary requests to each model (e.g., 50% to GPT-4, 30% to Gemini Pro, 20% to the internal model) for A/B testing. * Unified Response Format: Regardless of which LLM generates the summary, Kong can transform the response into a consistent format for the client application, making comparison easier. * Performance and Cost Tracking: Kong meticulously logs the latency, token usage, and error rates for each model. This data is then used to evaluate the trade-offs between model quality, speed, and cost, allowing the media company to make data-driven decisions on which model to deploy more widely. * Failover and Fallback: If one model experiences an outage or degraded performance, Kong can automatically reroute traffic to another available model, ensuring uninterrupted service.

By leveraging Kong AI Gateway, the media company can efficiently experiment with and optimize its AI strategy, dynamically adapting to the best-performing and most cost-effective models without refactoring its application code.

These use cases demonstrate that Kong AI Gateway is not just a theoretical concept but a practical, powerful tool that directly addresses the most pressing challenges in AI integration, security, performance, and cost management across diverse industries. It empowers organizations to move beyond basic API calls to sophisticated, enterprise-grade AI applications.

Implementing and Optimizing Kong AI Gateway

Deploying and optimizing Kong AI Gateway effectively requires careful planning and a deep understanding of best practices. Its flexibility allows for various deployment models, but maximizing its potential for AI workloads involves specific configuration and operational considerations.

1. Deployment Strategies: On-Prem, Cloud, and Hybrid

Kong AI Gateway can be deployed in diverse environments to meet specific organizational needs, compliance requirements, and existing infrastructure footprints.

On-Premises Deployment: For organizations with stringent data sovereignty requirements, existing data centers, or a desire for complete control over their infrastructure, deploying Kong AI Gateway on-premises is a viable option.
- Advantages: Maximum control over hardware, network, and data. Can be optimized for specific, high-performance local AI models. Reduces reliance on external cloud providers for critical AI pathways.
- Considerations: Requires significant operational overhead for hardware provisioning, maintenance, and scaling. Integration with cloud-based AI services might incur latency and egress costs.
- Implementation: Typically involves deploying Kong on virtual machines or bare metal servers using Docker, Kubernetes, or directly installing the software.
Cloud-Native Deployment (AWS, Azure, GCP): This is the most common and recommended approach for modern AI initiatives, leveraging the scalability, flexibility, and managed services of cloud providers.
- Advantages: High availability, automatic scaling, reduced operational burden through managed services (e.g., Kubernetes services like EKS, AKS, GKE). Seamless integration with cloud-native AI services and data storage.
- Considerations: Potential vendor lock-in, careful management of cloud costs (compute, network, data egress), and ensuring proper security group configurations.
- Implementation: Deploy Kong as a Kubernetes Ingress Controller or directly on managed Kubernetes services. Utilize cloud load balancers, auto-scaling groups, and cloud-native monitoring tools.
Hybrid Deployment: A common scenario where some AI models and sensitive data remain on-premises, while others leverage public cloud AI services. Kong AI Gateway can bridge these environments.
- Advantages: Balances control and security for sensitive internal AI with the scalability and advanced features of public cloud AI offerings.
- Considerations: Increased network complexity (VPNs, direct connects), consistent security policies across environments, and managing data transfer between hybrid components.
- Implementation: Kong instances can be deployed in both environments, potentially with a centralized Kong Manager for unified control, and secure network tunnels connecting the two.

Regardless of the deployment strategy, using containerization (Docker) and orchestration (Kubernetes) is highly recommended for Kong AI Gateway. This provides consistency, simplifies deployment, and enables robust scaling and resilience.

2. Configuration Best Practices for AI Workloads

Optimizing Kong AI Gateway for AI workloads goes beyond basic API gateway configuration.

Plugin Selection and Ordering: Carefully choose and order AI-specific plugins. For example, PII redaction should occur before the request reaches the AI model. Rate limiting for cost control might come after authentication but before caching.
- Example Order: Authentication -> PII Redaction -> Prompt Transformation -> Caching -> Rate Limiting -> Intelligent Routing -> Logging.
Prompt Management Configuration: Centralize prompt templates and context injection logic within Kong. Use environment variables or configuration files for prompt variables. Implement version control for prompts.
Caching Strategy:
- Configure semantic caching carefully. Define thresholds for semantic similarity and cache invalidation policies.
- Use cache keys that accurately reflect the AI request's uniqueness, potentially including model version, user ID (if responses are personalized), and a hash of the cleaned prompt.
- Monitor cache hit ratios to fine-tune your strategy.
Rate Limiting for Cost: Implement token-based rate limits if possible (some AI providers expose token counts in response headers). Otherwise, use estimated token counts or carefully calibrated request-based limits.
Health Checks for AI Models: Configure robust health checks for backend AI services. This ensures Kong only routes requests to healthy models, preventing errors and providing high availability. Include not just network reachability but also a basic inference test.
Security Policies:
- Enforce strong authentication (JWT, OAuth) and authorization.
- Configure PII redaction and content moderation plugins with up-to-date rules.
- Regularly review access logs for suspicious activity.
Centralized Kong Manager: For complex deployments with multiple Kong nodes and many AI services, use Kong Manager (or a similar control plane) to centralize configuration, monitor, and manage the entire gateway infrastructure.

3. Monitoring and Troubleshooting AI Gateway Performance

Effective observability is paramount for maintaining a healthy and cost-effective AI Gateway.

Key Metrics to Monitor:
- Request Latency: Overall API latency, and specifically, latency introduced by AI inference calls and gateway processing. Break down latency by plugin.
- Error Rates: HTTP error codes (e.g., 4xx, 5xx) from both Kong and backend AI services. Monitor specific AI model errors.
- Throughput: Requests per second, token usage per second (for LLMs).
- Cache Hit Ratio: Percentage of AI requests served from cache, indicating the effectiveness of your caching strategy.
- Resource Utilization: CPU, memory, and network usage of Kong nodes.
- AI-Specific Costs: Track actual or estimated costs per token/request/model.
Logging:
- Ensure detailed logging is enabled but also consider data privacy and storage costs. Redact sensitive information from logs.
- Integrate Kong logs with a centralized logging solution (ELK stack, Splunk, Datadog) for easy search and analysis.
Alerting: Set up alerts for critical metrics, such as high error rates, sudden drops in cache hit ratio, increased latency, or unusual spikes in token usage.
Distributed Tracing: Implement distributed tracing (e.g., OpenTelemetry, Jaeger) to trace a single AI request through Kong and into the backend AI model, identifying bottlenecks and performance issues across the entire distributed system.
Synthetic Monitoring: Deploy synthetic transactions that regularly test your AI services through Kong, simulating real user behavior to detect issues before they impact actual users.

4. Scalability Considerations for AI Workloads

AI workloads can be bursty and demand high throughput. Kong AI Gateway needs to be designed for scalability.

Horizontal Scaling of Kong Nodes: Deploy multiple Kong gateway instances behind a load balancer. Kong is stateless, making horizontal scaling straightforward.
Kubernetes for Auto-scaling: Leverage Kubernetes Horizontal Pod Autoscalers (HPAs) to automatically scale Kong pods based on CPU utilization, request queue length, or custom metrics (e.g., token usage rate).
Database Scalability: Kong relies on a database (PostgreSQL or Cassandra). Ensure your database is also scaled appropriately to handle the increased load from Kong nodes. Consider managed database services in the cloud.
Network Capacity: Ensure adequate network bandwidth between Kong and your AI models, especially if models are hosted externally or in different data centers.
Backend AI Model Scaling: While Kong manages access, the backend AI models themselves must also be scalable. Ensure your AI providers or internal model deployments can handle the traffic routed by Kong.
Resource Allocation: Provide sufficient CPU and memory resources to Kong nodes, especially if performing complex prompt transformations or extensive data masking, as these operations can be CPU-intensive.

By meticulously planning deployment, configuring for AI specifics, continuously monitoring performance, and designing for scalability, organizations can unlock the full potential of Kong AI Gateway, establishing a robust, efficient, and cost-effective foundation for their AI initiatives.

The Future of AI Gateways and API Management

The landscape of Artificial Intelligence is evolving at an exhilarating pace, constantly introducing new paradigms, models, and application patterns. As AI becomes more pervasive, the role of an AI Gateway will not only remain critical but will also expand and deepen, becoming an even more intelligent and autonomous orchestrator of AI experiences. The future of AI Gateways and API Management will be characterized by greater intelligence, proactive capabilities, and a seamless blend of human oversight with AI-driven automation.

Emerging Trends in AI and Their Impact on Gateways

AI-Native APIs and Autonomous Agents:
- Shift from APIs to Capabilities: Instead of just exposing an API endpoint for an LLM, future AI Gateways will manage access to higher-level capabilities or agents that can autonomously chain together multiple AI models, tools, and data sources to fulfill complex requests. For instance, an agent might receive a user request, decide which model to use for intent classification, then call another for data retrieval, and finally a third for response generation.
- Dynamic API Generation: Gateways might dynamically generate APIs or endpoints based on newly available AI models or newly defined agent capabilities, reducing manual configuration.
- Goal-Oriented Interaction: Future interactions will be more goal-oriented, with users specifying objectives and the AI Gateway (or agents behind it) intelligently orchestrating the necessary AI calls to achieve those goals.
Ethical AI and Governance Automation:
- Enhanced Bias Detection and Mitigation: AI Gateways will integrate more sophisticated real-time bias detection and mitigation techniques, analyzing both prompts and responses for fairness, transparency, and potential harmful outputs.
- Automated Compliance: As AI regulations (e.g., EU AI Act) solidify, gateways will automate compliance checks, ensuring data privacy, consent management, and explainability (XAI) for AI interactions. This could involve logging specific model provenance, version, and training data characteristics.
- Explainable AI (XAI) Integration: Gateways might help capture and expose the reasoning paths of complex AI models or agents, providing insights into why an AI generated a particular response, which is critical for trust and auditing.
Hyper-Personalization and Contextual Intelligence:
- Richer Context Injection: AI Gateways will become even more adept at dynamically enriching prompts with hyper-contextual data drawn from diverse internal and external sources (user profiles, real-time sensor data, historical interactions, environmental factors).
- Adaptive Learning: The gateway itself might learn from past interactions, adapting its routing decisions, caching strategies, and prompt transformations to deliver increasingly personalized and efficient AI experiences.
Edge AI and Hybrid Architectures:
- Federated Learning Integration: Gateways will facilitate secure communication and model updates for federated learning scenarios, where AI models are trained on distributed data at the edge without centralizing sensitive information.
- Optimized Edge Inference: For low-latency applications, AI Gateways deployed at the edge will become crucial for routing requests to local, smaller models for initial processing, and only forwarding complex queries to powerful cloud LLMs when necessary.

The Evolving Role of Gateways in this Landscape

The AI Gateway of the future will move beyond being a mere proxy to become an intelligent, proactive, and adaptive component of the AI stack.

Intelligent Orchestration Hub: It will serve as the central brain for AI interactions, deciding which models to use, how to chain them, what data to inject, and how to optimize for cost, performance, and ethical considerations – all in real-time.
AI Policy Enforcement Point: All AI governance, security, and compliance policies will be enforced at the gateway, acting as the ultimate gatekeeper for responsible AI deployment.
Proactive Optimization Engine: Utilizing machine learning, the gateway will proactively identify performance bottlenecks, cost inefficiencies, and security threats, automatically adjusting its configuration to optimize the AI workload.
Developer Empowerment Platform: It will continue to simplify AI consumption for developers, abstracting away increasing layers of complexity and allowing them to focus on building innovative applications.

In this dynamic ecosystem, while commercial offerings like Kong AI Gateway will continue to lead with cutting-edge features and enterprise-grade support, the open-source community will also play a crucial role in democratizing access to these powerful capabilities. For instance, platforms like APIPark offer an open-source AI gateway and API management platform, designed to help developers and enterprises manage, integrate, and deploy AI and REST services with ease. APIPark stands out by enabling quick integration of over 100 AI models, unifying the API format for AI invocation, and allowing users to encapsulate custom prompts into REST APIs. Its comprehensive features, ranging from end-to-end API lifecycle management and team-based API sharing to robust security measures like approval-based access and high-performance capabilities (achieving over 20,000 TPS with modest resources), underscore the broad commitment across the industry to simplifying AI integration. APIPark (visit their website at ApiPark) represents a vital component of the evolving landscape, providing an accessible and powerful solution for managing the entire API ecosystem, including the increasingly complex world of AI. Such platforms highlight the collaborative effort to build more efficient, secure, and developer-friendly AI infrastructures.

Conclusion

The journey into the AI-first world is not merely about adopting powerful models; it's about building the intelligent infrastructure that can harness their potential responsibly, securely, and efficiently. Kong AI Gateway, with its robust architecture, comprehensive feature set, and deep extensibility, stands as an indispensable tool in this endeavor. It transforms the daunting task of integrating, managing, and optimizing diverse AI models into a streamlined, cost-effective process.

By acting as the intelligent intermediary, Kong AI Gateway empowers organizations to:

Accelerate Innovation: Developers can focus on building innovative applications, abstracting away the complexities of underlying AI models.
Fortify Security: Critical security layers protect sensitive data, prevent attacks, and ensure compliance with evolving regulations.
Optimize Performance and Cost: Intelligent routing, caching, and granular cost tracking ensure AI resources are utilized efficiently and economically.
Ensure Scalability and Reliability: A high-performance, resilient architecture guarantees AI services are always available and can handle demand fluctuations.
Future-Proof Investments: The flexible, model-agnostic approach allows organizations to adapt quickly to new AI advancements without extensive re-architecting.

As Artificial Intelligence continues its relentless march, becoming even more integral to business operations and consumer experiences, the role of a sophisticated AI Gateway will only grow in importance. Kong AI Gateway is not just a technology; it is a strategic imperative for any enterprise serious about unlocking the transformative power of AI and navigating the complexities of the intelligent future. It is the key to building smarter, more resilient, and more innovative applications that will define the next generation of digital experiences.

Frequently Asked Questions (FAQs)

1. What is an AI Gateway and how does it differ from a traditional API Gateway? An AI Gateway is an evolved form of a traditional API Gateway, specifically designed to manage, secure, and optimize access to Artificial Intelligence models, especially Large Language Models (LLMs). While a traditional API Gateway handles general backend service routing, authentication, and rate limiting, an AI Gateway adds specialized features like intelligent routing based on AI model types, prompt engineering, semantic caching for AI responses, PII redaction, token-based cost management, and AI-specific observability (e.g., token usage, inference latency). It provides a unified interface to diverse AI models, abstracting away their complexities and unique API formats.

2. Why is an AI Gateway like Kong essential for integrating LLMs into enterprise applications? LLMs present unique challenges in enterprise integration: high operational costs (per token), potential data privacy risks with sensitive prompts, varying APIs from different providers, and the need for sophisticated prompt engineering. Kong AI Gateway addresses these by offering features like cost-aware routing (to cheaper models), PII redaction before prompts leave the system, unified API abstraction for multiple LLM providers, and centralized prompt management. This ensures secure, cost-effective, and scalable access to LLMs, accelerating development while mitigating risks.

3. How does Kong AI Gateway help with cost optimization for AI models, especially LLMs? Kong AI Gateway offers several mechanisms for cost optimization. It can implement token-based rate limiting to directly control LLM consumption, preventing runaway costs. Its intelligent routing can direct requests to the most cost-effective AI model for a given task, dynamically choosing between premium and more economical options based on requirements. Crucially, Kong's semantic caching feature can store and serve responses to semantically similar prompts, drastically reducing the number of expensive inference calls to LLMs and significantly lowering operational expenses. Detailed logging and analytics also provide visibility into cost drivers, enabling informed budget management.

4. What security features does Kong AI Gateway provide for protecting AI workloads and sensitive data? Kong AI Gateway provides multi-layered security for AI workloads. It offers robust authentication (API keys, OAuth, JWT) and fine-grained authorization to control access to AI models. For sensitive data, it can automatically perform PII (Personally Identifiable Information) redaction or masking in prompts before they are sent to external AI models, ensuring data privacy compliance. It also includes features for content moderation, detection of prompt injection attacks, and comprehensive audit logging of all AI interactions, providing a secure perimeter for your AI ecosystem.

5. Can Kong AI Gateway manage both proprietary internal AI models and third-party cloud AI services simultaneously? Yes, absolutely. One of Kong AI Gateway's key strengths is its model-agnostic approach. It provides a unified API layer that can expose and manage any AI model, whether it's a proprietary model developed and hosted internally (e.g., on-premises or in your private cloud) or a third-party service from providers like OpenAI, Google, or Hugging Face. This flexibility allows organizations to seamlessly integrate diverse AI capabilities into a single, cohesive infrastructure, reducing vendor lock-in and simplifying the developer experience. Kong can intelligently route requests to the appropriate backend AI service based on configuration, ensuring optimal utilization and consistent access.

🚀You can securely and efficiently call the OpenAI API on APIPark in just two steps:

Step 1: Deploy the APIPark AI gateway in 5 minutes.

APIPark is developed based on Golang, offering strong product performance and low development and maintenance costs. You can deploy APIPark with a single command line.

curl -sSO https://download.apipark.com/install/quick-start.sh; bash quick-start.sh

In my experience, you can see the successful deployment interface within 5 to 10 minutes. Then, you can log in to APIPark using your account.

Step 2: Call the OpenAI API.