Eyes on every model in production

AI Observability & Monitoring

32 companies tracked by our intelligence team

Market Overview

AI Observability & Monitoring is the operational nervous system of enterprise AI. With 32 tracked companies, this category provides the tooling organizations need to evaluate, monitor, and maintain AI systems once they move from development into production — addressing everything from model drift and hallucination detection to cost optimization and performance benchmarking.

The category has bifurcated into two primary segments. ML observability platforms (Arize AI, Arthur AI, Aporia) focus on traditional machine learning models — monitoring for data drift, feature importance changes, and prediction quality degradation. These platforms have matured significantly and are now standard infrastructure for organizations running ML in production. The second and faster-growing segment is LLM and GenAI evaluation platforms (Patronus AI, Braintrust, Galileo) that address the unique challenges of monitoring large language models — hallucination rates, response quality, prompt effectiveness, and safety compliance.

The managed detection and response (MDR) segment, represented by companies like Arctic Wolf, brings AI-powered analysis to security monitoring, using machine learning to surface genuine threats from the noise of security telemetry. Meanwhile, developer-focused platforms like AgentOps and Langfuse are building the monitoring infrastructure specifically for AI agent deployments — tracking agent sessions, tool usage, cost, and behavioral patterns.

As enterprise AI deployments scale from dozens to thousands of models and agents, observability becomes non-negotiable. The ability to detect when an AI system begins behaving unexpectedly — before it causes a security incident, compliance violation, or customer-facing error — is what separates production-ready AI programs from experimental ones. We expect significant platform consolidation in this category as the major cloud and security vendors build or acquire monitoring capabilities.

All 32 AI Observability & Monitoring Companies

AgentOps
Developer platform for monitoring, testing, and debugging AI agents with session replays and cost tracking.
📍 San Francisco, CA Est. 2024
Aporia
ML observability and guardrails platform for monitoring AI models in production with real-time alerting.
📍 Tel Aviv, Israel Est. 2019
Arctic Wolf
Security operations platform delivering managed detection and response with AI-driven threat analysis.
📍 Eden Prairie, MN Est. 2012
Arize AI
Unified AI observability platform for evaluating, monitoring, and troubleshooting AI applications and agents.
📍 Berkeley, CA Est. 2020
Arthur AI
AI performance platform for monitoring, evaluating, and improving ML, GenAI, and agentic AI models at scale.
📍 New York, NY Est. 2019
Braintrust
Enterprise AI evaluation and observability platform for testing, scoring, and monitoring LLM applications.
📍 San Francisco, CA Est. 2023
Censius
AI observability platform providing monitoring, explainability, and compliance tools for ML models in production.
📍 San Francisco, CA Est. 2021
Confident AI
LLM evaluation and testing platform providing automated benchmarking, red teaming, and quality metrics for AI apps.
📍 San Francisco, CA Est. 2023
Darktrace
AI-powered cyber defense platform using self-learning AI for autonomous threat detection and response.
📍 Cambridge, UK Est. 2013
Datadog
Cloud monitoring platform with LLM observability, AI agent tracing, and ML model monitoring capabilities.
📍 New York, NY Est. 2010
Deepchecks
Open-source AI validation and testing platform for continuous evaluation of ML models and LLM applications.
📍 Tel Aviv, Israel Est. 2021
Dynatrace
Software intelligence platform with AI-powered observability including Davis AI for automated root cause analysis.
📍 Waltham, MA Est. 2005
Fiddler AI
Enterprise AI observability and security platform providing monitoring, guardrails, and control plane for AI agents.
📍 Palo Alto, CA Est. 2018
Galileo AI
AI evaluation and observability platform for detecting hallucinations, toxicity, and quality issues in LLM applications.
📍 San Francisco, CA Est. 2021
Helicone
Open-source LLM observability platform for logging, monitoring, and optimizing AI application performance.
📍 San Francisco, CA Est. 2023
Humanloop
Platform for evaluating, monitoring, and managing LLM prompts and AI applications with human feedback integration.
📍 London, UK Est. 2020
Kolena
ML testing and evaluation platform for systematic model validation with scenario-based testing and regression detection.
📍 San Francisco, CA Est. 2021
Langfuse
Open-source LLM engineering platform for tracing, evaluation, prompt management, and monitoring of AI applications.
📍 Berlin, Germany Est. 2023
LastMile AI
AI evaluation and debugging platform for testing and improving LLM applications with automated test generation.
📍 San Francisco, CA Est. 2023
Lunary
Open-source AI developer platform for monitoring, evaluating, and managing LLM applications in production.
📍 San Francisco, CA Est. 2023
New Relic
Observability platform with AI monitoring capabilities for tracking model performance and LLM application health.
📍 San Francisco, CA Est. 2008
Patronus AI
AI evaluation platform specializing in detecting hallucinations, toxicity, and safety violations in LLM outputs.
📍 San Francisco, CA Est. 2023
Portkey AI
AI gateway and observability platform for routing, monitoring, and managing LLM API calls with reliability controls.
📍 San Francisco, CA Est. 2023
PromptLayer
Prompt engineering platform for logging, versioning, and evaluating LLM prompts in production.
📍 New York, NY Est. 2022
RagaAI
AI testing and quality platform providing automated evaluation, guardrails, and red teaming for AI applications.
📍 San Francisco, CA Est. 2023
Scale AI
AI data platform providing model evaluation, safety research via SEAL lab, and red teaming for AI systems.
📍 San Francisco, CA Est. 2016
Traceloop
Open-source LLM observability platform built on OpenTelemetry for tracing and monitoring AI application performance.
📍 Tel Aviv, Israel Est. 2023
Truera
AI quality management platform providing explainability, monitoring, and testing for trustworthy AI deployments.
📍 Redwood City, CA Est. 2019
Vectra AI
AI-driven threat detection and response platform using behavioral analytics for network, cloud, and identity security.
📍 San Jose, CA Est. 2012
Vellum AI
LLM development platform for prompt engineering, evaluation, and monitoring with version control and collaboration.
📍 New York, NY Est. 2022
Weights & Biases
ML experiment tracking and model management platform with governance, evaluation, and collaboration tools.
📍 San Francisco, CA Est. 2017
WhyLabs
AI observability and security platform for monitoring models and data. Acquired by Apple (~Jan 2025), SaaS shut down.
📍 Seattle, WA Est. 2019
Related Categories

Explore Adjacent Markets

Explore the Full Database

206 companies across 10 categories — search, filter, and analyze the AI security landscape.

Browse All Companies →