SambaStack Monitoring and Observability

SambaStack emits two primary telemetry surfaces to help you observe and operate your deployments:

Metrics (Prometheus) – Router-level metrics such as traffic, latency, queueing, and worker state. See Metrics.
Logs (Logging Events / Manifest Events) – Detailed per-request execution events from the model runtime and other services. See Logs.

Telemetry data generally consists of three types: Metrics (numeric time series for aggregation and alerting), Logs (discrete events for debugging and forensics), and Traces (request path records for latency analysis and root cause investigation).

Monitoring stack

SambaStack includes the emission of these two telemetry services to enable observation and troubleshooting in all aspects of AI inference workloads running on SambaNova racks. This allows for collecting, storing, and visualizing of:

System and application logs (control plane and data plane)
Audit events and access traces
Usage metrics (QPS, latency, queue time, memory utilization)
User activity (active users, sessions)
Health and availability signals (node status, pod status, model health)

There are many third party tools and services available to build out a monitoring and observability stack that works for your organization. The reference architecture described here is SambaNova’s suggested implementation, but is completely optional. SambaNova provides an example of a default monitoring stack based on widely used open-source tools. Many customers already have mature monitoring solutions. The SambaStack monitoring architecture is modular, so you can:

Adopt the full stack as provided, or
Swap individual components with equivalents from your existing observability platform (Splunk, Datadog, Elasticsearch, New Relic, etc.)

Reference architectures use third-party products. There is no guarantee that they will be updated in sync with version or command syntax changes. Address any issues not specific to SambaStack to the respective vendor.

Components

SambaStack’s reference monitoring stack uses four primary components:

Component	Tool	Description
Log Forwarder	Fluent Bit	Collects logs from Kubernetes pods, nodes, and system services. Parses, enriches, and forwards logs to OpenSearch.
Log Storage	OpenSearch	Stores logs and audit trails at scale. Provides search, filtering, and aggregation for log data.
Metrics Collection	Prometheus	Scrapes metrics from SambaStack services, Kubernetes components, and node exporters. Stores time-series data for monitoring and alerting.
Visualization	Grafana	Connects to Prometheus and OpenSearch. Provides dashboards for metrics and log exploration.

Architecture

                         VISUALIZATION
+--------------------------------------------------------------------+
|                            Grafana                                 |
|              (Dashboards, Alerts, Log Exploration)                 |
+--------------------------------------------------------------------+
                    |                           |
                    v                           v
+-------------------------------+  +-----------------------------+
|          Prometheus           |  |         OpenSearch          |
|       (Metrics Storage)       |  |        (Log Storage)        |
+-------------------------------+  +-----------------------------+
                    ^                           ^
                    |                           |
+-------------------------------+  +-----------------------------+
|        Node Exporter          |  |         Fluent Bit          |
|        (Host Metrics)         |  |      (Log Forwarding)       |
+-------------------------------+  +-----------------------------+
                    ^                           ^
                    |                           |
+--------------------------------------------------------------------+
|                          SAMBASTACK                                |
+--------------------------------------------------------------------+
|  Inference Router  |  Model Deployments  |  Kubernetes Pods/Nodes |
+--------------------------------------------------------------------+

Design principles

Modular integration — Each component exposes well-defined interfaces (Fluent Bit outputs, Prometheus remote_write, Grafana data sources). You can replace any component with an equivalent.
Kubernetes-native — All components run on and integrate with the Kubernetes cluster where SambaStack workloads are deployed.
Bring-your-own stack — Integrate with your existing log platform, metrics system, or visualization layer.
Security and compliance ready — Logging, metrics, and audit data can integrate with your existing SIEM and compliance tooling.

Component substitution

You can replace any component with an equivalent from your existing observability platform:

Component	Replaceable with	Requirements
OpenSearch	Elasticsearch, Splunk, Loki, Datadog Logs	Must accept logs via HTTP, gRPC, or Kafka
Fluent Bit	Fluentd, Vector, Logstash, Datadog Agent	Must support Kubernetes log collection
Prometheus	Managed Prometheus, Datadog, New Relic	Must scrape `/metrics` endpoints or accept `remote_write`
Grafana	Datadog dashboards, Kibana, custom tooling	Must integrate with both metrics and log sources

Prerequisites

Before deploying the monitoring stack, ensure you have:

Requirement	Description
Kubernetes cluster	A running cluster with SambaStack installed
kubectl	Configured with access to your target Kubernetes cluster
Helm (latest version)	For deploying all Helm charts
jq	For parsing JSON output during verification
Storage class	A valid storage class for persistent volumes (OpenSearch and Prometheus)

Directory structure

Create this directory structure before starting:

mkdir -p ~/.sambastack-observability/{opensearch,fluentbit,monitoring}

After completing all deployments, your directory should contain:

~/.sambastack-observability/
|-- monitoring-namespace.yaml
|-- opensearch/
|   |-- opensearch-initial-admin-password-secret.yaml
|   +-- opensearch-values.yaml
|-- fluentbit/
|   |-- append_tags.lua
|   |-- fluentbit-conf.conf
|   +-- fluentbit-values.yml
+-- monitoring/
    |-- grafana-initial-admin-credentials-secret.yaml
    |-- inference-router-sm.yaml
    +-- prometheus-grafana-values.yaml

Deployment order

Deploy components in this order:

Step	Component	Guide
1	OpenSearch	Log Storage
2	Fluent Bit	Log Forwarding
3	Prometheus and Grafana	Monitoring

Resource requirements

Component	CPU	Memory	Storage
OpenSearch	2-4 cores	4-8 GB	50-100 GB
Fluent Bit (per node)	100m	128Mi	—
Prometheus	500m	2Gi	30Gi
Prometheus Operator	200m	256Mi	-
Grafana	250m	512Mi	—
Node Exporter (per node)	100m	128Mi	—

​Monitoring stack

​Components

​Architecture

​Design principles

​Component substitution

​Prerequisites

​Directory structure

​Deployment order

​Resource requirements

Monitoring stack

Components

Architecture

Design principles

Component substitution

Prerequisites

Directory structure

Deployment order

Resource requirements