NNO Documentation

Date: 2026-03-30 Status: Partially Implemented (Phase 1) — structured logging deployed; Logpush/CFAE/dashboards are Phase 2 Parent: System Architecture Scope: All NNO core services + every provisioned platform Worker

Phase 1 implemented: packages/logger is live with the Logger class, requestLogger Hono middleware, and x-trace-id propagation. The gateway also has requestIdMiddleware (generating/propagating x-request-id) and tracingMiddleware (distributed tracing context via @neutrino-io/core/tracing) as deployed observability primitives. Deployed in 6 NNO core services (IAM, Registry, Billing, Provisioning, CLI Service, Gateway).

Phase 2 not yet implemented: Cloudflare Logpush configuration, Analytics Engine (CFAE) bindings, SLOs, alerting, and observability dashboards are designed below and planned for Phase 2.

Overview

Neutrino's observability stack is built on three Cloudflare-native pillars:

Pillar	Technology	Purpose
Logs	Structured `console.log` → Cloudflare Logpush → R2 / external SIEM	Audit trail, debugging, security forensics
Metrics	Cloudflare Analytics Engine (CFAE)	Real-time operational signals, usage metering, SLO tracking
Traces	`x-trace-id` propagation + CFAE span events	Cross-service request correlation

Each pillar serves two audiences:

NNO operators — Visibility across all platforms, all services, all tenants. Used for platform health monitoring, incident response, and capacity planning.
Platform admins — Scoped visibility into their platform only. Accessible via the NNO Portal observability dashboard. No access to other platforms' data or NNO internal service internals.

1. Structured Logging [Phase 1]

1.1 Log Format

Every NNO service and every provisioned platform Worker emits logs as newline-delimited JSON (NDJSON). Cloudflare Workers capture console.log() output and include it in Logpush streams.

// packages/logger/src/logger.ts

export type LogLevel = "info" | "warn" | "error" | "debug";

export interface LogEntry {
  timestamp: string; // ISO 8601
  level: LogLevel;
  service: string; // e.g. 'registry' | 'provisioning'
  traceId?: string; // x-trace-id propagated across services
  requestId?: string; // x-request-id per request
  message: string;
  [key: string]: unknown; // arbitrary structured data spread into entry
}

The LogEntry is intentionally flat: the Logger spreads any extra data fields directly into the entry (via ...data) rather than nesting them under a data key. This keeps log lines compact and easily queryable.

1.2 Logger Implementation

// packages/logger/src/logger.ts

export class Logger {
  constructor(
    private readonly service: string,
    private readonly traceId?: string,
    private readonly requestId?: string,
  ) {}

  private emit(
    level: LogLevel,
    message: string,
    data?: Record<string, unknown>,
  ): void {
    const entry: LogEntry = {
      timestamp: new Date().toISOString(),
      level,
      service: this.service,
      ...(this.traceId !== undefined && { traceId: this.traceId }),
      ...(this.requestId !== undefined && { requestId: this.requestId }),
      message,
      ...data, // extra fields are spread into the top-level entry
    };

    const output = JSON.stringify(entry);

    if (level === "error") {
      console.error(output);
    } else {
      console.log(output);
    }
  }

  info(message: string, data?: Record<string, unknown>): void {
    this.emit("info", message, data);
  }
  warn(message: string, data?: Record<string, unknown>): void {
    this.emit("warn", message, data);
  }
  error(message: string, data?: Record<string, unknown>): void {
    this.emit("error", message, data);
  }
  debug(message: string, data?: Record<string, unknown>): void {
    this.emit("debug", message, data);
  }
}

/** Convenience factory — creates a Logger without trace/request IDs */
export function createLogger(service: string): Logger {
  return new Logger(service);
}

Key differences from the previous documentation:

Constructor takes (service: string, traceId?: string, requestId?: string) — not a context object with version, platformId, etc.
Error-level logs use console.error(); all others use console.log().
Extra data fields are spread directly into the top-level log entry (not nested under data).
traceId and requestId are only included in the entry when defined (conditional spread).
No child(), request(), or fatal() methods exist.

1.3 Hono Request Logging Middleware

All NNO service Workers and platform Workers use a shared middleware that logs every request:

// packages/logger/src/middleware.ts

import { Logger } from "./logger.js";
import { initTrace } from "./trace.js";

export function requestLogger(service: string): MiddlewareHandler {
  return async (c, next) => {
    const start = Date.now();
    const { traceId, requestId } = initTrace(c.req.raw);

    // Set on Hono context — available to all downstream handlers
    c.set("traceId", traceId);
    c.set("requestId", requestId);

    const logger = new Logger(service, traceId, requestId);
    c.set("logger", logger);

    logger.info("→ request", { method: c.req.method, path: c.req.path });

    await next();

    logger.info("← response", {
      method: c.req.method,
      path: c.req.path,
      status: c.res.status,
      duration: Date.now() - start,
    });
  };
}

Key differences from the previous documentation:

Takes service: string (not a Logger instance) and creates the Logger internally after extracting trace context.
Calls initTrace(c.req.raw) to extract/generate both traceId and requestId.
Sets three values on the Hono context: traceId, requestId, and logger — downstream handlers access the logger via c.get("logger").
Emits both a request log (→ request) and a response log (← response) with duration.

1.4 Metrics Recording

packages/logger/src/metrics.ts provides a lightweight helper for recording metric data points. In production it writes to the Cloudflare Analytics Engine dataset nno_metrics; in development (or when the binding is unavailable) it falls back to structured console.log output.

// packages/logger/src/metrics.ts

export interface MetricLabels {
  [key: string]: string;
}

export function recordMetric(
  name: string, // e.g. 'gateway.request.count'
  value: number, // count, latency ms, bytes, etc.
  labels: MetricLabels, // key/value dimensions for filtering
  env?: { NNO_METRICS?: AnalyticsDataset }, // CF Workers env binding
): void;

Behaviour:

Production (env.NNO_METRICS present): calls env.NNO_METRICS.writeDataPoint() with the metric name and label values as blobs, the numeric value as doubles, and the metric name as indexes. Wrapped in a try/catch so metric failures never throw.
Development (no binding): emits \{ metric, value, labels, timestamp \} via console.log (not via Logger, to avoid circular dependency).

Note: recordMetric is not currently re-exported from the barrel index.ts. Import it directly: import \{ recordMetric \} from '@neutrino-io/logger/metrics'.

Phase 2 additions to audit infrastructure: The Registry audit_log table gains dedicated columns actor_email TEXT, ip_address TEXT, user_agent TEXT for queryability. Historical rows will have NULL in these columns; the existing metadata JSON column already captures this data for pre-Phase 2 entries. The platform_lifecycle_events table (Phase 2) provides a dedicated lifecycle audit trail separate from the general audit_log — it records every platform status transition with actor, trigger type, and reason.

1.5 What Each Service Logs

NNO Registry

Event	Level	Key fields
Resource created/updated/deleted	`info`	`resourceType`, `resourceId`, `platformId`
Manifest fetched	`debug`	`platformId`, `entityId`, `featureCount`
Audit log write	`debug`	`action`, `actorId`
Query timeout (>500ms)	`warn`	`query`, `durationMs`
Internal error	`error`	`error.stack`

NNO Provisioning

Event	Level	Key fields
Job created	`info`	`jobId`, `operation`, `platformId`
Step started/completed	`info`	`jobId`, `step`, `durationMs`
Step failed	`error`	`jobId`, `step`, `error`, `willRollback`
Rollback started/completed	`warn`	`jobId`, `stepsToRollback`
CF API rate limit hit	`warn`	`endpoint`, `retryAfterMs`
CF API call	`debug`	`method`, `endpoint`, `status`, `durationMs`

NNO CLI Service

Event	Level	Key fields
Repo created	`info`	`platformId`, `repoUrl`
Feature config committed	`info`	`platformId`, `featureId`, `commitSha`
CF Pages build triggered	`info`	`platformId`, `buildId`
GitHub API error	`error`	`endpoint`, `status`, `error`

Platform Auth Workers

Event	Level	Key fields
Login success/failure	`info`	`platformId`, `userId`, `method`, `result`
Session created/invalidated	`info`	`platformId`, `userId`, `sessionId`
Permission denied	`warn`	`platformId`, `userId`, `permission`
2FA triggered	`info`	`platformId`, `userId`, `method`

Auth events are also written to the auth D1 audit_authentication and audit_authorization tables (90-day retention) by the existing audit middleware — the structured log provides real-time streaming; D1 provides queryable history.

Platform Feature Workers

Event	Level	Key fields
Request handled	`info`	`platformId`, `entityId`, `featureId`, `status`, `durationMs`
Auth validation failure	`warn`	`platformId`, `featureId`, `reason`
D1 query slow (>200ms)	`warn`	`featureId`, `query`, `durationMs`
Unhandled error	`error`	`featureId`, `error.stack`

2. Cloudflare Logpush [Phase 2]

Cloudflare Logpush streams real-time Worker logs (including console.log output and HTTP request fields) to a configurable destination.

2.1 Logpush Destinations

Audience	Destination	Retention
NNO internal (all services)	R2 bucket `nno-logs-internal`	90 days
Per-platform logs (for admins)	R2 bucket `nno-logs-\{platformId\}`	30 days
Security/SIEM (optional)	Datadog / Splunk / custom HTTPS endpoint	Per SIEM policy

2.2 NNO Internal Logpush Configuration

One Logpush job per NNO core service, configured via Cloudflare API at provisioning time:

// Called once during NNO core service deployment
async function createLogpushJob(
  accountId: string,
  cfApiToken: string,
  workerName: string,
  r2BucketName: string,
): Promise<void> {
  await fetch(
    `https://api.cloudflare.com/client/v4/accounts/${accountId}/logpush/jobs`,
    {
      method: "POST",
      headers: {
        Authorization: `Bearer ${cfApiToken}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        name: `logpush-${workerName}`,
        destination_conf: `r2://${r2BucketName}/{DATE}/{HOUR}/{filename}`,
        dataset: "workers_trace_events",
        filter: JSON.stringify({
          where: {
            key: "ScriptName",
            value: workerName,
            op: "eq",
          },
        }),
        logpull_options:
          "fields=Event,EventTimestampMs,Outcome,Logs,ScriptName,Ray",
        enabled: true,
      }),
    },
  );
}

2.3 Platform Logpush Configuration

NNO Provisioning creates a Logpush job for each platform's Workers during platform provisioning. All Workers belonging to a platform (auth, feature Workers) are collected into one per-platform R2 bucket:

nno-logs-{platformId}/
├── 2026-02-22/
│   ├── 00/
│   │   └── {platformId}-auth-prod-{uuid}.json.gz
│   ├── 01/
│   │   └── {platformId}-analytics-prod-{uuid}.json.gz
│   └── ...
└── 2026-02-23/
    └── ...

Platform admins can download log files directly from R2 via a signed URL generated by the NNO Portal. In Phase 2, a basic log search UI is provided in the Portal.

2.4 Logpush Record Format

Each Logpush record from Cloudflare Workers contains:

{
  "ScriptName": "k3m9p2xw7q-r8n4t6y1z5-analytics-prod",
  "Ray": "8b2e4f1a2b3c4d5e",
  "Outcome": "ok",
  "EventTimestampMs": 1740220440123,
  "Logs": [
    {
      "Level": "log",
      "Message": [
        "{\"traceId\":\"abc\",\"service\":\"analytics\",\"level\":\"info\",\"message\":\"GET /api/data 200\",\"http\":{\"durationMs\":45}}"
      ],
      "TimestampMs": 1740220440100
    }
  ]
}

The Logs[*].Message[0] field contains the JSON-stringified LogEntry emitted by the Worker's logger. NNO log tooling parses this to extract structured fields.

3. Cloudflare Analytics Engine (Metrics) [Phase 2]

3.1 CFAE Datasets

One Analytics Engine dataset per logical domain. Dataset names follow the NNO naming convention:

Dataset	Workers that write to it	Key measurements
`nno-core-ops`	Registry, Provisioning, CLI Service, Stack Registry, IAM	Operation counts, durations, error rates per NNO service
`\{platformId\}-usage`	All Workers for a given platform	Per-feature invocation counts (reused from billing metering)
`\{platformId\}-perf`	All Workers for a given platform	Response latency percentiles per feature + endpoint
`nno-auth-events`	All Auth Workers across all platforms	Login events, session counts (anonymised), 2FA usage
`nno-builds`	NNO CLI Service	CF Pages build outcomes, durations

3.2 Data Point Schema

`nno-core-ops`

analytics.writeDataPoint({
  blobs: [
    service, // blob1: e.g. 'registry'
    operation, // blob2: e.g. 'GET /platforms'
    outcome, // blob3: 'success' | 'error' | 'timeout'
    platformId, // blob4: which platform the operation was for (or 'nno-internal')
  ],
  doubles: [
    1, // double1: request count (always 1 per data point)
    durationMs, // double2: response time in ms
    isError ? 1 : 0, // double3: error flag
  ],
  indexes: [service],
});

`\{platformId\}-perf`

analytics.writeDataPoint({
  blobs: [
    featureId, // blob1: e.g. 'analytics'
    endpoint, // blob2: e.g. 'GET /api/data'
    String(status), // blob3: HTTP status code
    entityId, // blob4: tenant
  ],
  doubles: [1, durationMs, status >= 500 ? 1 : 0],
  indexes: [featureId],
});

3.3 Querying CFAE

NNO Portal queries CFAE via the Analytics Engine SQL API:

// Error rate for all features on a platform (last 24h)
const sql = `
  SELECT
    blob1                          AS feature_id,
    SUM(double1)                   AS total_requests,
    SUM(double3)                   AS error_count,
    AVG(double2)                   AS avg_duration_ms,
    quantileWeighted(0.95)(double2, double1) AS p95_duration_ms
  FROM   ${platformId}_perf
  WHERE  timestamp >= now() - INTERVAL '24' HOUR
  GROUP BY blob1
  ORDER BY total_requests DESC
`;

// Provisioning job success rate (last 7 days)
const sql = `
  SELECT
    blob2        AS operation,
    SUM(double1) AS total,
    SUM(double3) AS errors,
    ROUND(100.0 * SUM(double3) / SUM(double1), 2) AS error_pct
  FROM   nno_core_ops
  WHERE  blob1 = 'provisioning'
    AND  timestamp >= now() - INTERVAL '7' DAY
  GROUP BY blob2
`;

4. Distributed Tracing [Phase 1]

Cloudflare Workers do not support OpenTelemetry natively (no spans, no trace context propagation built-in). NNO implements a lightweight tracing model using HTTP headers and CFAE span events.

4.1 Trace ID Propagation

A x-trace-id header is generated at the NNO Gateway and propagated through every downstream service call:

Client request
    → NNO Gateway         x-trace-id: <uuid>  x-request-id: <uuid>  (generated here via crypto.randomUUID())
    → NNO Registry        x-trace-id: <uuid>  x-request-id: <uuid>  (forwarded via withTraceHeaders)
    → NNO Provisioning    x-trace-id: <uuid>  x-request-id: <uuid>  (forwarded)
    → CF API call         (external — trace stops)

If a request already carries x-trace-id (e.g., from the NNO CLI), it is preserved and used throughout.

// packages/logger/src/trace.ts

export function initTrace(request: Request): {
  traceId: string;
  requestId: string;
} {
  const traceId = request.headers.get("x-trace-id") ?? crypto.randomUUID();
  const requestId =
    request.headers.get("x-request-id") ?? crypto.randomUUID();
  return { traceId, requestId };
}

export function withTraceHeaders(
  headers: Headers | [string, string][] | Record<string, string> | undefined,
  traceId: string,
  requestId: string,
): Headers {
  const result = new Headers(headers);
  result.set("x-trace-id", traceId);
  result.set("x-request-id", requestId);
  return result;
}

Key differences from the previous documentation:

No global TRACE_STORE — there is no module-level Map. Trace context is passed explicitly via function arguments and Hono context, not stored globally.
No currentTrace() — this function does not exist. Services access traceId/requestId from the Hono context (c.get("traceId"), c.get("requestId")) or pass them explicitly.
initTrace uses crypto.randomUUID() (Web Crypto API, available in all CF Workers), not nanoid. IDs have no tr_/req_ prefix.
initTrace reads both x-trace-id and x-request-id headers from the incoming request, falling back to crypto.randomUUID() for each.
withTraceHeaders requires explicit traceId and requestId arguments — it does not read from a global store. It sets both x-trace-id and x-request-id on the outgoing headers.

4.2 CFAE Span Events [Phase 2]

Not yet implemented. The spans.ts file does not exist in packages/logger. The emitSpan function and CFAE span dataset are planned for Phase 2 alongside the broader Analytics Engine integration. The design below is the target specification.

For operations spanning multiple async steps (provisioning jobs, stack activation pipeline), NNO will emit span events to CFAE:

// packages/logger/src/spans.ts  (Phase 2 — not yet implemented)

export function emitSpan(
  analytics: AnalyticsEngineDataset,
  span: {
    traceId: string;
    spanId: string;
    parentSpanId?: string;
    service: string;
    operation: string;
    startMs: number;
    endMs: number;
    outcome: "ok" | "error";
    platformId?: string;
  },
): void {
  analytics.writeDataPoint({
    blobs: [
      span.traceId,
      span.spanId,
      span.parentSpanId ?? "",
      span.service,
      span.operation,
      span.outcome,
      span.platformId ?? "nno-internal",
    ],
    doubles: [
      span.endMs - span.startMs, // duration in ms
      span.outcome === "error" ? 1 : 0,
    ],
    indexes: [span.traceId],
  });
}

With indexes: [traceId], all spans for a given trace will be fetchable efficiently:

-- All spans for a given trace (Phase 2)
SELECT blob4 AS service, blob5 AS operation,
       double1 AS duration_ms, blob6 AS outcome,
       blob2 AS span_id, blob3 AS parent_span_id
FROM   nno_core_ops_spans
WHERE  indexes[0] = 'tr_abc123'
ORDER BY timestamp ASC

4.3 Trace Correlation with Logs

Because every log entry includes traceId, a single trace ID allows correlation of:

All log lines across all NNO services that handled the request
All CFAE span events for the operation
The Logpush records for that specific Cloudflare Ray ID

This gives a complete picture of a single user action across the entire stack without a dedicated tracing backend.

5. Key Metrics & SLOs [Phase 2]

5.1 NNO Core Service SLOs

Service	Metric	Target	Alert at
NNO Gateway	Request error rate (5xx)	< 0.1%	> 1%
NNO Gateway	p99 latency	< 500ms	> 1000ms
NNO Registry	Read p99 latency	< 100ms	> 300ms
NNO Registry	Write p99 latency	< 200ms	> 500ms
NNO Provisioning	Job success rate	> 99%	< 97%
NNO Provisioning	PROVISION_PLATFORM duration	< 120s	> 300s
NNO Provisioning	ACTIVATE_FEATURE duration	< 60s	> 180s
NNO CLI Service	Feature activation commit time	< 10s	> 30s
Stack Registry	Template publish p95	< 5s	> 15s
Stack Registry	Version validation duration	< 60s	> 300s

5.2 Platform Worker SLOs (per platform)

Metric	Target	Notes
Auth Worker p95 latency	< 200ms	Login, session validation
Feature Worker p95 latency	< 300ms	Per activated feature
Feature Worker error rate	< 0.5%	5xx responses
CF Pages build success rate	> 98%	Platform shell rebuild
CF Pages build duration	< 120s	Typical `pnpm install + build`

5.3 Platform Shell SLOs (client-facing)

Metric	Target	Notes
Shell TTFB (CF Pages CDN)	< 50ms	Static asset, fully cached
Auth session check latency	< 100ms	Cookie cache hit
Feature registry init time	< 50ms	Static imports, no network
Remote manifest fetch (Phase 2)	< 30ms	CDN-cached KV read

6. Alerting [Phase 2]

Alerts are sent via email (NNO email Worker) and optionally to a Slack webhook or PagerDuty.

6.1 Alert Configuration

// Stored in NNO Registry — per-platform alert config
interface AlertConfig {
  platformId: string;
  email: string; // platform admin email
  slackWebhookUrl?: string;
  pagerdutyKey?: string;
  alerts: {
    errorRateThreshold: number; // e.g. 0.05 = 5%
    latencyP99ThresholdMs: number; // e.g. 1000
    buildFailureAlert: boolean;
    provisioningFailureAlert: boolean;
    usageAlerts: boolean; // from billing metering
  };
}

6.2 Alert Categories

Category	Trigger	Severity	Default recipients
Platform Worker error spike	Feature Worker 5xx rate > 5% in 5-min window	High	Platform admin
Auth Worker down	Auth Worker returning 0 requests for 2 min	Critical	Platform admin + NNO ops
CF Pages build failure	Build exits non-zero	Medium	Platform admin
Provisioning job failed	Job reaches `FAILED` state	High	NNO ops team
Provisioning job timeout	Job running > 10 min	Medium	NNO ops team
Registry D1 latency	p99 read > 300ms for 5 min	High	NNO ops team
Stack Registry validation failure rate	> 20% of publish attempts in 1 hour	Low	NNO ops team
Usage threshold	50% / 75% / 90% / 100% of tier limit	Info → Critical	Platform admin

6.3 Alert Message Format

[NNO ALERT] Platform k3m9p2xw7q — analytics Worker error rate critical

Platform:  AcmeCorp (k3m9p2xw7q)
Feature:   analytics
Alert:     Worker error rate exceeded threshold
Threshold: 5%
Current:   12.4% (over last 5 minutes)
Time:      2026-02-22 10:34:00 UTC

Affected endpoints:
  POST /api/data/export — 45% error rate
  GET  /api/report      — 8% error rate

Action: Review logs at https://portal.nno.app/platforms/k3m9p2xw7q/observability
Trace a recent error: https://portal.nno.app/platforms/k3m9p2xw7q/logs?traceId=tr_abc123

—
NNO Platform Monitoring

7. Platform Admin Dashboard (NNO Portal) [Phase 2]

Platform admins access observability via NNO Portal → Observability (/platforms/\{id\}/observability).

7.1 Overview Tab

Health status — Traffic light per deployed Worker (auth, each feature) based on real-time error rate
Request volume — Chart: total requests/hour across all platform Workers (last 24h, Recharts)
Error rate — Chart: 5xx error rate per feature (last 24h)
Active builds — CF Pages build status card with link to CF dashboard

7.2 Feature Performance Tab

Per-feature breakdown:

Request rate (req/min), error rate (%), p50/p95/p99 latency
Top 10 slowest endpoints
Top 10 most-erroring endpoints

Data source: GET /api/observability/features?platformId=\{id\}&range=24h Backed by CFAE query on \{platformId\}-perf dataset.

7.3 Logs Tab

Search by: time range, feature, log level, trace ID, user ID, free-text
Log viewer — Paginated list of structured log entries, expandable to show full JSON
Download — Export log files as .json.gz from R2 (signed URL, 15-min expiry)

Data source: Logpush files in nno-logs-\{platformId\} R2 bucket, read via the NNO Portal backend.

Phase 1: Download only. Phase 2: Real-time search via a lightweight log indexing Worker.

7.4 Audit Log Tab

All auth events (logins, logouts, org changes) from audit_authentication table
All authorization decisions from audit_authorization table
Filterable by event type, user, result (success/failure)
90-day retention, paginated

Data source: GET /api/auth/admin/audit (auth Worker endpoint, platform-admin only)

8. NNO Operator Dashboard (Internal) [Phase 2]

NNO operators access a richer view via the NNO Portal's internal tools section (/internal/observability).

8.1 Cross-Platform Health

Fleet overview — One row per active platform: Worker count, aggregate error rate, last build status, subscription tier
Incident heatmap — Time × platform grid; cells coloured by error rate

8.2 Provisioning Monitor

Active provisioning jobs with live step progress
Failed jobs with rollback status and error details
Historical job completion times (p50/p95 per operation type)
CF API quota usage (Workers scripts deployed today vs. 200/day limit)

8.3 Stack Registry Pipeline

Submission queue depth (PENDING + IN_REVIEW counts)
Automated validation pass/fail rate (last 7 days)
Average review cycle time (submission → approval)
Recently approved/rejected packages

8.4 Trace Explorer

Search by traceId across all CFAE span datasets
Renders a waterfall diagram of span durations across services
Links to Logpush records for the same trace

9. Retention Policies

Data store	Retention	Reasoning
CFAE metrics (all datasets)	90 days rolling	CF Analytics Engine limit
Logpush to R2 (internal)	90 days	NNO debugging and audit
Logpush to R2 (per platform)	30 days	Platform admin access
Auth D1 audit tables	90 days	Compliance requirement
Registry D1 `audit_log`	1 year	Platform provisioning audit
Billing `usage_snapshots`	2 years	Invoice dispute resolution
Billing `invoices`	7 years	Legal/accounting requirement

R2 lifecycle rules are configured at bucket creation to auto-delete objects beyond their retention window.

10. Wrangler Configuration

CFAE Binding (all NNO Workers + platform Workers)

# Added to every NNO service and every provisioned platform Worker template

[[analytics_engine_datasets]]
binding = "ANALYTICS"
dataset = "nno-core-ops"       # NNO internal services

# Platform Workers use:
# dataset = "{platformId}-usage"   (invocation metering — billing)
# dataset = "{platformId}-perf"    (latency / error rate — observability)

R2 Bindings (NNO logging Worker)

# services/nno-logging/wrangler.toml

[[r2_buckets]]
binding     = "INTERNAL_LOGS"
bucket_name = "nno-logs-internal"

# Per-platform log buckets are created dynamically at provisioning time:
# bucket_name = "nno-logs-{platformId}"

Secrets

Secret	Description
`CF_API_TOKEN`	Token with Logpush:Edit permission (for job creation at provisioning)
`CF_ACCOUNT_ID`	Cloudflare account ID
`ALERT_EMAIL`	NNO ops team email for internal alerts
`ALERT_SLACK_WEBHOOK`	Slack webhook for NNO ops alerts
`PAGERDUTY_KEY`	PagerDuty routing key for critical alerts

§11 Implementation Phases

Phase 1 Current State

The following observability infrastructure is built and deployed:

Component	Status	Details
`packages/logger` — `Logger` class	✅ Live	Structured JSON log emission via `console.log`
`packages/logger` — `requestLogger` middleware	✅ Live	Hono middleware logging every HTTP request
`packages/logger` — `x-trace-id` propagation	✅ Live	Trace ID generated at gateway, forwarded to downstream services
`packages/logger` — `recordMetric` helper	✅ Live	Writes to CF Analytics Engine `nno_metrics` dataset (console fallback in dev)
Deployed in IAM	✅ Live	`services/iam` uses `Logger` + `requestLogger`
Deployed in Registry	✅ Live	`services/registry` uses `Logger` + `requestLogger`
Deployed in Billing	✅ Live	`services/billing` uses `Logger` + `requestLogger`
Deployed in Provisioning	✅ Live	`services/provisioning` uses `Logger` + `requestLogger`
Deployed in CLI Service	✅ Live	`services/cli` uses `Logger` + `requestLogger`
Deployed in Gateway	✅ Live	`services/gateway` has `requestIdMiddleware` + `tracingMiddleware` (via `@neutrino-io/core/tracing`)

The following are not yet implemented (Phase 2):

Cloudflare Logpush jobs and R2 log destinations
Cloudflare Analytics Engine (CFAE) dataset bindings and metric writes
SLO definitions and alert configuration
Platform admin observability dashboard (NNO Portal)
NNO operator internal dashboard

Implementation delta: Observability Phase 1 Plan.

Status: Partially implemented — Phase 1 structured logging deployed; Phase 2 Logpush/CFAE/dashboards planned Implementation target: packages/logger/ · services/logging/ · apps/console/ Related: System Architecture · NNO Provisioning · NNO Registry · NNO Billing & Metering · NNO Auth Model

NNO Observability

On this page