Launch

Monitoring

Ensure your Init application runs smoothly in production with comprehensive monitoring, error tracking, and performance insights. This guide covers observability strategies and implementation.

Overview

Production monitoring provides:

  • Real-time visibility - Monitor application health and performance
  • Error tracking - Catch and resolve issues before users are affected
  • Performance insights - Optimize slow queries and bottlenecks
  • User analytics - Understand usage patterns and behavior
  • Alerting - Get notified of critical issues immediately

Monitoring Stack

Application Performance Monitoring (APM)

Sentry provides comprehensive error tracking and performance monitoring:

// lib/sentry.ts
import * as Sentry from "@sentry/nextjs";

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: process.env.NODE_ENV,

  // Performance monitoring
  tracesSampleRate: process.env.NODE_ENV === "production" ? 0.1 : 1.0,

  // Session tracking
  replaysSessionSampleRate: 0.1,
  replaysOnErrorSampleRate: 1.0,

  integrations: [
    new Sentry.Integrations.Http({ tracing: true }),
    new Sentry.Integrations.Postgres(),
  ],
});

Configuration

Add to next.config.js:

const { withSentryConfig } = require("@sentry/nextjs");

const nextConfig = {
  // Your existing config
};

module.exports = withSentryConfig(nextConfig, {
  silent: true,
  org: "your-org",
  project: "init-app",

  widenClientFileUpload: true,
  transpileClientSDK: true,
  tunnelRoute: "/monitoring",
  hideSourceMaps: true,
  disableLogger: true,
});

Custom Error Context

// lib/error-tracking.ts
import * as Sentry from "@sentry/nextjs";

export function captureUserContext(user: { id: string; email: string }) {
  Sentry.setUser({
    id: user.id,
    email: user.email,
  });
}

export function captureTeamContext(team: { id: string; name: string }) {
  Sentry.setTag("team.id", team.id);
  Sentry.setTag("team.name", team.name);
}

export function captureError(error: Error, context?: Record<string, any>) {
  Sentry.withScope((scope) => {
    if (context) {
      scope.setContext("custom", context);
    }
    Sentry.captureException(error);
  });
}

Infrastructure Monitoring

Vercel Analytics

For apps deployed on Vercel:

// app/layout.tsx
import { Analytics } from "@vercel/analytics/react";
import { SpeedInsights } from "@vercel/speed-insights/next";

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <html lang="en">
      <body>
        {children}
        <Analytics />
        <SpeedInsights />
      </body>
    </html>
  );
}

Custom Health Checks

// app/api/health/route.ts
import { NextRequest } from "next/server";
import { db } from "@repo/db/drizzle-client";

export async function GET(request: NextRequest) {
  const checks = {
    timestamp: new Date().toISOString(),
    status: "healthy",
    version: process.env.npm_package_version || "unknown",
    environment: process.env.NODE_ENV,
    checks: {} as Record<string, any>,
  };

  try {
    // Database health check
    const dbStart = Date.now();
    await db.execute("SELECT 1");
    checks.checks.database = {
      status: "ok",
      responseTime: Date.now() - dbStart,
    };

    // External API health checks
    if (process.env.OPENAI_API_KEY) {
      checks.checks.openai = await checkOpenAI();
    }

    return Response.json(checks);
  } catch (error) {
    checks.status = "unhealthy";
    checks.checks.error = error.message;

    return Response.json(checks, { status: 503 });
  }
}

async function checkOpenAI() {
  try {
    const response = await fetch("https://api.openai.com/v1/models", {
      headers: { Authorization: `Bearer ${process.env.OPENAI_API_KEY}` },
      signal: AbortSignal.timeout(5000),
    });

    return {
      status: response.ok ? "ok" : "error",
      responseTime: response.headers.get("x-response-time") || "unknown",
    };
  } catch {
    return { status: "error" };
  }
}

Database Monitoring

Query Performance

Monitor slow queries with Supabase:

-- Enable query statistics
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1s
ALTER SYSTEM SET log_statement = 'all';
SELECT pg_reload_conf();

-- Create monitoring view
CREATE OR REPLACE VIEW slow_queries AS
SELECT
  query,
  calls,
  total_exec_time,
  mean_exec_time,
  rows,
  100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
WHERE mean_exec_time > 100 -- Queries averaging > 100ms
ORDER BY mean_exec_time DESC;

Database Health Metrics

// lib/db-monitoring.ts
import { db } from "@repo/db/drizzle-client";

export async function getDatabaseMetrics() {
  const [connectionStats] = await db.execute(`
    SELECT 
      COUNT(*) as total_connections,
      COUNT(*) FILTER (WHERE state = 'active') as active_connections,
      COUNT(*) FILTER (WHERE state = 'idle') as idle_connections
    FROM pg_stat_activity 
    WHERE datname = current_database()
  `);

  const [tableStats] = await db.execute(`
    SELECT 
      schemaname,
      tablename,
      n_tup_ins as inserts,
      n_tup_upd as updates,
      n_tup_del as deletes,
      seq_scan as sequential_scans,
      seq_tup_read as sequential_reads,
      idx_scan as index_scans,
      idx_tup_fetch as index_reads
    FROM pg_stat_user_tables
    ORDER BY n_tup_ins + n_tup_upd + n_tup_del DESC
    LIMIT 10
  `);

  return {
    connections: connectionStats,
    tables: tableStats,
    timestamp: new Date().toISOString(),
  };
}

API Monitoring

tRPC Middleware

Monitor API performance with custom middleware:

// packages/api/src/trpc.ts
const performanceMiddleware = t.middleware(async ({ next, path, type }) => {
  const start = Date.now();
  const result = await next();
  const duration = Date.now() - start;

  // Log slow requests
  if (duration > 1000) {
    console.warn(`Slow ${type} request: ${path} took ${duration}ms`);
  }

  // Send metrics to monitoring service
  if (process.env.NODE_ENV === "production") {
    await logMetric({
      name: "trpc.request.duration",
      value: duration,
      tags: {
        procedure: path,
        type,
        status: result.ok ? "success" : "error",
      },
    });
  }

  return result;
});

// Apply to all procedures
export const monitoredProcedure = t.procedure.use(performanceMiddleware);

Request Logging

// middleware.ts
import type { NextRequest } from "next/server";
import { NextResponse } from "next/server";

export function middleware(request: NextRequest) {
  const start = Date.now();
  const response = NextResponse.next();

  // Log request details
  response.headers.set("x-request-id", crypto.randomUUID());

  if (process.env.NODE_ENV === "production") {
    // Async logging (don't await)
    logRequest({
      method: request.method,
      url: request.url,
      userAgent: request.headers.get("user-agent"),
      timestamp: new Date().toISOString(),
      duration: Date.now() - start,
    });
  }

  return response;
}

async function logRequest(data: any) {
  try {
    // Send to your logging service
    await fetch("https://your-logging-service.com/api/logs", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(data),
    });
  } catch {
    // Fail silently to avoid affecting user experience
  }
}

User Experience Monitoring

Web Vitals

Track Core Web Vitals automatically:

// lib/analytics.ts
import { getCLS, getFCP, getFID, getLCP, getTTFB } from "web-vitals";

function sendToAnalytics(metric: any) {
  // Send to your analytics service
  if (typeof window !== "undefined") {
    navigator.sendBeacon(
      "/api/analytics",
      JSON.stringify({
        name: metric.name,
        value: metric.value,
        id: metric.id,
        url: window.location.href,
        timestamp: Date.now(),
      }),
    );
  }
}

// Measure all Web Vitals
getCLS(sendToAnalytics);
getFID(sendToAnalytics);
getFCP(sendToAnalytics);
getLCP(sendToAnalytics);
getTTFB(sendToAnalytics);

Custom Performance Metrics

// lib/performance.ts
export class PerformanceTracker {
  private static instance: PerformanceTracker;

  static getInstance() {
    if (!this.instance) {
      this.instance = new PerformanceTracker();
    }
    return this.instance;
  }

  trackPageLoad(pageName: string) {
    if (typeof window !== "undefined") {
      const navigation = performance.getEntriesByType(
        "navigation",
      )[0] as PerformanceNavigationTiming;

      this.sendMetric({
        name: "page.load",
        value: navigation.loadEventEnd - navigation.loadEventStart,
        tags: { page: pageName },
      });
    }
  }

  trackUserInteraction(action: string, target: string) {
    this.sendMetric({
      name: "user.interaction",
      value: 1,
      tags: { action, target },
    });
  }

  trackAPICall(
    endpoint: string,
    duration: number,
    status: "success" | "error",
  ) {
    this.sendMetric({
      name: "api.call",
      value: duration,
      tags: { endpoint, status },
    });
  }

  private sendMetric(metric: any) {
    // Buffer metrics and send in batches
    this.buffer.push(metric);

    if (this.buffer.length >= 10) {
      this.flush();
    }
  }

  private buffer: any[] = [];

  private async flush() {
    if (this.buffer.length === 0) return;

    try {
      await fetch("/api/metrics", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify(this.buffer),
      });
      this.buffer = [];
    } catch (error) {
      console.error("Failed to send metrics:", error);
    }
  }
}

// Usage in components
export function usePerformanceTracking() {
  const tracker = PerformanceTracker.getInstance();

  return {
    trackInteraction: tracker.trackUserInteraction.bind(tracker),
    trackPageLoad: tracker.trackPageLoad.bind(tracker),
  };
}

Alerting

Error Rate Alerts

// lib/alerts.ts
export async function checkErrorRate() {
  const last24h = new Date(Date.now() - 24 * 60 * 60 * 1000);

  const errorCount = await db.execute(
    `
    SELECT COUNT(*) as error_count
    FROM error_logs 
    WHERE created_at > $1
  `,
    [last24h],
  );

  const totalRequests = await db.execute(
    `
    SELECT COUNT(*) as total_requests
    FROM request_logs 
    WHERE created_at > $1
  `,
    [last24h],
  );

  const errorRate =
    errorCount.rows[0].error_count / totalRequests.rows[0].total_requests;

  if (errorRate > 0.05) {
    // 5% error rate threshold
    await sendAlert({
      type: "error_rate",
      message: `High error rate detected: ${(errorRate * 100).toFixed(2)}%`,
      severity: "high",
    });
  }
}

async function sendAlert(alert: any) {
  // Send to Slack, Discord, email, etc.
  await fetch(process.env.WEBHOOK_URL!, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      text: `🚨 **${alert.type}**: ${alert.message}`,
      severity: alert.severity,
    }),
  });
}

Performance Alerts

// lib/performance-alerts.ts
export async function checkPerformanceMetrics() {
  const metrics = await getAverageResponseTimes();

  const slowEndpoints = metrics.filter((m) => m.avg_response_time > 2000);

  if (slowEndpoints.length > 0) {
    await sendAlert({
      type: "performance",
      message: `Slow endpoints detected: ${slowEndpoints.map((e) => e.endpoint).join(", ")}`,
      data: slowEndpoints,
    });
  }
}

async function getAverageResponseTimes() {
  return db.execute(`
    SELECT 
      endpoint,
      AVG(response_time) as avg_response_time,
      COUNT(*) as request_count
    FROM request_logs 
    WHERE created_at > NOW() - INTERVAL '1 hour'
    GROUP BY endpoint
    HAVING AVG(response_time) > 1000
    ORDER BY avg_response_time DESC
  `);
}

Dashboards

Custom Dashboard

// app/admin/dashboard/page.tsx
import { getDashboardMetrics } from "@/lib/monitoring";

export default async function AdminDashboard() {
  const metrics = await getDashboardMetrics();

  return (
    <div className="p-6">
      <h1>System Dashboard</h1>

      <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6">
        <MetricCard
          title="Active Users"
          value={metrics.activeUsers}
          change="+12%"
        />
        <MetricCard
          title="Response Time"
          value={`${metrics.avgResponseTime}ms`}
          change="-5ms"
        />
        <MetricCard
          title="Error Rate"
          value={`${metrics.errorRate}%`}
          change="-0.2%"
        />
        <MetricCard
          title="Database Connections"
          value={metrics.dbConnections}
        />
      </div>

      <div className="mt-8 grid grid-cols-1 lg:grid-cols-2 gap-6">
        <RecentErrors errors={metrics.recentErrors} />
        <SlowQueries queries={metrics.slowQueries} />
      </div>
    </div>
  );
}

function MetricCard({ title, value, change }: any) {
  return (
    <div className="bg-white p-6 rounded-lg shadow">
      <h3 className="text-sm font-medium text-gray-500">{title}</h3>
      <div className="mt-2 flex items-baseline">
        <p className="text-2xl font-semibold text-gray-900">{value}</p>
        {change && (
          <p className={`ml-2 text-sm ${change.startsWith('+') ? 'text-red-600' : 'text-green-600'}`}>
            {change}
          </p>
        )}
      </div>
    </div>
  );
}

Metrics API

// app/api/admin/metrics/route.ts
import { getDatabaseMetrics } from "@/lib/db-monitoring";

export async function GET() {
  try {
    const [dbMetrics, errorStats, performanceStats] = await Promise.all([
      getDatabaseMetrics(),
      getErrorStats(),
      getPerformanceStats(),
    ]);

    return Response.json({
      database: dbMetrics,
      errors: errorStats,
      performance: performanceStats,
      timestamp: new Date().toISOString(),
    });
  } catch (error) {
    return Response.json({ error: "Failed to fetch metrics" }, { status: 500 });
  }
}

async function getErrorStats() {
  const last24h = new Date(Date.now() - 24 * 60 * 60 * 1000);

  return {
    totalErrors: 42,
    errorRate: 0.02,
    topErrors: [
      { message: "Database connection timeout", count: 15 },
      { message: "Invalid API key", count: 8 },
      { message: "Rate limit exceeded", count: 5 },
    ],
  };
}

async function getPerformanceStats() {
  return {
    avgResponseTime: 250,
    p95ResponseTime: 800,
    p99ResponseTime: 1500,
    slowestEndpoints: [
      { endpoint: "/api/chat", avgTime: 1200 },
      { endpoint: "/api/team/create", avgTime: 800 },
    ],
  };
}

Log Management

Structured Logging

// lib/logger.ts
export interface LogEntry {
  level: "info" | "warn" | "error" | "debug";
  message: string;
  timestamp: string;
  context?: Record<string, any>;
  userId?: string;
  teamId?: string;
  requestId?: string;
}

export class Logger {
  private context: Record<string, any> = {};

  setContext(context: Record<string, any>) {
    this.context = { ...this.context, ...context };
  }

  info(message: string, data?: any) {
    this.log("info", message, data);
  }

  warn(message: string, data?: any) {
    this.log("warn", message, data);
  }

  error(message: string, error?: Error | any) {
    this.log("error", message, error);
  }

  private log(level: LogEntry["level"], message: string, data?: any) {
    const entry: LogEntry = {
      level,
      message,
      timestamp: new Date().toISOString(),
      context: this.context,
    };

    if (data) {
      if (data instanceof Error) {
        entry.error = {
          name: data.name,
          message: data.message,
          stack: data.stack,
        };
      } else {
        entry.data = data;
      }
    }

    // Console output for development
    if (process.env.NODE_ENV === "development") {
      console.log(JSON.stringify(entry, null, 2));
    }

    // Send to logging service in production
    if (process.env.NODE_ENV === "production") {
      this.sendToLoggingService(entry);
    }
  }

  private async sendToLoggingService(entry: LogEntry) {
    try {
      // Implementation depends on your logging service
      // Examples: DataDog, LogRocket, CloudWatch, etc.
    } catch {
      // Fail silently to avoid affecting application
    }
  }
}

export const logger = new Logger();

// Usage
logger.setContext({ userId: "123", teamId: "abc" });
logger.info("User logged in");
logger.error("Database connection failed", error);

Environment Variables

# Monitoring
SENTRY_DSN=your-sentry-dsn
SENTRY_ORG=your-org
SENTRY_PROJECT=your-project

# Analytics
NEXT_PUBLIC_GA_MEASUREMENT_ID=G-XXXXXXXXXX
VERCEL_ANALYTICS_ID=your-vercel-analytics-id

# Alerting
WEBHOOK_URL=your-slack-webhook-url
ALERT_EMAIL=admin@your-domain.com

# Health Checks
HEALTH_CHECK_TOKEN=your-secret-token

Best Practices

1. Monitoring Strategy

  • Start simple - Begin with basic error tracking and health checks
  • Add gradually - Implement more sophisticated monitoring as you grow
  • Focus on user impact - Monitor metrics that affect user experience
  • Set meaningful thresholds - Avoid alert fatigue with smart thresholds

2. Performance Monitoring

  • Track Core Web Vitals for user experience
  • Monitor database query performance
  • Set up alerting for slow API endpoints
  • Use performance budgets for assets

3. Error Handling

  • Capture errors with sufficient context
  • Group similar errors to reduce noise
  • Set up escalation rules for critical errors
  • Monitor error trends over time

4. Data Privacy

  • Sanitize sensitive data from logs
  • Implement log retention policies
  • Use anonymous user IDs when possible
  • Comply with privacy regulations

Troubleshooting

High Error Rates

# Check recent errors
SELECT message, COUNT(*), MAX(created_at)
FROM error_logs
WHERE created_at > NOW() - INTERVAL '1 hour'
GROUP BY message
ORDER BY COUNT(*) DESC;

Performance Issues

# Find slow queries
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
WHERE mean_exec_time > 100
ORDER BY mean_exec_time DESC;

Memory Leaks

// Monitor memory usage
setInterval(() => {
  const memUsage = process.memoryUsage();
  console.log({
    rss: Math.round((memUsage.rss / 1024 / 1024) * 100) / 100,
    heapTotal: Math.round((memUsage.heapTotal / 1024 / 1024) * 100) / 100,
    heapUsed: Math.round((memUsage.heapUsed / 1024 / 1024) * 100) / 100,
    external: Math.round((memUsage.external / 1024 / 1024) * 100) / 100,
  });
}, 30000); // Every 30 seconds

Next Steps

After setting up monitoring:

  1. Production checklist - Complete the production readiness guide
  2. User analytics - Implement user behavior tracking
  3. Feedback collection - Set up user feedback systems
  4. Growth strategies - Plan for scaling and growth