System Status

Real-time monitoring dashboard displaying application health, uptime, cache performance, and external API status.

System Status Page (`app/system-status/`)

Real-time monitoring dashboard displaying application health, uptime, cache performance, and external API status.

Purpose

Provides operational visibility for:

Application uptime and availability
Cache performance metrics (hit rate, utilization)
External dependency status (NJ OGIS APIs)
Overall system health

Route

/system-status - System Status Monitoring Dashboard

Features

Real-Time Metrics

Application Uptime: Time since application started
System Status: Healthy / Degraded / Unhealthy
Cache Statistics: Size, utilization, hit rate, evictions
API Reachability: NJ OGIS API status and response times
Auto-Refresh: Updates every 30 seconds

Visual Indicators

Status Badge: Green (healthy), yellow (degraded), red (unhealthy)
Progress Bars: Cache utilization percentage
Charts: Hit rate trends over time (if implemented)
Icons: ✅ Reachable, ❌ Unreachable, ⚠️ Degraded

Cache Metrics

Displays cache performance metrics:

interface CacheMetrics {
  size: number; // Current items in cache
  maxSize: number; // Maximum capacity
  utilization: number; // Percentage full
  hits: number; // Total cache hits
  misses: number; // Total cache misses
  hitRate: number; // Hit percentage
  evictions: number; // LRU/LFU evictions
}

External API Status

Shows reachability of NJ State APIs:

Status: Reachable / Unreachable
Response Time: Average latency in milliseconds
Last Check: Timestamp of most recent health check
Error Message: Details if unreachable

UI Layout

┌─────────────────────────────────────┐
│ System Status Dashboard             │
├─────────────────────────────────────┤
│ Overall Status: ✅ Healthy          │
│ Uptime: 2d 5h 30m                   │
│                                     │
│ Cache Performance                   │
│ ├─ Size: 150 / 1000 (15%)          │
│ ├─ Hit Rate: 78%                    │
│ └─ Evictions: 23                    │
│                                     │
│ External APIs                       │
│ └─ NJ OGIS: ✅ Reachable (234ms)   │
│                                     │
│ Last Updated: 10:30:15              │
│ [Refresh Now] [Auto-refresh: ON]   │
└─────────────────────────────────────┘

Data Source

Page fetches data from /api/system-status endpoint:

async function fetchSystemStatus() {
  const response = await fetch("/api/system-status?checkApis=true");
  const data: HealthCheckResponse = await response.json();
  return data;
}

Auto-Refresh

Dashboard automatically refreshes every 30 seconds:

useEffect(() => {
  const interval = setInterval(() => {
    fetchAndUpdateStatus();
  }, 30000); // 30 seconds

  return () => clearInterval(interval);
}, []);

Users can:

Manually Refresh: Click "Refresh Now" button
Toggle Auto-Refresh: Turn on/off automatic updates
Adjust Interval: Configure refresh frequency (10s / 30s / 60s)

Status Levels

✅ Healthy

All systems operational:

Application running
Cache functioning normally
External APIs reachable
No errors or warnings

⚠️ Degraded

Partial functionality:

Application operational
External APIs unreachable (main features work via cache)
Cache nearing capacity (> 90%)

❌ Unhealthy

Critical issues:

Application not responding
Cache disabled or full
Persistent errors

Use Cases

Operations Monitoring

Ops team monitors dashboard for:

Early warning of API outages
Cache performance degradation
Application stability trends

Incident Response

During incidents, dashboard helps:

Quickly identify affected components
Verify fix deployment success
Track recovery progress

Performance Tuning

Developers use metrics to:

Optimize cache size and eviction strategy
Identify API performance bottlenecks
Validate configuration changes

Cache Performance Insights

Dashboard helps answer:

Is caching effective? High hit rate (> 70%) indicates good cache utilization
Is cache too small? High eviction rate suggests increasing cache size
Are there hot addresses? Uneven access patterns may benefit from different eviction strategy

Accessibility

Screen Reader Announcements: Status changes announced via ARIA live regions
Keyboard Navigation: Full keyboard access to all controls
Color + Icons: Not relying on color alone for status indication
Focus Management: Focus on critical status changes

Responsive Design

Mobile: Stacked metrics, simplified charts
Tablet: 2-column layout
Desktop: Full dashboard with detailed metrics and charts

Testing

Status page tested with:

Unit Tests: Component rendering with mock data
Integration Tests: Fetching from real API endpoint
E2E Tests: Playwright tests verifying dashboard updates

E2E Test Example:

test("should display system status", async ({ page }) => {
  await page.goto("/system-status");

  // Verify status displayed
  await expect(page.getByText(/system status/i)).toBeVisible();
  await expect(page.getByText(/healthy|degraded|unhealthy/i)).toBeVisible();

  // Verify metrics
  await expect(page.getByText(/uptime/i)).toBeVisible();
  await expect(page.getByText(/cache/i)).toBeVisible();

  // Test manual refresh
  await page.getByRole("button", { name: /refresh/i }).click();
  await expect(page.getByText(/updated/i)).toBeVisible();
});

Performance

Initial Load: < 1s (includes API health check)
Refresh: < 500ms (cached data + quick health check)
Auto-Refresh Impact: Minimal (background fetch)

Historical Data (Future Enhancement)

Potential additions:

Time Series Charts: Uptime trends, hit rate over time
Event Log: Record of status changes and incidents
Alerts: Email/Slack notifications for status changes
SLA Tracking: Uptime percentage for reporting

System Status API - Backend health check endpoint
Cache Adapter - Cache implementation and metrics
NJ API Adapters - External API integration
Error Types - Error handling and status mapping