Files
metabuilder/frontends/nextjs/docs/ERROR_HANDLING.md
johndoe6345789 b253d582e5 feat: Phase 5.2 - Implement Error Boundaries with Retry Logic
Implement comprehensive error handling system for improved production reliability with error boundaries, automatic retry logic, and user-friendly error categorization.

## Features Added

### 1. RetryableErrorBoundary Component (NEW)
- Enhanced React error boundary with automatic retry logic
- Catches component errors and displays fallback UI
- Automatic retry for transient failures (network, timeout, 5xx)
- Exponential backoff between retries (1s → 2s → 4s → 8s max)
- Retry countdown display with progress indication
- Error categorization with visual indicators (icons, colors)
- User-friendly error messages based on error type
- Developer-friendly error details in development mode
- Support contact information in UI
- Configurable via props (maxAutoRetries, delays, support email)

### 2. Error Categorization System (ENHANCED)
- Automatic error categorization into 10 types:
  - Network (🌐): Network failures, offline, connection errors
  - Authentication (🔐): Auth/session errors (401)
  - Permission (🚫): Access denied (403)
  - Validation (⚠️): Invalid input (400)
  - Not Found (🔍): Resource not found (404)
  - Conflict (): Duplicate/conflict (409)
  - Rate Limit (⏱️): Too many requests (429)
  - Server (🖥️): Server errors (5xx)
  - Timeout (): Request timeout (408)
  - Unknown (⚠️): All other errors

- Automatic retry eligibility detection
- Suggested recovery actions per category
- Color-coded UI based on error type

### 3. Enhanced Error Reporting Service
- Error categorization with HTTP status code detection
- Pattern-based error type detection
- Retry eligibility determination
- Context-specific user messages
- Query errors by category
- Track error history (last 100 errors)
- Production monitoring hook (placeholder for Sentry/DataDog)

### 4. Async Error Boundary Utilities (NEW)
- withAsyncErrorBoundary(): Wrap async operations with retry logic
- fetchWithErrorBoundary(): Fetch with automatic retry
- tryAsyncOperation(): Safe async wrapper that never throws
- useAsyncErrorHandler(): React hook for async error handling
- Exponential backoff with configurable delays
- Timeout support
- Error reporting and callbacks

### 5. Root Layout Integration
- Wrapped Providers component with RetryableErrorBoundary
- Automatic error recovery at application root
- 3 automatic retry attempts with exponential backoff
- Support contact information displayed

## Files Created

1. frontends/nextjs/src/components/RetryableErrorBoundary.tsx
   - Main retryable error boundary component
   - ~450 lines with full error UI, retry logic, and categorization
   - withRetryableErrorBoundary() HOC for easy component wrapping

2. frontends/nextjs/src/lib/async-error-boundary.ts
   - Async operation wrappers with retry logic
   - ~200 lines with multiple utility functions
   - Integration with error reporting service

3. frontends/nextjs/docs/ERROR_HANDLING.md
   - Comprehensive error handling guide
   - 400+ lines of documentation
   - Usage examples, best practices, common scenarios
   - Error recovery strategies per category
   - API reference for all components and utilities

4. frontends/nextjs/src/lib/error-reporting.test.ts
   - 100+ lines of unit tests
   - Tests for error categorization
   - Tests for retry eligibility
   - Tests for user messages
   - Tests for error history and queries

## Files Modified

1. frontends/nextjs/src/lib/error-reporting.ts
   - Added ErrorCategory type with 10 categories
   - Added error categorization logic
   - Added retry eligibility detection
   - Added suggested action generation
   - Enhanced getUserMessage() with category-specific messages
   - Added getErrorsByCategory() and getRetryableErrors() methods
   - Added extractStatusCode() helper

2. frontends/nextjs/src/app/providers/providers-component.tsx
   - Wrapped children with RetryableErrorBoundary
   - Configured 3 automatic retries
   - Enabled support info display

## Key Behaviors

### Automatic Retry Flow
1. Component error occurs or async operation fails
2. Error is caught and categorized
3. If retryable (network, timeout, 5xx):
   - Schedule automatic retry with exponential backoff
   - Display countdown: "Retrying in Xs..."
   - Retry operation
4. If successful:
   - Reset error state, show success
5. If all retries exhausted:
   - Show error UI with manual retry button

### Error Message Examples
- Network Error: "Network error. Please check your internet connection and try again."
- Auth Error: "Your session has expired. Please log in again."
- Permission Error: "You do not have permission to perform this action."
- Rate Limit: "Too many requests. Please wait a moment and try again."
- Server Error: "A server error occurred. Our team has been notified. Please try again later."

### Retry Configuration
- Max Auto-Retries: 3
- Initial Delay: 1000ms
- Max Delay: 8000ms
- Backoff Multiplier: 2
- Retryable Codes: 408, 429, 500, 502, 503, 504

## Production Readiness

 Error categorization covers all common scenarios
 User messages are clear and actionable
 Retry logic uses proven exponential backoff
 Development mode shows full error details
 Production mode shows user-friendly messages
 Support contact information included
 Comprehensive documentation provided
 Unit tests for core categorization logic

## Migration Notes

Existing ErrorBoundary component remains unchanged for backward compatibility.
New RetryableErrorBoundary is recommended for:
- Root layout
- Admin tools (Schema Editor, Workflow Manager, Database Manager, Script Editor)
- API integration layers
- Dynamic component renderers

## Next Steps (Phase 5.3+)

1. Wrap admin tool packages with RetryableErrorBoundary
2. Add error boundaries around data table components
3. Integrate with Sentry/DataDog monitoring
4. Add error analytics dashboard
5. A/B test error messages for improvement

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-21 02:15:43 +00:00

16 KiB

Error Handling & Boundaries Guide

This document describes the comprehensive error handling system in MetaBuilder, including error boundaries, retry logic, error categorization, and recovery strategies.

Overview

MetaBuilder implements a production-grade error handling system with:

  • Error Boundaries: React Error Boundaries that catch and display component errors
  • Retryable Error Boundaries: Enhanced error boundaries with automatic retry for transient failures
  • Error Categorization: Automatic categorization of errors (network, auth, permission, etc.)
  • Retry Logic: Exponential backoff with configurable retry strategies
  • User-Friendly Messages: Context-specific error messages for different error types
  • Error Recovery: Suggested actions based on error category

Components

1. ErrorBoundary

Basic React error boundary component for catching component rendering errors.

Location: src/components/ErrorBoundary.tsx

Features:

  • Catches JavaScript errors in child component tree
  • Displays user-friendly error UI with technical details in dev mode
  • Manual retry and page reload buttons
  • Error count tracking
  • Error reporting integration

Usage:

import { ErrorBoundary } from '@/components/ErrorBoundary'

export function App() {
  return (
    <ErrorBoundary context={{ component: 'App' }}>
      <YourComponent />
    </ErrorBoundary>
  )
}

With HOC:

import { withErrorBoundary } from '@/components/ErrorBoundary'

const ProtectedComponent = withErrorBoundary(YourComponent)

2. RetryableErrorBoundary

Enhanced error boundary with automatic retry logic for transient failures.

Location: src/components/RetryableErrorBoundary.tsx

Features:

  • Catches component errors
  • Automatic retry for retryable errors (network, timeout, 5xx)
  • Exponential backoff between retries
  • Retry countdown display
  • Error categorization with visual indicators
  • Color-coded UI based on error type
  • Manual retry and page reload options
  • Support contact information
  • Development mode error details

Error Types with Visual Indicators:

Category Icon Color When Used
Network 🌐 Orange Network failures, offline
Authentication 🔐 Pink Auth/session errors (401, 403)
Permission 🚫 Red Access denied (403)
Validation ⚠️ Yellow Invalid input (400)
Not Found 🔍 Blue Resource not found (404)
Conflict Orange Duplicate/conflict (409)
Rate Limit ⏱️ Light Blue Too many requests (429)
Server 🖥️ Red Server errors (5xx)
Timeout Orange Request timeout (408)

Usage:

import { RetryableErrorBoundary } from '@/components/RetryableErrorBoundary'

export function AdminTools() {
  return (
    <RetryableErrorBoundary
      componentName="AdminTools"
      maxAutoRetries={3}
      initialRetryDelayMs={1000}
      maxRetryDelayMs={8000}
      showSupportInfo
      supportEmail="support@metabuilder.dev"
    >
      <SchemaEditor />
      <WorkflowManager />
    </RetryableErrorBoundary>
  )
}

With HOC:

import { withRetryableErrorBoundary } from '@/components/RetryableErrorBoundary'

const ProtectedComponent = withRetryableErrorBoundary(YourComponent, {
  componentName: 'AdminPanel',
  maxAutoRetries: 3,
})

Props:

interface RetryableErrorBoundaryProps {
  children: ReactNode
  fallback?: ReactNode                    // Custom fallback UI
  onError?: (error, errorInfo) => void   // Error callback
  context?: Record<string, unknown>      // Error reporting context
  maxAutoRetries?: number                // Max auto-retries (default: 3)
  initialRetryDelayMs?: number           // Initial retry delay (default: 1000ms)
  maxRetryDelayMs?: number               // Max retry delay (default: 8000ms)
  componentName?: string                 // Component name for debugging
  showSupportInfo?: boolean              // Show support contact (default: true)
  supportEmail?: string                  // Support email address
}

Error Categorization

The error reporting system automatically categorizes errors into 10 types:

Network

  • Indicators: "network", "fetch", "offline" in message
  • Retryable: Yes
  • Suggested Action: "Check your internet connection and try again"

Authentication

  • Indicators: 401 status, "auth", "unauthorized" in message
  • Retryable: No
  • Suggested Action: "Log in again or refresh your credentials"

Permission

  • Indicators: 403 status, "permission", "forbidden" in message
  • Retryable: No
  • Suggested Action: "Contact your administrator for access"

Validation

  • Indicators: 400 status, "validation", "invalid" in message
  • Retryable: No
  • Suggested Action: "Please verify your input and try again"

Not Found

  • Indicators: 404 status, "not found" in message
  • Retryable: No
  • Suggested Action: "The requested resource no longer exists"

Conflict

  • Indicators: 409 status, "conflict", "duplicate" in message
  • Retryable: No
  • Suggested Action: "This resource already exists. Please use a different name"

Rate Limit

  • Indicators: 429 status, "rate", "too many" in message
  • Retryable: Yes
  • Suggested Action: "Too many requests. Please wait a moment and try again"

Server

  • Indicators: 5xx status, "server" in message
  • Retryable: Yes (for 502, 503, 504; not 500)
  • Suggested Action: "The server is experiencing issues. Please try again later"

Timeout

  • Indicators: 408 status, "timeout" in message
  • Retryable: Yes
  • Suggested Action: "Request took too long. Please try again"

Unknown

  • Indicators: All other errors
  • Retryable: No
  • Suggested Action: "Please try again or contact support"

Error Reporting

ErrorReporting Service

The errorReporting singleton handles error categorization, reporting, and user-friendly messages.

Location: src/lib/error-reporting.ts

Key Methods:

// Report an error
const report = errorReporting.reportError(error, context)

// Get user-friendly message
const message = errorReporting.getUserMessage(error, category)

// Query errors
const allErrors = errorReporting.getErrors()
const networkErrors = errorReporting.getErrorsByCategory('network')
const retryableErrors = errorReporting.getRetryableErrors()

// Clear error history
errorReporting.clearErrors()

Error Report Structure:

interface ErrorReport {
  id: string                        // Unique error ID
  message: string                   // Error message
  code?: string                     // Error code (if applicable)
  statusCode?: number               // HTTP status code (if applicable)
  category: ErrorCategory           // Error category
  stack?: string                    // Stack trace
  context: ErrorReportContext       // Additional context
  timestamp: Date                   // When error occurred
  isDevelopment: boolean            // Development mode flag
  isRetryable: boolean              // Can this error be retried?
  suggestedAction?: string          // Suggested recovery action
}

Hook for Components:

import { useErrorReporting } from '@/lib/error-reporting'

export function MyComponent() {
  const { reportError, getUserMessage } = useErrorReporting()

  const handleError = (error: Error) => {
    const report = reportError(error, { component: 'MyComponent' })
    console.log(`Error: ${report.message}, Retryable: ${report.isRetryable}`)
  }
}

Async Error Boundary

Utilities for wrapping async operations with error boundaries, retry logic, and error reporting.

Location: src/lib/async-error-boundary.ts

Key Functions:

withAsyncErrorBoundary

Wraps an async operation with retry logic.

import { withAsyncErrorBoundary } from '@/lib/async-error-boundary'

try {
  const data = await withAsyncErrorBoundary(
    () => fetch('/api/data').then(r => r.json()),
    {
      maxRetries: 3,
      initialDelayMs: 100,
      maxDelayMs: 5000,
      timeoutMs: 10000,
      context: { action: 'fetchData' },
      onError: (error, attempt) => {
        console.log(`Attempt ${attempt} failed:`, error.message)
      },
      onRetry: (attempt, error) => {
        console.log(`Retrying attempt ${attempt}`)
      },
      onRetrySuccess: (attempt) => {
        console.log(`Succeeded after ${attempt} retries`)
      },
    }
  )
} catch (error) {
  console.error('All retries exhausted:', error)
}

fetchWithErrorBoundary

Fetch with automatic retry and error handling.

import { fetchWithErrorBoundary } from '@/lib/async-error-boundary'

const response = await fetchWithErrorBoundary('/api/data', {}, {
  maxRetries: 3,
  timeoutMs: 10000,
})

tryAsyncOperation

Safe async wrapper that never throws.

import { tryAsyncOperation } from '@/lib/async-error-boundary'

const result = await tryAsyncOperation(
  () => fetch('/api/data').then(r => r.json()),
  { maxRetries: 3 }
)

if (result.success) {
  console.log('Data:', result.data)
} else {
  console.error('Failed:', result.error)
}

useAsyncErrorHandler

Hook for React components.

import { useAsyncErrorHandler } from '@/lib/async-error-boundary'

export function MyComponent() {
  const { execute, fetchWithRetry, tryOperation } = useAsyncErrorHandler()

  const handleFetch = async () => {
    try {
      const result = await execute(
        () => fetch('/api/data').then(r => r.json()),
        { maxRetries: 3 }
      )
    } catch (error) {
      console.error('Failed:', error)
    }
  }
}

Retry Logic

Exponential Backoff Algorithm

The system uses exponential backoff with jitter to retry failed operations:

delay = min(initialDelay * (backoffMultiplier ^ attempt), maxDelay)

Default Configuration:

  • Initial Delay: 100ms
  • Max Delay: 5000ms
  • Backoff Multiplier: 2
  • Max Retries: 3

Example Retry Schedule:

  • Attempt 1: 100ms
  • Attempt 2: 200ms
  • Attempt 3: 400ms
  • Attempt 4: 800ms (then max 5000ms)

Retryable Status Codes

Only certain HTTP status codes trigger automatic retry:

  • 408: Request Timeout
  • 429: Too Many Requests (Rate Limit)
  • 500: Internal Server Error
  • 502: Bad Gateway
  • 503: Service Unavailable
  • 504: Gateway Timeout

Non-retryable codes (4xx except above): Return immediately without retry.

Best Practices

1. Wrap Root Components

Always wrap root layout components with error boundaries:

// Root layout
export default function RootLayout({ children }) {
  return (
    <html>
      <body>
        <RetryableErrorBoundary componentName="RootLayout">
          {children}
        </RetryableErrorBoundary>
      </body>
    </html>
  )
}

2. Granular Error Boundaries

Use smaller error boundaries around critical features:

export function AdminPanel() {
  return (
    <div>
      <RetryableErrorBoundary componentName="SchemaEditor">
        <SchemaEditor />
      </RetryableErrorBoundary>
      <RetryableErrorBoundary componentName="WorkflowManager">
        <WorkflowManager />
      </RetryableErrorBoundary>
    </div>
  )
}

3. Async Operations

Wrap async operations with error boundaries for better error handling:

const handleSave = async () => {
  try {
    const result = await withAsyncErrorBoundary(
      () => api.save(data),
      {
        maxRetries: 2,
        context: { action: 'save' },
        onError: (error) => {
          toast.error(errorReporting.getUserMessage(error))
        },
      }
    )
  } catch (error) {
    console.error('Save failed:', error)
  }
}

4. Development vs Production

Error details are automatically managed:

  • Development: Full error messages and stack traces shown
  • Production: User-friendly messages, technical details hidden

No code changes needed; set NODE_ENV=production to enable production mode.

5. Error Context

Always provide context for error reporting:

<RetryableErrorBoundary
  context={{
    userId: currentUser.id,
    tenantId: currentTenant.id,
    feature: 'schemaEditor',
  }}
>
  <SchemaEditor />
</RetryableErrorBoundary>

Error Recovery Strategies by Type

Network Errors

  • Automatic: Retry with exponential backoff
  • Manual: User clicks "Try Again"
  • If Persists: Show "Check your connection" and support contact

Authentication Errors

  • Automatic: No automatic retry
  • Manual: User logs in again
  • If Persists: Redirect to login page

Permission Errors

  • Automatic: No automatic retry
  • Manual: Contact administrator
  • If Persists: Show permission request UI

Validation Errors

  • Automatic: No automatic retry
  • Manual: User fixes input and retries
  • If Persists: Show validation error details

Rate Limit Errors

  • Automatic: Retry with longer exponential backoff
  • Manual: User waits and retries
  • If Persists: Show rate limit exceeded message

Server Errors

  • Automatic: Retry with exponential backoff
  • Manual: User clicks "Try Again"
  • If Persists: Show server error and support contact

Monitoring & Analytics

Error Tracking

Development:

const errors = errorReporting.getErrors()
const networkErrors = errorReporting.getErrorsByCategory('network')

Production:

// TODO: Implement monitoring integration (Sentry, DataDog, etc.)
// See sendToMonitoring() in error-reporting.ts

Error Statistics

Query errors by category:

const categories = ['network', 'auth', 'server', 'timeout']
categories.forEach(category => {
  const errors = errorReporting.getErrorsByCategory(category)
  console.log(`${category}: ${errors.length} errors`)
})

Common Error Scenarios

Scenario 1: Network Timeout

User Action: Click "Save"
↓
API Request Timeout (408)
↓
Error Category: timeout
↓
Is Retryable: Yes
↓
Action: Automatic retry in 1s
↓
Retry Succeeds: User sees success message
OR
All Retries Fail: Show "Request took too long" + retry button

Scenario 2: Permission Denied

User Action: Access Admin Panel
↓
API Returns 403 Forbidden
↓
Error Category: permission
↓
Is Retryable: No
↓
Show: "You do not have permission" + contact admin
↓
No Automatic Retry

Scenario 3: Server Error

User Action: Load Dashboard
↓
API Returns 503 Service Unavailable
↓
Error Category: server
↓
Is Retryable: Yes
↓
Action: Automatic retry with exponential backoff
↓
Shows: "Retrying in 2s..." countdown
↓
Success or Exhausted: User sees result

Testing Error Boundaries

Manual Testing

  1. Throw an error in component render:
if (someCondition) {
  throw new Error('Test error')
}
  1. Test async errors using withAsyncErrorBoundary:
const result = await withAsyncErrorBoundary(
  () => Promise.reject(new Error('Test error')),
  { maxRetries: 1 }
)
  1. Trigger different error categories:
  • Network: throw new Error('Network error')
  • Auth: throw new Error('401: Unauthorized')
  • Server: throw new Error('500: Internal Server Error')

Automated Testing

See e2e/ directory for Playwright tests covering:

  • Error boundary activation
  • Retry logic and countdown
  • Error categorization
  • Recovery actions

Future Enhancements

  • Integration with Sentry/DataDog for production monitoring
  • Error aggregation dashboard
  • Automatic error recovery rules
  • A/B testing of error messages
  • Error analytics and reporting
  • Integration with support ticketing system
  • Offline error queue for offline-first scenarios

API Reference

See inline documentation in:

  • src/components/ErrorBoundary.tsx - Basic error boundary
  • src/components/RetryableErrorBoundary.tsx - Retryable error boundary
  • src/lib/error-reporting.ts - Error reporting service
  • src/lib/async-error-boundary.ts - Async error utilities
  • src/lib/api/retry.ts - Low-level retry logic