Files
WizardMerge/.github/issues/10-testing-quality.md
2025-12-27 03:11:55 +00:00

11 KiB

title, labels, assignees, milestone
title labels assignees milestone
Phase 2.7: Comprehensive Testing & Quality Assurance
testing
quality
phase-2
high-priority
Phase 2 - Intelligence & Usability

Overview

Establish comprehensive testing infrastructure and quality assurance processes to ensure WizardMerge is reliable, performant, and correct. This includes unit tests, integration tests, performance benchmarks, and fuzzing.

Phase 2.7 - Testing & Quality

Motivation

As WizardMerge grows more complex with semantic merging, SDG analysis, and multi-platform support, we need:

  • Confidence that changes don't break existing functionality
  • Performance metrics to prevent regressions
  • Edge case coverage to handle real-world scenarios
  • Quality documentation and examples

Testing Strategy

1. Unit Tests

Coverage Target: >90% code coverage

Backend (C++)

  • Three-way merge algorithm

    • Test all merge cases (clean merge, conflicts, auto-resolution)
    • Test edge cases (empty files, binary files, large files)
    • Test different line endings (LF, CRLF)
  • Semantic mergers

    • JSON merger tests (objects, arrays, nested structures)
    • YAML merger tests (comments, anchors, multi-document)
    • XML merger tests (namespaces, attributes, DTD)
    • Package file merger tests (version conflicts, dependencies)
  • AST mergers

    • Python: imports, functions, classes
    • JavaScript: ES6 modules, React components
    • Java: classes, methods, annotations
    • C++: includes, namespaces, templates
  • SDG analysis

    • Dependency graph construction
    • Edge classification
    • Conflict analysis
    • Suggestion generation
  • Git integration

    • Git CLI operations
    • Repository detection
    • Branch operations
    • PR/MR fetching

Framework: Google Test (gtest)

// Example unit test
TEST(ThreeWayMergeTest, NonOverlappingChanges) {
  ThreeWayMerge merger;
  std::string base = "line1\nline2\nline3\n";
  std::string ours = "line1\nline2_modified\nline3\n";
  std::string theirs = "line1\nline2\nline3_modified\n";
  
  auto result = merger.merge(base, ours, theirs);
  
  ASSERT_TRUE(result.success);
  ASSERT_FALSE(result.has_conflicts);
  EXPECT_EQ(result.merged_content, "line1\nline2_modified\nline3_modified\n");
}

Frontends

Qt6 (C++):

  • UI component tests
  • QML integration tests
  • Model-view tests

Framework: Qt Test

Next.js (TypeScript):

  • Component tests (React Testing Library)
  • API client tests
  • Integration tests
  • E2E tests (Playwright or Cypress)

Framework: Jest, React Testing Library, Playwright

// Example component test
import { render, screen, fireEvent } from '@testing-library/react';
import ConflictPanel from './ConflictPanel';

test('renders conflict and resolves with "ours"', () => {
  const conflict = { id: 1, ours: 'code A', theirs: 'code B' };
  const onResolve = jest.fn();
  
  render(<ConflictPanel conflict={conflict} onResolve={onResolve} />);
  
  const oursButton = screen.getByText('Keep Ours');
  fireEvent.click(oursButton);
  
  expect(onResolve).toHaveBeenCalledWith(1, 'ours');
});

2. Integration Tests

Test interactions between components:

  • Backend + Git

    • Clone repo, create branch, commit changes
    • Fetch PR/MR data, apply merge, create branch
  • Backend + Frontend

    • API calls from UI
    • WebSocket updates
    • File upload/download
  • End-to-end scenarios

    • User resolves conflict via UI
    • CLI resolves PR conflicts
    • Batch resolution of multiple files

Framework:

  • C++: Integration test suite with real Git repos
  • Next.js: Playwright for E2E testing
// Example E2E test (Playwright)
test('resolve conflict via web UI', async ({ page }) => {
  await page.goto('http://localhost:3000');
  
  // Upload conflicted file
  await page.setInputFiles('input[type=file]', 'test_conflict.txt');
  
  // Wait for merge analysis
  await page.waitForSelector('.conflict-panel');
  
  // Click "Keep Ours"
  await page.click('button:has-text("Keep Ours")');
  
  // Verify resolution
  const resolved = await page.textContent('.merged-content');
  expect(resolved).toContain('code A');
  expect(resolved).not.toContain('<<<<<<<');
});

3. Performance Benchmarks

Goals:

  • Merge time: <100ms for files up to 10MB
  • API response: <500ms for typical PRs
  • UI rendering: <50ms for typical conflicts
  • SDG analysis: <500ms for files up to 2000 lines

Benchmark Suite:

// Benchmark framework: Google Benchmark
static void BM_ThreeWayMerge_SmallFile(benchmark::State& state) {
  std::string base = generate_file(100);  // 100 lines
  std::string ours = modify_lines(base, 10);
  std::string theirs = modify_lines(base, 10);
  
  ThreeWayMerge merger;
  for (auto _ : state) {
    auto result = merger.merge(base, ours, theirs);
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(BM_ThreeWayMerge_SmallFile);

static void BM_ThreeWayMerge_LargeFile(benchmark::State& state) {
  std::string base = generate_file(10000);  // 10k lines
  std::string ours = modify_lines(base, 100);
  std::string theirs = modify_lines(base, 100);
  
  ThreeWayMerge merger;
  for (auto _ : state) {
    auto result = merger.merge(base, ours, theirs);
    benchmark::DoNotOptimize(result);
  }
}
BENCHMARK(BM_ThreeWayMerge_LargeFile);

Metrics to Track:

  • Execution time (median, p95, p99)
  • Memory usage
  • CPU usage
  • Throughput (files/second)

Regression Detection:

  • Run benchmarks on every commit
  • Alert if performance degrades >10%
  • Track performance over time

4. Fuzzing

Find edge cases and bugs with fuzz testing:

Targets:

  • Three-way merge algorithm
  • JSON/YAML/XML parsers
  • Git URL parsing
  • API input validation

Framework: libFuzzer, AFL++, or OSS-Fuzz

// Example fuzz target
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
  std::string input(reinterpret_cast<const char*>(data), size);
  
  ThreeWayMerge merger;
  try {
    // Try to crash the merger with random input
    auto result = merger.merge(input, input, input);
  } catch (...) {
    // Catch exceptions to continue fuzzing
  }
  
  return 0;
}

Goals:

  • Find crashes and hangs
  • Discover edge cases not covered by unit tests
  • Improve input validation
  • Run continuously in CI

5. Test Data & Fixtures

Real-World Test Cases:

  • Collect conflicts from popular open-source projects
  • Build test dataset with various conflict types
  • Include edge cases (large files, binary files, unusual encodings)
  • Categorize by difficulty (simple, medium, complex)

Test Repositories:

tests/
├── fixtures/
│   ├── conflicts/
│   │   ├── simple/
│   │   │   ├── 01-non-overlapping.txt
│   │   │   ├── 02-identical-changes.txt
│   │   │   └── ...
│   │   ├── medium/
│   │   │   ├── 01-json-merge.json
│   │   │   ├── 02-python-imports.py
│   │   │   └── ...
│   │   └── complex/
│   │       ├── 01-sdg-analysis-needed.cpp
│   │       ├── 02-multi-file-dependencies.zip
│   │       └── ...
│   ├── repositories/
│   │   ├── test-repo-1/  # Git repo for integration tests
│   │   ├── test-repo-2/
│   │   └── ...
│   └── api-responses/
│       ├── github-pr-123.json
│       ├── gitlab-mr-456.json
│       └── ...

6. Continuous Integration

CI Pipeline:

# .github/workflows/ci.yml
name: CI

on: [push, pull_request]

jobs:
  test-backend:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build backend
        run: cd backend && ./build.sh
      - name: Run unit tests
        run: cd backend/build && ctest --output-on-failure
      - name: Upload coverage
        uses: codecov/codecov-action@v3
  
  test-frontend-nextjs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: oven-sh/setup-bun@v1
      - name: Install dependencies
        run: cd frontends/nextjs && bun install
      - name: Run tests
        run: cd frontends/nextjs && bun test
      - name: E2E tests
        run: cd frontends/nextjs && bun run test:e2e
  
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build backend
        run: cd backend && ./build.sh
      - name: Run benchmarks
        run: cd backend/build && ./benchmarks
      - name: Check for regressions
        run: python scripts/check_benchmark_regression.py
  
  fuzz:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build fuzzer
        run: cd backend && cmake -DFUZZING=ON . && make
      - name: Run fuzzer (5 minutes)
        run: ./backend/build/fuzzer -max_total_time=300

7. Code Quality Tools

  • Static Analysis: clang-tidy, cppcheck
  • Code Coverage: gcov, lcov (C++), Istanbul (JS)
  • Linting: cpplint (C++), ESLint (JS), Prettier
  • Memory Safety: Valgrind, AddressSanitizer
  • Security Scanning: CodeQL (already in use )
# Run all quality checks
./scripts/quality-check.sh

8. Documentation & Examples

  • API Documentation: Doxygen (C++), JSDoc (JS)
  • User Guide: Step-by-step examples
  • Developer Guide: Architecture, contributing
  • Example Conflicts: Tutorials for common scenarios
  • Video Demos: Screen recordings of key features

Implementation Steps

Phase 1: Unit Tests (3 weeks)

  • Set up test frameworks
  • Write unit tests for core algorithms
  • Achieve 80% code coverage
  • CI integration

Phase 2: Integration Tests (2 weeks)

  • Set up test repositories
  • Write integration tests
  • E2E tests for frontends
  • CI integration

Phase 3: Performance Benchmarks (1 week)

  • Set up benchmark framework
  • Write benchmark suite
  • Baseline measurements
  • Regression detection

Phase 4: Fuzzing (1 week)

  • Set up fuzzing infrastructure
  • Write fuzz targets
  • Run continuous fuzzing
  • Fix discovered issues

Phase 5: Quality Tools (1 week)

  • Integrate static analysis
  • Set up code coverage
  • Memory safety checks
  • CI integration

Phase 6: Documentation (2 weeks)

  • Generate API docs
  • Write user guide
  • Create examples
  • Video demos

Acceptance Criteria

  • >90% code coverage for backend
  • >80% code coverage for frontends
  • All unit tests pass
  • All integration tests pass
  • Performance benchmarks meet targets
  • Zero crashes from fuzzing (after fixes)
  • Documentation complete and accurate
  • CI pipeline green on all commits

Priority

HIGH - Quality and reliability are essential for user trust

Estimated Effort

10 weeks (can be done in parallel with feature development)

Dependencies

  • Core features implemented (Phase 1 and 2)
  • #TBD (Phase 2 completion)
  • #TBD (All feature implementation issues)

Success Metrics

  • 0 critical bugs in production
  • <1% test failure rate
  • 95% user satisfaction with stability
  • Performance targets met consistently

Test Coverage Goals

Component Coverage Target
Three-way merge 95%
Semantic mergers 90%
AST mergers 90%
SDG analysis 85%
Git integration 90%
API endpoints 95%
UI components 80%
Overall 90%