15 KiB
Agent Development Guide for DBAL
This document provides guidance for AI agents and automated tools working with the DBAL codebase.
Architecture Philosophy
The DBAL is designed as a language-agnostic contract system that separates:
- API Definition (in YAML) - The source of truth
- Development Implementation (TypeScript) - Fast iteration, testing, debugging
- Production Implementation (C++) - Security, performance, isolation
- Shared Test Vectors - Guarantees behavioral consistency
Key Principles for Agents
1. API Contract is Source of Truth
Always start with the API definition when adding features:
1. Define entity in api/schema/entities/
2. Define operations in api/schema/operations/
3. Generate TypeScript types: python tools/codegen/gen_types.py
4. Generate C++ types: python tools/codegen/gen_types.py --lang=cpp
5. Implement in adapters
6. Add conformance tests
Never add fields, operations, or entities directly in TypeScript or C++ without updating the YAML schemas first.
2. TypeScript is for Development Speed
The TypeScript implementation prioritizes:
- Fast iteration - Quick to modify and test
- Rich ecosystem - npm packages, debugging tools
- Easy prototyping - Try ideas quickly
Use TypeScript for:
- New feature development
- Schema iteration
- Integration testing
- Developer debugging
3. C++ is for Production Security
The C++ implementation prioritizes:
- Security - Process isolation, sandboxing, no user code execution
- Performance - Optimized queries, connection pooling
- Stability - Static typing, memory safety
- Auditability - All operations logged
C++ daemon provides:
- Credential protection (user code never sees DB URLs/passwords)
- Query validation and sanitization
- Row-level security enforcement
- Resource limits and quotas
4. Conformance Tests Guarantee Parity
Every operation must have conformance tests that run against both implementations:
# common/contracts/conformance_cases.yaml
- name: "User CRUD operations"
setup:
- create_user:
username: "testuser"
email: "test@example.com"
tests:
- create:
entity: Post
input: { title: "Test", author_id: "$setup.user.id" }
expect: { status: "success" }
- read:
entity: Post
input: { id: "$prev.id" }
expect: { title: "Test" }
CI/CD runs these tests on both TypeScript and C++ implementations. If they diverge, the build fails.
Development Workflow for Agents
Adding a New Entity
# 1. Create entity schema
cat > api/schema/entities/comment.yaml << EOF
entity: Comment
version: "1.0"
fields:
id: { type: uuid, primary: true, generated: true }
content: { type: text, required: true }
post_id: { type: uuid, required: true, foreign_key: { entity: Post, field: id } }
author_id: { type: uuid, required: true }
created_at: { type: datetime, generated: true }
EOF
# 2. Create operations
cat > api/schema/operations/comment.ops.yaml << EOF
operations:
create:
input: [content, post_id, author_id]
output: Comment
acl_required: ["comment:create"]
list:
input: [post_id]
output: Comment[]
acl_required: ["comment:read"]
EOF
# 3. Generate types
python tools/codegen/gen_types.py
# 4. Implement adapters (both TS and C++)
# - ts/src/adapters/prisma/mapping.ts
# - cpp/src/adapters/prisma/prisma_adapter.cpp
# 5. Add conformance tests
cat > common/contracts/comment_tests.yaml << EOF
- name: "Comment CRUD"
operations:
- action: create
entity: Comment
input: { content: "Great post!", post_id: "post_1", author_id: "user_1" }
expected: { status: success }
EOF
# 6. Run conformance
python tools/conformance/run_all.py
Modifying an Existing Entity
# 1. Update YAML schema
vim api/schema/entities/user.yaml
# Add: avatar_url: { type: string, optional: true }
# 2. Regenerate types
python tools/codegen/gen_types.py
# 3. Create migration (if using Prisma)
cd backends/prisma
npx prisma migrate dev --name add_avatar_url
# 4. Update adapters to handle new field
# Both ts/src/adapters/prisma/mapping.ts and C++ version
# 5. Add tests
# Update common/contracts/user_tests.yaml
# 6. Verify conformance
python tools/conformance/run_all.py
Adding a Backend Adapter
# 1. Define capabilities
cat > api/schema/capabilities.yaml << EOF
adapters:
mongodb:
transactions: true
joins: false
full_text_search: true
ttl: true
EOF
# 2. Create TypeScript adapter
mkdir -p ts/src/adapters/mongodb
cat > ts/src/adapters/mongodb/index.ts << EOF
export class MongoDBAdapter implements DBALAdapter {
async create(entity: string, data: any): Promise<any> {
// Implementation
}
}
EOF
# 3. Create C++ adapter
mkdir -p cpp/src/adapters/mongodb
# Implement MongoDBAdapter class
# 4. Register adapter
# Update ts/src/core/client.ts and cpp/src/client.cpp
# 5. Test conformance
python tools/conformance/run_all.py --adapter=mongodb
File Organization Rules
api/ (Language-Agnostic Contracts)
api/
├── schema/
│ ├── entities/ # One file per entity
│ │ ├── user.yaml
│ │ ├── post.yaml
│ │ └── comment.yaml
│ ├── operations/ # One file per entity
│ │ ├── user.ops.yaml
│ │ ├── post.ops.yaml
│ │ └── comment.ops.yaml
│ ├── errors.yaml # Single file for all errors
│ └── capabilities.yaml # Single file for all adapter capabilities
Rules:
- One entity per file
- Use lowercase with underscores for filenames
- Version every entity (semantic versioning)
- Document breaking changes in comments
ts/ (TypeScript Implementation)
ts/src/
├── core/ # Core abstractions
│ ├── client.ts # Main DBAL client
│ ├── types.ts # Generated from YAML
│ └── errors.ts # Error classes
├── adapters/ # One directory per backend
│ ├── prisma/
│ ├── sqlite/
│ └── mongodb/
├── query/ # Query builder (backend-agnostic)
└── runtime/ # Config, secrets, telemetry
Rules:
- Keep files under 300 lines
- One class per file
- Use barrel exports (index.ts)
- No circular dependencies
cpp/ (C++ Implementation)
cpp/
├── include/dbal/ # Public headers
├── src/ # Implementation
├── tests/ # Tests
└── CMakeLists.txt
Rules:
- Header guards:
#ifndef DBAL_CLIENT_HPP - Namespace:
dbal:: - Use modern C++17 features
- RAII for resource management
common/ (Shared Test Vectors)
common/
├── fixtures/ # Sample data
│ ├── seed/
│ └── datasets/
├── golden/ # Expected results
└── contracts/ # Conformance test definitions
├── user_tests.yaml
├── post_tests.yaml
└── conformance_cases.yaml
Rules:
- YAML for test definitions
- JSON for fixtures
- One test suite per entity
- Include edge cases
Code Generation
Automated Type Generation
The DBAL uses Python scripts to generate TypeScript and C++ types from YAML schemas:
# tools/codegen/gen_types.py
def generate_typescript_types(schema_dir: Path, output_file: Path):
"""Generate TypeScript interfaces from YAML schemas"""
def generate_cpp_types(schema_dir: Path, output_dir: Path):
"""Generate C++ structs from YAML schemas"""
When to regenerate:
- After modifying any YAML in
api/schema/ - Before running tests
- As part of CI/CD pipeline
Manual Code vs Generated Code
Generated (Never edit manually):
ts/src/core/types.ts- Entity interfacests/src/core/errors.ts- Error classescpp/include/dbal/types.hpp- Entity structscpp/include/dbal/errors.hpp- Error types
Manual (Safe to edit):
- Adapter implementations
- Query builder
- Client facade
- Utility functions
Testing Strategy
1. Unit Tests (Per Implementation)
# TypeScript
cd ts && npm run test:unit
# C++
cd cpp && ./build/tests/unit_tests
Test individual functions and classes in isolation.
2. Integration Tests (Per Implementation)
# TypeScript
cd ts && npm run test:integration
# C++
cd cpp && ./build/tests/integration_tests
Test adapters against real databases (with Docker).
3. Conformance Tests (Cross-Implementation)
# Both implementations
python tools/conformance/run_all.py
Critical: These must pass for both TS and C++. If they diverge, it's a bug.
4. Security Tests (C++ Only)
cd cpp && ./build/tests/security_tests
Test sandboxing, ACL enforcement, SQL injection prevention.
Security Considerations for Agents
What NOT to Do
❌ Never expose database credentials to user code ❌ Never allow user code to construct raw SQL queries ❌ Never skip ACL checks ❌ Never trust user input without validation ❌ Never log sensitive data (passwords, tokens, PII)
What TO Do
✅ Always validate input against schema ✅ Always enforce row-level security ✅ Always use parameterized queries ✅ Always log security-relevant operations ✅ Always test with malicious input
Sandboxing Requirements (C++ Daemon)
The C++ daemon must:
- Run with minimal privileges (drop root, use dedicated user)
- Restrict file system access (no write outside /var/lib/dbal/)
- Limit network access (only to DB, no outbound internet)
- Enforce resource limits (CPU, memory, connections)
- Validate all RPC calls (schema conformance, ACL checks)
ACL Enforcement
Every operation must check:
// C++ daemon
bool DBALDaemon::authorize(const Request& req) {
User user = req.user();
string entity = req.entity();
string operation = req.operation();
// 1. Check entity-level permission
if (!acl_.hasPermission(user, entity, operation)) {
return false;
}
// 2. Apply row-level filter
if (operation == "update" || operation == "delete") {
return acl_.canAccessRow(user, entity, req.id());
}
return true;
}
CI/CD Integration
GitHub Actions Workflow
name: DBAL CI/CD
on: [push, pull_request]
jobs:
typescript:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: cd dbal/development && npm ci
- run: npm run test:unit
- run: npm run test:integration
cpp:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: cd dbal/production && cmake -B build && cmake --build build
- run: ./build/tests/unit_tests
- run: ./build/tests/integration_tests
conformance:
needs: [typescript, cpp]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: python dbal/shared/tools/conformance/run_all.py
Pre-commit Hooks
# .git/hooks/pre-commit
#!/bin/bash
cd dbal/shared/api/schema
if git diff --cached --name-only | grep -q "\.yaml$"; then
echo "YAML schema changed, regenerating types..."
python ../../tools/codegen/gen_types.py
git add ../ts/src/core/types.ts
git add ../cpp/include/dbal/types.hpp
fi
Deployment Architecture
Development Environment
┌─────────────────┐
│ Spark App (TS) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ DBAL Client (TS)│
└────────┬────────┘
│ (direct)
▼
┌─────────────────┐
│ Prisma Client │
└────────┬────────┘
│
▼
┌─────────────────┐
│ SQLite / DB │
└─────────────────┘
Production Environment
┌─────────────────┐
│ Spark App (TS) │
└────────┬────────┘
│ gRPC
▼
┌─────────────────┐
│ DBAL Client (TS)│
└────────┬────────┘
│ gRPC/WS
▼
┌─────────────────┐ ┌─────────────────┐
│ DBAL Daemon(C++)│────▶│ Network Policy │
│ [Sandboxed] │ │ (Firewall) │
└────────┬────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Prisma Client │
└────────┬────────┘
│
▼
┌─────────────────┐
│ PostgreSQL │
└─────────────────┘
Docker Compose Example
version: '3.8'
services:
dbal-daemon:
build: ./dbal/production
container_name: dbal-daemon
ports:
- "50051:50051"
environment:
- DBAL_MODE=production
- DBAL_SANDBOX=strict
- DATABASE_URL=postgresql://user:pass@postgres:5432/db
volumes:
- ./config:/config:ro
security_opt:
- no-new-privileges:true
read_only: true
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE
postgres:
image: postgres:15
container_name: dbal-postgres
environment:
- POSTGRES_PASSWORD=secure_password
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- internal
networks:
internal:
internal: true
volumes:
postgres-data:
Troubleshooting for Agents
Problem: Types out of sync with schema
Solution:
python tools/codegen/gen_types.py
Problem: Conformance tests failing
Diagnosis:
# Run verbose
python tools/conformance/run_all.py --verbose
# Compare outputs
diff common/golden/ts_results.json common/golden/cpp_results.json
Problem: C++ daemon won't start in production
Check:
- Permissions:
ls -la /var/lib/dbal/ - Ports:
netstat -tlnp | grep 50051 - Logs:
journalctl -u dbal-daemon - Database connectivity:
nc -zv postgres 5432
Problem: Security audit failing
Review:
- No hardcoded secrets
- All queries use parameters
- ACL checks on every operation
- Audit logs enabled
Best Practices Summary
- ✅ Schema first - Define in YAML, generate code
- ✅ Test both - TS and C++ must pass conformance tests
- ✅ Security by default - ACL on every operation
- ✅ Documentation - Update README when adding features
- ✅ Versioning - Semantic versioning for API changes
- ✅ Backward compatibility - Support N-1 versions
- ✅ Fail fast - Validate early, error clearly
- ✅ Audit everything - Log security-relevant operations
- ✅ Principle of least privilege - Minimal permissions
- ✅ Defense in depth - Multiple layers of security
Resources
- API Schema Reference: api/schema/README.md
- TypeScript Guide: ts/README.md
- C++ Guide: cpp/README.md
- Security Guide: docs/SECURITY.md
- Contributing: docs/CONTRIBUTING.md