mirror of
https://github.com/johndoe6345789/metabuilder.git
synced 2026-04-24 22:04:56 +00:00
605 lines
15 KiB
Markdown
605 lines
15 KiB
Markdown
# Agent Development Guide for DBAL
|
|
|
|
This document provides guidance for AI agents and automated tools working with the DBAL codebase.
|
|
|
|
## Architecture Philosophy
|
|
|
|
The DBAL is designed as a **language-agnostic contract system** that separates:
|
|
|
|
1. **API Definition** (in YAML) - The source of truth
|
|
2. **Development Implementation** (TypeScript) - Fast iteration, testing, debugging
|
|
3. **Production Implementation** (C++) - Security, performance, isolation
|
|
4. **Shared Test Vectors** - Guarantees behavioral consistency
|
|
|
|
## Key Principles for Agents
|
|
|
|
### 1. API Contract is Source of Truth
|
|
|
|
**Always start with the API definition** when adding features:
|
|
|
|
```
|
|
1. Define entity in api/schema/entities/
|
|
2. Define operations in api/schema/operations/
|
|
3. Generate TypeScript types: python tools/codegen/gen_types.py
|
|
4. Generate C++ types: python tools/codegen/gen_types.py --lang=cpp
|
|
5. Implement in adapters
|
|
6. Add conformance tests
|
|
```
|
|
|
|
**Never** add fields, operations, or entities directly in TypeScript or C++ without updating the YAML schemas first.
|
|
|
|
### 2. TypeScript is for Development Speed
|
|
|
|
The TypeScript implementation prioritizes:
|
|
- **Fast iteration** - Quick to modify and test
|
|
- **Rich ecosystem** - npm packages, debugging tools
|
|
- **Easy prototyping** - Try ideas quickly
|
|
|
|
Use TypeScript for:
|
|
- New feature development
|
|
- Schema iteration
|
|
- Integration testing
|
|
- Developer debugging
|
|
|
|
### 3. C++ is for Production Security
|
|
|
|
The C++ implementation prioritizes:
|
|
- **Security** - Process isolation, sandboxing, no user code execution
|
|
- **Performance** - Optimized queries, connection pooling
|
|
- **Stability** - Static typing, memory safety
|
|
- **Auditability** - All operations logged
|
|
|
|
C++ daemon provides:
|
|
- Credential protection (user code never sees DB URLs/passwords)
|
|
- Query validation and sanitization
|
|
- Row-level security enforcement
|
|
- Resource limits and quotas
|
|
|
|
### 4. Conformance Tests Guarantee Parity
|
|
|
|
Every operation **must** have conformance tests that run against both implementations:
|
|
|
|
```yaml
|
|
# common/contracts/conformance_cases.yaml
|
|
- name: "User CRUD operations"
|
|
setup:
|
|
- create_user:
|
|
username: "testuser"
|
|
email: "test@example.com"
|
|
tests:
|
|
- create:
|
|
entity: Post
|
|
input: { title: "Test", author_id: "$setup.user.id" }
|
|
expect: { status: "success" }
|
|
- read:
|
|
entity: Post
|
|
input: { id: "$prev.id" }
|
|
expect: { title: "Test" }
|
|
```
|
|
|
|
CI/CD runs these tests on **both** TypeScript and C++ implementations. If they diverge, the build fails.
|
|
|
|
## Development Workflow for Agents
|
|
|
|
### Adding a New Entity
|
|
|
|
```bash
|
|
# 1. Create entity schema
|
|
cat > api/schema/entities/comment.yaml << EOF
|
|
entity: Comment
|
|
version: "1.0"
|
|
fields:
|
|
id: { type: uuid, primary: true, generated: true }
|
|
content: { type: text, required: true }
|
|
post_id: { type: uuid, required: true, foreign_key: { entity: Post, field: id } }
|
|
author_id: { type: uuid, required: true }
|
|
created_at: { type: datetime, generated: true }
|
|
EOF
|
|
|
|
# 2. Create operations
|
|
cat > api/schema/operations/comment.ops.yaml << EOF
|
|
operations:
|
|
create:
|
|
input: [content, post_id, author_id]
|
|
output: Comment
|
|
acl_required: ["comment:create"]
|
|
list:
|
|
input: [post_id]
|
|
output: Comment[]
|
|
acl_required: ["comment:read"]
|
|
EOF
|
|
|
|
# 3. Generate types
|
|
python tools/codegen/gen_types.py
|
|
|
|
# 4. Implement adapters (both TS and C++)
|
|
# - ts/src/adapters/prisma/mapping.ts
|
|
# - cpp/src/adapters/prisma/prisma_adapter.cpp
|
|
|
|
# 5. Add conformance tests
|
|
cat > common/contracts/comment_tests.yaml << EOF
|
|
- name: "Comment CRUD"
|
|
operations:
|
|
- action: create
|
|
entity: Comment
|
|
input: { content: "Great post!", post_id: "post_1", author_id: "user_1" }
|
|
expected: { status: success }
|
|
EOF
|
|
|
|
# 6. Run conformance
|
|
python tools/conformance/run_all.py
|
|
```
|
|
|
|
### Modifying an Existing Entity
|
|
|
|
```bash
|
|
# 1. Update YAML schema
|
|
vim api/schema/entities/user.yaml
|
|
# Add: avatar_url: { type: string, optional: true }
|
|
|
|
# 2. Regenerate types
|
|
python tools/codegen/gen_types.py
|
|
|
|
# 3. Regenerate Prisma schema + create migration (if using Prisma)
|
|
node ../../shared/tools/codegen/gen_prisma_schema.js
|
|
npx prisma migrate dev --schema ../../prisma/schema.prisma --name add_avatar_url
|
|
|
|
# 4. Update adapters to handle new field
|
|
# Both ts/src/adapters/prisma/mapping.ts and C++ version
|
|
|
|
# 5. Add tests
|
|
# Update common/contracts/user_tests.yaml
|
|
|
|
# 6. Verify conformance
|
|
python tools/conformance/run_all.py
|
|
```
|
|
|
|
### Adding a Backend Adapter
|
|
|
|
```bash
|
|
# 1. Define capabilities
|
|
cat > api/schema/capabilities.yaml << EOF
|
|
adapters:
|
|
mongodb:
|
|
transactions: true
|
|
joins: false
|
|
full_text_search: true
|
|
ttl: true
|
|
EOF
|
|
|
|
# 2. Create TypeScript adapter
|
|
mkdir -p ts/src/adapters/mongodb
|
|
cat > ts/src/adapters/mongodb/index.ts << EOF
|
|
export class MongoDBAdapter implements DBALAdapter {
|
|
async create(entity: string, data: any): Promise<any> {
|
|
// Implementation
|
|
}
|
|
}
|
|
EOF
|
|
|
|
# 3. Create C++ adapter
|
|
mkdir -p cpp/src/adapters/mongodb
|
|
# Implement MongoDBAdapter class
|
|
|
|
# 4. Register adapter
|
|
# Update ts/src/core/client.ts and cpp/src/client.cpp
|
|
|
|
# 5. Test conformance
|
|
python tools/conformance/run_all.py --adapter=mongodb
|
|
```
|
|
|
|
## File Organization Rules
|
|
|
|
### api/ (Language-Agnostic Contracts)
|
|
|
|
```
|
|
api/
|
|
├── schema/
|
|
│ ├── entities/ # One file per entity
|
|
│ │ ├── user.yaml
|
|
│ │ ├── post.yaml
|
|
│ │ └── comment.yaml
|
|
│ ├── operations/ # One file per entity
|
|
│ │ ├── user.ops.yaml
|
|
│ │ ├── post.ops.yaml
|
|
│ │ └── comment.ops.yaml
|
|
│ ├── errors.yaml # Single file for all errors
|
|
│ └── capabilities.yaml # Single file for all adapter capabilities
|
|
```
|
|
|
|
**Rules:**
|
|
- One entity per file
|
|
- Use lowercase with underscores for filenames
|
|
- Version every entity (semantic versioning)
|
|
- Document breaking changes in comments
|
|
|
|
### ts/ (TypeScript Implementation)
|
|
|
|
```
|
|
ts/src/
|
|
├── core/ # Core abstractions
|
|
│ ├── client.ts # Main DBAL client
|
|
│ ├── types.ts # Generated from YAML
|
|
│ └── errors.ts # Error classes
|
|
├── adapters/ # One directory per backend
|
|
│ ├── prisma/
|
|
│ ├── sqlite/
|
|
│ └── mongodb/
|
|
├── query/ # Query builder (backend-agnostic)
|
|
└── runtime/ # Config, secrets, telemetry
|
|
```
|
|
|
|
**Rules:**
|
|
- Keep files under 300 lines
|
|
- One class per file
|
|
- Use barrel exports (index.ts)
|
|
- No circular dependencies
|
|
|
|
### cpp/ (C++ Implementation)
|
|
|
|
```
|
|
cpp/
|
|
├── include/dbal/ # Public headers
|
|
├── src/ # Implementation
|
|
├── tests/ # Tests
|
|
└── CMakeLists.txt
|
|
```
|
|
|
|
**Rules:**
|
|
- Header guards: `#ifndef DBAL_CLIENT_HPP`
|
|
- Namespace: `dbal::`
|
|
- Use modern C++17 features
|
|
- RAII for resource management
|
|
|
|
### common/ (Shared Test Vectors)
|
|
|
|
```
|
|
common/
|
|
├── fixtures/ # Sample data
|
|
│ ├── seed/
|
|
│ └── datasets/
|
|
├── golden/ # Expected results
|
|
└── contracts/ # Conformance test definitions
|
|
├── user_tests.yaml
|
|
├── post_tests.yaml
|
|
└── conformance_cases.yaml
|
|
```
|
|
|
|
**Rules:**
|
|
- YAML for test definitions
|
|
- JSON for fixtures
|
|
- One test suite per entity
|
|
- Include edge cases
|
|
|
|
## Code Generation
|
|
|
|
### Automated Type Generation
|
|
|
|
The DBAL uses Python scripts to generate TypeScript and C++ types from YAML schemas:
|
|
|
|
```python
|
|
# tools/codegen/gen_types.py
|
|
def generate_typescript_types(schema_dir: Path, output_file: Path):
|
|
"""Generate TypeScript interfaces from YAML schemas"""
|
|
|
|
def generate_cpp_types(schema_dir: Path, output_dir: Path):
|
|
"""Generate C++ structs from YAML schemas"""
|
|
```
|
|
|
|
**When to regenerate:**
|
|
- After modifying any YAML in `api/schema/`
|
|
- Before running tests
|
|
- As part of CI/CD pipeline
|
|
|
|
### Manual Code vs Generated Code
|
|
|
|
**Generated (Never edit manually):**
|
|
- `ts/src/core/types.ts` - Entity interfaces
|
|
- `ts/src/core/errors.ts` - Error classes
|
|
- `cpp/include/dbal/types.hpp` - Entity structs
|
|
- `cpp/include/dbal/errors.hpp` - Error types
|
|
|
|
**Manual (Safe to edit):**
|
|
- Adapter implementations
|
|
- Query builder
|
|
- Client facade
|
|
- Utility functions
|
|
|
|
## Testing Strategy
|
|
|
|
### 1. Unit Tests (Per Implementation)
|
|
|
|
```bash
|
|
# TypeScript
|
|
cd ts && npm run test:unit
|
|
|
|
# C++
|
|
cd cpp && ./build/tests/unit_tests
|
|
```
|
|
|
|
Test individual functions and classes in isolation.
|
|
|
|
### 2. Integration Tests (Per Implementation)
|
|
|
|
```bash
|
|
# TypeScript
|
|
cd ts && npm run test:integration
|
|
|
|
# C++
|
|
cd cpp && ./build/tests/integration_tests
|
|
```
|
|
|
|
Test adapters against real databases (with Docker).
|
|
|
|
### 3. Conformance Tests (Cross-Implementation)
|
|
|
|
```bash
|
|
# Both implementations
|
|
python tools/conformance/run_all.py
|
|
```
|
|
|
|
**Critical:** These must pass for both TS and C++. If they diverge, it's a bug.
|
|
|
|
### 4. Security Tests (C++ Only)
|
|
|
|
```bash
|
|
cd cpp && ./build/tests/security_tests
|
|
```
|
|
|
|
Test sandboxing, ACL enforcement, SQL injection prevention.
|
|
|
|
## Security Considerations for Agents
|
|
|
|
### What NOT to Do
|
|
|
|
❌ **Never** expose database credentials to user code
|
|
❌ **Never** allow user code to construct raw SQL queries
|
|
❌ **Never** skip ACL checks
|
|
❌ **Never** trust user input without validation
|
|
❌ **Never** log sensitive data (passwords, tokens, PII)
|
|
|
|
### What TO Do
|
|
|
|
✅ **Always** validate input against schema
|
|
✅ **Always** enforce row-level security
|
|
✅ **Always** use parameterized queries
|
|
✅ **Always** log security-relevant operations
|
|
✅ **Always** test with malicious input
|
|
|
|
### Sandboxing Requirements (C++ Daemon)
|
|
|
|
The C++ daemon must:
|
|
|
|
1. **Run with minimal privileges** (drop root, use dedicated user)
|
|
2. **Restrict file system access** (no write outside /var/lib/dbal/)
|
|
3. **Limit network access** (only to DB, no outbound internet)
|
|
4. **Enforce resource limits** (CPU, memory, connections)
|
|
5. **Validate all RPC calls** (schema conformance, ACL checks)
|
|
|
|
### ACL Enforcement
|
|
|
|
Every operation must check:
|
|
|
|
```cpp
|
|
// C++ daemon
|
|
bool DBALDaemon::authorize(const Request& req) {
|
|
User user = req.user();
|
|
string entity = req.entity();
|
|
string operation = req.operation();
|
|
|
|
// 1. Check entity-level permission
|
|
if (!acl_.hasPermission(user, entity, operation)) {
|
|
return false;
|
|
}
|
|
|
|
// 2. Apply row-level filter
|
|
if (operation == "update" || operation == "delete") {
|
|
return acl_.canAccessRow(user, entity, req.id());
|
|
}
|
|
|
|
return true;
|
|
}
|
|
```
|
|
|
|
## CI/CD Integration
|
|
|
|
### GitHub Actions Workflow
|
|
|
|
```yaml
|
|
name: DBAL CI/CD
|
|
|
|
on: [push, pull_request]
|
|
|
|
jobs:
|
|
typescript:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
- run: cd dbal/development && npm ci
|
|
- run: npm run test:unit
|
|
- run: npm run test:integration
|
|
|
|
cpp:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
- run: cd dbal/production && cmake -B build && cmake --build build
|
|
- run: ./build/tests/unit_tests
|
|
- run: ./build/tests/integration_tests
|
|
|
|
conformance:
|
|
needs: [typescript, cpp]
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
- run: python dbal/shared/tools/conformance/run_all.py
|
|
```
|
|
|
|
### Pre-commit Hooks
|
|
|
|
```bash
|
|
# .git/hooks/pre-commit
|
|
#!/bin/bash
|
|
cd dbal/shared/api/schema
|
|
if git diff --cached --name-only | grep -q "\.yaml$"; then
|
|
echo "YAML schema changed, regenerating types..."
|
|
python ../../tools/codegen/gen_types.py
|
|
git add ../ts/src/core/types.ts
|
|
git add ../cpp/include/dbal/types.hpp
|
|
fi
|
|
```
|
|
|
|
## Deployment Architecture
|
|
|
|
### Development Environment
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ Spark App (TS) │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ DBAL Client (TS)│
|
|
└────────┬────────┘
|
|
│ (direct)
|
|
▼
|
|
┌─────────────────┐
|
|
│ Prisma Client │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ SQLite / DB │
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Production Environment
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ Spark App (TS) │
|
|
└────────┬────────┘
|
|
│ gRPC
|
|
▼
|
|
┌─────────────────┐
|
|
│ DBAL Client (TS)│
|
|
└────────┬────────┘
|
|
│ gRPC/WS
|
|
▼
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ DBAL Daemon(C++)│────▶│ Network Policy │
|
|
│ [Sandboxed] │ │ (Firewall) │
|
|
└────────┬────────┘ └─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Prisma Client │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ PostgreSQL │
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Docker Compose Example
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
dbal-daemon:
|
|
build: ./dbal/production
|
|
container_name: dbal-daemon
|
|
ports:
|
|
- "50051:50051"
|
|
environment:
|
|
- DBAL_MODE=production
|
|
- DBAL_SANDBOX=strict
|
|
- DATABASE_URL=postgresql://user:pass@postgres:5432/db
|
|
volumes:
|
|
- ./config:/config:ro
|
|
security_opt:
|
|
- no-new-privileges:true
|
|
read_only: true
|
|
cap_drop:
|
|
- ALL
|
|
cap_add:
|
|
- NET_BIND_SERVICE
|
|
|
|
postgres:
|
|
image: postgres:15
|
|
container_name: dbal-postgres
|
|
environment:
|
|
- POSTGRES_PASSWORD=secure_password
|
|
volumes:
|
|
- postgres-data:/var/lib/postgresql/data
|
|
networks:
|
|
- internal
|
|
|
|
networks:
|
|
internal:
|
|
internal: true
|
|
|
|
volumes:
|
|
postgres-data:
|
|
```
|
|
|
|
## Troubleshooting for Agents
|
|
|
|
### Problem: Types out of sync with schema
|
|
|
|
**Solution:**
|
|
```bash
|
|
python tools/codegen/gen_types.py
|
|
```
|
|
|
|
### Problem: Conformance tests failing
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Run verbose
|
|
python tools/conformance/run_all.py --verbose
|
|
|
|
# Compare outputs
|
|
diff common/golden/ts_results.json common/golden/cpp_results.json
|
|
```
|
|
|
|
### Problem: C++ daemon won't start in production
|
|
|
|
**Check:**
|
|
1. Permissions: `ls -la /var/lib/dbal/`
|
|
2. Ports: `netstat -tlnp | grep 50051`
|
|
3. Logs: `journalctl -u dbal-daemon`
|
|
4. Database connectivity: `nc -zv postgres 5432`
|
|
|
|
### Problem: Security audit failing
|
|
|
|
**Review:**
|
|
- No hardcoded secrets
|
|
- All queries use parameters
|
|
- ACL checks on every operation
|
|
- Audit logs enabled
|
|
|
|
## Best Practices Summary
|
|
|
|
1. ✅ **Schema first** - Define in YAML, generate code
|
|
2. ✅ **Test both** - TS and C++ must pass conformance tests
|
|
3. ✅ **Security by default** - ACL on every operation
|
|
4. ✅ **Documentation** - Update README when adding features
|
|
5. ✅ **Versioning** - Semantic versioning for API changes
|
|
6. ✅ **Backward compatibility** - Support N-1 versions
|
|
7. ✅ **Fail fast** - Validate early, error clearly
|
|
8. ✅ **Audit everything** - Log security-relevant operations
|
|
9. ✅ **Principle of least privilege** - Minimal permissions
|
|
10. ✅ **Defense in depth** - Multiple layers of security
|
|
|
|
## Resources
|
|
|
|
- **API Schema Reference**: [api/schema/README.md](api/schema/README.md)
|
|
- **TypeScript Guide**: [ts/README.md](ts/README.md)
|
|
- **C++ Guide**: [cpp/README.md](cpp/README.md)
|
|
- **Security Guide**: [docs/SECURITY.md](../docs/SECURITY.md)
|
|
- **Contributing**: [docs/CONTRIBUTING.md](../docs/CONTRIBUTING.md)
|