Phase 02: DocType Engine & Database

Objective: Implement the metadata engine that dynamically creates database tables from Pydantic models and provides generic CRUD operations.

[!IMPORTANT] Database Agnostic: All database implementations MUST work with both SQLite and PostgreSQL. Avoid database-specific features (e.g., PostgreSQL's ARRAY). Use portable SQL types that work across databases.

1. Schema Mapper (Pydantic → SQLAlchemy)

1.1 Field Registry & Type Mapping

1.2 Schema Mapper Implementation

TDD Focus: Write tests/adapters/db/test_schema_mapper.py first!

[!IMPORTANT] Primary Key Design Decision: Use id (UUID) as primary key, name as unique index.

Rationale: Allows document renames without cascading foreign key updates

Frappe Issue: Using name as PK causes expensive updates on rename

Standard Practice: All major databases use auto-generated id as PK

Flexibility: Framework users can rename documents cheaply

1.3 Table Registry

Create TableRegistry class to store created tables
Add methods:
- register_table(doctype_name: str, table: Table)
- get_table(doctype_name: str) -> Table
- table_exists(doctype_name: str) -> bool

2. Database Connection Setup

2.1 Connection Factory & Virtual DocTypes

Create src/framework_m/adapters/db/connection.py
Implement ConnectionFactory (supports Multiple Bindings):
- Load db_binds from config (e.g., legacy, timescale)
- Create generic async engine (supports Postgres, SQLite for testing)
- Maintain map of engine_name -> AsyncEngine

2.2 Session Factory (SQL Support)

Add session factory:
- async def get_session(bind: str = "default") -> AsyncSession
- Support VirtualDocType (SQL Bind) by accepting bind_key from DocType Meta

2.3 Non-SQL Virtual DocTypes (Custom Repositories)

Create concept of RepositoryOverride:
- Allow DocType to define repository_class in Meta
- RepositoryFactory instantiates this instead of GenericRepository
Create tests/adapters/db/test_repository_factory.py:
- Implement a Mock FileRepository that reads from JSON file
- Register it for a VirtualDoc
- Verify VirtualDoc.get() calls FileRepository.get()

3. Generic Repository Implementation

Create tests/adapters/db/test_generic_repository.py
Define tests for CRUD operations (mock AsyncSession)
Create src/framework_m/adapters/db/generic_repository.py
Implement GenericRepository[T] class
Constructor dependencies (session is NOT stored here):
- model: Type[T]
- table: Table
- controller_class: Type[BaseController] | None
- event_bus: EventBusProtocol | None (InMemoryEventBus for dev)

[!NOTE] All CRUD methods accept session: AsyncSession as the FIRST argument. The caller (Service/Controller) owns the session via UnitOfWork.

3.1 CRUD Operations

3.2 Lifecycle Hook Integration

Create helper method _call_hook(hook_name: str):
- Check if controller exists
- Check if hook method exists on controller
- Call hook method if present
- Handle exceptions and rollback on error

4. Migration System

4.1 Alembic Integration

Initialize Alembic in project:
```
alembic init alembic
```
Implemented via MigrationManager.init() which creates alembic directory structure
Configure alembic.ini:
- Set database URL from environment
- Configure migration file location
Create alembic/env.py:
- Import MetaRegistry
- Import all registered DocTypes
- Set target_metadata from SchemaMapper tables Auto-generated with async support via MigrationManager.init()

4.2 Auto-Migration Detection

4.3 CLI Commands

Add migration commands to CLI:
- m migrate - run pending migrations
- m migrate:create <name> - create new migration
- m migrate:rollback - rollback last migration
- m migrate:status - show migration status Also added: m migrate:history, m migrate:init

5. Repository Factory

Create src/framework_m/adapters/db/repository_factory.py
Implement RepositoryFactory class:
- create_generic_repository(doctype_name: str) -> GenericRepository
- Look up DocType from MetaRegistry
- Look up Table from TableRegistry
- Look up Controller from MetaRegistry
- Create and return GenericRepository instance
- Support event_bus parameter for domain events
- Support custom repository overrides for Virtual DocTypes

6. Engine, Session Factory & Unit of Work

[!IMPORTANT] Anti-Pattern to Avoid: Do NOT tie transaction lifecycle to HTTP request lifecycle. Repositories receive sessions; they do NOT create or own them.

6.1 Engine & Session Factory Setup

Create src/framework_m/adapters/db/connection.py (ConnectionFactory)
Implement create_engine(url: str) -> AsyncEngine
- Pool configuration (pool_size, max_overflow, timeout, recycle, pre_ping)
- Environment variable expansion (${VAR} syntax)
Implement SessionFactory (returns AsyncSession context managers)
- Support multiple binds (for Virtual DocTypes with SQL binds)
- Configuration from ConnectionFactory
- Auto commit/rollback in context manager

6.2 Unit of Work (`UnitOfWork`)

6.3 Multi-Source Coordination (Outbox Pattern)

6.4 Startup Sequence (Schema Sync)

Create src/framework_m/adapters/db/__init__.py
Implement startup sequence (called once at app boot, NOT per-request):
- Initialize database engine (NOT session)
- Discover all DocTypes via MetaRegistry
- Create/sync tables via SchemaMapper
- Register tables in TableRegistry
- Run auto-migration (if enabled)

7. Testing

7.1 Unit Tests

Test SchemaMapper:
- Test type mapping for all supported types (test_schema_mapper.py)
- Test primary key creation (test_table_has_id_column_as_primary_key)
- Test nullable fields (TestSchemaMapperNullableFields)
- Test enum mapping (TestSchemaMapperEnums)
Test GenericRepository:
- Test CRUD operations with mock data (test_generic_repository.py)
- Test lifecycle hook calls (TestControllerHooks)
- Test transaction rollback on error (TestTransactionRollback)

7.2 Integration Tests

8. Error Handling

9. Performance Optimizations

Validation Checklist

Before moving to Phase 03, verify:

Can create tables from Pydantic models dynamically
CRUD operations work with lifecycle hooks
Migrations are generated and applied correctly
All integration tests pass with real PostgreSQL (skips if Docker unavailable)
No direct SQLAlchemy imports in domain layer
Repository implements RepositoryProtocol correctly

Anti-Patterns to Avoid

❌ Don't: Use raw SQL queries in business logic ✅ Do: Use repository methods and let SQLAlchemy handle queries

❌ Don't: Hardcode supported field types ✅ Do: Use FieldRegistry to allow plugins to add types (e.g. GeoLocation)

❌ Don't: Store metadata in database like Frappe ✅ Do: Generate tables from code-first Pydantic models

❌ Don't: Use synchronous database calls ✅ Do: Use async/await throughout

❌ Don't: Hardcode table names or column names ✅ Do: Derive from Pydantic model metadata

1. Schema Mapper (Pydantic → SQLAlchemy)​

1.1 Field Registry & Type Mapping​

1.2 Schema Mapper Implementation​

1.3 Table Registry​

2. Database Connection Setup​

2.1 Connection Factory & Virtual DocTypes​

2.2 Session Factory (SQL Support)​

2.3 Non-SQL Virtual DocTypes (Custom Repositories)​

3. Generic Repository Implementation​

3.1 CRUD Operations​

3.2 Lifecycle Hook Integration​

4. Migration System​

4.1 Alembic Integration​

4.2 Auto-Migration Detection​

4.3 CLI Commands​

5. Repository Factory​

6. Engine, Session Factory & Unit of Work​

6.1 Engine & Session Factory Setup​

6.2 Unit of Work (UnitOfWork)​

6.3 Multi-Source Coordination (Outbox Pattern)​

6.4 Startup Sequence (Schema Sync)​

7. Testing​

7.1 Unit Tests​

7.2 Integration Tests​

8. Error Handling​

9. Performance Optimizations​

Validation Checklist​

Anti-Patterns to Avoid​