Skip to content

Testing

Test Structure

tests/
├── conftest.py              # Shared fixtures + Fakes
├── unit/                    # No external deps
│   ├── test_models.py       # Domain models
│   ├── test_pipeline.py     # Pipeline orchestration
│   ├── test_processors.py   # Post-processors
│   ├── test_formatters.py   # Output formatters
│   ├── test_storage.py      # Storage fake
│   └── test_api.py          # FastAPI endpoints
└── integration/             # Requires Docker services

Running Tests

just test                    # All tests
just test-unit               # Unit tests with coverage
just test-integration        # Integration tests

Test Philosophy

Fakes over Mocks

We use in-memory implementations of ports (Fakes) instead of unittest.mock:

class FakeExtractor(ExtractorPort):
    async def extract(self, file_content, mime_type):
        return ExtractionResult(text="Fake text", engine_used="fake")

Benefits:

  • Fakes implement the real interface — they break when the port changes
  • No brittle mock assertions
  • Reusable across tests

Test Pyramid

Layer Proportion What How
Unit ~70% Domain + Application Fakes
Integration ~25% Adapters + API Tika container
E2E ~5% Full pipeline Docker Compose

Test Markers

@pytest.mark.unit           # Fast, no external deps
@pytest.mark.integration    # Needs Docker services
@pytest.mark.e2e            # Full stack

Run specific markers:

uv run pytest -m unit
uv run pytest -m integration

Coverage

Target: 80% overall, 90% for domain layer.

just test-unit   # Includes --cov report