Testing¶

WebMACS has three test suites: backend (Python), controller (Python), and frontend (TypeScript).

Test Structure¶

├── backend/tests/
│   ├── conftest.py               # Fixtures (async session, test client, auth)
│   ├── test_auth.py              # Login, JWT validation, logout
│   ├── test_datapoints.py        # Datapoints batch, latest, CRUD
│   ├── test_events.py            # Events CRUD + duplicates
│   ├── test_hardening.py         # OTA download, returning(), subprocess retry
│   ├── test_polling_safeguards.py # Batch cap, rule-eval opt, broadcast throttle
│   └── test_webhook_safeguards.py # Webhook throttle, concurrency, rate limiter
├── controller/tests/
│   ├── conftest.py               # Fixtures (mock API client)
│   ├── test_api_client.py        # HTTP client + retry logic (incl. 429)
│   ├── test_hardware.py          # Hardware abstraction tests
│   └── test_polling_safeguards.py # Per-sensor throttle, dedup, chunking
└── frontend/src/
    ├── composables/__tests__/    # useFormatters, useNotification
    ├── stores/__tests__/         # auth, events, experiments, datapoints
    ├── services/__tests__/       # api, websocket
    └── types/__tests__/          # TypeScript enum validation

Running Tests¶

All Tests¶

just test

By Component¶

just test-backend       # Backend only
just test-controller    # Controller only
just test-frontend      # Frontend only

With Coverage¶

cd backend && uv run pytest tests/ -v --cov --cov-report=html

Open htmlcov/index.html to view the coverage report.

Backend Test Setup¶

Tests use an in-memory SQLite database by default (overridable via DATABASE_URL env var).

Key Fixtures (`conftest.py`)¶

Fixture	Scope	Description
`engine`	session	Async SQLAlchemy engine
`session`	function	Async session (auto-rolled-back)
`client`	function	`httpx.AsyncClient` against the test app
`auth_headers`	function	`{"Authorization": "Bearer ..."}` for authenticated requests
`admin_user`	function	Pre-seeded admin user

Example Test¶

@pytest.mark.asyncio
async def test_create_experiment(client, auth_headers):
    response = await client.post(
        "/api/v1/experiments",
        json={"name": "Test Run"},
        headers=auth_headers,
    )
    assert response.status_code == 201
    data = response.json()
    assert data["name"] == "Test Run"
    assert data["started_on"] is not None

Controller Test Setup¶

Controller tests mock the HTTP client to avoid needing a running backend:

@pytest.fixture
def mock_api_client(mocker):
    client = mocker.MagicMock(spec=APIClient)
    client.post = mocker.AsyncMock(return_value={"status": "ok"})
    return client

Frontend Tests¶

cd frontend
npm run test -- --run     # Single run
npm run test -- --watch   # Watch mode

Uses Vitest with Vue Test Utils.

CI Integration¶

GitHub Actions runs the full test matrix:

Job	Python	OS	Services
`python`	3.13	ubuntu-latest	PostgreSQL 17
`frontend`	—	ubuntu-latest	—

Coverage reports are uploaded to Codecov.

Test Markers¶

@pytest.mark.unit          # Fast, no external deps
@pytest.mark.integration   # Needs database
@pytest.mark.e2e           # Full stack

Run specific markers:

uv run pytest -m unit
uv run pytest -m integration

Load Testing¶

WebMACS includes a built-in sensor scaling load test to determine system capacity.

Running the Load Test¶

python3 scripts/load_test.py

The script authenticates against the API, creates test sensors linked to the active plugin instance, sends batch datapoints at 2 Hz (matching the real controller), and measures latency, throughput, and error rates per stage.

Options¶

Flag	Default	Description
`--stages`	`10,50,100,250,500`	Comma-separated sensor counts per stage
`--duration`	`30`	Seconds per stage
`--frequency`	`2.0`	Batches per second (Hz)
`--base-url`	`http://localhost:8000`	Backend URL

Examples:

# Quick smoke test
python3 scripts/load_test.py --stages 10,50 --duration 10

# Extended stress test
python3 scripts/load_test.py --stages 10,50,100,250,500,1000 --duration 60

# Against a remote server
python3 scripts/load_test.py --base-url http://192.168.1.50:8000

Metrics Collected¶

Per stage, the script reports:

Throughput — actual datapoints/second accepted by the backend
Batch Latency — P50, P95, P99 of POST /datapoints/batch response time
Dashboard Latency — P95 of GET /datapoints/latest (simulates frontend polling)
Error Rate — percentage of failed batch requests

Stage Status Icons¶

Icon	Meaning
✅	P95 < 200 ms, 0% errors
⚠️	P95 200–1000 ms or < 2% errors
❌	P95 > 1 s or > 2% errors

Reference Results (Single Worker, Docker on Mac)¶

Sensors	dp/s	P50	P95	P99	Dashboard P95	Errors
10	20	12 ms	17 ms	20 ms	43 ms	0%
50	100	27 ms	228 ms	247 ms	44 ms	0%
100	199	43 ms	243 ms	248 ms	46 ms	0%
250	498	92 ms	353 ms	362 ms	52 ms	0%
500	990	186 ms	532 ms	552 ms	83 ms	0%

Sweet Spot

With the default single-worker configuration, 10 sensors provide the best experience with P95 < 20 ms. The system remains stable and error-free up to 500+ sensors (~1000 dp/s), though latency increases.

Scaling Beyond 500 Sensors

To improve high-sensor performance, consider:

Adding uvicorn workers: --workers 4
PostgreSQL connection pooling (PgBouncer)
Database partitioning for the datapoints table
Reducing the controller poll interval

Next Steps¶

Contributing — development workflow
Code Style — formatting and linting rules