# Testing Strategy

This page describes the testing structure, conventions, and current coverage for
quicprochat. All tests run with `cargo test --workspace` (or `just test`) and
must pass before any code is merged.

For the coding standards that tests must follow, see
[Coding Standards](coding-standards.md).

---

## Test Organisation

### Unit Tests

Unit tests live alongside the code they test, in `#[cfg(test)] mod tests` blocks
at the bottom of each source file. They test individual functions and types in
isolation.

**quicprochat-core (96 tests):**

| Module | Tests | What they cover |
|--------|-------|----------------|
| `codec` | 7 | Length-prefixed frame encoding/decoding, edge cases (empty payload, max size, partial frame, exact boundary) |
| `keypair` | 3 | Ed25519 keypair generation, public key extraction, deterministic re-derivation |
| `group` | 2 | Group round-trip (create + add + join + send + recv), group\_id lifecycle |
| `hybrid_kem` | 11 | Encapsulate/decapsulate round-trip, key generation, combiner correctness, wrong-key rejection, serialisation |
| `opaque_auth` | 12 | OPAQUE registration + login full flow, bad password rejection |
| `mls_*` | 61 | MLS key schedule, member add/remove, Welcome processing, key exhaustion |

**quicprochat-rpc (18 tests):**

| Module | Tests | What they cover |
|--------|-------|----------------|
| `framing` | 8 | Wire framing round-trips, method ID encoding, length-prefix correctness |
| `dispatch` | 10 | Handler dispatch, method not found, middleware chain, timeout enforcement |

**quicprochat-sdk (30 tests):**

| Module | Tests | What they cover |
|--------|-------|----------------|
| `client` | 15 | `QpqClient` connect, send, receive, event broadcast |
| `conversation_store` | 15 | `ConversationStore` CRUD, pagination, message ordering |

**quicprochat-server (65 tests):**

| Module | Tests | What they cover |
|--------|-------|----------------|
| `auth` | 20 | OPAQUE registration, login, session management, rate limiting |
| `node_service` | 20 | KeyPackage upload/fetch, message enqueue/deliver, sealed sender |
| `storage` | 15 | `FileBackedStore` and `SqlStore` CRUD, MLS entity serialisation |
| `federation` | 10 | Federation peer relay, mTLS validation, domain routing |

**quicprochat-kt (21 tests):**

| Module | Tests | What they cover |
|--------|-------|----------------|
| `merkle_log` | 21 | Merkle tree insertion, consistency proofs, root hash correctness |

**quicprochat-p2p (34 tests):**

| Module | Tests | What they cover |
|--------|-------|----------------|
| iroh mesh | 34 | P2P peer discovery, relay, mesh join/leave |

### Integration and E2E Tests

E2E tests live in `crates/quicprochat-client/tests/e2e.rs` (20 tests) and
exercise the full client-server stack in-process. Each test spawns a real server
using `tokio::spawn`, runs client operations against it, and asserts on the
results.

**quicprochat-client unit (16 tests):**

| File | What it covers |
|------|---------------|
| `src/lib.rs` | CLI command parsing, client state machine, error formatting |

**quicprochat-client E2E (20 tests):**

| Test | What it covers |
|------|---------------|
| `auth_failure` | Rejected OPAQUE login (wrong password) |
| `message_ordering` | Sequential message delivery order preserved |
| `opaque_flow` | Full OPAQUE registration + login round-trip |
| `key_exhaustion` | Behaviour when KeyPackage queue is empty |
| `rate_limit` | Rate limiting rejects excess requests |
| `mls_group_round_trip` | Full MLS group: create, add member, send, receive |
| `keypackage_single_use` | KeyPackage consumed on first fetch |
| and 13 more | Additional protocol scenarios |

### Test Pattern

All E2E tests follow the same pattern:

```rust
#[tokio::test]
async fn test_something() {
    // 1. Acquire shared lock to avoid port conflicts
    let _lock = AUTH_LOCK.lock().await;

    // 2. Start server in background
    let server_handle = tokio::spawn(async move {
        server::run(config).await.expect("server failed");
    });

    // 3. Wait for server to be ready
    tokio::time::sleep(Duration::from_millis(100)).await;

    // 4. Run client operations
    let result = client::do_something(server_addr).await;

    // 5. Assert
    assert!(result.is_ok());

    // 6. Cleanup
    server_handle.abort();
}
```

This pattern ensures tests are self-contained and do not require an external
server process.

---

## Running Tests

### Full Workspace

```bash
just test
# or
cargo test --workspace
```

This runs all unit tests and integration tests across all nine crates (301 tests total).

### E2E Tests (serialised)

The E2E test suite shares an `AUTH_LOCK` `tokio::Mutex` to prevent port binding
conflicts when tests run in parallel. Always run E2E tests with a single thread:

```bash
cargo test -p quicprochat-client --test e2e -- --test-threads 1
```

Running without `--test-threads 1` may cause intermittent bind errors if two
tests try to use the same port concurrently.

### Single Crate

```bash
cargo test -p quicprochat-core
cargo test -p quicprochat-rpc
cargo test -p quicprochat-sdk
cargo test -p quicprochat-server
cargo test -p quicprochat-kt
cargo test -p quicprochat-p2p
```

### Single Test

```bash
cargo test -p quicprochat-core -- codec::tests::test_round_trip
cargo test -p quicprochat-client --test e2e -- opaque_flow --test-threads 1
```

### With Output

```bash
cargo test --workspace -- --nocapture
```

---

## Current Results

All 301 tests pass on branch `v2`.

| Crate | Unit / Integration Tests | E2E Tests | Total |
|-------|--------------------------|-----------|-------|
| `quicprochat-core` | 96 | -- | 96 |
| `quicprochat-rpc` | 18 | -- | 18 |
| `quicprochat-sdk` | 30 | -- | 30 |
| `quicprochat-server` | 65 | -- | 65 |
| `quicprochat-kt` | 21 | -- | 21 |
| `quicprochat-p2p` | 34 | -- | 34 |
| `quicprochat-client` | 16 unit + 1 doctest | 20 | 37 |
| **Total** | **281** | **20** | **301** |

---

## Test Conventions

### Naming

Test functions use descriptive names that state what is being tested and the
expected outcome:

```rust
#[test]
fn encode_decode_round_trip_preserves_payload() { ... }

#[test]
fn empty_payload_produces_length_zero_frame() { ... }

#[test]
fn fetch_consumes_keypackage_single_use() { ... }
```

### Assertions

- Use `assert_eq!` with both expected and actual values.
- Use `assert!(result.is_ok(), "descriptive message: {result:?}")` for
  `Result` checks.
- For crypto operations, assert on specific error variants, not just
  `is_err()`.

### No External Dependencies

Tests must not depend on external services, network access, or filesystem state
outside the test's temporary directory. The `tokio::spawn` pattern for
E2E tests ensures everything runs in-process.

### Determinism

Tests must be deterministic. If randomness is needed (e.g., key generation),
the test must not depend on specific random values — only on the properties of
the output (correct length, successful round-trip, etc.).

### No `.unwrap()` in Test Setup

`.unwrap()` is acceptable in test assertions, but test setup that fails silently
is not. Use `expect("descriptive message")` on setup operations so failures
report clearly.

---

## Planned Testing Enhancements

### Fuzzing Targets (M5+)

Fuzz testing for parser and deserialisation code:

- **Protobuf message parser:** Feed arbitrary bytes to `prost::Message::decode`
  on each generated type and verify it either parses correctly or returns a
  typed error (no panics, no undefined behaviour).
- **MLS message handler:** Feed arbitrary `MLSMessage` bytes to the
  `GroupMember::receive_message` path.

Tool: `cargo-fuzz` with `libfuzzer`.

### Golden-Wire Fixtures (M5+)

Serialised test vectors for regression testing across versions:

- Capture the wire bytes of known-good Protobuf messages at the current version.
- Store as `.bin` files in `tests/fixtures/`.
- Each test deserialises the fixture and verifies the expected field values.
- When the wire format changes, fixtures are updated with a version bump.

This catches accidental wire-format regressions that would break client-server
compatibility.

### N-1 Compatibility Tests (M5+)

Test that a client built at version N can communicate with a server built at
version N-1 (and vice versa):

- Build two versions of the binary (current and previous release).
- Run the older server with the newer client and verify all RPCs succeed.
- Run the newer server with the older client and verify graceful degradation.

### Criterion Benchmarks (M5)

Performance benchmarks using [Criterion.rs](https://docs.rs/criterion/):

- Key generation latency (Ed25519, X25519, ML-KEM-768).
- MLS encap/decap (KeyPackage generation, Welcome processing).
- Group-add latency scaling: 2, 10, 100, 1000 members.
- Protobuf serialise/deserialise throughput.

Benchmarks run separately from tests (`cargo bench`) and are not part of the
CI gate, but are tracked for regression detection.

### Docker-based E2E Tests (Phase 5)

End-to-end tests using `testcontainers-rs`:

- Spin up server container from the Docker image.
- Run client operations from the test process against the containerised server.
- Verify real network boundaries, container startup, and multi-process
  interactions.

---

## Cross-references

- [Coding Standards](coding-standards.md) -- quality requirements for test code
- [Milestones](../roadmap/milestones.md) -- which tests were added at each milestone
- [Production Readiness WBS](../roadmap/production-readiness.md) -- Phase 5 (E2E Harness and Security Tests)