Files
quicproquo/docs/src/protocol-layers/mls.md
Christian Nennemann 2e081ead8e chore: rename quicproquo → quicprochat in docs, Docker, CI, and packaging
Rename all project references from quicproquo/qpq to quicprochat/qpc
across documentation, Docker configuration, CI workflows, packaging
scripts, operational configs, and build tooling.

- Docker: crate paths, binary names, user/group, data dirs, env vars
- CI: workflow crate references, binary names, artifact names
- Docs: all markdown files under docs/, SDK READMEs, book.toml
- Packaging: OpenWrt Makefile, init script, UCI config (file renames)
- Scripts: justfile, dev-shell, screenshot, cross-compile, ai_team
- Operations: Prometheus config, alert rules, Grafana dashboard
- Config: .env.example (QPQ_* → QPC_*), CODEOWNERS paths
- Top-level: README, CONTRIBUTING, ROADMAP, CLAUDE.md
2026-03-21 19:14:06 +01:00

20 KiB

MLS (RFC 9420)

The Messaging Layer Security protocol (RFC 9420) is the core cryptographic layer in quicprochat. It provides authenticated group key agreement with forward secrecy and post-compromise security -- properties that distinguish quicprochat from a simple transport-encrypted relay. This is the most detailed page in the Protocol Deep Dives section because MLS is the most complex layer in the stack.

The implementation lives in quicprochat-core/src/group.rs and quicprochat-core/src/keystore.rs, using the openmls 0.5 crate.

Background: what problem MLS solves

Before MLS, group messaging systems had two main approaches:

  1. Pairwise encryption (Signal/Double Ratchet): Each pair of group members maintains an independent encrypted session. A message to a group of n members requires n - 1 separate encryptions. Adding or removing a member requires O(n) operations by each member. The total work for a group operation is O(n^2).

  2. Server-side fan-out with shared key: All members share a single group key. The server decrypts and re-encrypts for each member. This is not end-to-end encrypted -- the server sees plaintext.

MLS takes a fundamentally different approach: it uses a ratchet tree (a binary tree of Diffie-Hellman key pairs) to derive group keys. This gives:

  • O(log n) scaling: A group operation (add, remove, update) requires only O(log n) DH operations, one per level of the tree, regardless of group size.
  • Forward secrecy: Each epoch uses a fresh key derived from the ratchet tree. Compromising the current key does not reveal past messages.
  • Post-compromise security (PCS): After a member's key is compromised, a single Update Commit operation re-randomises the compromised node's path in the tree, restoring confidentiality for all subsequent messages.
  • End-to-end encryption: The server (Delivery Service) never sees plaintext. It routes opaque MLS blobs by recipient key without parsing them.

Ciphersuite

quicprochat uses:

MLS_128_DHKEMX25519_AES128GCM_SHA256_Ed25519
Component Algorithm Purpose
HPKE KEM DHKEM(X25519, HKDF-SHA256) Key encapsulation for Welcome messages and tree operations
AEAD AES-128-GCM Symmetric encryption of application messages
Hash SHA-256 Key derivation, transcript hashing, tree hashing
Signature Ed25519 Credential binding, Commit signing, KeyPackage signing

This ciphersuite provides 128-bit classical security. Post-quantum protection is handled by the Hybrid KEM layer wrapping MLS payloads at the transport level (planned for M5).

The GroupMember state machine

The central type is GroupMember, defined in quicprochat-core/src/group.rs. It wraps an openmls MlsGroup, a persistent crypto backend (StoreCrypto), and the user's long-term Ed25519 identity keypair.

Lifecycle diagram

GroupMember::new(identity)
  |
  ├── generate_key_package()      → TLS-encoded KeyPackage bytes
  |                                  (upload to Authentication Service)
  |
  ├── create_group(group_id)      → Epoch 0; caller is sole member
  |     |
  |     └── add_member(kp_bytes)  → (commit_bytes, welcome_bytes)
  |           |                      merge_pending_commit() called internally
  |           |
  |           ├── [commit_bytes → existing members via DS]
  |           └── [welcome_bytes → new member via DS]
  |
  └── join_group(welcome_bytes)   → Join via Welcome; epoch matches inviter
        |
        ├── send_message(plaintext) → MLS PrivateMessage bytes
        |
        └── receive_message(bytes)  → Some(plaintext) for Application messages
                                      None for Commits (state updated internally)
                                      None for Proposals (stored for later Commit)

Construction

pub fn new(identity: Arc<IdentityKeypair>) -> Self

Creates a new GroupMember with:

  • A fresh StoreCrypto backend using an ephemeral (in-memory) key store.
  • The provided Ed25519 identity keypair (used as the MLS Signer).
  • No active group (self.group = None).

For state persistence across restarts, use:

pub fn new_with_state(
    identity: Arc<IdentityKeypair>,
    key_store: DiskKeyStore,
    group: Option<MlsGroup>,
) -> Self

This constructor accepts a pre-existing DiskKeyStore (loaded from disk) and an optional serialised MlsGroup. The MlsGroupConfig is rebuilt with use_ratchet_tree_extension(true).

MLS group configuration

The group configuration is built once at construction time:

let config = MlsGroupConfig::builder()
    .use_ratchet_tree_extension(true)
    .build();

The critical setting is use_ratchet_tree_extension(true): this embeds the full ratchet tree inside Welcome messages so that new members can reconstruct the group state without a separate tree-fetching step. The trade-off is larger Welcome messages, but this simplifies the protocol by eliminating a round-trip to a tree distribution service.

Key operations

generate_key_package()

pub fn generate_key_package(&mut self) -> Result<Vec<u8>, CoreError>

Generates a fresh, single-use MLS KeyPackage and returns it as TLS-encoded bytes.

What happens internally:

  1. A CredentialWithKey is created from the identity keypair. The credential type is Basic -- the credential body is the raw Ed25519 public key bytes, and the signature_key field is the same public key.

  2. KeyPackage::builder().build() is called with:

    • CryptoConfig::with_default_version(CIPHERSUITE) -- specifies the MLS ciphersuite.
    • &self.backend -- the StoreCrypto provider. During build, openmls generates an HPKE init keypair and stores the private key in the backend's key store.
    • self.identity.as_ref() -- the Signer (Ed25519 private key) used to sign the KeyPackage.
    • The CredentialWithKey binding the credential to the signature key.
  3. The KeyPackage is serialised via tls_serialize_detached() (TLS presentation language encoding, as specified by RFC 9420).

Critical invariant: The HPKE init private key is stored in self.backend's key store. The same GroupMember instance (or one reconstructed with the same DiskKeyStore) must later call join_group(), because new_from_welcome() looks up the init private key by reference to decrypt the Welcome. If a different GroupMember instance (with a fresh key store) tries to join, the lookup fails and the Welcome cannot be decrypted.

Why KeyPackages are single-use: Each KeyPackage contains a unique HPKE init public key. Using the same KeyPackage for two different group joins would allow the joiner's init key to be reused, which could compromise forward secrecy. See ADR-005: Single-Use KeyPackages.

create_group(group_id)

pub fn create_group(&mut self, group_id: &[u8]) -> Result<(), CoreError>

Creates a new MLS group at epoch 0 with the caller as the sole member.

Parameters:

  • group_id: Any non-empty byte string. By convention, quicprochat uses the SHA-256 digest of a human-readable group name.

What happens internally:

  1. A CredentialWithKey is created (same as generate_key_package).
  2. MlsGroup::new_with_group_id() is called with the backend, signer, config, group ID, and credential.
  3. The resulting MlsGroup is stored in self.group.

After this call, the group exists at epoch 0 with one member. Use add_member() to invite additional members.

add_member(key_package_bytes)

pub fn add_member(
    &mut self,
    key_package_bytes: &[u8],
) -> Result<(Vec<u8>, Vec<u8>), CoreError>

Adds a new member to the group by their TLS-encoded KeyPackage. Returns (commit_bytes, welcome_bytes).

What happens internally:

  1. KeyPackage deserialisation and validation: The raw bytes are deserialised via KeyPackageIn::tls_deserialize(). Note the In suffix -- openmls 0.5 distinguishes between KeyPackage (trusted, locally-generated) and KeyPackageIn (untrusted, received from the network). The validate() method verifies the Ed25519 signature on the KeyPackage and returns a trusted KeyPackage.

    let key_package: KeyPackage =
        KeyPackageIn::tls_deserialize(&mut key_package_bytes.as_ref())?
            .validate(self.backend.crypto(), ProtocolVersion::Mls10)?;
    
  2. Commit + Welcome creation: group.add_members() produces three outputs:

    • commit_out (MlsMessageOut): A Commit message that existing members process to update their state.
    • welcome_out (MlsMessageOut): A Welcome message that bootstraps the new member into the group.
    • _group_info: A GroupInfo for external commits (not used here).
  3. Merge pending commit: group.merge_pending_commit() applies the Commit to the local state, advancing the epoch. This is called immediately because the creator of the Commit is also a group member.

  4. Serialisation: Both commit_out and welcome_out are serialised to bytes via .to_bytes().

Caller responsibilities:

  • Send commit_bytes to all existing group members via the Delivery Service. (In the two-party case where the creator is the only member, this can be discarded -- the creator has already merged it locally.)
  • Send welcome_bytes to the new member via the Delivery Service.

join_group(welcome_bytes)

pub fn join_group(&mut self, welcome_bytes: &[u8]) -> Result<(), CoreError>

Joins an existing group from a TLS-encoded Welcome message.

Prerequisites:

  • generate_key_package() must have been called on this same instance (or one with the same DiskKeyStore) so that the HPKE init private key is available in the backend.

What happens internally:

  1. Deserialisation: The bytes are deserialised as MlsMessageIn, then the inner body is extracted. The into_welcome() method is feature-gated in openmls 0.5, so the implementation uses msg_in.extract() with a match on MlsMessageInBody::Welcome.

    let welcome = match msg_in.extract() {
        MlsMessageInBody::Welcome(w) => w,
        _ => return Err(CoreError::Mls("expected a Welcome message".into())),
    };
    
  2. Group construction: MlsGroup::new_from_welcome() is called with:

    • &self.backend -- to look up the HPKE init private key.
    • &self.config -- group configuration (ratchet tree extension enabled).
    • The Welcome message.
    • ratchet_tree = None -- because use_ratchet_tree_extension = true means the tree is embedded in the Welcome's GroupInfo extension. openmls extracts it automatically.
  3. The resulting MlsGroup is stored in self.group.

send_message(plaintext)

pub fn send_message(&mut self, plaintext: &[u8]) -> Result<Vec<u8>, CoreError>

Encrypts plaintext as an MLS Application message (PrivateMessage variant).

What happens internally:

  1. group.create_message() is called with the backend, signer, and plaintext.
  2. The resulting MlsMessageOut is serialised to bytes via .to_bytes().

The output is a TLS-encoded MLS message ready for delivery. The Delivery Service treats it as an opaque blob.

receive_message(bytes)

pub fn receive_message(&mut self, bytes: &[u8]) -> Result<Option<Vec<u8>>, CoreError>

Processes an incoming TLS-encoded MLS message.

Return values:

  • Ok(Some(plaintext)) -- for Application messages (PrivateMessage). The caller receives the decrypted plaintext.
  • Ok(None) -- for Commit messages. The group state is updated internally (epoch advances) via merge_staged_commit().
  • Ok(None) -- for Proposal messages. The proposal is stored via store_pending_proposal() for inclusion in a future Commit.
  • Ok(None) -- for External Join Proposal messages. Also stored as a pending proposal.

What happens internally:

  1. Deserialisation: Bytes are deserialised as MlsMessageIn, then extracted as either PrivateMessage or PublicMessage. The extraction uses manual pattern matching because into_protocol_message() is feature-gated in openmls 0.5:

    let protocol_message = match msg_in.extract() {
        MlsMessageInBody::PrivateMessage(m) => ProtocolMessage::PrivateMessage(m),
        MlsMessageInBody::PublicMessage(m) => ProtocolMessage::PublicMessage(m),
        _ => return Err(CoreError::Mls("not a protocol message".into())),
    };
    
  2. Processing: group.process_message() decrypts (for PrivateMessage) or verifies (for PublicMessage) the message and returns a ProcessedMessage.

  3. Content dispatch: The ProcessedMessageContent is matched:

    • ApplicationMessage: Plaintext bytes are extracted and returned.
    • StagedCommitMessage: The staged commit is merged, advancing the epoch.
    • ProposalMessage / ExternalJoinProposalMessage: The proposal is stored for later.

The StoreCrypto backend

The StoreCrypto struct (in quicprochat-core/src/keystore.rs) implements OpenMlsCryptoProvider, which openmls requires for all cryptographic operations:

pub struct StoreCrypto {
    crypto: RustCrypto,
    key_store: DiskKeyStore,
}

It couples two things:

  1. RustCrypto: The openmls_rust_crypto crate's implementation of MLS cryptographic primitives (HPKE, AEAD, hashing, signing). This provides both the CryptoProvider and RandProvider traits.

  2. DiskKeyStore: A key-value store that maps opaque byte keys to serialised MLS entities (HPKE private keys, epoch secrets, etc.). This is the critical piece -- openmls stores HPKE init private keys here during KeyPackage::builder().build() and retrieves them during MlsGroup::new_from_welcome().

Why the backend must persist

This is the most important implementation detail in the entire MLS layer:

When generate_key_package() is called, openmls generates an HPKE init keypair and stores the private key in the DiskKeyStore under a reference derived from the init public key. When join_group() is later called with a Welcome message, new_from_welcome() decrypts the Welcome using that stored private key.

If the DiskKeyStore is lost between these two calls, the Welcome cannot be decrypted.

This means:

  • For ephemeral usage (tests, demos), DiskKeyStore::ephemeral() (in-memory HashMap) works as long as the same GroupMember instance is used throughout.
  • For persistent usage (real clients), DiskKeyStore::persistent(path) must be used. It serialises the HashMap to disk via bincode on every store and delete operation.

DiskKeyStore implementation

pub struct DiskKeyStore {
    path: Option<PathBuf>,
    values: RwLock<HashMap<Vec<u8>, Vec<u8>>>,
}
  • Ephemeral mode (path = None): Pure in-memory. Fast but not restart-safe.
  • Persistent mode (path = Some(path)): Flushes the entire HashMap to disk on every mutation. This is simple but not optimised -- a production system would use an append-only log or embedded database.

The OpenMlsKeyStore trait implementation:

  • store(): Serialises the value via serde_json, inserts into the HashMap, then flushes to disk.
  • read(): Deserialises from the HashMap via serde_json.
  • delete(): Removes from the HashMap, then flushes to disk.

openmls 0.5 API gotchas

Several openmls 0.5 API patterns are non-obvious and worth documenting:

KeyPackageIn vs KeyPackage

openmls 0.5 separates untrusted wire types (*In suffix) from validated types. KeyPackage only derives TlsSerialize; KeyPackageIn derives TlsDeserialize. To go from bytes to a trusted KeyPackage:

KeyPackageIn::tls_deserialize(&mut bytes.as_ref())?
    .validate(backend.crypto(), ProtocolVersion::Mls10)?

Feature-gated methods

Several convenient methods (into_welcome(), into_protocol_message()) are feature-gated behind openmls feature flags that quicprochat does not enable. The workaround is to use msg_in.extract() and pattern-match on the MlsMessageInBody enum variants.

MlsGroup is not Send

MlsGroup holds internal state that may not be Send depending on the crypto backend. In quicprochat, StoreCrypto uses RwLock (which is Send + Sync), so GroupMember is Send. However, all MLS operations must use the same backend instance, so GroupMember should not be cloned across tasks.

Ratchet tree embedding

The ratchet tree is embedded in Welcome messages via the use_ratchet_tree_extension(true) configuration. This means:

  1. When add_member() creates a Welcome, the full ratchet tree is included as a GroupInfo extension.
  2. When join_group() calls new_from_welcome() with ratchet_tree = None, openmls extracts the tree from the extension automatically.

The trade-off:

  • Pro: No need for a separate tree distribution service or additional round-trips.
  • Con: Welcome messages grow with the group size (O(n log n) for a balanced tree of n members).

For quicprochat's target group sizes (2-100 members), this trade-off is acceptable.

Wire format

All MLS messages are serialised using TLS presentation language encoding (tls_codec). The TLS-encoded byte vectors are what the transport layer (QUIC + TLS 1.3) and the Delivery Service see. The DS routes these blobs without parsing them.

The key wire message types:

MLS Type Envelope MsgType Direction
KeyPackage keyPackageUpload Client -> AS
Welcome mlsWelcome Inviter -> DS -> Joinee
Commit (PublicMessage) mlsCommit Committer -> DS -> Members
Application (PrivateMessage) mlsApplication Sender -> DS -> Recipient

Example: two-party round-trip

The following sequence shows a complete Alice-and-Bob scenario, matching the two_party_mls_round_trip test in group.rs:

1. Alice = GroupMember::new(alice_identity)
2. Bob   = GroupMember::new(bob_identity)

3. bob_kp = Bob.generate_key_package()
   → Bob's backend now holds the HPKE init private key

4. Alice.create_group(b"test-group")
   → Alice is sole member at epoch 0

5. (commit, welcome) = Alice.add_member(&bob_kp)
   → Alice's epoch advances to 1
   → commit is for existing members (Alice already merged it)
   → welcome is for Bob

6. Bob.join_group(&welcome)
   → Bob's backend retrieves the HPKE init key to decrypt the Welcome
   → Bob is now at the same epoch as Alice

7. ct = Alice.send_message(b"hello bob")
   → MLS PrivateMessage encrypted under the group key

8. pt = Bob.receive_message(&ct)
   → pt == Some(b"hello bob")

9. ct = Bob.send_message(b"hello alice")
10. pt = Alice.receive_message(&ct)
    → pt == Some(b"hello alice")

Credential model

quicprochat uses MLS Basic credentials. The credential body is the raw Ed25519 public key bytes (32 bytes), and the signature_key is the same public key:

let credential = Credential::new(
    self.identity.public_key_bytes().to_vec(),
    CredentialType::Basic,
)?;

CredentialWithKey {
    credential,
    signature_key: self.identity.public_key_bytes().to_vec().into(),
}

This means the MLS identity is the Ed25519 key. There is no X.509 certificate chain or other PKI. The trust model is:

  • Peers trust identity keys obtained out-of-band (e.g., verified via QR code, secure channel, or TOFU).
  • The Authentication Service stores KeyPackages indexed by Ed25519 public key.
  • The Delivery Service routes by Ed25519 public key.

A future milestone may introduce X.509 credentials for integration with external PKI.

Further reading