Improve Maildir performance for large messages#479
Conversation
| resp, err := client.Get(endpoint(cfg.HTTPURL, "/api/v3/messages?limit=1")) | ||
| if err == nil { | ||
| resp.Body.Close() | ||
| if resp.StatusCode >= 200 && resp.StatusCode < 500 { |
There was a problem hiding this comment.
The readiness check passes on any status < 500, including 404 or 403, which could mean the server is up but the API path doesn't exist yet. Change the condition to only accept 2xx.
| b.WriteString("Content-Transfer-Encoding: base64\r\n") | ||
| b.WriteString("Content-Disposition: attachment; filename=\"large.bin\"\r\n") | ||
| b.WriteString("\r\n") | ||
|
|
There was a problem hiding this comment.
// This allocates the full raw payload (~8 MiB) and then a separate base64-encoded
// slice (~11 MiB) per message. Since the raw data is just one repeated byte, we don't
// need to materialise it at all — a repeatReader + io.LimitedReader + base64.NewEncoder
// streaming into a lineWrapper gets this down to ~32 KiB regardless of attachment size.
raw := bytes.Repeat([]byte{byte('A' + index%26)}, attachmentBytes)
encoded := make([]byte, base64.StdEncoding.EncodedLen(len(raw)))
base64.StdEncoding.Encode(encoded, raw)
writeWrappedBase64(&b, encoded)
| return checkDownload(client, cfg, smallID) | ||
| }}, | ||
| } | ||
|
|
There was a problem hiding this comment.
counts and firstErr share the same mutex mu, but they have different
access patterns — firstErr is checked on every tick in the main loop
(under mu.Lock), while counts is only read after wg.Wait().
Consider a separate mutex or atomic for firstErr to avoid the main
loop contending with workers on every tick.
Summary
This PR improves Maildir-backed MailHog performance for large messages and long-running instances.
The main change is to keep the existing
/api/v1and/api/v2contracts compatible, while adding optimized/api/v3endpoints for the web UI and high-volume polling use cases:/api/v1and/api/v2continue returning full legacy message payloads for API clients./api/v3/messagesand/api/v3/searchreturn compact metadata only./api/v3/messages/{id}returns a bounded preview for large messages./api/v3/messages/{id}/bodyreturns body preview chunks for progressive UI loading./api/v3/messages/{id}/downloadstreams the original message without loading the whole message into memory./api/v3/messages?older_than=...supports deleting messages older than a relative age./api/v3/messages?created_before=...supports deleting messages created before an absolute cutoff.The web UI now uses
/api/v3for list/search/websocket/preview/download paths, so large attachments do not inflate list/search responses.Why
Polling workloads that wait for test emails can generate many repeated list/search requests. With large messages or attachments in Maildir, the previous behavior could repeatedly read and serialize full message bodies and MIME payloads for list/search responses. That makes responses much larger than the data needed for polling and puts pressure on memory in long-running containers.
What Changed
go install github.com/mailhog/MailHog@latest.DELETE /api/v3/messages?older_than=1hDELETE /api/v3/messages?older_than=1dDELETE /api/v3/messages?older_than=1wDELETE /api/v3/messages?created_before=2026-05-22T12:34:56ZDELETE /api/v3/messages?created_before=1700000000000masterandv*tagsBefore And After Examples
Before: list/search could include large raw payloads
Legacy list/search responses could include full message content that is not needed for polling:
{ "total": 50, "count": 50, "items": [ { "ID": "message-id", "Raw": { "From": "...", "To": ["..."], "Data": "full SMTP DATA including large MIME attachments" }, "Content": { "Headers": { "Subject": ["Large attachment"] }, "Body": "full body / MIME payload", "Size": 104857600 }, "MIME": { "Parts": ["large parsed MIME tree"] } } ] }That shape is still available from
/api/v1and/api/v2for compatibility.After: compact v3 list/search for polling and UI
{ "total": 23, "count": 23, "start": 0, "items": [ { "ID": "message-id", "From": { "Mailbox": "sender", "Domain": "example.com" }, "To": [ { "Mailbox": "user", "Domain": "example.com" } ], "Created": "2026-05-22T12:34:56Z", "Content": { "Headers": { "Subject": ["Large attachment"] }, "Size": 104857600 } } ] }The v3 compact contract intentionally omits:
RawContent.BodyMIMEAfter: progressive body preview
{ "ID": "message-id", "Content": { "Headers": { "Content-Type": ["text/html; charset=utf-8"] }, "Body": "<preview chunk up to the requested limit>", "Size": 12582912 }, "Offset": 0, "NextOffset": 1048576, "Limit": 1048576, "MaxSize": 10485760, "HasMore": true, "Truncated": false, "Source": "mime" }After: streaming download
The response is streamed as
message/rfc822from storage, without loading the full message into memory.After: deletion filters
Relative age:
Absolute cutoff:
older_thanintentionally does not accept timestamps; timestamp-like values belong increated_before.Compatibility Notes
/api/v1/messagesbehavior is preserved./api/v2/messages,/api/v2/search, and/api/v2/websocketcontinue returning full legacy messages./api/v3./api/v3.Docker Image Naming
The workflow publishes to:
For the upstream repository this resolves to:
It also publishes branch/ref, tag,
latestformaster, andsha-<commit>tags.Local Action Log
Implemented and split into commits:
Key implementation steps:
Validation Log
GOPATH and unit test validation:
Docker build:
600MB memory-limit performance smoke:
Smoke output:
Observed memory sample during smoke:
Also checked:
Both passed with no reported issues.