Skip to content

Conversation

@Fang-
Copy link
Collaborator

@Fang- Fang- commented Jan 23, 2026

Adds a compression handler, configured to support gzip and brotli for responses of non-trivial size.

The library takes care of applying compression only on content types where this makes sense. For now, we're deferring to h2o's defaults, which is essentially a whitelist containing most common text-based mime types.

We lightly adjust the h2o_add_header_by_str call so that h2o tokenizes our headers appropriately, letting it detect the content type.

With this change, we see bytes-over-the-wire reduced as expected.
A cache refresh of the groups web client on a ship with very little content (so, primarily .js file downloads) goes down from ~10mb to ~3mb transferred. (~30%)
Filling a channel with small posts by the same author gives a "recent posts" json scry result of ~23kb, but only uses ~3kb over the wire. (~13%!)

Happy to bikeshed the compression configuration here. I just picked values that seemed fairly middle-of-the-road.

@Fang- Fang- added the help wanted Extra attention is needed label Jan 23, 2026
@pkova
Copy link
Collaborator

pkova commented Jan 23, 2026

I stepped through this codepath to see what's going on.

The operative branch for this scenario is right here:

    if ((content_type_index = h2o_find_header(&req->res.headers, H2O_TOKEN_CONTENT_TYPE, -1)) != -1 &&
        (mime = h2o_mimemap_get_type_by_mimetype(req->pathconf->mimemap, req->res.headers.entries[content_type_index].value, 0)) !=
            NULL)
        req->res.mime_attr = &mime->data.attr;
    else
        req->res.mime_attr = &h2o_mime_attributes_as_is;

Notably h2o_find_header does a pointer comparison to find the header!

ssize_t h2o_find_header(const h2o_headers_t *headers, const h2o_token_t *token, ssize_t cursor)
{
    for (++cursor; cursor < headers->size; ++cursor) {
        if (headers->entries[cursor].name == &token->buf) {
            return cursor;
        }
    }
    return -1;
}

So even though we have the right header for h2o by value it completely ignores it here and therefore disables all compression. There's another function called h2o_find_header_by_str that does what we want right below it, too bad we can't choose...

But wait!

The library will do the string interning for us if we let it. All it takes is a one character change right here:

      h2o_add_header_by_str(&rec_u->pool, &rec_u->res.headers,
                            hed_u->nam_c, hed_u->nam_w, 1, // <- THIS ONE RIGHT HERE
                             0, hed_u->val_c, hed_u->val_w);

Adds a compression handler, configured to support gzip and brotli for
responses of non-trivial size.

The library takes care of applying compression only on content types
where this makes sense. For now, we're deferring to h2o's defaults,
which is essentially a whitelist containing most common text-based
mime types.

We lightly adjust the `h2o_add_header_by_str` call so that h2o tokenizes
our headers appropriately, letting it detect the content type.
@Fang- Fang- changed the title http: force-enable h2o's response compression http: enable h2o's response compression Jan 23, 2026
@Fang- Fang- marked this pull request as ready for review January 23, 2026 19:06
@Fang- Fang- requested a review from a team as a code owner January 23, 2026 19:06
@Fang-
Copy link
Collaborator Author

Fang- commented Jan 23, 2026

Brilliant, thank you @pkova. I took the liberty of force-pushing a cleaner diff so I don't dirty the blame needlessly.

Passes smoke tests. Haven't tried anything crazy, but this should now restrict itself to h2o's text type defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

help wanted Extra attention is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants