Geodocs.dev

Brotli vs Gzip Compression for AI Crawlers

ShareLinkedIn

Open this article in your favorite AI assistant for deeper analysis, summaries, or follow-up questions.

Brotli (br) is a Google-developed compression algorithm that produces 14-20% smaller text payloads than Gzip and is supported by all modern browsers and major AI crawlers via standard Accept-Encoding negotiation. The right pattern is not Brotli vs Gzip; it is Brotli with Gzip fallback, negotiated per request. CPU cost differs sharply by compression level: Brotli at level 4 is comparable to Gzip level 8; Brotli at level 11 is far more expensive but achievable as static pre-compression.

TL;DR

Serve Brotli to any client that advertises br in Accept-Encoding; fall back to Gzip for clients that do not. Use Brotli level 4 for dynamic compression and level 11 for static pre-compression. AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Bingbot, Googlebot) honor standard Accept-Encoding negotiation; you do not need crawler-specific compression rules. Cloudflare's measured rollout reduced HTML bandwidth by ~53% after switching from Gzip to Brotli at the edge.

At a glance

DimensionGzipBrotli
AlgorithmDEFLATE (LZ77 + Huffman)LZ77 + Huffman + 120 KB static dictionary
Compression ratio (HTML/CSS/JS)Baseline14-20% better for JS, ~30% better for CSS
Compression speedFast at low/medium levelsSlower at high levels
Decompression speedFastComparable to Gzip
Browser support~100%~96.3% globally (caniuse)
AI crawler supportUniversalYes, via standard Accept-Encoding
Encoding tokengzipbr
HTTPS requiredNoYes — browsers only request br over HTTPS
Practical level for dynamic64
Practical level for static pre-compression911
File size minimum to compress48 bytes (Cloudflare)50 bytes (Cloudflare)

Compression ratio benchmarks

Real-world measurements consistently put Brotli ahead of Gzip on text content:

  • Akamai broad benchmark: Brotli median 82% reduction vs Gzip 78% across mixed assets, with larger margins on HTML/CSS/JS individually.
  • Cloudflare CSS test: Gzip level 8 = 59.21% reduction; Brotli level 4 = 59.58%; Brotli level 11 = 66.94%.
  • Cloudflare full rollout (Deco CMS report): Brotli at the edge cut HTML transfer ~53% vs Gzip across one production day.
  • Static-site benchmarks (DEV community, DoHost): Brotli typically 14-20% smaller for JavaScript, up to 30% smaller for CSS.

The gap widens as compression level increases, but Brotli compression CPU cost grows non-linearly above level 6.

CPU cost reality

Brotli's compression speed at level 11 (max) is ~30-60x slower than at level 4. Decompression speed is roughly equivalent to Gzip in both directions, so the asymmetric cost lives entirely on the server.

Practical rules:

  • Dynamic responses (HTML, JSON): Use Brotli level 4. CPU cost approximates Gzip level 6-8 with similar or smaller output.
  • Static assets (precompressed CSS/JS): Use Brotli level 11 at build time, store the .br artifact alongside the source, and serve via Nginx brotli_static on or equivalent.
  • Mixed origin / CDN: Most CDNs (Cloudflare, Fastly, Akamai, AWS CloudFront) re-compress responses at the edge. If the CDN serves Brotli-from-origin, level 11 origin compression is preserved end to end.

Accept-Encoding negotiation

The whole comparison resolves at request time via the standard Accept-Encoding header. The client lists supported encodings; the server picks one and sends Content-Encoding to confirm.

Typical AI crawler request:

GET /article HTTP/2

Host: example.com

User-Agent: GPTBot/1.0 (+https://openai.com/gptbot)

Accept-Encoding: gzip, br

Accept: text/html,/

Server response when Brotli is selected:

HTTP/2 200 OK

Content-Type: text/html; charset=utf-8

Content-Encoding: br

Vary: Accept-Encoding

Content-Length: 12453

Key rules:

  • The Vary: Accept-Encoding header is mandatory whenever you serve different bodies based on Accept-Encoding. Without it, intermediate caches will serve the wrong encoding.
  • Accept-Encoding ordering does not imply preference; the server chooses. Most servers prefer br when present.
  • If the bot advertises only gzip, send Gzip. Never force Brotli on a bot that did not request it.
  • Some buggy crawlers and proxies advertise br they cannot actually decode. In production logs, this appears as Brotli responses being followed by retry requests with Accept-Encoding: gzip. Log and downgrade these clients if observed.

AI crawler-specific behavior

Most major AI crawlers honor standard Accept-Encoding negotiation. Documented patterns:

  • GPTBot, OAI-SearchBot, ChatGPT-User: Advertise gzip, br in Accept-Encoding. OpenAI does not publish an explicit compression spec but has not been observed rejecting Brotli.
  • ClaudeBot, Claude-User: Honor Accept-Encoding negotiation; Brotli served correctly in production logs.
  • PerplexityBot, Perplexity-User: Honor Accept-Encoding negotiation.
  • Googlebot: Officially supports both gzip and Brotli; advertises both.
  • Bingbot: Supports both; documents .gz sitemap compression explicitly.

No major AI crawler requires Gzip-only or Brotli-only. The negotiation is automatic.

Decision matrix

SituationRecommended encoding
Static HTML, CSS, JS in CDNBrotli level 11 pre-compression + Gzip fallback
Dynamic HTML (server-rendered)Brotli level 4 dynamic + Gzip level 6 fallback
API JSON responses to AI crawlersBrotli level 4 + Gzip fallback
Sitemap.xml.gzGzip (file extension convention)
llms.txt / llms-full.txtBrotli (AI crawlers benefit most)
Plain HTTP (no TLS) originGzip only (browsers do not request br over HTTP)
Internal services without HTTPSGzip
Embedded / IoT clients with limited memoryGzip (smaller decoder footprint)

CDN / server configuration patterns

Cloudflare

Brotli is on by default for all plans as of 2024. Use Compression Rules to enable Brotli-from-origin (level 11 preserved end to end) on Pro plans and above.

Nginx

brotli on;

brotli_static on; # Serve precompressed .br files

brotli_comp_level 4; # Dynamic level

brotli_types text/html text/css application/javascript application/json image/svg+xml;

gzip on;

gzip_static on;

gzip_comp_level 6;

gzip_types text/html text/css application/javascript application/json image/svg+xml;

The brotli_static on directive lets you precompile assets with brotli -q 11 file.css at build time and ship the .br alongside the original. Nginx serves the static .br when the client advertises br.

Vercel / Netlify

Both platforms apply Brotli by default for static assets and Gzip for dynamic. No configuration is required for AI crawler compatibility.

Apache

Use mod_brotli and mod_deflate. Order the AddOutputFilterByType directives so Brotli wins when both are accepted.

Common mistakes

  • Forgetting Vary: Accept-Encoding. Causes shared caches to serve the wrong encoding to the wrong client.
  • Compressing already-compressed assets. Do not compress JPEG/PNG/WebP/MP4/PDF/font WOFF2 files. They are already compressed; recompression bloats slightly and wastes CPU.
  • Brotli level 11 dynamic. Will saturate origin CPU under load. Reserve level 11 for build-time pre-compression.
  • Disabling Brotli on AI crawlers "to be safe". Modern AI crawlers handle Brotli correctly. Disabling it costs you ~15% bandwidth with zero benefit.
  • Mixing pre-compressed and dynamic on the same path. If /index.html exists as both .html and .html.br, ensure your server picks the static .br only when content matches.
  • Skipping HTTPS on the origin. Brotli is HTTPS-only by browser convention; even though servers can send br over HTTP, browsers will not request it. AI crawlers follow the same convention via TLS-by-default.

How to apply

  1. Audit your origin and CDN: confirm Brotli is enabled and Gzip remains as fallback.
  2. For static assets, add a build step that produces .br artifacts at level 11.
  3. Set Vary: Accept-Encoding on all compressible responses.
  4. Verify in CDN logs that AI crawler user-agents (GPTBot, ClaudeBot, PerplexityBot) receive Content-Encoding: br for HTML.
  5. Spot-check curl -H "Accept-Encoding: br, gzip" -I https://example.com/article for Content-Encoding: br.
  6. Monitor CPU: if origin compression CPU spikes, dial dynamic Brotli level from 4 to 3.

FAQ

Q: Do AI crawlers support Brotli?

Yes. GPTBot, ClaudeBot, PerplexityBot, Googlebot, and Bingbot all advertise br in Accept-Encoding and decode Brotli responses correctly. Standard negotiation is the only configuration needed.

Q: Should I disable Gzip if I have Brotli?

No. Always keep Gzip as a fallback. Roughly 4% of clients globally still cannot decode Brotli, and some legacy AI tooling (custom crawlers) may not advertise br.

Q: What Brotli level should I use?

Level 4 for dynamic responses, level 11 for static pre-compression. Levels above 4 sharply increase CPU cost but only marginally improve ratio for dynamic content.

Q: Does Brotli require HTTPS?

Browsers only request br over HTTPS by convention. AI crawlers follow the same pattern in practice. The protocol does not technically require TLS, but in practice TLS is the floor.

Q: Why does my CDN strip Content-Length when compressing?

Cloudflare and similar CDNs may omit Content-Length when applying dynamic transforms to avoid mismatches. To preserve the header, set cache-control: no-transform on the origin response.

Q: Should I use Zstandard instead of Brotli for AI crawlers?

Zstandard (zstd) compresses ~42% faster than Brotli with ~11% better ratio than Gzip, but browser and AI crawler support is below 50%. As of 2026, treat Zstandard as opt-in and ship Brotli as the default.

: ioriver, "GZIP vs Brotli Compression Performance" — verified 2026-05-03 — supports Akamai 82% vs 78% benchmark. https://www.ioriver.io/blog/gzip-vs-brotli-compression-performance

: DoHost, "Brotli vs. Gzip Benchmarking" (March 2026) — verified 2026-05-03 — supports 14-20% JS / ~30% CSS ratio gap. https://dohost.us/index.php/2026/03/16/brotli-vs-gzip-benchmarking-compression-ratios-for-javascript-and-css/

: CRAN, "Text Compression in R: brotli, gzip, xz and bz2" — verified 2026-05-03 — supports decompression-speed parity. https://cran.r-project.org/web/packages/brotli/vignettes/benchmarks.html

: caniuse.com, "Brotli Accept-Encoding/Content-Encoding" — verified 2026-05-03 — supports 96.31% global support. https://caniuse.com/brotli

: Stack Overflow, "Does Chrome support Brotli?" — verified 2026-05-03 — supports HTTPS-only browser request convention. https://stackoverflow.com/questions/40583809/does-chrome-support-brotli-accept-encoding-does-not-contain-br

: Cloudflare Speed Docs, "Content compression" — verified 2026-05-03 — supports 48-byte gzip / 50-byte Brotli minimum and Content-Length handling. https://developers.cloudflare.com/speed/optimization/content/compression/

: Cloudflare Blog, "All the way up to 11: Serve Brotli from origin" — verified 2026-05-03 — supports 66.94% level-11 reduction benchmark. https://blog.cloudflare.com/this-is-brotli-from-origin/

: Cloudflare Blog, "Results of experimenting with Brotli for dynamic web content" — verified 2026-05-03 — supports level-4 dynamic compression rationale. https://blog.cloudflare.com/results-experimenting-brotli/

: Cloudflare Community, "Brotli ON by Default and Brotli Toggle removal" — verified 2026-05-03 — supports default-on Brotli at Cloudflare. https://community.cloudflare.com/t/important-brotli-on-by-default-and-brotli-toggle-removal/652757

: Deco CMS, "How we reduced 53% of bandwidth at Cloudflare" — verified 2026-05-03 — supports 53% HTML bandwidth reduction post-rollout. https://www.decocms.com/blog/post/gzip-to-brotli-compression-cloudflare

: Cloudflare Blog, "New standards for a faster and more private Internet" — verified 2026-05-03 — supports Zstandard 42% faster / 11.3% smaller comparison. https://blog.cloudflare.com/new-standards/

: Netty GitHub Issue #15815 — verified 2026-05-03 — supports buggy-client downgrade pattern. https://github.com/netty/netty/issues/15815

: DEV Community, "Brotli vs. Gzip for Web Performance In Static Sites" — verified 2026-05-03 — supports browser support and feature comparison table. https://dev.to/lovestaco/brotli-vs-gzip-for-web-performance-in-static-sites-2nhk

: OpenAI, "Overview of OpenAI Crawlers" — verified 2026-05-03 — supports GPTBot user-agent context. https://developers.openai.com/api/docs/bots

Related Articles

reference

AI Crawler IP Allowlist Reference

Reference list of official AI crawler IP range endpoints, user agents, and reverse-DNS verification methods for GPTBot, ClaudeBot, PerplexityBot, Googlebot, and more.

guide

How to Create llms.txt: Step-by-Step Tutorial for AI Search

Step-by-step tutorial for creating, deploying, and validating an llms.txt file so AI systems and LLMs can discover your site's most important content.

comparison

HTTP/2 vs HTTP/3 for AI Crawlers

HTTP/3 AI crawlers support is uneven: GPTBot and most AI bots still default to HTTP/2 over TCP. Compare protocols, fallback behavior, and CDN config.

Topics
Stay Updated

GEO & AI Search Insights

New articles, framework updates, and industry analysis. No spam, unsubscribe anytime.