What is Compression?
Quick Definition
Compression is the process of reducing file size by encoding data more efficiently. PDFs use various compression methods for text, images, and other content to minimize file size while maintaining acceptable quality.
How PDF Compression Works
PDFs can contain multiple types of content—text, vector graphics, raster images, fonts, and metadata—each compressed using different algorithms. Text and vector data typically use lossless compression (Flate/Deflate), which reduces size without any quality loss. Images can use either lossless compression (PNG, Flate) or lossy compression (JPEG), which achieves smaller sizes by discarding some visual information.
The PDF format applies compression to individual objects within the file. A single PDF might use JPEG compression for photographs, Flate compression for text streams, and CCITT compression for black-and-white scanned pages.
Lossy vs Lossless Compression
- Lossless compression: Reduces file size without discarding any data. The decompressed file is identical to the original. Used for text, vector graphics, and when perfect quality is required. Examples: Flate, LZW, CCITT.
- Lossy compression: Achieves higher compression ratios by discarding less important visual information. The decompressed file is similar but not identical to the original. Used for photographs and images where some quality loss is acceptable. Examples: JPEG, JPEG2000.
Common Compression Methods in PDFs
- Flate (Deflate): Lossless compression for text and vector graphics. Based on the same algorithm as ZIP files.
- JPEG: Lossy compression for color photographs. Widely supported and efficient for photographic content.
- JPEG2000: Advanced lossy or lossless compression. Better quality than JPEG at high compression ratios but less widely supported.
- CCITT: Lossless compression optimized for black-and-white scanned documents and fax images.
- JBIG2: Advanced compression for black-and-white images. Can be lossy or lossless.
Why Compression Matters
Uncompressed PDFs can be enormous. A single high-resolution photograph can occupy 50+ MB uncompressed. Compression makes PDFs practical for email, web distribution, and storage. A well-compressed PDF maintains visual quality while reducing file size by 50-90%.
However, excessive compression degrades quality. Over-compressed JPEG images show visible artifacts (blockiness, color banding). Finding the right balance between file size and quality is essential.
Compression and PDF Standards
PDF/A allows only certain compression methods to ensure long-term compatibility. JPEG and Flate are permitted, but some newer compression methods (like JBIG2 lossy) are prohibited because they may not be supported by future software.
PDF/X standards for print production typically use lossless or high-quality lossy compression to maintain print quality.
Recompressing PDFs
PDFs can be recompressed to reduce file size further. This involves decompressing the content, applying more aggressive compression settings, and re-encoding. However, recompressing already-compressed JPEG images with lossy compression compounds quality loss. Each recompression cycle degrades the image further.
Common Use Cases
- Email attachments: Reducing file size to meet email size limits
- Web distribution: Faster downloads and reduced bandwidth
- Archival storage: Minimizing storage requirements for large document collections
- Mobile viewing: Smaller files load faster on mobile devices
Related Concepts
- Resolution — Image quality affecting compression
- Compress vs Optimize — Different approaches to reducing file size
- PDF Too Large — Reducing oversized PDFs
- Compress for Email — Preparing PDFs for email
Need to reduce PDF file size? Use our PDF compression tool to optimize your documents.