Complete Guide to Font Subsetting
Master font subsetting to reduce file sizes by 50-90%, improve performance, and optimize web fonts for production websites
In Simple Terms
Font subsetting removes unused characters from font files, reducing size by 50-90%. A full font with 1,200+ glyphs (53KB) becomes 14KB with Latin Basic subset.Use Latin Extended for most sites (47% smaller, supports Western European languages). Use Basic Latin for English-only controlled content (74% smaller).Tools: Glyphhanger (recommended, scans websites automatically), pyftsubset (advanced control), or Font Squirrel (beginner-friendly GUI). Always keep original fonts and test thoroughly.
In this article
Font subsetting is the process of creating a smaller font file that contains only a specific subset of characters from the original complete font, rather than the full character set. A typical professional font includes 1,000-2,000 glyphs covering multiple languages, special symbols, and advanced typography features. Through subsetting, you can reduce this to the 200-600 characters your website actually uses, achieving 50-90% file size reduction without any visual compromise for your users.
This dramatic size reduction directly translates to faster page loads, better Core Web Vitals scores, reduced bandwidth costs, and improved mobile experience. For a typical website using four font variations (Regular, Italic, Bold, Bold Italic), subsetting can reduce total font weight from 200-300 KB down to 40-80 KB—a difference of 1-2 seconds on typical mobile connections. These performance gains compound across millions of page views, making subsetting one of the highest-ROI optimizations in web development.
This comprehensive guide teaches font subsetting from fundamentals to advanced techniques. You'll learn what subsetting is, why it's crucial for modern web performance, different subsetting strategies (language-based, content-based, custom), tools and workflows, implementation best practices, potential pitfalls, and how to balance performance gains against flexibility needs. Whether optimizing a single landing page or an enterprise application, this guide provides the knowledge to implement effective font subsetting.
What is Font Subsetting?
The Concept Explained
Font subsetting removes unused glyphs (characters) from font files to create smaller versions:
Full Font (Roboto Regular):
- • Total glyphs: 1,294 characters
- • Basic Latin: A-Z, a-z, 0-9
- • Latin Extended: À, É, Ñ, Ø
- • Cyrillic: А, Б, В, Г
- • Greek: Α, Β, Γ, Δ
- • Vietnamese: Ạ, Ả, Ã
- • Symbols: €, ™, ©, arrows
- • WOFF2 size: 53 KB
Subset Font (Latin Basic):
- • Total glyphs: 218 characters
- • A-Z (uppercase)
- • a-z (lowercase)
- • 0-9 (numbers)
- • Basic punctuation
- • Common symbols
- • No extended characters
- • WOFF2 size: 14 KB (74% smaller!)
How Subsetting Works
The subsetting process extracts only specified characters:
- Character Selection: You specify which characters to include (A-Z, a-z, 0-9, etc.) using Unicode ranges or explicit character lists
- Glyph Extraction: Subsetting tool extracts only the selected glyph outlines from the original font file
- Table Optimization: Font tables (cmap, glyf, glyph positioning) are rebuilt containing only necessary data
- Feature Preservation: OpenType features like ligatures and kerning are maintained for included glyphs
- Output: Result is a complete, valid font file that's dramatically smaller but functionally identical for your character set
What Gets Removed
- Unused glyphs: All characters not explicitly included (accented letters, foreign scripts, special symbols)
- Related features: Ligatures involving removed characters (if you remove "f" and "i", "fi" ligature is also removed)
- Kerning pairs: Letter-spacing adjustments for removed character combinations
- Metadata: Often stripped to reduce size (copyright notices, designer info, version details)
- Hinting data: Sometimes simplified or removed depending on subsetting settings
Critical: Subsetting is Irreversible
Once you create a subset, removed characters cannot be added back:
- • Always keep original font files as your master source
- • Never subset from a subset (you'll lose more data)
- • Regenerate subsets from originals when you need different characters
- • Document which characters are included in each subset
- • Version control your original fonts for long-term projects
Why Subset Fonts?
Performance Benefits
Real-World Example:
Blog using 4 font files (Regular, Italic, Bold, Bold Italic):
Before Subsetting:
- • 4 full fonts: 212 KB total
- • Load time (4G): ~1.4s
- • LCP: 4.2s
- • PageSpeed: 72
After Latin Subsetting:
- • 4 subset fonts: 56 KB total
- • Load time (4G): ~0.4s
- • LCP: 2.3s
- • PageSpeed: 93
Result: 156 KB saved (74% reduction), 1 second faster load, 1.9s better LCP, +21 PageSpeed points
Business Impact
- Higher conversion rates: Every 100ms improvement can increase conversions by up to 1% (Amazon, Walmart studies)
- Lower bounce rates: 53% of mobile users abandon pages taking longer than 3 seconds (Google)
- Better SEO: Page speed is a ranking factor, especially for mobile searches
- Reduced costs: 156 KB × 100,000 visitors = 15.6 GB saved in bandwidth per 100K visitors
- Mobile experience: Dramatic improvement on slower 3G/4G connections in emerging markets
Size Reduction by Strategy
| Subsetting Strategy | Characters | Size | Reduction |
|---|---|---|---|
| Full Font | 1,294 | 53 KB | Baseline |
| Latin Extended | 640 | 28 KB | 47% smaller |
| Basic Latin | 218 | 14 KB | 74% smaller |
| Custom Minimal | 128 | 9 KB | 83% smaller |
| Logo/Display Only | 30-50 | 4-6 KB | 88-92% smaller |
Subsetting Approaches and Strategies
1. Language-Based Subsetting (Most Common)
Include character ranges for specific language support:
Basic Latin (English-only):
Unicode Range: U+0000-00FF, U+0131, U+0152-0153
Includes: A-Z, a-z, 0-9, basic punctuation, common symbols
Size: ~14 KB WOFF2 (74% reduction)
Best for: English-only blogs, US-focused businesses, controlled content
Latin Extended (Western European):
Unicode Range: U+0000-00FF, U+0100-017F, U+0180-024F
Includes: Basic Latin + French (é, è, ç), German (ä, ö, ü), Spanish (ñ), Portuguese accents
Size: ~28 KB WOFF2 (47% reduction)
Best for: Most websites, international names, user-generated content
Multi-Language Strategy:
Create separate subsets for different language groups (Latin, Cyrillic, Greek, Vietnamese) and load only what's needed per page
Best for: International sites serving distinct regional markets
2. Content-Based Subsetting (Maximum Optimization)
Analyze your actual website content to include only characters you use:
Process:
- Scan all website content (HTML, CMS database, static files)
- Extract every unique character actually present
- Create subset containing exactly those characters
- Result: Minimal file size, perfect coverage
Tools:
- • Glyphhanger: Scans live websites, generates optimal character lists automatically
- • Custom scripts: Parse CMS database or static site content programmatically
- • Manual analysis: Review content files to identify needed characters
Best for: Static sites, controlled content, maximum performance requirements
3. Use-Case-Specific Subsetting
Display/Heading Fonts:
If font only used for specific headings or logos, subset to those exact letters
Example: Logo says "ACME Corp" → subset to "ACMEcorp" (9 unique chars) = 4-5 KB file (90%+ reduction)
Number-Only Fonts:
For pricing tables, statistics, dashboards: subset to 0-9, currency symbols, decimal point
Result: 3-4 KB font file, instant load
Icon Fonts:
Most icon fonts ship with 500+ icons. Subset to the 20-30 you actually use.
Reduction: Typically 85-95% smaller
4. Progressive Enhancement Strategy
Load minimal subset initially, extended characters on-demand:
/* Basic Latin loads first */
@font-face {
font-family: 'Roboto';
src: url('/fonts/roboto-latin.woff2') format('woff2');
unicode-range: U+0000-00FF;
}
/* Extended Latin loads only if page uses these characters */
@font-face {
font-family: 'Roboto';
src: url('/fonts/roboto-latin-ext.woff2') format('woff2');
unicode-range: U+0100-017F;
}Browser automatically downloads extended subset only when page contains those characters
Tools and Methods
Glyphhanger (Recommended for Most Users)
Command-line tool that scans websites and generates optimal subsets automatically.
Installation:
npm install -g glyphhanger
Basic Usage:
# Scan website and create subset glyphhanger https://yoursite.com --subset=font.ttf --formats=woff2,woff # Subset to specific Unicode range glyphhanger --whitelist=U+0000-00FF --subset=font.ttf --formats=woff2 # Subset to specific characters glyphhanger --whitelist="ABCDEFabcdef0123456789" --subset=font.ttf
Pros: Automated, accurate, supports multiple formats
Cons: Requires Node.js, command-line knowledge
pyftsubset (FontTools - Advanced Users)
Python-based tool with extensive options and fine control.
Installation:
pip install fonttools brotli
Basic Latin Subset:
pyftsubset font.ttf \ --output-file=font-subset.woff2 \ --flavor=woff2 \ --layout-features='*' \ --unicodes="U+0000-00FF,U+0131,U+0152-0153"
Pros: Most powerful, flexible, industry standard
Cons: Steeper learning curve, Python required
Font Squirrel Webfont Generator (Beginners)
Online GUI tool requiring no installation or command-line knowledge.
URL:
fontsquirrel.com/tools/webfont-generator
Process:
- Upload your font file
- Select "Expert" mode
- Choose character sets to include (Basic Latin, Latin Extended, etc.)
- Select formats (WOFF2, WOFF)
- Download subset fonts
Pros: Easy, no technical skills needed, visual interface
Cons: Manual process, uploads fonts to third party, less control
Tool Comparison
| Tool | Skill Level | Automation | Best For |
|---|---|---|---|
| Glyphhanger | Intermediate | High | Most projects |
| pyftsubset | Advanced | Medium | Custom workflows |
| Font Squirrel | Beginner | Low | One-off subsets |
Implementation Guide
Complete Subsetting Workflow
- Analyze Content Needs
- Determine primary language(s) your site serves
- Check if you have user-generated content
- Identify special characters needed (currency symbols, etc.)
- Choose Subsetting Strategy
- Conservative: Latin Extended (good for most sites)
- Aggressive: Basic Latin (English-only, controlled content)
- Custom: Content-based scanning (maximum optimization)
- Create Subsets
# Using Glyphhanger glyphhanger --whitelist=U+0000-00FF,U+0100-017F \ --subset=*.ttf --formats=woff2,woff
- Test Thoroughly
- Check all pages for missing characters (boxes)
- Test with actual content, especially user input
- Verify in Chrome, Safari, Firefox
- Deploy and Monitor
- Upload subset fonts to server
- Update @font-face declarations
- Measure performance improvement
- Monitor for missing character issues
CSS Implementation Example
/* Subset font with unicode-range */
@font-face {
font-family: 'Roboto';
src: url('/fonts/roboto-latin.woff2') format('woff2'),
url('/fonts/roboto-latin.woff') format('woff');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+0000-00FF, U+0131, U+0152-0153;
}
/* Fallback to system fonts for missing characters */
body {
font-family: 'Roboto', -apple-system, BlinkMacSystemFont,
'Segoe UI', Arial, sans-serif;
}Tradeoffs and Considerations
Risks and Limitations
1. Missing Characters Display as Boxes
If subset doesn't include a character, it appears as ☐ or renders in fallback font inconsistently.
Mitigation: Use conservative subsetting (Latin Extended), comprehensive fallback fonts
2. User-Generated Content Problems
Comments, reviews, forum posts may contain unexpected characters (emoji, foreign text).
Mitigation: Include Latin Extended, rely on fallback fonts for edge cases
3. Maintenance Overhead
Content changes may require regenerating subsets with different characters.
Mitigation: Automate subsetting in build process, keep originals for regeneration
When NOT to Subset
- • Truly international sites: Serving many languages with unpredictable content
- • Heavy user-generated content: Forums, social networks where users contribute diverse text
- • Technical documentation: May need math symbols, code characters, special notation
- • Limited technical resources: If maintenance burden outweighs performance gain
- • Variable font with few axis: If single variable font is already small enough
Best Practices
- • Start conservative (Latin Extended), optimize further if traffic justifies
- • Always keep original font files for regenerating subsets
- • Use --layout-features='*' in pyftsubset to preserve ligatures and kerning
- • Document which characters are included in each subset
- • Test thoroughly across all page types and content
- • Implement comprehensive fallback font stacks
- • Consider unicode-range for progressive loading
- • Automate subsetting in your build pipeline
Advanced Subsetting Techniques
Unicode-Range Progressive Loading
Create multiple subsets and let browser download only what's needed:
/* Core characters - always loads */
@font-face {
font-family: 'Roboto';
src: url('/fonts/roboto-core.woff2') format('woff2');
unicode-range: U+0020-007F; /* ASCII */
}
/* Extended Latin - loads only if page uses these */
@font-face {
font-family: 'Roboto';
src: url('/fonts/roboto-extended.woff2') format('woff2');
unicode-range: U+0100-017F; /* Latin Extended-A */
}
/* Symbols - loads only if needed */
@font-face {
font-family: 'Roboto';
src: url('/fonts/roboto-symbols.woff2') format('woff2');
unicode-range: U+2000-206F; /* Punctuation */
}Automated Build Integration
Integrate subsetting into your build process:
Example: npm script
{
"scripts": {
"subset-fonts": "glyphhanger --whitelist=U+0000-017F --subset=fonts/*.ttf --formats=woff2,woff",
"build": "npm run subset-fonts && next build"
}
}Content Monitoring
Set up monitoring to detect missing characters in production:
// Detect missing glyphs
document.fonts.ready.then(() => {
const text = document.body.innerText;
const uniqueChars = [...new Set(text)];
uniqueChars.forEach(char => {
if (!document.fonts.check('16px Roboto', char)) {
console.warn('Missing glyph:', char);
// Send to monitoring service
}
});
});Summary: Mastering Font Subsetting
Font subsetting is one of the highest-impact web performance optimizations, achieving 50-90% file size reduction with proper implementation. Start with Latin Extended subsetting for broad coverage and good performance gains. For controlled content sites, progress to Basic Latin or content-based subsetting for maximum optimization. Always maintain original fonts, test thoroughly, and implement comprehensive fallback strategies.
Tools like Glyphhanger make subsetting accessible, while techniques like unicode-range progressive loading enable advanced optimization. The key is balancing performance gains against character coverage needs for your specific use case. When done right, subsetting delivers faster page loads, better Core Web Vitals, and improved user experience without sacrificing typography quality.

Written & Verified by
Sarah Mitchell
Product Designer, Font Specialist
