Font Converter

Complete Guide to Font Subsetting

Master font subsetting to reduce file sizes by 50-90%, improve performance, and optimize web fonts for production websites

TL;DR

In Simple Terms

Font subsetting removes unused characters from font files, reducing size by 50-90%. A full font with 1,200+ glyphs (53KB) becomes 14KB with Latin Basic subset.Use Latin Extended for most sites (47% smaller, supports Western European languages). Use Basic Latin for English-only controlled content (74% smaller).Tools: Glyphhanger (recommended, scans websites automatically), pyftsubset (advanced control), or Font Squirrel (beginner-friendly GUI). Always keep original fonts and test thoroughly.

Share this page to:

Font subsetting is the process of creating a smaller font file that contains only a specific subset of characters from the original complete font, rather than the full character set. A typical professional font includes 1,000-2,000 glyphs covering multiple languages, special symbols, and advanced typography features. Through subsetting, you can reduce this to the 200-600 characters your website actually uses, achieving 50-90% file size reduction without any visual compromise for your users.

This dramatic size reduction directly translates to faster page loads, better Core Web Vitals scores, reduced bandwidth costs, and improved mobile experience. For a typical website using four font variations (Regular, Italic, Bold, Bold Italic), subsetting can reduce total font weight from 200-300 KB down to 40-80 KB—a difference of 1-2 seconds on typical mobile connections. These performance gains compound across millions of page views, making subsetting one of the highest-ROI optimizations in web development.

This comprehensive guide teaches font subsetting from fundamentals to advanced techniques. You'll learn what subsetting is, why it's crucial for modern web performance, different subsetting strategies (language-based, content-based, custom), tools and workflows, implementation best practices, potential pitfalls, and how to balance performance gains against flexibility needs. Whether optimizing a single landing page or an enterprise application, this guide provides the knowledge to implement effective font subsetting.

What is Font Subsetting?

The Concept Explained

Font subsetting removes unused glyphs (characters) from font files to create smaller versions:

Full Font (Roboto Regular):

  • • Total glyphs: 1,294 characters
  • • Basic Latin: A-Z, a-z, 0-9
  • • Latin Extended: À, É, Ñ, Ø
  • • Cyrillic: А, Б, В, Г
  • • Greek: Α, Β, Γ, Δ
  • • Vietnamese: Ạ, Ả, Ã
  • • Symbols: €, ™, ©, arrows
  • WOFF2 size: 53 KB

Subset Font (Latin Basic):

  • • Total glyphs: 218 characters
  • • A-Z (uppercase)
  • • a-z (lowercase)
  • • 0-9 (numbers)
  • • Basic punctuation
  • • Common symbols
  • No extended characters
  • WOFF2 size: 14 KB (74% smaller!)

How Subsetting Works

The subsetting process extracts only specified characters:

  1. Character Selection: You specify which characters to include (A-Z, a-z, 0-9, etc.) using Unicode ranges or explicit character lists
  2. Glyph Extraction: Subsetting tool extracts only the selected glyph outlines from the original font file
  3. Table Optimization: Font tables (cmap, glyf, glyph positioning) are rebuilt containing only necessary data
  4. Feature Preservation: OpenType features like ligatures and kerning are maintained for included glyphs
  5. Output: Result is a complete, valid font file that's dramatically smaller but functionally identical for your character set

What Gets Removed

  • Unused glyphs: All characters not explicitly included (accented letters, foreign scripts, special symbols)
  • Related features: Ligatures involving removed characters (if you remove "f" and "i", "fi" ligature is also removed)
  • Kerning pairs: Letter-spacing adjustments for removed character combinations
  • Metadata: Often stripped to reduce size (copyright notices, designer info, version details)
  • Hinting data: Sometimes simplified or removed depending on subsetting settings

Critical: Subsetting is Irreversible

Once you create a subset, removed characters cannot be added back:

  • Always keep original font files as your master source
  • • Never subset from a subset (you'll lose more data)
  • • Regenerate subsets from originals when you need different characters
  • • Document which characters are included in each subset
  • • Version control your original fonts for long-term projects

Why Subset Fonts?

Performance Benefits

Real-World Example:

Blog using 4 font files (Regular, Italic, Bold, Bold Italic):

Before Subsetting:

  • • 4 full fonts: 212 KB total
  • • Load time (4G): ~1.4s
  • • LCP: 4.2s
  • • PageSpeed: 72

After Latin Subsetting:

  • • 4 subset fonts: 56 KB total
  • • Load time (4G): ~0.4s
  • • LCP: 2.3s
  • • PageSpeed: 93

Result: 156 KB saved (74% reduction), 1 second faster load, 1.9s better LCP, +21 PageSpeed points

Business Impact

  • Higher conversion rates: Every 100ms improvement can increase conversions by up to 1% (Amazon, Walmart studies)
  • Lower bounce rates: 53% of mobile users abandon pages taking longer than 3 seconds (Google)
  • Better SEO: Page speed is a ranking factor, especially for mobile searches
  • Reduced costs: 156 KB × 100,000 visitors = 15.6 GB saved in bandwidth per 100K visitors
  • Mobile experience: Dramatic improvement on slower 3G/4G connections in emerging markets

Size Reduction by Strategy

Subsetting StrategyCharactersSizeReduction
Full Font1,29453 KBBaseline
Latin Extended64028 KB47% smaller
Basic Latin21814 KB74% smaller
Custom Minimal1289 KB83% smaller
Logo/Display Only30-504-6 KB88-92% smaller

Subsetting Approaches and Strategies

1. Language-Based Subsetting (Most Common)

Include character ranges for specific language support:

Basic Latin (English-only):

Unicode Range: U+0000-00FF, U+0131, U+0152-0153

Includes: A-Z, a-z, 0-9, basic punctuation, common symbols

Size: ~14 KB WOFF2 (74% reduction)

Best for: English-only blogs, US-focused businesses, controlled content

Latin Extended (Western European):

Unicode Range: U+0000-00FF, U+0100-017F, U+0180-024F

Includes: Basic Latin + French (é, è, ç), German (ä, ö, ü), Spanish (ñ), Portuguese accents

Size: ~28 KB WOFF2 (47% reduction)

Best for: Most websites, international names, user-generated content

Multi-Language Strategy:

Create separate subsets for different language groups (Latin, Cyrillic, Greek, Vietnamese) and load only what's needed per page

Best for: International sites serving distinct regional markets

2. Content-Based Subsetting (Maximum Optimization)

Analyze your actual website content to include only characters you use:

Process:

  1. Scan all website content (HTML, CMS database, static files)
  2. Extract every unique character actually present
  3. Create subset containing exactly those characters
  4. Result: Minimal file size, perfect coverage

Tools:

  • Glyphhanger: Scans live websites, generates optimal character lists automatically
  • Custom scripts: Parse CMS database or static site content programmatically
  • Manual analysis: Review content files to identify needed characters

Best for: Static sites, controlled content, maximum performance requirements

3. Use-Case-Specific Subsetting

Display/Heading Fonts:

If font only used for specific headings or logos, subset to those exact letters

Example: Logo says "ACME Corp" → subset to "ACMEcorp" (9 unique chars) = 4-5 KB file (90%+ reduction)

Number-Only Fonts:

For pricing tables, statistics, dashboards: subset to 0-9, currency symbols, decimal point

Result: 3-4 KB font file, instant load

Icon Fonts:

Most icon fonts ship with 500+ icons. Subset to the 20-30 you actually use.

Reduction: Typically 85-95% smaller

4. Progressive Enhancement Strategy

Load minimal subset initially, extended characters on-demand:

/* Basic Latin loads first */
@font-face {
  font-family: 'Roboto';
  src: url('/fonts/roboto-latin.woff2') format('woff2');
  unicode-range: U+0000-00FF;
}

/* Extended Latin loads only if page uses these characters */
@font-face {
  font-family: 'Roboto';
  src: url('/fonts/roboto-latin-ext.woff2') format('woff2');
  unicode-range: U+0100-017F;
}

Browser automatically downloads extended subset only when page contains those characters

Tools and Methods

Glyphhanger (Recommended for Most Users)

Command-line tool that scans websites and generates optimal subsets automatically.

Installation:

npm install -g glyphhanger

Basic Usage:

# Scan website and create subset
glyphhanger https://yoursite.com --subset=font.ttf --formats=woff2,woff

# Subset to specific Unicode range
glyphhanger --whitelist=U+0000-00FF --subset=font.ttf --formats=woff2

# Subset to specific characters
glyphhanger --whitelist="ABCDEFabcdef0123456789" --subset=font.ttf

Pros: Automated, accurate, supports multiple formats
Cons: Requires Node.js, command-line knowledge

pyftsubset (FontTools - Advanced Users)

Python-based tool with extensive options and fine control.

Installation:

pip install fonttools brotli

Basic Latin Subset:

pyftsubset font.ttf \
  --output-file=font-subset.woff2 \
  --flavor=woff2 \
  --layout-features='*' \
  --unicodes="U+0000-00FF,U+0131,U+0152-0153"

Pros: Most powerful, flexible, industry standard
Cons: Steeper learning curve, Python required

Font Squirrel Webfont Generator (Beginners)

Online GUI tool requiring no installation or command-line knowledge.

URL:

fontsquirrel.com/tools/webfont-generator

Process:

  1. Upload your font file
  2. Select "Expert" mode
  3. Choose character sets to include (Basic Latin, Latin Extended, etc.)
  4. Select formats (WOFF2, WOFF)
  5. Download subset fonts

Pros: Easy, no technical skills needed, visual interface
Cons: Manual process, uploads fonts to third party, less control

Tool Comparison

ToolSkill LevelAutomationBest For
GlyphhangerIntermediateHighMost projects
pyftsubsetAdvancedMediumCustom workflows
Font SquirrelBeginnerLowOne-off subsets

Implementation Guide

Complete Subsetting Workflow

  1. Analyze Content Needs
    • Determine primary language(s) your site serves
    • Check if you have user-generated content
    • Identify special characters needed (currency symbols, etc.)
  2. Choose Subsetting Strategy
    • Conservative: Latin Extended (good for most sites)
    • Aggressive: Basic Latin (English-only, controlled content)
    • Custom: Content-based scanning (maximum optimization)
  3. Create Subsets
    # Using Glyphhanger
    glyphhanger --whitelist=U+0000-00FF,U+0100-017F \
      --subset=*.ttf --formats=woff2,woff
  4. Test Thoroughly
    • Check all pages for missing characters (boxes)
    • Test with actual content, especially user input
    • Verify in Chrome, Safari, Firefox
  5. Deploy and Monitor
    • Upload subset fonts to server
    • Update @font-face declarations
    • Measure performance improvement
    • Monitor for missing character issues

CSS Implementation Example

/* Subset font with unicode-range */
@font-face {
  font-family: 'Roboto';
  src: url('/fonts/roboto-latin.woff2') format('woff2'),
       url('/fonts/roboto-latin.woff') format('woff');
  font-weight: 400;
  font-style: normal;
  font-display: swap;
  unicode-range: U+0000-00FF, U+0131, U+0152-0153;
}

/* Fallback to system fonts for missing characters */
body {
  font-family: 'Roboto', -apple-system, BlinkMacSystemFont, 
               'Segoe UI', Arial, sans-serif;
}

Tradeoffs and Considerations

Risks and Limitations

1. Missing Characters Display as Boxes

If subset doesn't include a character, it appears as ☐ or renders in fallback font inconsistently.

Mitigation: Use conservative subsetting (Latin Extended), comprehensive fallback fonts

2. User-Generated Content Problems

Comments, reviews, forum posts may contain unexpected characters (emoji, foreign text).

Mitigation: Include Latin Extended, rely on fallback fonts for edge cases

3. Maintenance Overhead

Content changes may require regenerating subsets with different characters.

Mitigation: Automate subsetting in build process, keep originals for regeneration

When NOT to Subset

  • Truly international sites: Serving many languages with unpredictable content
  • Heavy user-generated content: Forums, social networks where users contribute diverse text
  • Technical documentation: May need math symbols, code characters, special notation
  • Limited technical resources: If maintenance burden outweighs performance gain
  • Variable font with few axis: If single variable font is already small enough

Best Practices

  • • Start conservative (Latin Extended), optimize further if traffic justifies
  • • Always keep original font files for regenerating subsets
  • • Use --layout-features='*' in pyftsubset to preserve ligatures and kerning
  • • Document which characters are included in each subset
  • • Test thoroughly across all page types and content
  • • Implement comprehensive fallback font stacks
  • • Consider unicode-range for progressive loading
  • • Automate subsetting in your build pipeline

Advanced Subsetting Techniques

Unicode-Range Progressive Loading

Create multiple subsets and let browser download only what's needed:

/* Core characters - always loads */
@font-face {
  font-family: 'Roboto';
  src: url('/fonts/roboto-core.woff2') format('woff2');
  unicode-range: U+0020-007F;  /* ASCII */
}

/* Extended Latin - loads only if page uses these */
@font-face {
  font-family: 'Roboto';
  src: url('/fonts/roboto-extended.woff2') format('woff2');
  unicode-range: U+0100-017F;  /* Latin Extended-A */
}

/* Symbols - loads only if needed */
@font-face {
  font-family: 'Roboto';
  src: url('/fonts/roboto-symbols.woff2') format('woff2');
  unicode-range: U+2000-206F;  /* Punctuation */
}

Automated Build Integration

Integrate subsetting into your build process:

Example: npm script

{
  "scripts": {
    "subset-fonts": "glyphhanger --whitelist=U+0000-017F --subset=fonts/*.ttf --formats=woff2,woff",
    "build": "npm run subset-fonts && next build"
  }
}

Content Monitoring

Set up monitoring to detect missing characters in production:

// Detect missing glyphs
document.fonts.ready.then(() => {
  const text = document.body.innerText;
  const uniqueChars = [...new Set(text)];
  
  uniqueChars.forEach(char => {
    if (!document.fonts.check('16px Roboto', char)) {
      console.warn('Missing glyph:', char);
      // Send to monitoring service
    }
  });
});

Summary: Mastering Font Subsetting

Font subsetting is one of the highest-impact web performance optimizations, achieving 50-90% file size reduction with proper implementation. Start with Latin Extended subsetting for broad coverage and good performance gains. For controlled content sites, progress to Basic Latin or content-based subsetting for maximum optimization. Always maintain original fonts, test thoroughly, and implement comprehensive fallback strategies.

Tools like Glyphhanger make subsetting accessible, while techniques like unicode-range progressive loading enable advanced optimization. The key is balancing performance gains against character coverage needs for your specific use case. When done right, subsetting delivers faster page loads, better Core Web Vitals, and improved user experience without sacrificing typography quality.

Sarah Mitchell

Written & Verified by

Sarah Mitchell

Product Designer, Font Specialist