Building a website that serves readers in multiple languages is straightforward once you understand two CSS primitives that most developers overlook: the unicode-range @font-face descriptor and the :lang() pseudo-class selector. Together, they form the backbone of every well-engineered multilingual font system, from small bilingual blogs to large-scale international publishing platforms.
The fundamental challenge of multilingual typography is that different writing systems have vastly different requirements. A Latin font covering English, French, and German needs roughly 200-400 glyphs and weighs 15-40KB as WOFF2. A Japanese font covering all common kanji needs 6,000-10,000 glyphs and can reach 2-20MB uncompressed. Serving both from the same @font-face declaration without splitting would force every visitor—including English-only readers—to download the entire Japanese font. That is not acceptable performance.
The solution is selective loading: declare multiple @font-face rules each covering a specific Unicode range, and let the browser download only the subsets whose characters appear in the rendered page. Pair this with :lang() selectors to switch font families entirely for different language contexts, and you have a system that is both bandwidth-efficient and typographically excellent in every supported language.
This guide walks through the complete technical implementation: how unicode-range works at the browser level, how to write effective :lang() stacks for the major writing systems, performance strategies for sites that mix Latin, CJK, Arabic, and Indic scripts, and a practical checklist for shipping multilingual font support correctly the first time.
Understanding Multilingual Font Requirements
Different writing systems impose fundamentally different demands on web font infrastructure. Understanding these differences determines which loading strategy is appropriate for your site.
Latin and Latin Extended Scripts
Latin-based languages—English, French, German, Spanish, Portuguese, Polish, Czech, Vietnamese, and many others—share a core alphabet augmented with diacritical marks. Basic Latin covers ASCII (U+0020-U+007E). Latin-1 Supplement (U+0080-U+00FF) adds accented characters for Western European languages. Latin Extended-A and -B (U+0100-U+024F) add characters for Central/Eastern European languages. Vietnamese requires the Latin Extended Additional block (U+1E00-U+1EFF) for its stacked tone marks.
A single well-subsetted WOFF2 covering all Latin ranges typically weighs 25-60KB—small enough that splitting by language within Latin is rarely worth the complexity. Use a single Latin font file covering U+0000-U+024F and U+1E00-U+1EFF for broad European coverage.
CJK Scripts: Chinese, Japanese, Korean
CJK Unified Ideographs (U+4E00-U+9FFF) cover 20,902 characters shared between Chinese, Japanese, and Korean, though each language uses different subsets with different preferred glyph forms. Simplified Chinese adds characters not in the unified block; Traditional Chinese uses different glyph shapes for many shared ideographs; Japanese adds hiragana (U+3040-U+309F), katakana (U+30A0-U+30FF), and its own kanji selection; Korean uses Hangul syllables (U+AC00-U+D7A3, 11,172 characters).
This glyph count makes CJK fonts categorically different from Latin fonts. A full Japanese font weighing 5-15MB as WOFF2 is not viable to serve monolithically. The solution is either Google Fonts automatic slicing (which splits CJK fonts into 100+ subsets of ~150 characters each) or manual subsetting to the specific characters your content uses.
Arabic, Hebrew, and RTL Scripts
Arabic (U+0600-U+06FF) and Hebrew (U+0590-U+05FF) are right-to-left scripts with additional complexity: Arabic letters take different forms depending on their position in a word (initial, medial, final, isolated). This requires OpenType features—calt (contextual alternates), init, medi, fina, isol—to be present in the font and rendered by the browser's text shaping engine.
A well-subsetted Arabic WOFF2 covering the Arabic block plus Arabic Presentation Forms typically weighs 50-120KB. RTL scripts also require HTML dir="rtl" attributes and CSS direction: rtl in addition to font setup.
Indic Scripts: Devanagari, Tamil, Bengali, and Others
Indic scripts (Devanagari U+0900-U+097F, Tamil U+0B80-U+0BFF, Bengali U+0980-U+09FF, and others) require complex text shaping involving consonant conjuncts, vowel sign reordering, and mark positioning. The OpenType layout features half, pres, blws, abvs, pstf, and akhn must all be present and correctly implemented. Without them, Devanagari text will render as disconnected letters rather than properly formed words. Indic WOFF2 subsets are generally 40-100KB due to the moderate glyph count combined with complex glyph tables.
When a Single Pan-Unicode Font Makes Sense
Google's Noto Sans family covers virtually every writing system with a single design language. When your site has low traffic in secondary languages or your content management system generates text in many scripts unpredictably, using Noto Sans with Google Fonts automatic subsetting is a pragmatic choice. The trade-off is typographic quality: Noto's Japanese glyphs are functional but not as refined as Noto Sans JP or Hiragino Sans; its Arabic is readable but not as elegant as IBM Plex Arabic. Use language-optimized fonts for your primary content languages and Noto as a fallback for everything else.
Using unicode-range for Efficient Loading
The unicode-range CSS descriptor tells the browser which Unicode code points a @font-face covers. The browser inspects every character in the rendered page and downloads only the font files whose unicode-range intersects with characters actually present. A page containing only English text will not download a CJK font file, even if the CJK @font-face is declared in the stylesheet.
This is distinct from the CSS font-family fallback mechanism. With unicode-range, multiple @font-face rules can share the same family name but cover different character ranges. The browser assembles the complete font virtually from these separate files, selecting the appropriate file per character.
Latin and CJK Split Example
This pattern splits a font family across three files: Latin (small, always needed for UI), Japanese CJK (large, only downloaded if kanji or kana appear), and Korean Hangul (medium, only if Hangul appears). All three share the family name SiteFont.
/* Latin: covers Basic Latin, Latin-1 Supplement,
Latin Extended-A/B, Latin Extended Additional */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-latin.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+0000-00FF, U+0100-024F, U+1E00-1EFF,
U+2000-206F, U+2074, U+20AC, U+2122,
U+2191, U+2193, U+2212, U+2215, U+FEFF;
}
/* Japanese: hiragana, katakana, and common kanji */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-japanese.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+3040-309F, /* Hiragana */
U+30A0-30FF, /* Katakana */
U+4E00-9FFF, /* CJK Unified Ideographs */
U+3400-4DBF, /* CJK Extension A */
U+FF00-FFEF; /* Halfwidth/Fullwidth Forms */
}
/* Korean: Hangul syllables and Jamo */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-korean.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+1100-11FF, /* Hangul Jamo */
U+AC00-D7AF; /* Hangul Syllables */
}
body {
font-family: 'SiteFont', system-ui, sans-serif;
}Arabic unicode-range Split
Arabic requires its own font file with the correct OpenType shaping tables. The unicode-range must include not just the Arabic block but also Arabic Supplement (U+0750-U+077F) for additional letters used in Persian, Urdu, and other Arabic-script languages, plus the Arabic Presentation Forms blocks that contain contextual glyph variants.
/* Arabic font with full shaping support */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-arabic.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+0600-06FF, /* Arabic */
U+0750-077F, /* Arabic Supplement */
U+08A0-08FF, /* Arabic Extended-A */
U+FB50-FDFF, /* Arabic Presentation Forms-A */
U+FE70-FEFF; /* Arabic Presentation Forms-B */
}
/* Persian/Farsi-specific additions */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-persian.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+0600-06FF, U+0750-077F,
U+200C-200D; /* ZWNJ and ZWJ for Persian */
}Devanagari (Hindi, Sanskrit, Marathi)
/* Devanagari for Hindi, Sanskrit, Marathi */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-devanagari.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap;
unicode-range: U+0900-097F, /* Devanagari */
U+0980-09FF, /* Bengali (shared usage) */
U+1CD0-1CFF, /* Vedic Extensions */
U+200C-200D, /* ZWNJ, ZWJ (essential for conjuncts) */
U+20B9, /* Indian Rupee sign ₹ */
U+25CC; /* Dotted circle (placeholder) */
}The ZWNJ (U+200C, Zero Width Non-Joiner) and ZWJ (U+200D, Zero Width Joiner) are invisible control characters critical to correct Indic rendering. ZWNJ prevents consonant joining (producing half-forms); ZWJ forces joining. Always include them in Devanagari and other Indic unicode-range declarations.
Language-Specific Font Stacks with :lang()
While unicode-range controls which font files load, the :lang() CSS pseudo-class controls which font families apply to specific language contexts. These two mechanisms solve different problems and work best together.
The :lang() selector matches elements whose language is set via the lang HTML attribute—either on the element itself or inherited from an ancestor. It supports language subtags: :lang(zh) matches both lang="zh-Hans" and lang="zh-Hant", while :lang(zh-Hans) matches only Simplified Chinese. Use the most specific subtag your content requires.
Complete :lang() Font Stack Example
This example shows a real-world :lang() implementation covering Japanese, Simplified Chinese, Traditional Chinese, Korean, Arabic, and Hindi. Each stack lists a web font first, then high-quality system fonts that ship with relevant operating systems, then a generic fallback.
/* Japanese */
:lang(ja) {
font-family: 'Noto Sans JP', 'Hiragino Kaku Gothic ProN',
'Yu Gothic', 'Meiryo', sans-serif;
word-break: break-all; /* Japanese doesn't use spaces */
line-height: 1.8; /* More space needed for CJK */
}
/* Simplified Chinese */
:lang(zh-Hans),
:lang(zh-CN) {
font-family: 'Noto Sans SC', 'PingFang SC',
'Microsoft YaHei', 'Source Han Sans CN',
sans-serif;
word-break: break-all;
line-height: 1.8;
}
/* Traditional Chinese */
:lang(zh-Hant),
:lang(zh-TW),
:lang(zh-HK) {
font-family: 'Noto Sans TC', 'PingFang TC',
'Microsoft JhengHei', 'Source Han Sans TW',
sans-serif;
word-break: break-all;
line-height: 1.8;
}
/* Korean */
:lang(ko) {
font-family: 'Noto Sans KR', 'Apple SD Gothic Neo',
'Malgun Gothic', 'Nanum Gothic', sans-serif;
word-break: keep-all; /* Korean: break at syllable boundaries */
line-height: 1.7;
}
/* Arabic and Arabic-script languages */
:lang(ar) {
font-family: 'Noto Naskh Arabic', 'IBM Plex Arabic',
'Cairo', 'Amiri', 'Arial', sans-serif;
direction: rtl;
text-align: right;
line-height: 1.9; /* Arabic script needs more vertical space */
}
/* Persian/Farsi */
:lang(fa) {
font-family: 'Vazirmatn', 'Noto Naskh Arabic',
'Tahoma', sans-serif;
direction: rtl;
text-align: right;
line-height: 1.9;
}
/* Hindi (Devanagari) */
:lang(hi) {
font-family: 'Noto Sans Devanagari', 'Hind',
'Mangal', 'Kokila', sans-serif;
line-height: 1.8; /* Devanagari marks extend above/below */
}
/* Thai */
:lang(th) {
font-family: 'Noto Sans Thai', 'Sarabun',
'Tahoma', 'Leelawadee UI', sans-serif;
line-height: 1.9; /* Thai stacks tone marks above letters */
}Applying lang Attributes in HTML
The :lang() selector only works if the corresponding HTML lang attribute is set. For multilingual content, set the base language on the <html> element and override it on specific sections or inline elements that use a different language.
<!-- Base language set on html element -->
<html lang="en">
<!-- Override for a Japanese paragraph -->
<p lang="ja">日本語のテキストがここに表示されます。</p>
<!-- Mixed content: English page with Arabic quote -->
<blockquote lang="ar" dir="rtl">
الخط العربي جميل وله تاريخ عريق
</blockquote>
<!-- CMS-generated content with dynamic lang -->
<article lang="{{ page.language }}">
{{ page.content }}
</article>Adjusting Typography Per Script
Beyond font-family, :lang() lets you tune spacing and layout properties that differ by script. Letter-spacing that improves Latin readability often looks wrong for Arabic (which uses contextual joining). Word-break rules differ between CJK (break anywhere) and Korean (break at word boundaries). Line-height needs to increase for scripts with tall stacks like Thai or Tibetan.
/* Latin: letter-spacing is fine */
:lang(en),
:lang(fr),
:lang(de) {
letter-spacing: 0.01em;
}
/* CJK: never add letter-spacing to CJK text */
:lang(ja),
:lang(zh),
:lang(ko) {
letter-spacing: 0;
}
/* Arabic: letter-spacing disrupts joining glyphs */
:lang(ar),
:lang(fa),
:lang(ur) {
letter-spacing: 0;
font-feature-settings: 'calt' 1, 'clig' 1;
}
/* Devanagari: disable Latin font features */
:lang(hi),
:lang(mr),
:lang(sa) {
font-feature-settings: 'kern' 1;
}Performance Optimization for Multilingual Sites
Multilingual font loading is one of the most significant performance challenges for international websites. A page serving Japanese, Arabic, and Latin content without optimization could easily require 10-30MB of font data. With the right techniques, that drops to under 500KB of actually-needed font data per page.
Strategy 1: Preload the Primary Script Font
Preload only the font file for the primary language of each page. For a Japanese page, preload the Japanese CJK subset. For an Arabic page, preload the Arabic font. Do not preload all language fonts on every page—that defeats the purpose of unicode-range splitting.
<!-- In <head>, preload for a Japanese page -->
<link rel="preload"
href="/fonts/noto-sans-jp-400.woff2"
as="font"
type="font/woff2"
crossorigin="anonymous">
<!-- Latin preload for English/European pages -->
<link rel="preload"
href="/fonts/inter-latin-400.woff2"
as="font"
type="font/woff2"
crossorigin="anonymous">Strategy 2: Use Google Fonts for CJK Auto-Slicing
Google Fonts automatically splits CJK fonts into 100-300 micro-subsets of ~100-200 characters each, each with its own unicode-range. A page with 400 unique kanji characters downloads only the 2-4 slices covering those specific characters, not the entire 5MB font. This is the most bandwidth-efficient approach for CJK and is difficult to replicate with self-hosted fonts.
/* Google Fonts handles CJK subsetting automatically.
Use the display=swap parameter for font-display: swap */
/* In HTML <head> */
/* https://fonts.googleapis.com/css2?
family=Noto+Sans+JP:wght@400;700&
family=Noto+Sans+SC:wght@400;700&
family=Inter:wght@400;700&
display=swap */
/* Or via @import in CSS */
@import url('https://fonts.googleapis.com/css2?
family=Noto+Sans+JP:wght@400;700&
family=Noto+Sans+SC:wght@400;700&
family=Inter:wght@400;700&
display=swap');Strategy 3: font-display: swap for Non-Blocking Render
Always use font-display: swap (or optional for non-critical scripts) on all @font-face declarations. This prevents invisible text (FOIT) by rendering immediately with a system font, then swapping to the web font once loaded. For above-the-fold content, pair with preload links to minimize the swap delay.
/* Primary font: swap (text shown immediately) */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-latin.woff2') format('woff2');
font-display: swap;
unicode-range: U+0000-00FF, U+0100-024F;
}
/* Secondary language font: optional
(browser may skip if load time is too long) */
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-arabic.woff2') format('woff2');
font-display: optional;
unicode-range: U+0600-06FF, U+FB50-FDFF, U+FE70-FEFF;
}Strategy 4: Self-Hosted Font Subsetting with pyftsubset
For sites with known content, subset fonts to only the characters that actually appear in your content using the pyftsubset tool from the fonttools library. A CJK font subset to the 3,000 most common characters (covering 99%+ of typical web content) weighs 300-800KB instead of 5-20MB.
# Install fonttools pip install fonttools brotli # Subset NotoSansJP to the 3000 most common kanji # plus hiragana and katakana pyftsubset NotoSansJP-Regular.ttf --unicodes="U+3040-309F,U+30A0-30FF,U+4E00-9FFF" --flavor=woff2 --output-file=NotoSansJP-subset.woff2 # Subset for specific content using a text file pyftsubset NotoSansJP-Regular.ttf --text-file=content-characters.txt --flavor=woff2 --output-file=NotoSansJP-content.woff2
Strategy 5: Lazy-Load Non-Primary Scripts
For sites where non-primary language content is below the fold or in collapsible sections, defer font loading until the content is visible. Use an Intersection Observer to add the @font-face rules only when the multilingual section scrolls into view. This prevents any font loading cost for users who never scroll to that section.
// Lazy-load Japanese font when section is visible
const observer = new IntersectionObserver((entries) => {
entries.forEach(entry => {
if (entry.isIntersecting) {
const style = document.createElement('style');
style.textContent = `
@font-face {
font-family: 'SiteFont';
src: url('/fonts/sitefont-japanese.woff2')
format('woff2');
font-display: swap;
unicode-range: U+3040-30FF, U+4E00-9FFF;
}
`;
document.head.appendChild(style);
observer.disconnect();
}
});
});
const japaneseSection = document.querySelector('[lang="ja"]');
if (japaneseSection) observer.observe(japaneseSection);Common Font Combinations for Major Languages
This table summarizes recommended web fonts, system font fallbacks, and unicode-range values for the most commonly supported languages. All listed web fonts are available via Google Fonts unless noted otherwise.
| Language / Script | Primary Web Font | System Fallback | Key unicode-range |
|---|---|---|---|
| English (Latin) | Inter, Roboto, Source Sans 3 | system-ui, -apple-system | U+0000-00FF |
| French, German, Spanish | Inter, Lato, Open Sans | system-ui, Arial | U+0000-024F |
| Polish, Czech, Romanian | Inter, Nunito, Raleway | system-ui, Calibri | U+0100-017F |
| Vietnamese | Be Vietnam Pro, Nunito | system-ui, Arial | U+1E00-1EFF |
| Russian / Cyrillic | Inter, Roboto, PT Sans | Arial, Tahoma | U+0400-04FF |
| Greek | Inter, Roboto, Source Sans 3 | Arial, Helvetica | U+0370-03FF |
| Japanese | Noto Sans JP, M PLUS Rounded 1c | Hiragino Kaku Gothic, Yu Gothic | U+3040-30FF, U+4E00-9FFF |
| Simplified Chinese | Noto Sans SC, ZCOOL XiaoWei | PingFang SC, Microsoft YaHei | U+4E00-9FFF, U+3400-4DBF |
| Traditional Chinese | Noto Sans TC, Zen Old Mincho | PingFang TC, Microsoft JhengHei | U+4E00-9FFF, U+F900-FAFF |
| Korean | Noto Sans KR, Nanum Gothic | Apple SD Gothic Neo, Malgun Gothic | U+AC00-D7AF, U+1100-11FF |
| Arabic | Noto Naskh Arabic, IBM Plex Arabic, Cairo | Tahoma, Arial | U+0600-06FF, U+FB50-FDFF |
| Hebrew | Heebo, Rubik, Frank Ruhl Libre | Arial, Tahoma | U+0590-05FF, U+FB1D-FB4F |
| Hindi (Devanagari) | Noto Sans Devanagari, Hind, Mukta | Mangal, Kokila | U+0900-097F |
| Thai | Noto Sans Thai, Sarabun, Kanit | Leelawadee UI, Tahoma | U+0E00-0E7F |
Implementation Checklist
Follow these steps in order when adding multilingual font support to a new or existing website. Each step builds on the previous one and can be verified independently before moving on.
Audit Your Language Requirements
Identify every language and script your site serves. Use analytics to find what languages your actual visitors use. Prioritize fonts for the top 3-5 languages by traffic; add others as secondary concerns. Create a list of Unicode ranges needed per language using the table above as a reference.
Set lang Attributes Correctly Across Your Site
Ensure the <html lang="..."> attribute reflects each page's primary language. For multilingual pages or embedded foreign-language content, add lang attributes to the specific elements. Without lang attributes, :lang() selectors will not function and screen readers will mispronounce content.
Choose Fonts and Prepare Subsets
For Latin and Cyrillic, select a high-quality variable or static font and subset it to U+0000-024F plus any extended ranges needed. For CJK, use Google Fonts (Noto Sans JP/SC/TC/KR) for automatic slicing, or use pyftsubset to create a frequency-ordered subset if self-hosting. For Arabic and Indic, verify the font has the required OpenType shaping tables before deploying.
Write @font-face Declarations with unicode-range
Create @font-face rules for each subset, sharing the same font-family name across all rules for the same typeface family. Include font-display: swap on every declaration. Add crossorigin="anonymous" to any preload links for cross-origin fonts (including Google Fonts). Test by temporarily removing individual font files to verify the browser correctly falls through to the next font in the stack.
Write :lang() CSS Rules
Add :lang() declarations to your global stylesheet for each supported language. Include font-family with web font first, system fonts second, and generic fallback last. Adjust line-height, word-break, direction, and letter-spacing as appropriate for each script. Group related language rules together and add comments explaining why each typographic adjustment is needed.
Add Preload Links for Critical Fonts
Add <link rel="preload"> for the primary language font on each page type. For a server-rendered multilingual application, generate preload links dynamically based on the page's primary language. Do not preload fonts for secondary languages or scripts that appear below the fold—let unicode-range handle on-demand loading.
Measure and Verify in Browser DevTools
Open Network DevTools and filter by font. Load an English-only page and verify only Latin font subsets download. Load a Japanese page and verify the Japanese font subset downloads while others do not. Check font loading timing and ensure no render-blocking font requests. Use Lighthouse to verify LCP and CLS scores are not degraded by font loading.
Test Rendering with Native Speakers
Automated tests cannot catch typographic quality issues. Have native speakers review rendering for each language, checking specifically: conjunct formation in Indic scripts, Arabic letter joining and direction, CJK glyph shapes (Simplified vs Traditional Chinese), Korean syllable boundaries, and Thai tone mark stacking. Small errors in these areas signal an incorrect font or missing OpenType features.
Multilingual Font Setup FAQs
Common questions about setting up fonts for multilingual websites
Written & Verified by
Sarah Mitchell
Product Designer, Font Specialist
Related Resources
Unicode Range Generator
Generate accurate unicode-range values for any language or custom character set
Font Subsetter
Subset fonts to specific character sets to reduce download size for multilingual sites
CJK Font Optimization
Deep dive into loading Chinese, Japanese, and Korean fonts efficiently
Right-to-Left Fonts
Setting up Arabic, Hebrew, and other RTL scripts correctly in web typography
Font Fallback Chains
Build robust font stacks that degrade gracefully across all operating systems
Font Converter
Convert fonts to WOFF2, subset for languages, and prepare web-optimized font files
Ready to Optimize Your Multilingual Fonts?
Generate precise unicode-range values, subset fonts to specific language character sets, and convert to WOFF2 for maximum compression—all free.
