Font Converter

Indic Script Font Guide: Devanagari, Tamil, Bengali & More

Indic scripts are the most typographically demanding writing systems on the web. Unlike Latin text, they require a shaping engine to transform encoded characters into correctly rendered glyphs — and the font must carry the complete OpenType layout tables to make that happen. This guide covers everything from conjunct mechanics to safe subsetting.

TL;DR — Key Takeaways

  • • Indic scripts require complex shaping via HarfBuzz; a font without layout tables will display raw code points, not readable text
  • • Fonts must include features like akhn, half, pres, blws, and mark — in the correct processing order
  • Noto Sans variants cover every Indic script; for Devanagari specifically, Hind and Mukta offer better typographic texture at body sizes
  • • When subsetting, never remove individual code points from an Indic block — always include the entire Unicode range for the script
  • • Test rendering with real conjuncts: क्ष (ksha), त्र (tra), श्र (shra), ज्ञ (gya) — not just isolated characters

Share this page to:

Sarah Mitchell

Written & Verified by

Sarah Mitchell

Product Designer, Font Specialist

Understanding Indic Script Complexity

Latin typography operates on a straightforward model: each character maps to one glyph, glyphs stack left to right, and the font engine renders them in Unicode order with minimal intervention. Indic scripts are categorically different. Every major Indic script — Devanagari, Tamil, Bengali, Telugu, Kannada, Malayalam, Gujarati, Gurmukhi, and Odia — requires a shaping engine to mediate between the encoded Unicode sequence and the final glyph sequence that gets painted on screen.

Three mechanisms drive this complexity and every font must support all three:

Consonant Conjuncts

When two consonants appear without an intervening vowel, they fuse into a conjunct ligature. The individual consonants disappear and a single composite glyph takes their place. Devanagari has hundreds of conjuncts; the most common are क्ष (ksha = क + ् + ष), त्र (tra = त + ् + र), श्र (shra = श + ् + र), and ज्ञ (gya = ज + ् + ञ). The halant character (्, U+094D) signals that the conjunct should form.

Vowel Sign Reordering

Indic vowel signs (matras) do not always appear where they are encoded. The Tamil vowel sign ெ (U+0BC6, e) appears visually to the left of its base consonant even though it is encoded after it. Bengali has split vowels that place glyphs both before and after the base. The shaping engine reorders glyphs after substitution to achieve the correct visual sequence.

Above/Below Base Marks

Dependent vowels and consonant modifiers attach above or below the base consonant. In Devanagari, the ि (i) vowel appears above-base to the left; the ु (u) and ू (uu) appear below-base. Mark positioning requires the font's GPOS table with mark and mkmk features to place these precisely relative to varying base glyph widths.

To illustrate the gap, consider the Devanagari word for "scripture" — शास्त्र (shastra). The Unicode sequence is: श (U+0936) + ा (U+093E) + स (U+0938) + ् (U+094D) + त (U+0924) + ् (U+094D) + र (U+0930). That is seven code points. But the correct rendering requires the shaper to: (1) combine स + ् + त into the conjunct स्त, (2) combine स्त + ् + र into the full conjunct स्त्र, and (3) attach the ा matra to the initial श. The result is a four-cluster word rendered from a seven-code-point string — and the font must carry the substitution rules for every step.

A Latin font used for Devanagari text will display raw Unicode replacement characters (tofu squares) for every code point it does not contain. But even a font that contains all the Devanagari code points will fail to render conjuncts if it lacks the required OpenType GSUB lookup tables. Both pieces must be present: the glyph data and the layout logic.

Required OpenType Features for Indic Scripts

The OpenType specification defines a set of feature tags applied by the shaping engine in a specific order. For Indic scripts, the Universal Shaping Engine (USE) and the older Indic-specific shaping model both require a particular sequence of GSUB (substitution) and GPOS (positioning) lookups. Missing or misordered features cause incorrect rendering — not just visually suboptimal text, but linguistically wrong text.

The processing order matters: akhn runs first to form Akhand conjuncts that cannot be broken, then rphf handles reph forms, then half forms half consonants, then blwf handles below-base forms, then pstf handles post-base forms, then contextual substitutions (pres, abvs, blws), and finally positioning (mark, mkmk). HarfBuzz applies these in the correct order automatically — but only if the font declares them.

Feature TagFeature NameDescriptionExample
akhnAkhandForms mandatory ligatures that cannot be broken by any other rule. Highest priority substitution, processed before all other GSUB features.क्ष → क्ष
rphfReph FormConverts ra + halant into the above-base reph form (र्). The reph travels visually to the top of the syllable cluster, though encoded first.र् + क → र्क
halfHalf FormsReplaces a consonant + halant sequence with a half-form glyph when no specific conjunct exists. The consonant loses its right-side vertical stroke.प + ् → प्
blwfBelow-Base FormsCreates the subscript form of a consonant that appears below the base in consonant clusters. Used for ra, va in certain scripts.ट + ् + र → ट्र
pstfPost-Base FormsReplaces consonants that appear in post-base position with their visual post-base variant. Critical for ra in Devanagari appearing after the base.क + ् + र → क्र
presPre-Base SubstitutionsApplies contextual substitutions to pre-base components of a syllable. Handles ligatures and variant forms that appear before the base consonant.கி → கி
abvsAbove-Base SubstitutionsApplies contextual substitutions to above-base components. Used for vowel sign variants that change form depending on the base consonant shape.क + ि → कि
blwsBelow-Base SubstitutionsApplies contextual substitutions to below-base components, including below-base vowel sign variants and subscript consonant forms.क + ु → कु
halnHalant FormsProduces the visible halant form when a consonant cluster is intentionally split — used for explicit halant display in educational and linguistic contexts.क + ् → क्
cjctConjunct FormsForms optional conjunct ligatures beyond those mandated by akhn. Applied after mandatory rules to generate additional ligature combinations.त + ् + त → त्त
markMark PositioningGPOS feature that positions combining marks relative to their base glyphs using anchor points. Ensures vowel signs attach correctly regardless of base width.कि — ि anchored precisely
mkmkMark-to-Mark PositioningPositions combining marks relative to other combining marks (not the base). Required when vowel signs and chandrabindu stack — e.g., nasalized vowels.कँ — chandrabindu over ा

Processing Order is Mandatory

HarfBuzz applies these features in a fixed order defined by the OpenType shaping specification. A font that declares half lookups under the wrong feature tag or places them in the wrong order in the GSUB table will produce incorrect output. If your font is rendering conjuncts incorrectly despite having the correct glyph data, inspect the feature lookup ordering with a tool like fonttools.

Implementing Devanagari with CSS

The correct CSS implementation combines @font-face declarations with unicode-range to ensure that the Devanagari font loads only when Devanagari characters appear on the page. The Latin portions of the text continue to use your Latin font. This is the same technique used by Google Fonts internally to split their Noto Devanagari delivery into smaller on-demand files.

/* Step 1: Declare the Devanagari font face */
@font-face {
  font-family: 'Hind';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url('/fonts/hind-devanagari-400.woff2') format('woff2');
  /* Devanagari Unicode block + extended Devanagari */
  unicode-range: U+0900-097F, U+1CD0-1CFF, U+200C-200D,
                 U+20B9, U+25CC, U+A8E0-A8FF;
}

@font-face {
  font-family: 'Hind';
  font-style: normal;
  font-weight: 700;
  font-display: swap;
  src: url('/fonts/hind-devanagari-700.woff2') format('woff2');
  unicode-range: U+0900-097F, U+1CD0-1CFF, U+200C-200D,
                 U+20B9, U+25CC, U+A8E0-A8FF;
}

/* Step 2: Declare the Latin companion */
@font-face {
  font-family: 'Hind';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url('/fonts/hind-latin-400.woff2') format('woff2');
  unicode-range: U+0000-00FF, U+0131, U+0152-0153,
                 U+02BB-02BC, U+02C6, U+02DA, U+02DC,
                 U+2000-206F, U+2074, U+20AC, U+2122,
                 U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD;
}

/* Step 3: Apply using font-family stack + :lang() */
body {
  font-family: 'Hind', system-ui, sans-serif;
}

/* Language-specific overrides */
:lang(hi),
:lang(mr),
:lang(ne) {
  font-family: 'Hind', 'Noto Sans Devanagari', sans-serif;
  /* Devanagari text has taller ascenders than Latin */
  line-height: 1.8;
}

/* Step 4: Set the lang attribute in HTML for correct shaping */
/* <html lang="hi"> for primary Hindi pages */
/* <span lang="hi">नमस्ते</span> for inline Devanagari */

Critical: Set the lang Attribute

HarfBuzz uses the lang attribute to select the correct shaping model. Without it, Devanagari may be shaped using the default (Latin) model, which disables all Indic-specific feature lookups. Always set lang="hi" (Hindi), lang="mr" (Marathi), lang="ta" (Tamil), lang="bn" (Bengali) on the <html> element or the innermost relevant container.

Unicode Range Breakdown

  • U+0900–097F — Devanagari core block
  • U+1CD0–1CFF — Vedic extensions
  • U+200C–200D — ZWNJ and ZWJ (control joiners)
  • U+20B9 — Indian Rupee sign ₹
  • U+25CC — Dotted circle (placeholder)
  • U+A8E0–A8FF — Devanagari extended

Subsetting Indic Fonts Safely

Font subsetting removes unused glyphs from a font file to reduce its size. For Latin fonts, this is straightforward: scan your content, identify the characters used, and strip everything else. For Indic fonts, this approach will break your rendering. Conjuncts require glyphs that may not appear as standalone characters in your content — a page that never uses the word क्षमा (forgiveness) may still need the conjunct क्ष glyph if any word on the page forms that cluster dynamically.

The safe rule is: include the entire Unicode block for your script, never individual code points. For Devanagari, that means always including U+0900–097F (128 code points), plus the Vedic extensions if your content includes Sanskrit or classical texts. This block-level inclusion ensures that every possible conjunct and vowel combination has the glyph data it needs.

Do NOT do this

# Dangerous: subsets to only characters
# found in content — breaks conjuncts
pyftsubset font.otf \
  --text-file=content.txt \
  --output-file=subset.woff2 \
  --flavor=woff2

This strips conjunct glyphs whose component code points may not appear in isolation in your content.

Safe subsetting approach

# Safe: always keep the full Unicode block
pyftsubset font.otf \
  --unicodes="U+0900-097F,U+1CD0-1CFF,
U+200C-200D,U+20B9,U+25CC,
U+A8E0-A8FF" \
  --layout-features="*" \
  --output-file=subset.woff2 \
  --flavor=woff2

The --layout-features="*" flag preserves all GSUB/GPOS tables — never omit it.

Preserve OpenType Tables

When using pyftsubset (part of fonttools), always pass --layout-features="*". By default, pyftsubset drops OpenType features it considers "unused" based on the included glyph set. For Indic scripts, this will silently delete the akhn, half, and pres lookups, producing a font that renders code points but not conjuncts. Use our Font Subsetter tool which applies the correct flags automatically.

Browser Rendering and HarfBuzz

HarfBuzz is the text shaping engine used by Chrome, Firefox, Android, and most Linux desktop environments. It is the de facto standard for Indic text rendering on the web. Safari (macOS and iOS) uses Apple's CoreText engine, which implements the same OpenType Indic specifications but with its own code. The two engines produce functionally identical output for well-formed fonts, but edge cases differ.

When a browser encounters Indic text, the rendering pipeline proceeds as follows:

  1. 1

    Unicode Bidi Algorithm

    The browser runs the Unicode Bidirectional Algorithm to detect text direction. Indic scripts are all left-to-right, but mixed Indic/RTL content (e.g., Devanagari + Arabic) requires correct embedding levels.

  2. 2

    Script Segmentation

    The engine breaks the text into runs by script. A string mixing Devanagari and Latin characters produces two separate runs, each shaped by the appropriate font and feature set.

  3. 3

    Syllable Segmentation

    HarfBuzz identifies syllable boundaries within each Indic run using the Unicode Cluster Boundaries algorithm combined with script-specific rules. This is where ZWNJ (U+200C) and ZWJ (U+200D) influence cluster formation.

  4. 4

    OpenType Feature Application

    HarfBuzz applies the GSUB lookups in the mandated order: akhn → rphf → rkrf → pref → blwf → half → pstf → vatu → cjct → cfar → then presentation features (pres, abvs, blws, psts, haln). GPOS mark positioning follows.

  5. 5

    Glyph Positioning

    After substitution, the engine runs GPOS to position glyphs. Indic mark positioning (vowel signs, anusvara, chandrabindu) depends on anchor points in the font. Missing anchors cause marks to stack at origin (0,0), producing collisions.

Chrome vs. Firefox vs. Safari — Known Differences

Chrome (HarfBuzz)

  • • Full USE (Universal Shaping Engine) support
  • • Best coverage of Indic script variations
  • • COLRv1 color font support
  • • Generally the reference implementation

Firefox (HarfBuzz)

  • • Same HarfBuzz version as Chrome
  • • Occasional differences in font fallback selection
  • • Strong Indic rendering; matches Chrome on most conjuncts
  • • Some mark positioning edge cases differ

Safari (CoreText)

  • • CoreText implements same spec but differently
  • • Relies more heavily on system-level Indic fonts
  • • Some web font substitutions behave differently
  • • Test carefully on macOS and iOS

Cross-Script Vertical Metrics and Line Height

Mixing Indic and Latin text in a single paragraph creates a vertical metrics mismatch that often produces uncomfortable or broken line spacing. Devanagari text has visible above-base marks (anusvara, chandrabindu, reph) that extend well above the cap height of Latin letters. Bengali and Malayalam have even taller above-base structures. If your line-height is set for Latin text, the taller Indic glyphs will clip or collide with the line above.

Line Height Recommendations by Script

ContextRecommended line-heightReason
Latin only1.5Standard comfortable reading
Devanagari only1.7–1.8Above-base vowel signs and reph need vertical clearance
Bengali / Odia1.8–2.0Tall above-base mark structures
Tamil1.6–1.7Moderate above-base marks
Mixed Latin + Devanagari1.8Must accommodate the tallest script in the line
/* Strategy 1: Set line-height per language */
:lang(hi), :lang(mr), :lang(ne) {
  line-height: 1.8;
}

:lang(bn), :lang(as) {
  line-height: 2.0;
}

:lang(ta) {
  line-height: 1.7;
}

/* Strategy 2: Use a generous global line-height for mixed pages */
.mixed-script-content {
  line-height: 1.8;
}

/* Strategy 3: Override OS/2 font metrics with CSS descriptors */
/* Use this when the font's built-in metrics cause clipping */
@font-face {
  font-family: 'HindAdjusted';
  src: url('/fonts/hind-devanagari-400.woff2') format('woff2');
  unicode-range: U+0900-097F;
  /* Override ascender to prevent clipping of above-base marks */
  ascent-override: 105%;
  descent-override: 30%;
  line-gap-override: 0%;
}

The CSS ascent-override and descent-override descriptors (baseline in all modern browsers) let you adjust the vertical metrics the browser uses for line box calculation without modifying the font file itself. This is particularly useful for fallback fonts in Indic font stacks where the fallback font's metrics cause layout shifts compared to your primary font.

Indic Script Font FAQs

Common questions about Indic web fonts, conjunct rendering, and OpenType features

Sarah Mitchell

Written & Verified by

Sarah Mitchell

Product Designer, Font Specialist

Related Resources

Ready to Subset Your Indic Font?

Our Font Subsetter preserves all OpenType layout tables when subsetting Indic fonts — the safe approach that text-based subsetting breaks.

Open Font Subsetter