Languages¶
TeXSmith speaks more than Markdown—it speaks your language. Noto fonts ship in, so LATEX stops tripping over glyphs while browsers casually fall back. Most scripts just work out of the box; typographic nuances for highly specialized scripts (Arabic, Japanese, etc.) can be layered in if or when you need them.
Below, we render a dialect sampler through the article template and lay two PDF pages side-by-side (layout: 2x1). Click to fetch the PDF.
Source:
---
press:
subtitle: Demo on fonts fallback support
columns: 2
paragraph:
indent: 0
spacing: 0.5em
slots:
abstract: Abstract
---
# Foreign Script Support
## Abstract
TeXSmith can render documents in multiple languages and scripts, including those with complex typesetting requirements. Below are examples of various
languages and scripts supported by TeXSmith, showcasing its versatility in
handling diverse linguistic content.
## Japanese (日本語)
Japanese is the national language of Japan and is spoken by over 125 million people, primarily within the country.
月日は百代の過客にして、行きかふ年も旅人なり。
舟に乗り、馬に乗りて、初秋の風に吹かれつつ、
道のべの小草を分け、山川を越えて。
## Korean (한국어)
Korean is spoken by about 80 million people worldwide and is the official language of both South Korea and North Korea.
이 몸이 죽고 죽어 일백 번 고쳐 죽어
백골이 진토되어 넋이라도 있고 없고
임 향한 일편단심이야 가실 줄이 있으랴
## Syriac (ܠܫܢܐ ܣܘܪܝܝܐ)
Syriac is an ancient Aramaic language historically used in the Middle East, especially in Christian liturgical and literary traditions.
ܛܘܒܝܗܘܢ ܠܡܣܟܢ̈ܐ ܒܪܘܚܐ، ܕܕܝܠܗܘܢ ܗܝ ܡܠܟܘܬܐ ܕܫܡܝ̈ܐ. ܛܘܒܝܗܘܢ ܠܐܒܝ̈ܠܐ، ܕܗܢܘܢ ܢܬܒܝܐܘܢ. ܛܘܒܝܗܘܢ ܠܡܟܝ̈ܟܐ، ܕܗܢܘܢ ܢܪܬܘܢ ܠܐܪܥܐ. ܛܘܒܝܗܘܢ ܠܐܝܠܝܢ ܕܟܦܢܝܢ ܘܨܗܝܢ ܠܟܐܢܘܬܐ، ܕܗܢܘܢ ܢܣܒܥܘܢ.
## Greek (ελληνικά)
Greek is the official language of Greece and Cyprus and is spoken by roughly 13 million people, with a literary tradition dating back more than three millennia.
Άντρα, Μούσα, πες μου τον πολυμήχανο, που πάρα πολλά
ταλαιπωρήθηκε, αφού κατέστρεψε την ιερή πολιτεία της Τροίας·
και είδε τις πόλεις πολλών ανθρώπων και γνώρισε τον τρόπο τους,
και πολλές λύπες έπαθε στη θάλασσα μέσα στην καρδιά του,
προσπαθώντας να σώσει και τη δική του ζωή
και την επιστροφή των συντρόφων του.
## Chinese Trad. (繁體中文)
Traditional Chinese characters are used mainly in Taiwan, Hong Kong, and Macau by tens of millions of speakers of Mandarin, Cantonese, and other Sinitic languages.
少无适俗韵,性本爱丘山。
误落尘网中,一去三十年。
羁鸟恋旧林,池鱼思故渊。
开荒南野际,守拙归园田。
久在樊笼里,复得返自然。
## Arabic
Arabic is spoken by over 400 million people across the Middle East and North Africa and serves as the liturgical language of Islam.
قِفا نَبْكِ مِنْ ذِكرَى حبيبٍ ومَنزِلِ
بِسِقطِ اللِّوَى بَيْنَ الدَّخول فَحَوْمَلِ
فَتُوضِحَ فَالمِقراةِ لَم يَعفُ رَسمُها
لِما نَسَجَتها مِن جَنُوبٍ وشَمألِ
تَرَى بَعَرَ الأرْآمِ في عَرَصاتِها
وَقِيْعانِها كَأنَّهُ حَبُّ فُلْفُلِ
## Devanagari (देवनागरी)
The Devanagari script is used for several major South Asian languages, including Hindi, Marathi, and Sanskrit, representing over 600 million speakers combined.
न जायते म्रियते वा कदाचिन्
नायं भूत्वा भविता वा न भूयः।
अजो नित्यः शाश्वतोऽयं पुराणो
न हन्यते हन्यमाने शरीरे॥
## Urdu (اردو)
Urdu is spoken by more than 170 million people, mainly in Pakistan and India, and serves as Pakistan's national language.
عشق ایک مِیرُی آفت ہے
دل کو لے کر ڈوب ہی جاتی ہے
درد کی لہر اٹھتی جائے
اشک بھی ساتھ بہہ ہی جاتے ہیں
## Sinhala (සිංහල)
Sinhala is the primary language of Sri Lanka and is spoken by about 17 million native speakers.
සඳ එළිය පුරා රැය නිල් සිලුවෙන් හැඬේලා
සඳහනෙකු මෙන් සිත් තුළ හඬක් වී පාවෙයි।
සිහිනෙන් පාරවල් හරහා ලංවීය මල් සුවඳක්
සරත් රැයේ සුවිසෙලිනි මනමේ තනුවේ රැදී.
## Khmer (ខ្មែរ)
Khmer is the official language of Cambodia and is spoken by more than 16 million people.
យប់ស្ថប់ត្រង់ ព្រៃស្រពន់សូរ ស្ទឹងស្ទូងតាមស្រមោលដើមឈើ
ខ្យល់អូលីល មកពីភ្នំឆ្ងាយ ប៉ះមកដល់ចិត្តអើយឆ្ងាយណា
សូរសត្វបក្សី លេបលាន់តាមសែន ក្លិនផ្កាឈូករាត្រីរំលេច។
## Telugu (తెలుగు)
Telugu is a major Dravidian language of India, spoken by over 80 million people, primarily in the states of Andhra Pradesh and Telangana.
నన్నయ భట్టుని వాక్కులు నదీవేగమువలె ప్రవహించి
పాండవ కథామృతము లోకమును నింపెను।
ధర్మమునకు మార్గదర్శి పాండవుని గాథలతో
తెలుగు భాషకు తొలిమకుటం కీర్తి ప్రసరించెనె.
## Georgian (ქართული)
Georgian is the official language of Georgia, with around 4-5 million speakers, and is the largest member of the Kartvelian language family.
ქართული ენა კავკასიონის მთებს შორის დაბადებული ძველი ხმავია.
ქრონიკებში, ხელნაწერებში და ხალხურ სიმღერებში ეს ენა ხალხის ისტორიას მოჰყვება.
თბილისის ქუჩის საუბრიდან სოფლის სუფრის ტოსტებამდე,
ქართული სიტყვები მეგობრობისა და სტუმართმოყვარეობის სულს ატარებს.
## Armenian (Հայերեն)
Armenian is the official language of Armenia and a key language of the global Armenian diaspora, spoken by roughly 6–7 million people.
Հայերենը միայն պետական լեզու չէ՝ այլ նաեւ հոգեւոր հիշողության տանող ուղի է։
Դարերի ընթացքում գրված տարեգրությունները՝ սաղմոսները եւ ժողովրդական երգերը՝
այս լեզվով են պահպանել ժողովրդի ուրախությունն ու ցավը։
## Russian (русский)
Russian is an East Slavic language spoken by about 258 million people worldwide and is an official language in Russia, Belarus, Kazakhstan, and Kyrgyzstan.
Мой дядя самых честных правил,
когда не в шутку занемог,
он уважать себя заставил
и лучше выдумать не мог.
## Hebrew (עברית)
Hebrew is a Semitic language spoken by over 9 million people worldwide and is the official language of Israel.
הַלֵּילָה אֲנִי חוֹלֵם עַל שָׂפוֹת הָעֵת בָּאוּ בַּחוֹף
קוֹל הַגַּלִּים מְשַׁחֵק עִם הָרֶגַע
וְהַלֵּב כּוֹסֵף לְבֵית כָּל שִׁירַי.
## Thai (ไทย)
Thai is the national language of Thailand and is spoken by more than 60 million people.
สายลมพัดผ่านทุ่งข้าวเขียวขจี
กลิ่นดอกไม้โชยมากับเสียงระฆังวัด
หัวใจคิดถึงรอยยิ้มและคำทักทายอบอุ่นของผู้คน.
## Bengali (বাংলা)
Bengali is spoken by over 250 million people, mainly in Bangladesh and the Indian state of West Bengal.
এই নদী মাঠ আর গাছের ছায়া
মানুষের হাসি আর শোকের গান
সব মিলিয়ে বাংলা ভাষা হৃদয়ে রাখে প্রাণের গল্প।
## Tibetan
བོད་ཡིག་གི་ཚིག་སྨོན་རྒྱལ་ཁབ་གི་སྙིང་ནས་བྱུང་
རླུང་དང་སྣུམ་གྱི་སྒྲ་དང་མཉམ་དུ་སྐད་ཆ་སྣང་བ་བསྐུར།
ལམ་སེང་གི་སྒྲོན་མ་བཞུགས་སྒོམ་དང་མཉམ་བཞག་པའི་ཡིད་ཀྱི་གདུང་འདུག།
## Tamil (தமிழ்)
காற்றின் மணம் நனைந்து வரும்
பாட்டின் சொல் மனதில் நிற்கும்
தமிழ் மண்ணின் நெஞ்சில் வாழும் மரபின் இனிமை.
## Amharic (አማርኛ)
Amharic is a Semitic language spoken mainly in Ethiopia.
አማርኛ በኢትዮጵያ የመንግሥት ቋንቋ እና የብዙ ብሔሮች መገናኛ መንገድ ናት።
መዝሙሮች፣ ታሪካዊ ዘገባዎች እና የዕለት ተዕለት ተወላጅ ቃላት ውስጥ ይህች ቋንቋ የሕይወትን ድምጽ ታትማለች።
## Burmese (မြန်မာ)
Burmese is the official language of Myanmar.
and over 40 million total speakers use it.
မြန်မာဘာသာကို ရှေးအခေတ်ကပင် ဗိမာန်၊
ရာဇဝင်စာအုပ်တွေထဲက ဗျာဒိတ်စာနဲ့အတူ
ဆက်ခံလာကြတယ်။
Build it locally:
texsmith build dialects.md -tarticle --build
Fallbacks¶
Font fallback is… complicated—almost passionately so. In a browser, the mechanism is straightforward: if a glyph is missing in the first font, the engine walks the font stack and tries the next one, and the next one, until it eventually finds something that contains the character. For obscure Unicode blocks or emoji, Chrome and other modern browsers ultimately fall back to Google’s Noto super-family, which covers most of the Unicode universe.
LATEX, on the other hand, does not provide automatic font fallback in the same way. With classic pdfLaTeX there is no real fallback mechanism at all; if a glyph isn’t in the selected font, you simply get a missing-glyph warning or a tofu box.
LuaLaTeX changes the game: thanks to the luaotfload package, it can perform font fallback, but you must explicitly define fallback fonts in your document or template.
However, for this to work, those fallback fonts need to actually exist on your TEX system. And this is where it gets tricky: installing Noto fonts via TEX Live (tlmgr) or via system packages like apt install fonts-noto is often incomplete, because distributions usually ship only a subset of the full Noto collection. For full Unicode coverage (emoji, rare historical scripts, CJK extensions, etc.), you often need to manually download and install the complete font set from Google’s official Noto Fonts repository.