Character Encoding Guide

Character Encoding Image

Web browser must know the character set / character encoding that is used in a webpage in order to render and display it correctly.


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

In this example, the page is using UTF-8 character encoding which is the default character encoding in HTML5.

Note: All HTML 4 processors support UTF-8 and all HTML5 and XML processors support both UTF-8 and UTF-16.

Here is a partial list of the charsets most typically used by websites in various languages.

Note: UTF-8 can be used for all languages and is the recommended charset on the Internet.

Charset Language
charset=big5 Chinese Traditional (Big5)
charset=euc-kr Korean (EUC)
charset=iso-8859-1 Western Alphabet
charset=iso-8859-2 Central European Alphabet (ISO)
charset=iso-8859-3 Latin 3 Alphabet (ISO)
charset=iso-8859-4 Baltic Alphabet (ISO)
charset=iso-8859-5 Cyrillic Alphabet (ISO)
charset=iso-8859-6 Arabic Alphabet (ISO)
charset=iso-8859-7 Greek Alphabet (ISO)
charset=iso-8859-8 Hebrew Alphabet (ISO)
charset=koi8-r Cyrillic Alphabet (KOI8-R)
charset=shift-jis Japanese (Shift-JIS)
charset=x-euc Japanese (EUC)
charset=utf-8 Universal Alphabet (UTF-8)
charset=windows-1250 Central European Alphabet (Windows)
charset=windows-1251 Cyrillic Alphabet (Windows)
charset=windows-1252 Western Alphabet (Windows)
charset=windows-1253 Greek Alphabet (Windows)
charset=windows-1254 Turkish Alphabet
charset=windows-1255 Hebrew Alphabet (Windows)
charset=windows-1256 Arabic Alphabet (Windows)
charset=windows-1257 Baltic Alphabet (Windows)
charset=windows-1258 Vietnamese Alphabet (Windows)
charset=windows-874 Thai (Windows)