Windows-1255
MIME / IANA | windows-1255 |
---|---|
Alias(es) | cp1255 (Code page 1255) |
Language(s) | Hebrew, English |
Created by | Microsoft |
Standard | WHATWG Encoding Standard |
Classification | extended ASCII, Windows-125x |
Other related encoding(s) | ISO-8859-8 |
Windows-1255 is a code page used under Microsoft Windows to write Hebrew. It is an almost compatible superset of ISO-8859-8 – most of the symbols are in the same positions (except for A4, which is 'sheqel sign' in Windows-1255 but 'generic currency sign' in ISO 8859-8 and except for DF, which is undefined in Windows-1255 but 'double low line' in ISO 8859-8), but Windows-1255 adds vowel-points and other signs in lower positions.
IBM uses code page 1255 (CCSID 1255, euro sign extended CCSID 5351, and the further extended CCSID 9447) for Windows-1255.[1][2][3][4]
Modern applications prefer Unicode to Windows-1255, especially on the Internet;[5] meaning UTF-8, the dominant encoding for web pages (or UTF-16, while not on the Internet for security reasons). Windows-1255 is used by less than 0.1% of websites.[6]
Character set
[edit]The following table shows Windows-1255. Each character is shown with its Unicode equivalent.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | HT | LF | VT | FF | CR | SO | SI |
1x | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | DEL |
8x | € | ‚ | ƒ | „ | … | † | ‡ | ˆ | ‰ | ‹ | ||||||
9x | ‘ | ’ | “ | ” | • | – | — | ˜ | ™ | › | ||||||
Ax | NBSP | ¡ | ¢ | £ | ₪ | ¥ | ¦ | § | ¨ | © | × | « | ¬ | SHY | ® | ¯ |
Bx | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | ÷ | » | ¼ | ½ | ¾ | ¿ |
Cx | ְ | ֱ | ֲ | ֳ | ִ | ֵ | ֶ | ַ | ָ | ֹ | ֺ | ֻ | ּ | ֽ | ־ | ֿ |
Dx | ׀ | ׁ | ׂ | ׃ | װ | ױ | ײ | ׳ | ״ | |||||||
Ex | א | ב | ג | ד | ה | ו | ז | ח | ט | י | ך | כ | ל | ם | מ | ן |
Fx | נ | ס | ע | ף | פ | ץ | צ | ק | ר | ש | ת | LRM | RLM |
Usage
[edit]Windows-1255 Hebrew is always in logical order (as opposed to visual). Microsoft Hebrew products (Windows, Office and Internet Explorer) brought logically-ordered Hebrew to common use, with the result that Windows-1255 is the Hebrew encoding that can be found most on the Web, having ousted the visually ordered ISO-8859-8, and preferred to the logically ordered ISO-8859-8-I because it provides for vowel-points.
Relation to Unicode
[edit]The Unicode Hebrew block (U+0590–U+05FF) follows Windows-1255 by encoding both letters and vowel-points in the same relative positions as Windows-1255. Unicode goes further in encoding cantillation marks in lower positions. Unicode Hebrew is always in logical order.
For modern applications UTF-8 or UTF-16 is a preferred encoding.
See also
[edit]- 7-bit Hebrew under ISO 646
- Code page 862
- ISO 8859-8
- LMBCS-3
References
[edit]- ^ "Code page 1255 information document". Archived from the original on 2016-03-04.
- ^ "CCSID 1255 information document". Archived from the original on 2016-03-27.
- ^ "CCSID 5351 information document". Archived from the original on 2014-11-29.
- ^ "CCSID 9447 information document". Archived from the original on 2016-03-26.
- ^ John, Nicholas A. (2013). "The Construction of the Multilingual Internet: Unicode, Hebrew, and Globalization". Journal of Computer-Mediated Communication. 18 (3): 321–338. doi:10.1111/jcc4.12015. ISSN 1083-6101.
Background: the problem of Hebrew and the Internet
- ^ "Usage Statistics of Windows-1255 for Websites, January 2019". w3techs.com. Retrieved 2019-01-17.
- ^ Unicode mapping table for Windows 1255
- ^ Unicode mappings of windows 1255 with "best fit"
- ^ Code Page CPGID 01255 (pdf) (PDF), IBM
- ^ Code Page CPGID 01255 (txt), IBM
- ^ International Components for Unicode (ICU), ibm-1255_P100-1995.ucm, 2002-12-03
- ^ International Components for Unicode (ICU), ibm-1251_P100-1995.ucm, 2002-12-03
- ^ International Components for Unicode (ICU), ibm-5351_P100-1998.ucm, 2002-12-03