A character encoding is a system that assigns a unique numeric value, known as a code point, to each character in a writing script. This allows character data to be stored, transmitted, and processed by computers. Beyond natural language symbols, character sets can also include control characters and whitespace.
Common examples of character encoding systems include Morse code, Baudot code, ASCII (American Standard Code for Information Interchange), and Unicode. Unicode, a comprehensive and extensible encoding system, has largely replaced many older character encodings. UTF-8, a variable-width encoding defined by the Unicode Standard, is the most prevalent character encoding on the World Wide Web, used by nearly 99% of webpages as of 2026. It is designed for backward compatibility with ASCII, meaning that ASCII characters are encoded using a single byte with the same binary value as in ASCII.
Tried using scala repl in windows, chinese words now showing …