- Mar 2024
-
chiselapp.com chiselapp.com
-
$@ charCode
El carácter ($) se utiliza para denotar un carácter literal en Pharo, y (@) es el carácter específico que sigue al signo ($). El carácter charCode es un mensaje que se puede enviar a un carácter para obtener su valor numérico de código de carácter Unicode.
-
- Jan 2024
-
en.wikipedia.org en.wikipedia.org
-
U+003E > 62 076 Greater-than sign
\u003c
Tags
Annotators
URL
-
- Dec 2023
-
cuis-smalltalk.github.io cuis-smalltalk.github.io
-
In Cuis-Smalltalk, strings with characters not part of the ASCII table are usually instances of UnicodeString. In the same way, you may get an instance of UnicodeSymbol and not Symbol, or UnicodeCodePoint and not Character. You usually don’t need to care about this. The ASCII and Unicode classes provide the same services.
See Unicode Support in Cuis Smalltalk and its presentation at Smalltalks 2022. See also the 2023-04-05 Cuis Smalltalk Meeting
-
- Aug 2023
-
www.unicode.org www.unicode.org
Tags
Annotators
URL
-
- Apr 2023
-
www.unicode.org www.unicode.org
- Mar 2023
-
github.com github.com
-
user = User.new(password: "あ" * 25) # 25 characters, 75 bytes
characters vs. bytes
-
- Dec 2022
-
www.zhihu.com www.zhihu.com
-
Unicode 和 UTF-8 有什么区别?
Tags
Annotators
URL
-
- Nov 2022
-
developer.mozilla.org developer.mozilla.org
-
The btoa() function takes a JavaScript string as a parameter. In JavaScript strings are represented using the UTF-16 character encoding: in this encoding, strings are represented as a sequence of 16-bit (2 byte) units. Every ASCII character fits into the first byte of one of these units, but many other characters don't. Base64, by design, expects binary data as its input. In terms of JavaScript strings, this means strings in which each character occupies only one byte. So if you pass a string into btoa() containing characters that occupy more than one byte, you will get an error, because this is not considered binary data:
-
If you need to encode Unicode text as ASCII using btoa(), one option is to convert the string such that each 16-bit unit occupies only one byte.
-
-
en.wikipedia.org en.wikipedia.org
-
Thus the replacement character is now only seen for encoding errors, such as invalid UTF-8.
-
At one time the replacement character was often used when there was no glyph available in a font for that character. However, most modern text rendering systems instead use a font's .notdef character, which in most cases is an empty box (or "?" or "X" in a box[5]), sometimes called a "tofu" (this browser displays ). There is no Unicode code point for this symbol.
-
The replacement character � (often displayed as a black rhombus with a white question mark) is a symbol found in the Unicode standard at code point U+FFFD in the Specials table. It is used to indicate problems when a system is unable to render a stream of data to a correct symbol.[4] It is usually seen when the data is invalid and does not match any character:
-
-
stackoverflow.com stackoverflow.com
-
By the way, I am not talking about � (replacement character). This one is displayed when a Unicode character could not be correctly decoded from a data stream. It does not necessarily produce the same glyph:
-
replacement glyph
-
U+25A1 □ WHITE SQUARE may be used to represent a missing ideograph
apparently distinct from: Unicode replacement character (U+FFFD)
-
-
www.w3.org www.w3.org
-
The character exists in Unicode/ISO 10646, but not in the character encoding used for the document. In this case, use Numeric Character References (NCRs, example: 噸).
-
-
stackoverflow.com stackoverflow.com
-
However after doing a bit of testing I see that this character is not used to represent missing glyphs on either my Windows 7 computer or the Android phone I've tested with (Motorola Atrix).
-
The Unicode replacement character sounds promising when reading about it on Wikipedia: It is used to indicate problems when a system is not able to render a stream of data to a correct symbol. It is most commonly seen when a font does not contain a character, but is also seen when the data is invalid and does not match any character
-
-
apple.stackexchange.com apple.stackexchange.com
-
All glyphs are Unicode glyphs!
-
- Oct 2022
-
www.unicode.org www.unicode.org
-
Of course, if super-intelligent Aliens will arrive on our planet, bearing a writing system with billions characters, I will withdraw this proposal and donate the name "UTF-64" to the Unicode Consortium.
-
- Aug 2022
-
www.runoob.com www.runoob.com
-
Unicode 是基于通用字符集(Universal Character Set)的标准来发展,并且同时也以书本的形式[1]对外发表
utf-8是unicode字符集的编码方式之一
Tags
Annotators
URL
-
- Apr 2022
-
-
css ul { list-style-type: none; } ul li:before { content:"\2713"; }
-
- Mar 2022
-
code.visualstudio.com code.visualstudio.com
-
ambiguous and invisible Unicode characters
'е' != 'e'
-
- Dec 2021
-
shkspr.mobi shkspr.mobi
-
Here are the single characters which can be normalised down to a valid TLD. They're mostly country codes, but there are a few interesting exceptions:
㏕ - US Military ℡ - .tel registry № - Norway ㍳ - Australia ㍷ - Dominica ㎀ - Panama ㎁ - Namibia ㎃ - Morocco ㎊ - French Polynesia ㎋ - Norfolk Island ㎏ - Kyrgyzstan ㎖ - Mali ㎙ - Federated States of Micronesia fi - Finland ㎜ - Myanmar ㎝ - Cameroon ㎞ & ㏎ - Comoros ㎰ - Palestine ㎳ - Montserrat ㎷ & ㎹ - Republic of Maldives. ㎺ - Palau ㎽ & ㎿ - Malawi ㏄ - Cocos (Keeling) Islands ㏅ - Democratic Republic of Congo ㏉ - Guyana ㏗ - Philippines ㏘ - Saint Pierre and Miquelon ㏚ - Puerto Rico ㏛ - Suriname ㏜ - El Salvador ℠ - San Marino ™ - Turkmenistan st & ſt - São Tomé and Príncipe ㎇ - Great Britain (Obsolete) ß - South Sudan (Not available) ㏌ - India and Indiana (subdomain of .us) Ⅵ & ⅵ - Virgin Islands and Virginia (subdomain of .us) fl - Florida (subdomain of .us) ㎚ - New Mexico (subdomain of .us) ㎵ - Nevada (subdomain of .us) ㍵ - As part of .ovh
-
Nestling among the "Letterlike Symbols" are two curious entries. Both of these are single characters:
- Telephone symbol - ℡
- Numero Sign - №
What's interesting is both .tel and .no are Top-Level-Domains (TLD) on the Domain Name System (DNS).
So my contact site - https://edent.tel/ - can be written as - https://edent.℡/
And the Norwegian domain name registry NORID can be accessed at https://www.norid.№/
Copy and paste those links - they work in any browser!
-
- Jun 2021
-
en.wikipedia.org en.wikipedia.org
-
Through a linkpin called "Property Value Alias", Unicode has made a 1:1 connection between a script defined, and its ISO 15924 standard.
-
- Apr 2021
-
en.wikipedia.org en.wikipedia.org
-
The use of U+212B 'Angstrom sign', which was encoded due to round-trip mapping compatibility with an East-Asian character encoding, is discouraged, and the preferred representation is U+00C5 'capital letter A with ring above', which has the same glyph.
Is there a difference in semantic meaning between the two? And if so, what is it? 
-
-
en.wikipedia.org en.wikipedia.org
- Feb 2021
-
copyheart.org copyheart.org
-
But the circle on its own doesn’t seem to be available as a nonspacing diacritic in Unicode. Bugger.
-
- Sep 2020
-
developer.mozilla.org developer.mozilla.org
-
The value of dotAll is a Boolean and true if the "s" flag was used; otherwise, false. The "s" flag indicates that the dot special character (".") should additionally match the following line terminator ("newline") characters in a string, which it would not match otherwise: U+000A LINE FEED (LF) ("\n") U+000D CARRIAGE RETURN (CR) ("\r") U+2028 LINE SEPARATOR U+2029 PARAGRAPH SEPARATOR This effectively means the dot will match any character on the Unicode Basic Multilingual Plane (BMP). To allow it to match astral characters, the "u" (unicode) flag should be used. Using both flags in conjunction allows the dot to match any Unicode character, without exceptions.
-
-
- Jun 2020
-
codepoints.net codepoints.net
-
UTF-8 EF BF BC UTF-16 FF FC
-
- Feb 2020
-
www.w3.org www.w3.org
- Oct 2018
- Sep 2018
-
www.discoversdk.com www.discoversdk.com
Tags
Annotators
URL
-
- Sep 2015
-
www.unicode.org www.unicode.org
-
GMail
Gmail now uses the same set of emojis as other Google properties (Android, Hangouts) http://gmailblog.blogspot.com/2015/06/express-yourself-in-email-hundreds-more.html
Tags
Annotators
URL
-
- Apr 2015
-
-
This part of the Character Model for the World Wide Web covers string matching—the process by which a specification or implementation defines whether two string values are the same or different from one another.
Tags
Annotators
URL
-