14 Matching Annotations
  1. Dec 2021
    1. Here are the single characters which can be normalised down to a valid TLD. They're mostly country codes, but there are a few interesting exceptions:

      ㏕ - US Military
      ℡ - .tel registry
      № - Norway
      ㍳ - Australia
      ㍷ - Dominica
      ㎀ - Panama
      ㎁ - Namibia
      ㎃ - Morocco
      ㎊ - French Polynesia
      ㎋ - Norfolk Island
      ㎏ - Kyrgyzstan
      ㎖ - Mali
      ㎙ - Federated States of Micronesia
      fi - Finland
      ㎜ - Myanmar
      ㎝ - Cameroon
      ㎞ & ㏎ - Comoros
      ㎰ - Palestine
      ㎳ - Montserrat
      ㎷ & ㎹ - Republic of Maldives.
      ㎺ - Palau
      ㎽ & ㎿ - Malawi
      ㏄ - Cocos (Keeling) Islands
      ㏅ - Democratic Republic of Congo
      ㏉ - Guyana
      ㏗ - Philippines
      ㏘ - Saint Pierre and Miquelon
      ㏚ - Puerto Rico
      ㏛ - Suriname
      ㏜ - El Salvador
      ℠ - San Marino
      ™ - Turkmenistan
      st & ſt - São Tomé and Príncipe
      ㎇ - Great Britain (Obsolete)
      ß - South Sudan (Not available)
      ㏌ - India and Indiana (subdomain of .us)
      Ⅵ & ⅵ - Virgin Islands and Virginia (subdomain of .us)
      fl - Florida (subdomain of .us)
      ㎚ - New Mexico (subdomain of .us)
      ㎵ - Nevada (subdomain of .us)
      ㍵ - As part of .ovh
      
    2. Nestling among the "Letterlike Symbols" are two curious entries. Both of these are single characters:

      • Telephone symbol - ℡
      • Numero Sign - №

      What's interesting is both .tel and .no are Top-Level-Domains (TLD) on the Domain Name System (DNS).

      So my contact site - https://edent.tel/ - can be written as - https://edent.℡/

      And the Norwegian domain name registry NORID can be accessed at https://www.norid.№/

      Copy and paste those links - they work in any browser!

  2. Jun 2021
    1. Through a linkpin called "Property Value Alias", Unicode has made a 1:1 connection between a script defined, and its ISO 15924 standard.
  3. Apr 2021
  4. Feb 2021
    1. But the circle on its own doesn’t seem to be available as a nonspacing diacritic in Unicode. Bugger.
  5. Sep 2020
    1. The value of dotAll is a Boolean and true if the "s" flag was used; otherwise, false. The "s" flag indicates that the dot special character (".") should additionally match the following line terminator ("newline") characters in a string, which it would not match otherwise: U+000A LINE FEED (LF) ("\n") U+000D CARRIAGE RETURN (CR) ("\r") U+2028 LINE SEPARATOR U+2029 PARAGRAPH SEPARATOR This effectively means the dot will match any character on the Unicode Basic Multilingual Plane (BMP). To allow it to match astral characters, the "u" (unicode) flag should be used. Using both flags in conjunction allows the dot to match any Unicode character, without exceptions.
  6. Jun 2020
  7. Feb 2020
  8. Oct 2018
  9. Sep 2018
  10. Sep 2015
  11. Apr 2015
    1. This part of the Character Model for the World Wide Web covers string matching—the process by which a specification or implementation defines whether two string values are the same or different from one another.