Annotated Ada Reference Manual (Ada 202y Draft 1)Legal Information
Contents   Index   References   Search   Previous   Next 

A.3.5 The Package Wide_Characters.Handling

1/3
{AI05-0185-1} The package Wide_Characters.Handling provides operations for classifying Wide_Characters and case folding for Wide_Characters. 

Static Semantics

2/3
{AI05-0185-1} The library package Wide_Characters.Handling has the following declaration:
3/5
{AI05-0185-1} {AI05-0266-1} {AI12-0414-1} package Ada.Wide_Characters.Handling
   with Pure is
4/3
{AI05-0266-1}    function Character_Set_Version return String;
5/3
   function Is_Control (Item : Wide_Character) return Boolean;
6/3
   function Is_Letter (Item : Wide_Character) return Boolean;
7/3
   function Is_Lower (Item : Wide_Character) return Boolean;
8/3
   function Is_Upper (Item : Wide_Character) return Boolean;
8.1/5
{AI12-0260-1}    function Is_Basic (Item : Wide_Character) return Boolean;
9/3
   function Is_Digit (Item : Wide_Character) return Boolean;
10/3
   function Is_Decimal_Digit (Item : Wide_Character) return Boolean
      renames Is_Digit;
11/3
   function Is_Hexadecimal_Digit (Item : Wide_Character) return Boolean;
12/3
   function Is_Alphanumeric (Item : Wide_Character) return Boolean;
13/3
   function Is_Special (Item : Wide_Character) return Boolean;
14/3
   function Is_Line_Terminator (Item : Wide_Character) return Boolean;
15/3
   function Is_Mark (Item : Wide_Character) return Boolean;
16/3
   function Is_Other_Format (Item : Wide_Character) return Boolean;
17/3
   function Is_Punctuation_Connector (Item : Wide_Character) return Boolean;
18/3
   function Is_Space (Item : Wide_Character) return Boolean;
18.1/5
{AI12-0004-1}    function Is_NFKC (Item : Wide_Character) return Boolean;
19/3
   function Is_Graphic (Item : Wide_Character) return Boolean;
20/3
   function To_Lower (Item : Wide_Character) return Wide_Character;
   function To_Upper (Item : Wide_Character) return Wide_Character;
20.1/5
{AI12-0260-1}    function To_Basic (Item : Wide_Character) return Wide_Character;
21/3
   function To_Lower (Item : Wide_String) return Wide_String;
   function To_Upper (Item : Wide_String) return Wide_String;
21.1/5
{AI12-0260-1}    function To_Basic (Item : Wide_String) return Wide_String;
22/3
end Ada.Wide_Characters.Handling;
23/3
{AI05-0185-1} The subprograms defined in Wide_Characters.Handling are locale independent.
24/3
function Character_Set_Version return String;
25/3
{AI05-0266-1} Returns an implementation-defined identifier that identifies the version of the character set standard that is used for categorizing characters by the implementation.
26/3
function Is_Control (Item : Wide_Character) return Boolean;
27/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as other_control; otherwise returns False.
28/3
function Is_Letter (Item : Wide_Character) return Boolean;
29/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as letter_uppercase, letter_lowercase, letter_titlecase, letter_modifier, letter_other, or number_letter; otherwise returns False.
30/3
function Is_Lower (Item : Wide_Character) return Boolean;
31/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as letter_lowercase; otherwise returns False.
32/3
function Is_Upper (Item : Wide_Character) return Boolean;
33/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as letter_uppercase; otherwise returns False.
33.1/5
function Is_Basic (Item : Wide_Character) return Boolean;
33.2/5
{AI12-0260-1} {AI12-0450-1} Returns True if the Wide_Character designated by Item has no Decomposition Mapping in the code charts of ISO/IEC 10646:2020; otherwise returns False.
33.a/5
Implementation Note: Decomposition Mapping is defined in Clause 33 of ISO/IEC 10646:2020. Machine-readable (and normative!) versions of this can be found as Character Decomposition Mapping, described in file http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt, field 5 (which is the 6th item, Unicode counts from zero). 
34/3
function Is_Digit (Item : Wide_Character) return Boolean;
35/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as number_decimal; otherwise returns False.
36/3
function Is_Hexadecimal_Digit (Item : Wide_Character) return Boolean;
37/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as number_decimal, or is in the range 'A' .. 'F' or 'a' .. 'f'; otherwise returns False.
38/3
function Is_Alphanumeric (Item : Wide_Character) return Boolean;
39/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as letter_uppercase, letter_lowercase, letter_titlecase, letter_modifier, letter_other, number_letter, or number_decimal; otherwise returns False.
40/3
function Is_Special (Item : Wide_Character) return Boolean;
41/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as graphic_character, but not categorized as letter_uppercase, letter_lowercase, letter_titlecase, letter_modifier, letter_other, number_letter, or number_decimal; otherwise returns False.
42/3
function Is_Line_Terminator (Item : Wide_Character) return Boolean;
43/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as separator_line or separator_paragraph, or if Item is a conventional line terminator character (Line_Feed, Line_Tabulation, Form_Feed, Carriage_Return, Next_Line); otherwise returns False.
44/3
function Is_Mark (Item : Wide_Character) return Boolean;
45/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as mark_non_spacing or mark_spacing_combining; otherwise returns False.
46/3
function Is_Other_Format (Item : Wide_Character) return Boolean;
47/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as other_format; otherwise returns False.
48/3
function Is_Punctuation_Connector (Item : Wide_Character) return Boolean;
49/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as punctuation_connector; otherwise returns False.
50/3
function Is_Space (Item : Wide_Character) return Boolean;
51/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as separator_space; otherwise returns False.
51.1/5
function Is_NFKC (Item : Wide_Character) return Boolean;
51.2/5
{AI12-0004-1} {AI12-0263-1} {AI12-0439-1} {AI12-0450-1} Returns True if the Wide_Character designated by Item can be present in a string normalized to Normalization Form KC (as defined by Clause 22 of ISO/IEC 10646:2020), otherwise returns False.
51.a/5
Reason: Wide_Characters for which this function returns False are not allowed in identifiers (see 2.3) even if they are categorized as letters or digits. 
51.b/5
Implementation Note: This function returns False if the Unicode property NFKC Quick Check (NFKC_QC in the files) has the value No. See the Implementation Notes in 2.3 for the source of this property. 
51.c/5
Discussion: A string for which Is_NFKC is true for every character may still not be in Normalization Form KC, as Is_NFKC returns true for characters that are dependent on characters around them as to whether they are removed by normalization. Ada does not provide a full normalization operation (it is complex and expensive).
52/3
function Is_Graphic (Item : Wide_Character) return Boolean;
53/3
{AI05-0185-1} Returns True if the Wide_Character designated by Item is categorized as graphic_character; otherwise returns False.
54/3
function To_Lower (Item : Wide_Character) return Wide_Character;
55/5
{AI05-0185-1} {AI05-0266-1} {AI05-0299-1} {AI12-0263-1} {AI12-0450-1} Returns the Simple Lowercase Mapping as defined by documents referenced in Clause 2 of ISO/IEC 10646:2020 of the Wide_Character designated by Item. If the Simple Lowercase Mapping does not exist for the Wide_Character designated by Item, then the value of Item is returned.
55.a/5
Discussion: {AI12-0263-1} {AI12-0450-1} The “documents referenced” means Unicode, Chapter 4 (specifically, section 4.2 — Case). The case mappings come from Unicode as ISO/IEC 10646:2020 does not include complete case mappings. See the Implementation Notes in subclause 1.1.4 for machine-readable versions of both Uppercase and Lowercase mappings. 
56/3
function To_Lower (Item : Wide_String) return Wide_String;
57/3
{AI05-0185-1} Returns the result of applying the To_Lower conversion to each Wide_Character element of the Wide_String designated by Item. The result is the null Wide_String if the value of the formal parameter is the null Wide_String. The lower bound of the result Wide_String is 1.
58/3
function To_Upper (Item : Wide_Character) return Wide_Character;
59/5
{AI05-0185-1} {AI05-0266-1} {AI05-0299-1} {AI12-0263-1} {AI12-0450-1} Returns the Simple Uppercase Mapping as defined by documents referenced in Clause 2 of ISO/IEC 10646:2020 of the Wide_Character designated by Item. If the Simple Uppercase Mapping does not exist for the Wide_Character designated by Item, then the value of Item is returned.
60/3
function To_Upper (Item : Wide_String) return Wide_String;
61/3
{AI05-0185-1} Returns the result of applying the To_Upper conversion to each Wide_Character element of the Wide_String designated by Item. The result is the null Wide_String if the value of the formal parameter is the null Wide_String. The lower bound of the result Wide_String is 1.
61.1/5
function To_Basic (Item : Wide_Character) return Wide_Character;
61.2/5
{AI12-0260-1} {AI12-0450-1} Returns the Wide_Character whose code point is given by the first value of its Decomposition Mapping in the code charts of ISO/IEC 10646:2020 if any; returns Item otherwise.
61.3/5
function To_Basic (Item : Wide_String) return Wide_String;
61.4/5
{AI12-0260-1} Returns the result of applying the To_Basic conversion to each Wide_Character element of the Wide_String designated by Item. The result is the null Wide_String if the value of the formal parameter is the null Wide_String. The lower bound of the result Wide_String is 1.

Implementation Advice

62/3
{AI05-0266-1} The string returned by Character_Set_Version should include either “10646:” or “Unicode”.
62.a.1/3
Implementation Advice: The string returned by Wide_Characters.Handling.Character_Set_Version should include either “10646:” or “Unicode”.
62.a/5
Discussion: {AI12-0263-1} {AI12-0450-1} The intent is that the returned string include the year for 10646 (as in "10646:2020"), and the version number for Unicode (as in "Unicode 13.0"). We don't try to specify that further so we don't need to decide how to represent Corrigenda for 10646, nor which of these is preferred. (Giving a Unicode version is more accurate, as the case folding and mapping rules always come from a Unicode version [10646 just tells one to look at Unicode to get those], and the character classifications ought to be the same for equivalent versions, but we don't want to talk about non-ISO standards in an ISO standard.) 
63/5
NOTE 1   {AI05-0266-1} {AI12-0440-1} {AI12-0450-1} The results returned by these functions can depend on which particular version of ISO/IEC 10646 is supported by the implementation (see 2.1).
64/5
NOTE 2   {AI05-0286-1} {AI12-0449-1} The case insensitive equality comparison routines provided in A.4.10 are also available for wide strings (see A.4.7).

Extensions to Ada 2005

64.a/3
{AI05-0185-1} {AI05-0266-1} The package Wide_Characters.Handling is new. 

Incompatibilities With Ada 2012

64.b/5
{AI12-0004-1} {AI12-0260-1} Added additional classification routines Is_Basic and Is_NFKC, and additional conversion routine To_Basic. Therefore, a use clause conflict is possible; see the introduction of Annex A for more on this topic. 

Contents   Index   References   Search   Previous   Next 
Ada-Europe Ada 2005 and 2012 Editions sponsored in part by Ada-Europe