characters.texinfo (7132B)
1 @node Characters 2 @section Characters 3 4 Characters are objects that represent printed characters such as 5 letters and digits. All Scheme implementations must support at least the 6 ASCII character repertoire: that is, Unicode characters U+0000 through 7 U+007F. Implementations may support any other Unicode characters they 8 see fit, and may also support non-Unicode characters as well. Except 9 as otherwise specified, the result of applying any of the following 10 procedures to a non-Unicode character is implementation-dependent. 11 12 Characters are written using the notation @code{#\}@svar{character} 13 or @code{#\}@svar{character name} or @code{#\x}@svar{hex scalar value}. 14 15 The following character names must be supported by all implementations 16 with the given values. Implementations may add other names provided they 17 cannot be interpreted as hex scalar values preceded by @code{x}. 18 19 @lisp 20 #\alarm @r{; U+0007} 21 #\backspace @r{; U+0008} 22 #\delete @r{; U+007F} 23 #\escape @r{; U+001B} 24 #\newline @r{; the linefeed character, U+000A} 25 #\null @r{; the null character, U+0000} 26 #\return @r{; the return character, U+000D} 27 #\space @r{; the preferred way to write a space} 28 #\tab @r{; the tab character, U+0009} 29 @end lisp 30 31 Here are some additional examples: 32 33 @lisp 34 #\a @r{; lower case letter} 35 #\A @r{; upper case letter} 36 #\( @r{; left parenthesis} 37 #\ @r{; the space character} 38 #\x03BB @r{; @theultimate{} (if character is supported)} 39 #\iota @r{; @greekiota{} (if character and name are supported)} 40 @end lisp 41 42 Case is significant in @code{#\}@svar{character}, and in 43 @code{#\}⟨character name⟩, but not in @code{#\x}@svar{hex scalar 44 value}. If @svar{character} in @code{#\}@svar{character} is alphabetic, 45 then any character immediately following @svar{character} cannot be 46 one that can appear in an identifier. This rule resolves the ambiguous 47 case where, for example, the sequence of characters @samp{#\space} 48 could be taken to be either a representation of the space character or a 49 representation of the character @code{#\s} followed by a representation 50 of the symbol @code{pace}. 51 52 Characters written in the @code{#\} notation are self-evaluating. That 53 is, they do not have to be quoted in programs. 54 55 Some of the procedures that operate on characters ignore the difference 56 between upper case and lower case. The procedures that ignore case have 57 @samp{-ci} (for ``case insensitive'') embedded in their names. 58 59 @deffn procedure char? obj 60 61 Returns @code{#t} if @var{obj} is a character, otherwise returns 62 @code{#f}. 63 64 @end deffn 65 66 @deffn procedure char=? @vari{char} @varii{char} @variii{char}@dots{} 67 @deffnx procedure char<? @vari{char} @varii{char} @variii{char}@dots{} 68 @deffnx procedure char>? @vari{char} @varii{char} @variii{char}@dots{} 69 @deffnx procedure char<=? @vari{char} @varii{char} @variii{char}@dots{} 70 @deffnx procedure char>=? @vari{char} @varii{char} @variii{char}@dots{} 71 72 These procedures return @code{#t} if the results of passing their 73 arguments to @code{char->integer} are respectively equal, monotonically 74 increasing, monotonically decreasing, monotonically non-decreasing, or 75 monotonically non-increasing. 76 77 These predicates are required to be transitive. 78 79 @end deffn 80 81 @deffn {char library procedure} char-ci=? @vari{char} @varii{char} @variii{char}@dots{} 82 @deffnx {char library procedure} char-ci<? @vari{char} @varii{char} @variii{char}@dots{} 83 @deffnx {char library procedure} char-ci>? @vari{char} @varii{char} @variii{char}@dots{} 84 @deffnx {char library procedure} char-ci<=? @vari{char} @varii{char} @variii{char}@dots{} 85 @deffnx {char library procedure} char-ci>=? @vari{char} @varii{char} @variii{char}@dots{} 86 87 These procedures are similar to @code{char=?} et cetera, but they treat 88 upper case and lower case letters as the same. For example, 89 @code{(char-ci=? #\A #\a)} returns @code{#t}. 90 91 Specifically, these procedures behave as if @code{char-foldcase} were 92 applied to their arguments before they were compared. 93 94 @end deffn 95 96 @deffn {char library procedure} char-alphabetic? char 97 @deffnx {char library procedure} char-numeric? char 98 @deffnx {char library procedure} char-whitespace? char 99 @deffnx {char library procedure} char-upper-case? letter 100 @deffnx {char library procedure} char-lower-case? letter 101 102 These procedures return @code{#t} if their arguments are alphabetic, 103 numeric, whitespace, upper case, or lower case characters, respectively, 104 otherwise they return @code{#f}. 105 106 Specifically, they must return @code{#t} when applied to characters with 107 the Unicode properties Alphabetic, Numeric_Type=Decimal, White_Space, 108 Uppercase, and Lowercase respectively, and @code{#f} when applied to 109 any other Unicode characters. Note that many Unicode characters are 110 alphabetic but neither upper nor lower case. 111 112 @end deffn 113 114 @deffn {char library procedure} digit-value char 115 116 This procedure returns the numeric value (0 to 9) of its argument if it 117 is a numeric digit (that is, if @code{char-numeric?} returns 118 @code{#t}), or @code{#f} on any other character. 119 120 @lisp 121 (digit-value #\3) @result{} 3 122 (digit-value #\x0664) @result{} 4 123 (digit-value #\x0AE6) @result{} 0 124 (digit-value #\x0EA6) @result{} #f 125 @end lisp 126 @end deffn 127 128 @deffn procedure char->integer char 129 @deffnx procedure integer->char n 130 131 Given a Unicode character, @code{char->integer} returns an exact 132 integer between 0 and @code{#xD7FF} or between @code{#xE000} and 133 @code{#x10FFFF} which is equal to the Unicode scalar value of that 134 character. Given a non-Unicode character, it returns an exact integer 135 greater than @code{#x10FFFF}. This is true independent of whether the 136 implementation uses the Unicode representation internally. 137 138 Given an exact integer that is the value returned by a character when 139 @code{char->integer} is applied to it, @code{integer->char} returns 140 that character. 141 142 @end deffn 143 144 @deffn {char library procedure} char-upcase char 145 @deffnx {char library procedure} char-downcase char 146 @deffnx {char library procedure} char-foldcase char 147 148 The @code{char-upcase} procedure, given an argument that is the lowercase 149 part of a Unicode casing pair, returns the uppercase member of the pair, 150 provided that both characters are supported by the Scheme implementation. 151 Note that language-sensitive casing pairs are not used. If the argument 152 is not the lowercase member of such a pair, it is returned. 153 154 The @code{char-downcase} procedure, given an argument that is 155 the uppercase part of a Unicode casing pair, returns the lowercase 156 member of the pair, provided that both characters are supported by the 157 Scheme implementation. Note that language-sensitive casing pairs are 158 not used. If the argument is not the uppercase member of such a pair, 159 it is returned. 160 161 The @code{char-foldcase} procedure applies the Unicode simple 162 case-folding algorithm to its argument and returns the result. Note that 163 language-sensitive folding is not used. If the character that results 164 from folding is not supported by the implementation, the argument is 165 returned. 166 See UAX #29 [@ref{uax29}] (part of the Unicode Standard) 167 for details. 168 169 Note that many Unicode lowercase characters do not have uppercase 170 equivalents. 171 172 @end deffn