strings.texinfo (11078B)
1 @node Strings 2 @section Strings 3 4 @cindex escape sequence 5 6 Strings are sequences of characters. Strings are written as sequences 7 of characters enclosed within quotation marks (@code{"}). Within a 8 string literal, various escape sequences represent characters other 9 than themselves. Escape sequences always start with a backslash 10 (@code{\}): 11 12 @c We can't use a two-column table here because user-defined macros 13 @c are apparently "unreliable" in tables. 14 @itemize @bullet 15 16 @item 17 @code{\a} : alarm, U+0007 18 19 @item 20 @code{\b} : backspace, U+0008 21 22 @item 23 @code{\t} : character tabulation, U+0009 24 25 @item 26 @code{\n} : linefeed, U+000A 27 28 @item 29 @code{\r} : return, U+000D 30 31 @item 32 @code{\"} : double quote, U+0022 33 34 @item 35 @code{\\} : backslash, U+005C 36 37 @item 38 @code{\|} : vertical line, U+007C 39 40 @item 41 @code{\}@svar{intraline whitespace}*@svar{line ending} 42 @svar{intraline whitespace}* : nothing 43 44 @item 45 @code{\x}@svar{hex scalar value}@code{;} : specified character 46 (note the terminating semi-colon). 47 48 @end itemize 49 50 The result is unspecified if any other character in a string occurs 51 after a backslash. 52 53 Except for a line ending, any character outside of an escape sequence 54 stands for itself in the string literal. A line ending which is 55 preceded by @code{\}@svar{intraline whitespace} expands to nothing 56 (along with any trailing intraline whitespace), and can be used to 57 indent strings for improved legibility. Any other line ending has the 58 same effect as inserting a @code{\n} character into the string. 59 60 Examples: 61 62 @example 63 "The word \"recursion\" has many meanings." 64 "Another example:\ntwo lines of text" 65 "Here's text \ 66 containing just one line" 67 "\x03B1; is named GREEK SMALL LETTER ALPHA." 68 @end example 69 70 @cindex valid indexes 71 72 The @define{length} of a string is the number of characters that it 73 contains. This number is an exact, non-negative integer that is fixed 74 when the string is created. The @define{valid indexes} of a string are the 75 exact non-negative integers less than the length of the string. The 76 first character of a string has index 0, the second has index 1, and so 77 on. 78 79 Some of the procedures that operate on strings ignore the difference 80 between upper and lower case. The names of the versions that ignore 81 case end with @samp{-ci} (for ``case insensitive''). 82 83 Implementations may forbid certain characters from appearing in 84 strings. However, with the exception of @code{#\null}, ASCII characters 85 must not be forbidden. For example, an implementation might support the 86 entire Unicode repertoire, but only allow characters U+0001 to U+00FF 87 (the Latin-1 repertoire without @code{#\null}) in strings. 88 89 It is an error to pass such a forbidden character to 90 @code{make-string}, @code{string}, @code{string-set!}, or 91 @code{string-fill!}, as part of the list passed to @code{list->string}, 92 or as part of the vector passed to @code{vector->string} (see 93 @ref{Vectors}), or in UTF-8 encoded form within a bytevector passed to 94 @code{utf8->string} (see @ref{Bytevectors}). It is also an error for a 95 procedure passed to @code{string-map} (see @ref{Control features}) to 96 return a forbidden character, or for @code{read-string} (see 97 @ref{Input}) to attempt to read one. 98 99 @deffn procedure string? obj 100 101 Returns @code{#t} if @var{obj} is a string, otherwise returns @code{#f}. 102 103 @end deffn 104 105 @deffn procedure make-string k 106 @deffnx procedure make-string k char 107 108 The @code{make-string} procedure returns a newly allocated string of 109 length @var{k}. If @var{char} is given, then all the characters of the 110 string are initialized to @var{char}, otherwise the contents of the 111 string are unspecified. 112 113 @end deffn 114 115 @deffn procedure string char@dots{} 116 117 Returns a newly allocated string composed of the arguments. It is 118 analogous to @code{list}. 119 120 @end deffn 121 122 @deffn procedure string-length string 123 124 Returns the number of characters in the given @var{string}. 125 126 @end deffn 127 128 @deffn procedure string-ref string k 129 130 It is an error if @var{k} is not a valid index of @var{string}. 131 132 The @code{string-ref} procedure returns character @var{k} of 133 @var{string} using zero-origin indexing. 134 135 There is no requirement for this procedure to execute in constant time. 136 137 @end deffn 138 139 @deffn procedure string-set! string k char 140 141 It is an error if @var{k} is not a valid index of @var{string}. 142 143 The @code{string-set!} procedure stores @var{char} in element @var{k} 144 of @var{string}. There is no requirement for this procedure to execute 145 in constant time. 146 147 @lisp 148 (define (f) (make-string 3 #\*)) 149 (define (g) "***") 150 (string-set! (f) 0 #\?) @result{} @r{unspecified} 151 (string-set! (g) 0 #\?) @result{} @r{error} 152 (string-set! (symbol->string 'immutable) 153 0 154 #\?) @result{} @r{error} 155 @end lisp 156 157 @end deffn 158 159 @deffn procedure string=? @vari{string} @varii{string} @variii{string}@dots{} 160 161 Returns @code{#t} if all the @var{string}s are the same length and 162 contain exactly the same characters in the same positions, otherwise 163 returns @code{#f}. 164 165 @end deffn 166 167 @deffn {char library procedure} string-ci=? @vari{string} @varii{string} @variii{string}@dots{} 168 169 Returns @code{#t} if, after case-folding, all the @var{string}s are the 170 same length and contain the same characters in the same positions, 171 otherwise returns @code{#f}. Specifically, these procedures behave as 172 if @code{string-foldcase} were applied to their arguments before 173 comparing them. 174 175 @end deffn 176 177 @deffn procedure string<? @vari{string} @varii{string} @variii{string}@dots{} 178 @deffnx {char library procedure} string-ci<? @vari{string} @varii{string} @variii{string}@dots{} 179 @deffnx procedure string>? @vari{string} @varii{string} @variii{string}@dots{} 180 @deffnx {char library procedure} string-ci>? @vari{string} @varii{string} @variii{string}@dots{} 181 @deffnx procedure string<=? @vari{string} @varii{string} @variii{string}@dots{} 182 @deffnx {char library procedure} string-ci<=? @vari{string} @varii{string} @variii{string}@dots{} 183 @deffnx procedure string>=? @vari{string} @varii{string} @variii{string}@dots{} 184 @deffnx {char library procedure} string-ci>=? @vari{string} @varii{string} @variii{string}@dots{} 185 186 These procedures return @code{#t} if their arguments are (respectively): 187 monotonically increasing, monotonically decreasing, monotonically 188 non-decreasing, or monotonically non-increasing. 189 190 These predicates are required to be transitive. 191 192 These procedures compare strings in an implementation-defined way. 193 One approach is to make them the lexicographic extensions to strings of 194 the corresponding orderings on characters. In that case, @code{string<?} 195 would be the lexicographic ordering on strings induced by the ordering 196 @code{char<?} on characters, and if the two strings differ in length but 197 are the same up to the length of the shorter string, the shorter string 198 would be considered to be lexicographically less than the longer string. 199 However, it is also permitted to use the natural ordering imposed by the 200 implementation's internal representation of strings, or a more complex 201 locale-specific ordering. 202 203 In all cases, a pair of strings must satisfy exactly one of 204 @code{string<?}, @code{string=?}, and @code{string>?}, and must satisfy 205 @code{string<=?} if and only if they do not satisfy @code{string>?} 206 and @code{string>=?} if and only if they do not satisfy @code{string<?}. 207 208 The @samp{-ci} procedures behave as if they applied @code{string-foldcase} 209 to their arguments before invoking the corresponding procedures without 210 @samp{-ci}. 211 212 @end deffn 213 214 @deffn {char library procedure} string-upcase string 215 @deffnx {char library procedure} string-downcase string 216 @deffnx {char library procedure} string-foldcase string 217 218 These procedures apply the Unicode full string uppercasing, lowercasing, 219 and case-folding algorithms to their arguments and return the result. 220 In certain cases, the result differs in length from the argument. 221 If the result is equal to the argument in the sense of @code{string=?}, 222 the argument may be returned. Note that language-sensitive mappings 223 and foldings are not used. 224 225 The Unicode Standard prescribes special treatment of the Greek letter 226 @greekcapitalsigma{}, whose normal lower-case form is @greeksmallsigma{} 227 but which becomes @greekfinalsigma at the end of a word. See UAX #44 228 [@ref{uax44}] 229 (part of the Unicode Standard) for details. However, implementations of 230 @code{string-downcase} are not required to provide this behavior, and may 231 choose to change @greekcapitalsigma{} to @greeksmallsigma{} in all cases. 232 233 @end deffn 234 235 @deffn procedure substring string start end 236 237 The @code{substring} procedure returns a newly allocated string formed 238 from the characters of @var{string} beginning with index @var{start} 239 and ending with index @var{end}. This is equivalent to calling 240 @code{string-copy} with the same arguments, but is provided for 241 backward compatibility and stylistic flexibility. 242 243 @end deffn 244 245 @deffn procedure string-append string@dots{} 246 247 Returns a newly allocated string whose characters are the concatenation 248 of the characters in the given @var{string}s. 249 250 @end deffn 251 252 @deffn procedure string->list string 253 @deffnx procedure string->list string start 254 @deffnx procedure string->list string start end 255 @deffnx procedure list->string list 256 257 It is an error if any element of @var{list} is not a character. 258 259 The @code{string->list} procedure returns a newly allocated list of 260 the characters of @var{string} between @var{start} and @var{end}. 261 @code{list->string} returns a newly allocated string formed from the 262 elements in the list @var{list}. In both procedures, order is preserved. 263 @code{string->list} and @code{list->string} are inverses so far as 264 @code{equal?} is concerned. 265 266 @end deffn 267 268 @deffn procedure string-copy string 269 @deffnx procedure string-copy string start 270 @deffnx procedure string-copy string start end 271 272 Returns a newly allocated copy of the part of the given @var{string} 273 between @var{start} and @var{end}. 274 275 @end deffn 276 277 @deffn procedure string-copy! to at from 278 @deffnx procedure string-copy! to at from start 279 @deffnx procedure string-copy! to at from start end 280 281 It is an error if @var{at} is less than zero or greater than the length of 282 @var{to}. It is also an error if @code{(- (string-length }@var{to}@code{) 283 }@var{at}@code{)} is less than @code{(- }@var{end} @var{start}@code{)}. 284 285 Copies the characters of string @var{from} between @var{start} and 286 @var{end} to string @var{to}, starting at @var{at}. The order in 287 which characters are copied is unspecified, except that if the source 288 and destination overlap, copying takes place as if the source is first 289 copied into a temporary string and then into the destination. This can 290 be achieved without allocating storage by making sure to copy in the 291 correct direction in such circumstances. 292 293 @lisp 294 (define a "12345") 295 (define b (string-copy "abcde")) 296 (string-copy! b 1 a 0 2) 297 b @result{} "a12de" 298 @end lisp 299 300 @end deffn 301 302 @deffn procedure string-fill! string fill 303 @deffnx procedure string-fill! string fill start 304 @deffnx procedure string-fill! string fill start end 305 306 It is an error if @var{fill} is not a character. 307 308 The @code{string-fill!} procedure stores @var{fill} in the elements of 309 @var{string} between @var{start} and @var{end}. 310 311 @end deffn