1.8 Character strings

A character string or string literal (string for short) is a sequence of zero or more characters (byte or word sized), enclosed in single quotes, and on a single line of the program source code: no literal carriage return or linefeed characters can appear in the string.

A character set with nothing between the quotes (’’) is an empty string.

_________________________________________________________________________________________________________
Character strings

--string-literal---|quoted- string--------------------------------------
             | -control- string-- |
             ----------------

--quoted-string- ’--|string-character---’-------------------------------
                 ---------------

--string-character--- Any character except ’ or CR----------------------
                ------------”-------------|

--           --   -             --------------------------------
  control- string -#---unsigned- integer-|
____________________________________________

The string consists of standard, 8-bit ASCII characters or Unicode (normally UTF-8 encoded) characters. The control string can be used to specify characters which cannot be typed on a keyboard, such as #27 for the escape character.

The single quote character can be embedded in the string by typing it twice. The C construct of escaping characters in the string (using a backslash) is not supported in Pascal.

The following are valid string constants:

  'This is a pascal string'
  ''
  'a'
  'A tabulator character: '#9' is easy to embed'

The following is an invalid string:

  'the string starts here
   and continues here'

The above string must be typed as:

  'the string starts here'#13#10'   and continues here'

or

  'the string starts here'#10'   and continues here'

on unices (including Mac OS X), and as

  'the string starts here'#13'   and continues here'

on a classic Mac-like operating system.

It is possible to use other character sets in strings: in that case the codepage of the source file must be specified with the {$CODEPAGE XXX} directive or with the -Fc command line option for the compiler. In that case the characters in a string will be interpreted as characters from the specified codepage.