Next: Lists Up: Using Python as Previous: Numbers

Strings

Besides numbers, Python can also manipulate strings, enclosed in single quotes or double quotes:


>>> 'foo bar'
'foo bar'
>>> 'doesn\'t'
"doesn't"
>>> "doesn't"
"doesn't"
>>> '"Yes," he said.'
'"Yes," he said.'
>>> "\"Yes,\" he said."
'"Yes," he said.'
>>> '"Isn\'t," she said.'
'"Isn\'t," she said.'
>>>
Strings are written the same way as they are typed for input: inside quotes and with quotes and other funny characters escaped by backslashes, to show the precise value. The string is enclosed in double quotes if the string contains a single quote and no double quotes, else it's enclosed in single quotes. (The print statement, described later, can be used to write strings without quotes or escapes.)

Strings can be concatenated (glued together) with the + operator, and repeated with *:


>>> word = 'Help' + 'A'
>>> word
'HelpA'
>>> '<' + word*5 + '>'
'<HelpAHelpAHelpAHelpAHelpA>'
>>>
Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0.

There is no separate character type; a character is simply a string of size one. Like in Icon, substrings can be specified with the slice notation: two indices separated by a colon.


>>> word[4]
'A'
>>> word[0:2]
'He'
>>> word[2:4]
'lp'
>>>
Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.


>>> word[:2]    # The first two characters
'He'
>>> word[2:]    # All but the first two characters
'lpA'
>>>
Here's a useful invariant of slice operations: s[:i] + s[i:] equals s.


>>> word[:2] + word[2:]
'HelpA'
>>> word[:3] + word[3:]
'HelpA'
>>>
Degenerate slice indices are handled gracefully: an index that is too large is replaced by the string size, an upper bound smaller than the lower bound returns an empty string.


>>> word[1:100]
'elpA'
>>> word[10:]
''
>>> word[2:1]
''
>>>
Indices may be negative numbers, to start counting from the right. For example:


>>> word[-1]     # The last character
'A'
>>> word[-2]     # The last-but-one character
'p'
>>> word[-2:]    # The last two characters
'pA'
>>> word[:-2]    # All but the last two characters
'Hel'
>>>
But note that -0 is really the same as 0, so it does not count from the right!


>>> word[-0]     # (since -0 equals 0)
'H'
>>>
Out-of-range negative slice indices are truncated, but don't try this for single-element (non-slice) indices:


>>> word[-100:]
'HelpA'
>>> word[-10]    # error
Traceback (innermost last):
  File "<stdin>", line 1
IndexError: string index out of range
>>>
The best way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n, for example:


 +---+---+---+---+---+ 
 | H | e | l | p | A |
 +---+---+---+---+---+ 
 0   1   2   3   4   5 
-5  -4  -3  -2  -1
The first row of numbers gives the position of the indices 0...5 in the string; the second row gives the corresponding negative indices. The slice from i to j consists of all characters between the edges labeled i and j, respectively.

For nonnegative indices, the length of a slice is the difference of the indices, if both are within bounds, e.g., the length of word[1:3] is 2.

The built-in function len() returns the length of a string:


>>> s = 'supercalifragilisticexpialidocious'
>>> len(s)
34
>>>


Next: Lists Up: Using Python as Previous: Numbers


guido@cwi.nl