More str operations and methods

Escape sequences

When working with multi-line strings, you learned about the newline character: '\n'.

In [1]:
s = """Line one,
line two,
line three."""
In [2]:
s
Out[2]:
'Line one,\nline two,\nline three.'

The backslash character, \, is called the escape character. It can be combined with other characters to form escape sequences. For example, if we want to include a single quote inside a single-quoted string, we use the escape character:

In [3]:
# A failed attempt to include a single quote within a single-quoted string:
'can't'
  File "<ipython-input-3-a5c8cd786e02>", line 2
    'can't'
         ^
SyntaxError: invalid syntax

Now, we use the escape character to to indicate that the quote is a character within the string, not part of the strings opening/closing quotation marks:

In [4]:
'can\'t'
Out[4]:
"can't"

As we see above, we can use double-quotes to include single-quotes in a string, without using the escape character. We can do the same with the types of quotes swapped:

In [5]:
'He said, "I will be there soon".'
Out[5]:
'He said, "I will be there soon".'
In [6]:
"He said, \"I will be there soon\"."
Out[6]:
'He said, "I will be there soon".'

Here is a summary of the escape sequences that we will use most:

  • \': single quote
  • \": double quote
  • \\: backslash
  • \n: newline
  • \t: tab

String functions and operations

A string represents a sequence of characters. A string has a length and we can find out the length using built-in method len:

In [7]:
len('happy')
Out[7]:
5

Although they are formed using two symbols, escape sequences count as a single character:

In [8]:
len('\'')
Out[8]:
1

There are also several operators that can be applied to strings, including + (concatenation):

In [10]:
'hi' + 'there'
Out[10]:
'hithere'

Another string operator is in, which we use to check whether one string is a substring of another:

In [11]:
'ha' in 'happy'
Out[11]:
True
In [12]:
'hat' in 'happy'
Out[12]:
False

As for numeric values, we can apply comparison operators to strings:

In [13]:
'a' == 'a'
Out[13]:
True
In [14]:
'a' == 'A'
Out[14]:
False
In [16]:
'a' != 'b'
Out[16]:
True
In [15]:
'A' < 'C'
Out[15]:
True

String indexing

Each character in the string has an index, representing its position within the string. We start counting from 0. For example:

In [17]:
s = 'happy'
In [18]:
s[0]
Out[18]:
'h'
In [19]:
s[1]
Out[19]:
'a'
In [20]:
s[2]
Out[20]:
'p'
In [21]:
s[3]
Out[21]:
'p'
In [22]:
s[4]
Out[22]:
'y'

If we try to access an index that doesn't exist, an error occurs:

In [23]:
s[5]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-23-b70192e38e48> in <module>()
----> 1 s[5]

IndexError: string index out of range
In [24]:
s[12]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-24-2adddb99d733> in <module>()
----> 1 s[12]

IndexError: string index out of range

Negative indexes

An alternative to counting from 0 from left-to-right, is to count from -1 from right-to-left:

In [25]:
s[-1]
Out[25]:
'y'
In [26]:
s[-2]
Out[26]:
'p'
In [27]:
s[-3]
Out[27]:
'p'
In [28]:
s[-4]
Out[28]:
'a'
In [29]:
s[-5]
Out[29]:
'h'

As we saw with positive indices, an error occurs when we try to index into a string at a position that doesn't exist:

In [30]:
s[-6]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-30-2d41c3755976> in <module>()
----> 1 s[-6]

IndexError: string index out of range

String slicing

Using indexing, we can produce a new single-character string. We can use slicing to produce new strings of 0 or more characters, by providing start and stop indices. For example, here we take a slice from start index 2 up to but not including stop index 4:

In [31]:
'computer'[2:4]
Out[31]:
'mp'

Here are a few more examples using both positive and negative indices, including cases where the slice is empty:

In [32]:
'computer'[1:6]
Out[32]:
'omput'
In [33]:
'computer'[3:4]
Out[33]:
'p'
In [34]:
'computer'[3:3]
Out[34]:
''
In [35]:
'computer'[-4:-2]
Out[35]:
'ut'
In [36]:
'computer'[-2:-4]
Out[36]:
''

String Methods

We've seen Python's built-in functions. In addition to those functions, each Python types can also have a set of functions defined in them called methods. For example, we can see the str methods by called dir on type str (for now, ignore the ones with underscores in their names):

In [37]:
dir(str)
Out[37]:
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

We can then find out how to use each method, by calling help:

In [38]:
help(str.upper)   # Note: we use the type (str) and the method name (upper), separated by a .
Help on method_descriptor:

upper(...)
    S.upper() -> str
    
    Return a copy of S converted to uppercase.

Now we can call on the method:

In [39]:
'hello'.upper()
Out[39]:
'HELLO'
In [40]:
'acdc'.upper()
Out[40]:
'ACDC'
In [41]:
'c4m'.upper()
Out[41]:
'C4M'

In place of the str object, we can use a variable that refers to a str object, such as s:

In [44]:
s = 'hello'
s.upper()
Out[44]:
'HELLO'

Notice that method upper returned a new uppercase string, but the original str object is unchanged:

In [45]:
s
Out[45]:
'hello'

Strings are immutable, so they cannot be modified.

General form of method call

To call a method, we use this general form:

object.method(arguments)

For all str methods, the object on the left-hand side of the dot will always be a str.

Notice that the string that s refers to hasn't changed, but calling the method upper on s, returned a new string 'HELLO'.

Practice Exercise: Exploring str methods

Call help on each str method to learn about it and then make an educated guess as to what each of the following method calls produces. Once you've done that, check your work by executing the method calls in the Python shell.

robot = 'R2D2'

  1. robot.isupper()
  2. robot.isalpha()
  3. robot.isalnum()
  4. robot.isdigit()
  5. robot.lower()
  6. robot.index('2')
  7. robot.index('2', 2)
  8. robot.count(2)