The String Data Type

A string is a sequence of characters. In Chapter 2 you learned that a string literal is a sequence of characters in quotations. Python also allows strings to be delimited by single quotes (apostrophes). There's no difference— just be sure to use a matching set. Strings can be stored in variables, just like numbers. Here are some examples illustrating these two forms of string literals.

>>> strl = "Hello" >>> str2 = 'spam' >>> print strl, str2 Hello spam >>> type(strl) <type 'string'> >>> type(str2) <type 'string'>

You already know how to print strings. Some programs also need to get string input from the user (e.g., a name). Getting string-valued input requires a bit of care. Remember that the input statement treats whatever the user types as an expression to be evaluated. Consider the following interaction.

>>> firstName = input("Please enter your name: ") Please enter your name: John Traceback (innermost last):

firstName = input("Please enter your name: ") File "<string>", line 0, in ? NameError: John

Something has gone wrong here. Can you see what the problem is?

Remember, an input statement is just a delayed expression. When I entered the name, "John", this had the exact same effect as executing this assignment statement:

firstName = John

This statement says, "look up the value of the variable John and store that value in firstName." Since John was never given a value, Python cannot find any variable with that name and responds with a NameError. One way to fix this problem is to type quotes around a string input so that it evaluates as a string literal.

>>> firstName = input("Please enter your name: ") Please enter your name: "John" >>> print "Hello", firstName Hello John

This works, but it is not a very satisfactory solution. We shouldn't have to burden the users of our programs with details like typing quotes around their names.

Python provides a better mechanism. The raw_input function is exactly like input except it does not evaluate the expression that the user types. The input is simply handed to the program as a string of text. Revisiting our example, here is how it looks with raw_input:

>>> firstName = raw_input("Please enter your name: ") Please enter your name: John >>> print "Hello", firstName Hello John

Notice that this example works as expected without having to type quotes around the input. If you want to get textual input from the user, raw_input is the way to do it.

So far, we have seen how to get strings as input, assign them to variables and print them out. That's enough to write a parrot program, but not to do any serious text-based computing. For that, we need some string operations. The rest of this section takes you on a tour of the more important Python string operations. In the following section, we'll put these ideas to work in some example programs.

While the idea of numeric operations may be old hat to you from your math studies, you may not have thought about string operations before. What kinds of things can we do with strings?

For starters, remember what a string is: a sequence of characters. One thing we might want to do is access the individual characters that make up the string. In Python, this can be done through the operation of indexing. We can think of the positions in a string as being numbered, starting from the left with 0. Figure 4.1

H

e

l

l

o

B

o

Figure 4.1: Indexing of the string "Hello Bob"

illustrates with the string "Hello Bob." Indexing is used in string expressions to access a specific character position in the string. The general form for indexing is <string>[<expr>] . The value of the expression determines which character is selected from the string. Here are some interactive indexing examples:

Notice that, in a string of n characters, the last character is at position n 1, because the indexes start at 0.

Indexing returns a string containing a single character from a larger string. It is also possible to access a contiguous sequence of characters or substring from a string. In Python, this is accomplished through an operation called slicing. You can think of slicing as a way of indexing a range of positions in the string. Slicing takes the form <string> [<start> :<end>] . Both start and end should be int-valued expressions. A slice produces the substring starting at the position given by start and running up to, but not including, position end.

Continuing with our interactive example, here are some slices.

>>> greet[:5] 'Hello' >>> greet[5:] ' Bob'

The last three examples show that if either expression is missing, the start and end of the string are the assumed defaults. The final expression actually hands back the entire string.

Indexing and slicing are useful operations for chopping strings into smaller pieces. The string data type also supports operations for putting strings together. Two handy operators are concatenation (+) and repetition (*). Concatenation builds a string by "gluing" two strings together. Repetition builds a string by multiple concatenations of a string with itself. Another useful function is len, which tells how many characters are in a string. Here are some examples:

>>> "spam" + "eggs" 'spameggs'

>>> "Spam" + "And" + "Eggs"

'SpamAndEggs'

'spamspamspam'

'spamspamspamspamspam'

>>> (3 * "spam") + ("eggs" * 5)

'spamspamspameggseggseggseggseggs'

These basic string operations are summarized in Table 4.1.

Operator

Meaning

+

Concatenation

*

Repetition

<string> [ ]

Indexing

len(<string>)

length

<string> [ : ]

slicing

Table 4.1: Python string operations

Table 4.1: Python string operations

0 0

Post a comment