String in Python

This isn't the first time that we are encountering Strings since we have started learning python. In many of the previous tutorials we have used strings in examples or discussed about it, so it shouldn't be an ambush for you. Nonetheless, this chapter will give you more insight about how they can be used, manipulated and implemented in python world. We will also checkout some handy string functions to manipulate string. So, without wasting time let's jump right into it.


What is a String?

String can be defined as a sequence of characters, and that's the most basic explanation of string that you can provide. In this definition, we can see two important terms, first being sequence and other is characters. If you are here after finishing the last tutorial, then there, we already explained - What is Sequence data type and how Strings are a type of sequence. Just for revision, in python, Sequence is a data type which is made up of several elements of same type, i.e., integers, float, characters, strings etc.

Note: There is a unique code provided to all existing characters. The coding convention had been labelled as Unicode format. It consists of characters of almost every possible languages and in fact emoticons too (yes, emoticons had been declared as characters too).

Hence, strings can be considered as a special type of sequence, where all its elements are characters. For example, string "Hello, World" is basically a sequence ['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd'] and its length can be calculated by counting number of characters inside the sequence, which is 12.

Note: Yes, space, comma everything inside those quotes will be a character if the length is 1.

Generally in programming languages there is a different data type dedicated to characters only, while in Python, there is no character data type. Instead characters are just treated as a string of length 1.


Declaration of Strings

>>> mystring = "This is not my first String"
>>> print mystring
This is not my first String

Live Example →

You can access each individual character of a string too. Just like accessing each element of a Sequence, we can use index numbers for this purpose. To access first character of mystring, we can do following:

>>> print mystring[0]
T

Since T is the first character of our string This is not my first String, hence it will have index number as 0 (zero). Similarly, for further characters we can use index 1, 2, 3 and so on, i.e., in order to access ith element we will have to use (i-1)th index.

There is another trick to access elements of the sequence from its end. For example, if you want to access the last element of the sequence just do the following:

>>> print mystring[-1]

Writing -1 in the index will imply that you are asking for the 1st element from the last. Similarly, in order to access 2nd last element use -2 as index, for 3rd last use -3 and so on, i.e., for ith element from the last use -ith as the index. So that settles the generalization for accessing each character from both forward and backward side in a string. Note that positive index number implies you are accessing character from the forward side, while negative index number means you're accessing it from the rear end.


String in Python


We can conclude the what we have covered till now in a simple table. Consider a string PYTHON. For this each character can be accessed in two ways - from the front, or from the rear end.

CharactersPYTHON
Forward Index012345
Backward Index-6-5-4-3-2-1


Escape Sequence

Suppose you want a string to store a quote by Mahatma Gandhi.

"You must be the change you wish to see in the world" - Gandhi

This is the exact line you want to display in the console. And you also wish to have the quotes surrounding this sentence. As you go ahead and print the statement, you will see that it isn't that simple.

String in Python


Python will instantly return a syntax error. This is because of those extra double quotes that we added. In above image you can notice that Gandhi's quoted text is in black colour, while "- Gandhi" is in green. Also, if you have used IDLE enough you might know that all the characters inside the string are highlighted in green in the IDLE (it can be other colours too depending upon text editor, python version, OS etc). This clearly means that Python isn't treating You must be the change you wish to see in the world part of the sentence as a string. Therefore, this concludes that whenever we open a quote and close it, to declare a string, whatever we write after the closing quote, is just considered as some python keyword.

String in Python


Like for the above quotation text, we started the string with two double quotes and wrote You must be the change you wish to see in the world just next to it, since double quote was already closed before this phrase, hence Python considered the entire sentence as some non-understandable python keywords. After the phrase, another double quote started, then came - Gandhi after that and finally the closing double quote, since - Gandhi part is within a pair of double quotes hence its totally legitimate.

Now you understand the problem that we can face if we use uneven number of double quotes. Now let's see how we can actually have a quote in a string. Well, there are two ways to do so:

  1. First one is a bit compromising. You can use single quotes inside of double quotes, like:
    >>> print "'You must be the change you wish to see in the world' - Gandhi"
    
    ‘You must be the change you wish to see in the world' - Gandhi

    Hence, it's legitimate to use single quote inside double quotes, however, reverse is not true, i.e.,

    >>> '"You must be the change you wish to see in the world" - Gandhi'

    Will give an error.

  2. Second one is for those who hate to compromise, or just want to use the double quotes. For you people, there is something called escape sequence or literally speaking, a back-slash\. You can use it like:
    >>> print "\"You must be the change you wish to see in the world\" – Gandhi"

    Can you guess what happened? We used backslash or escape sequence at two places, just before the quotes which we directly want to print. If you want to inform the compiler to simply print whatever you type and not try to compile it, just add an escape sequence before it. Also remember, you must use one escape sequence for one character. For example, in order to print 5 double quotes, we will have to use 5 backslashes, one before each quote, like this:

    >>> print "\"\"\"\"\""

Input and Output for String

Input and Output methods have already been discussed in Input and Output tutorial in details. It is recommended to go through that tutorial, if you haven't already.


Operations on String

String handling in python probably requires least efforts. Since in python, string operations have very low complexity compared to other languages. Let's see how we can play around with strings.

  1. Concatenation: No, wait! what? This word may sound a bit complex for absolute beginners but all it means is - to join two strings. Like to join "Hello" with "World", to make it "HelloWorld". Yes, that's it.
    >>> print "Hello" + "World"
    
    HelloWorld
    

    Yes. A plus sign + is enought to do the trick. When used with strings, the + sign joins the two strings. Let's have one more example:

    >>> s1 = "Name Python "
    >>> s2 = "had been adapted "
    >>> s3 = "from Monty Python"
    >>> print s1 + s2 + s3
    
    Name Python had been adapted from Monty Python
    

    Live Example →


  2. Repetition: Suppose we want to write same text multiple times on console. Like repeat "Hi!" a 100 times. Now one option is to write it all manually, like "Hi!Hi!Hi!..." hundred times or just do the following:
    >>> print "Hi!"*100

    String in Python

    Suppose, you want the user to input some number n and based on that you want a text to be printed on console n times, how can you do it? It's simple. Just create a variable n and use input() function to get a number from the user and then just multiply the text with n.

    >>> n = input("Number of times you want the text to repeat: ")
    
    Number of times you want the text to repeat: 5
    
    >>> print "Text"*n
    
    TextTextTextTextText

  3. Check existence of a character or a sub-string in a string: The keyword in is used for this. For example: If there is a text India won the match and you want to check if won exist in it or not. Go to IDLE and try the following:
    >>> "won" in "India won the match"
    
    True

    Amongst other datatypes in python, there is Boolean datatype which can have one of the possible two values, i.e., either true or false. Since we are checking if something exists in a string or not, hence, the possible outcomes to this will either be Yes, it exists or No, it doesn't, therefore either True or False is returned. This should also give you an idea about where to use Boolean datatype while writing programs.

    Go ahead, try out some more examples of the in keyword, where you can take any string and substring or character as input one by one from the user and then print true or false using the in keyword.

    String in Python


  4. not in keyword: This is just the opposite of the in keyword. You're pretty smart if you guessed that right. Its implementation is also pretty similar to the in keyword.
  5. >>> "won" not in "India won the match"
    
    False

You can see all the above String operations live in action, by clicking on the below Live example button. Also, we suggest you to practice using the live compiler and try changing the code and run it.

Live Example →


Converting String to Int or Float datatype and vice versa

This is a very common doubt amongst beginners as a number when enclosed in quotes becomes a string in python and then if you will try to perform mathematical operations on it, you will get error.

numStr = '123'

In the statement above 123 is not a number, but a string.

Hence, in such situation, to convert a numeric string into float or int datatype, we can use float() and int() functions.

numStr = '123'
numFloat = float(numStr)
numInt = int(numFloat)

Live Example →

And then you can easily perform mathematical functions on the numeric value.

Similarly, to convert an int or float variable to string, we can use the str() function.

num = 123
# so simple
numStr = str(num)

Slicing

Slicing is yet another string operation. Slicing lets you extract a part of any string based on a start index and an end index. For example, if we have a string This is Python tutorial and we want to extract a part of this string or just a character, then we can use slicing. First lets get familiar with its usage syntax:

string_name[starting_index : finishing_index : character_iterate]
  • String_name is the name of the variable holding the string.
  • starting_index is the index of the beginning character which you want in your sub-string.
  • finishing_index is one more than the index of the last character that you want in your substring.
  • character_iterate: To understand this, let us consider that we have a string Hello Brother!, and we want to use the slicing operation on this string to extract a sub-string. This is our code:
    >>> str = "Hello Brother!"
    >>> str[0:10:2]

    Live Example →

    Now str[0:10:2] means, we want to extract a substring starting from the index 0 (beginning of the string), to the index value 10, and the last parameter means, that we want every second character, starting from the starting index. Hence in the output we will get, HloBo.

    H is at index 0, then leaving e, the second character from H will be printed, which is l, then skipping the second l, the second character from the first l is printed, which is o and so on.


It will be more clear with a few more examples:

Let's take a string with 10 characters, ABCDEFGHIJ. The index number will begin from 0 and end at 9.

ABCDEFGHIJ
0123456789

String in Python


Now try the following command:

>>> print s[0:5:1]

Here slicing will be done from 0th character to the 4th character (5-1) by iterating 1 character in each jump.

Now, remove the last number and the colon and just write this.

>>> print s[0:5]

You'll see that output are both same.

string slicing example in Python


You can practice by changing the values. Also try changing the value of the character iterate variable to some value n, then it will print every nth character from starting index to the final index.

String in Python