Introduction to Strings in Python
In this blog, you will learn very important and widely used datatype, i.e., “string.”
The string is a simple sequence of characters enclosed in delimiters. We will learn what kinds of delimiters are used in Python. There are many ways to represent strings with different delimiters. E.g., “Python”, ‘Python’ and “““Python”””, all three are strings.
In Python, Strings are arrays of bytes representing Unicode characters. However, Python does not have a character data type; a single character is simply a string with a length of 1.
Let’s learn how to create strings!
Creating a String
Strings are one of the primary and most common data types in Python. The first code we write in any programming language is “Hello World!” The datatype is string there! 🙂
Strings in Python can be created using delimiters. Delimiters can be single quotes or double quotes, or even triple quotes.
These quotes are called delimiters because they tell Python where the string starts and where it ends.
Let’s play with strings! 🙂
str_var = 'Co-Learning Lounge' ## String represented in single quotes print(str_var) """ output: Co-Learning Lounge """
Let’s try creating one string without delimiters.
str_var = ColearningLounge print(str_var) """ output: --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-1-582f6480c6ad> in <module> ----> 1 str_var = ColearningLounge 2 print(str_var) NameError: name 'ColearningLounge' is not defined """
It throws one error called “invalid syntax.” When Python executes this statement, it interprets Co-learningLounge
it to be a variable. Because that’s how we represent variables, right?
Because we can literally have any text as a variable(provided we follow identifier rules) and string.
So how do we differentiate both? Because string can be of any length, i.e., word, phrase, sentence, paragraph, document, etc. It makes sense to represent a string with some special notation.
str_var = "Co-Learning Lounge" ## String represented in double quotes print(str_var) """ output: Co-Learning Lounge """
Let’s understand the difference between assigning a string with single/doubts and triple quotes.
# Creating a String # with triple Quotes ## As a comment ''' Co-Learning Lounge is amazing community ''' String1 = '''Co-Learning Lounge is amazing community''' print("String with the use of Triple Quotes: ") print(String1) # Creating String with triple # Quotes allows multiple lines String1 = '''Co-Learning Lounge is amazing community''' print("\nCreating a multiline String: ") print(String1) """ output: String with the use of Triple Quotes: Co-Learning Lounge is amazing community Creating a multiline String: Co-Learning Lounge is amazing community """
So if we have to create a multi-line string, we can use triple quotes to do that easily.
Let’s compare all three types of delimiter in one example!
## Checking the datatype of value using "type" function print(type('Co-Learning Lounge')) print(type("Co-Learning Lounge")) print(type('''Co-Learning Lounge is amazing community''')) print(type('''Co-Learning Lounge is amazing ..... community''')) """ output: <class 'str'> <class 'str'> <class 'str'> <class 'str'> """
In the above example, the last one is a multi-line string.
Let’s try this multi-line string with double or single quotes!
## Multi line string with double or single quote as a delimiter won't work print(type("Co-Learning Lounge ## <-- NEWLINE \\n is amazing community")) """ output: File "<ipython-input-1-26f631c030bf>", line 1 print(type("Co-Learning Lounge ^ SyntaxError: EOL while scanning string literal """
It doesn’t work, as we can see from the error. It only works with triple quotes as a delimiter.
Now, another question! Can multi-line strings only be created with triple quotes?
The answer is a big NO! Creating multi-line strings with triple quotes is easy but not the only way.
We can use the Escape character (\) at the end. Let’s try!
print(type("Co-Learning Lounge \ is amazing community")) """ output: File "<ipython-input-6-f50cbbc9561f>", line 1 print(type("Co-Learning Lounge \ ^ SyntaxError: EOL while scanning string literal """
Oh god! still error! Did you understand why?
God: “Because you have only put the escape character at the end of the first line!”
Thanks, god! 🙂
Let’s trace what Python did! When we executed the above example, python started to run through it; at the end of the first line, Python saw that there was an escape character, and it went to the next line. So far, we are good.
When it goes to the second line, it doesn’t find the end quotes, and it breaks out with the error because it was expecting end quotes. (escape character is our way to tell Python to expect end quotes in the next line, then if we mention another escape character, it again goes to the next line hoping to find the escape character.
print("Co-Learning Lounge \ is amazing \ community") """ output: Co-Learning Lounge is amazing community """
When a string is defined with a delimiter (Single, Double or Triple Quotes), it has to end with the same delimiter.
We live in a civilized society, and you can’t start one string with double quotes and try to end it with single or triple quotes.
Go through the following statements to get a better idea.
str_var = 'Co-Learning Lounge" print(str_var) """ output: File "<ipython-input-5-79c01a4ac324>", line 1 str_var = 'Co-Learning Lounge" ^ SyntaxError: EOL while scanning string literal """"
str_var = "Co-Learning Lounge' print(str_var) """ output: File "<ipython-input-16-b49841155818>", line 1 str_var = "Co-Learning Lounge' ^ SyntaxError: EOL while scanning string literal """
str_var = "Co-Learning Lounge''' print(str_var) """ output: File "<ipython-input-17-30849380a57f>", line 1 str_var = "Co-Learning Lounge''' ^ SyntaxError: EOL while scanning string literal """
Also, Actual string can contain any special damn character i.e., #, %, * etc
valid_str = "Co-Learning Lounge is the #1 community!" valid_str_1 = "×Pýŧħøŋ× " valid_str_2 = "नमस्ते" valid_str_3 = "微笑" print(valid_str) print(valid_str_1) print(valid_str_2) print(valid_str_3) """ output: Co-Learning Lounge is the #1 community! ×Pýŧħøŋ× नमस्ते 微笑 """
But be aware when you use single, double, and triple quotes in the actual string. It throws a SyntaxError
because it thinks that the string ends after the second ” and doesn’t know how to interpret the rest of the line.
Let’s try to understand with examples!
## Here we have used double quotes inside of a string that is delimited by double quotes str_var = "Community name is "Co-Learning Lounge"" print(str_var) """ output: File "<ipython-input-4-f12898be0832>", line 2 str_var = "Community name is "Co-Learning Lounge"" ^ SyntaxError: invalid syntax """
str_var = "Community name is " Co-Learning Lounge"" print(str_var) """ output: File "<ipython-input-5-326e8807f652>", line 2 Co-Learning Lounge"" ^ SyntaxError: invalid syntax """
In the above example, python starts executing the string, and it stops when it finds another quote. (It presumes string ends here). This is the reason why after closing quotes, the remaining words aren’t in red (string); these are black (interpreted as a variable).
Now the problem is there will be many strings where you want to show double quotes within the string to emphasize the statement. Like Shakespeare said, “This is not that!”
We have to put this in doubts quotes because Shakespears said this.
There are multiple ways to get around! Let’s see the same example again!
str_var = 'Community name is "Co-Learning Lounge"' print(str_var) """ output: Community name is "Co-Learning Lounge" """
Eureka! it runs fine now.
This means we can use a different delimiter because Python will always consider the end of a string when it finds the exact quotes again.
We can put double quotes within the string, and start and end delimiters can be single quotes, or we can keep single quotes within strings when delimiters are double or triple quotes.
This is one way of going around. Let’s check another way!
We can use escape characters when we have to use the same quotes within the string.
Let’s see how?
str_var = "Community name is \"Co-Learning Lounge\"" print(str_var) """ output: Community name is "Co-Learning Lounge" """
In the above example, the escape character tells the Python interpreter to escape the next character, and this is why it works fine.
To escape such characters, we have to put \ character every time. Let’s try with only one.
str_var = "Community name is \"Co-Learning Lounge"" print(str_var) """ output: File "<ipython-input-1-0cc4ea939d92>", line 1 str_var = "Community name is \"Co-Learning Lounge"" ^ SyntaxError: EOL while scanning string literal """
The escape character can be used with any kind of delimiter.
## Here we have used single quotes inside of a string that is delimited by single quotes str_var = 'Community is amazing hence I\'m active member of the community' print(str_var) """ output: Community is amazing hence I'm active member of the community """
str_var = "Community is amazing hence I'm active member of the community" print(str_var) """ output: Community is amazing hence I'm active member of the community """
After Python reads the first delimiter, all of the characters after it are considered a part of the string until a second matching delimiter is read. This is why you can use a single quote in a string delimited by double quotes and vice versa.
Escape sequencing uses escape characters to print any character without losing it in the code execution process.
Escape Sequencing in Python
While printing Strings with single and double quotes causes SyntaxError because String already contains Single and Double Quotes and hence cannot be printed with either of these. Hence, to print such a String, either Triple Quotes are used or Escape sequences are used to print such Strings. Escape sequences start with a backslash and can be interpreted differently. If single quotes are used to represent a string, then all the single quotes present in the string must be escaped, and the same is done for Double Quotes.
# Python Program for # Escape Sequencing # of String # Escaping Single Quote String1 = 'Community is amazing hence I\'m active member of the community' print("\nEscaping Single Quote: ") print(String1) # Escaping Double Quotes String1 = "Community name is \"Co-Learning Lounge\"" print("\nEscaping Double Quotes: ") print(String1) # Printing Paths with the # use of Escape Sequences String1 = "C:\\Python\\learner\\" print("\nEscaping Backslashes: ") print(String1) """ output: Escaping Single Quote: Community is amazing hence I'm active member of the community Escaping Double Quotes: Community name is "Co-Learning Lounge" Escaping Backslashes: C:\Python\learner\ """
To ignore the escape sequences in a String, r or R is used; this implies that the string is a raw string and escape sequences inside it are to be ignored.
# Printing Geeks in HEX String1 = "This is \x43\x6f\x2d\x6c\x65\x61\x72\x6e\x69\x6e\x67\x20\x4c\x6f\x75\x6e\x67\x65 in \x48\x45\x58" print("Printing in HEX with the use of Escape Sequences: ") print(String1) String1 = "This is \\x43\\x6f\\x2d\\x6c\\x65\\x61\\x72\\x6e\\x69\\x6e\\x67 in \\x48\\x45\\x58" print("\nPrinting in HEX with the use of Escape Sequences: ") print(String1) # Using raw String to # ignore Escape Sequences String1 = r"This is \x43\x6f\x2d\x6c\x65\x61\x72\x6e\x69\x6e\x67 in \x48\x45\x58" print("\nPrinting Raw String in HEX Format: ") print(String1) """ output: Printing in HEX with the use of Escape Sequences: This is Co-learning Lounge in HEX Printing in HEX with the use of Escape Sequences: This is \x43\x6f\x2d\x6c\x65\x61\x72\x6e\x69\x6e\x67 in \x48\x45\x58 Printing Raw String in HEX Format: This is \x43\x6f\x2d\x6c\x65\x61\x72\x6e\x69\x6e\x67 in \x48\x45\x58 """
String1 = r"C:\Python\learner\ \x43\x6f\x2d\x6c\x65\x61\x72\x6e\x69\x6e\x67" print(String1) """ output: C:\Python\learner\ \x43\x6f\x2d\x6c\x65\x61\x72\x6e\x69\x6e\x67 """
Bonus:
If we convert these strings to boolean, we get the following results:
An empty string is false.
A non-empty string is true.
print(bool(''), bool(""), bool("""""")) """ output: False False False """
print(bool('foo'), bool(" "), bool(''' ''')) """ output: True True True """
How to represent a really long string?
long_str = "This planet has—or rather had—a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movements of small green pieces of paper, which is odd because on the whole it wasn't the small green pieces of paper that were unhappy." print(long_str) """ output: This planet has—or rather had—a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movements of small green pieces of paper, which is odd because on the whole it wasn't the small green pieces of paper that were unhappy. """
The above code doesn’t look pretty, right?
There are a couple of ways to tackle this. One way is to break the string up across multiple lines and put a backslash () at the end of all but the last line.
long_str = "This planet has—or rather had—a problem, \ which was this: most of the people living on it were unhappy for pretty much of the time. \ Many solutions were suggested for this problem, \ but most of these were largely concerned with the movements of small green pieces of paper, \ which is odd because on the whole \ it wasn't the small green pieces of paper that were unhappy." print(long_str) """ output: This planet has—or rather had—a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movements of small green pieces of paper, which is odd because on the whole it wasn't the small green pieces of paper that were unhappy. """
This way is good, but it is a little more work to do because adding a backslash at the end of each line is not wise to work.
There is another way! We can use triple quotes as a delimiter.
Triple quotes as delimiters are the best way to create readable long strings.
long_str = """This planet has—or rather had—a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movements of small green pieces of paper, which is odd because on the whole it wasn't the small green pieces of paper that were unhappy.""" print(long_str) """ output: This planet has—or rather had—a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movements of small green pieces of paper, which is odd because on the whole it wasn't the small green pieces of paper that were unhappy. """
In this blog, we have understood what a string is, multiple ways to create a string, different delimiters, and escape sequences to represent longer strings.
In the next blog, we will dive into string functions and operations with string. If you have learned this far, please visit the following blogs on string for an easy and more adventurous dive into the string.