Regular expression in python
In our last tutorial, we have learned about the open file to read and write the data. If you have not checked the I would be recommended please check the articled which is posted here link. Some of the others of related out python learning series are here
Understanding Exception Handling in python, Python OOPS concept with Class, Object and Constructor,
concept of variables, data types, conditional control, functional control, Module, and packages and Data Structure (List, Dictionary, and Set) of python.
Now in this tutorial, we will learn about the Regular Expression(Regex). Regular Expression used for pattern matching. Regular expression use ‘\’ to avoid considering characters as regular expression keys. It is better to provide the pattern as raw string i.r r’<patrn>’. A regular expression can contain special characters and normal characters. Let’s see what are the special characters are available in regex and what are the meaning of those in python? To execute any regular expression we need to import the internal module “re”.
‘.’ (dot): matches any character except newline
'^': matches starting of a line
'$': matches end of the line
'*' : 0 or more repetition of preceding RE
'+': 1 or more repetition of preceding RE
'?': 0 or 1 repetition of preceding RE
I hope the above explanation is understandable. Let’s see by an example to better understanding.
import re
m = re.match('.', 'abcd')
print m.group()
output: a
Let’s see the example of ‘^’ matches.
import re
m = re.match('^.', 'abcd')
print m.group()
output : a
import re
m = re.search('.$', 'abcd')
print m.group()
output :d
Let’s see the example of matching *, +, ?
import re
print re.match('a*bc', 'aaabc').group()
print re.match('a*bc', 'bc').group()
print re.match('a+bc', 'abc').group()
print re.match('a+bc', 'aaaabc').group()
print re.match('a?bc', 'abc').group()
print re.match('a?bc', 'bc').group()
output:
aaabc
bc
abc
aaaabc
abc
bc
Let’s explore the same of other special characters which are used as the regular expression.
'*?', '+?', '??': non-greed matching for pattern
'{m}': match m copies of previous RE
'{m,n}': match m-n copies of previous RE
'{m,n}?': match non-greedy copies of RE
'\': escapes special characters
'[]': indicate a set of characters. Individual or range
'|': multiple RE. A|B – match either A or B
'(….)': indicates the start and end of group
Let’s see the non-greed matching pattern by an example.
import re
print re.match('a.*b', 'aaabaaab').group()
print re.match('a.*?b', 'aaabaaab').group()
output:
aaabaaab
aaab
Now let’s move to some of the special symbols of regular expression. These symbols are most frequently used in daily program basis.
Regex: Special symbols
‘\nn’: matches the group number. Max 99
‘\A’: matches only the start of string
‘\b’: matches empty string at beginning and end of word
‘\B’: matches empty string, but not at beginning or end
‘\d’: matches any digit
‘\D’: matches any non-digit
‘\s’: matches any whitespace character. [ \t\r\n\f\v]
‘\S’: matches non-whitespace character
‘\w’: matches alphanumeric and underscore
‘\W’: matches non-alphanumeric and non-underscore
‘\Z’: matches end of string
Let’s see a few examples for the above regex matching.
import re
print re.search('bc', 'abcdbcbc').group()
output : bc
import re
print re.split('a', 'abababbbaccca')
print re.split('(a)', 'abababbbaccca')
output:
['', 'b', 'b', 'bbb', 'ccc', '']
['', 'a', 'b', 'a', 'b', 'a', 'bbb', 'a', 'ccc', 'a', '']
import re
print re.findall('bc', 'abcddbccccbc')
print re.findall('\d{2,3}', 'a000b00000')
output:
['bc', 'bc', 'bc']
['000', '000', '00']
You can much more practice of regular expression. I hope all the explanation is understandable.
Summary: Now we have a good understanding of the Regular expression of python. We have seen a few examples and play around it to clear the doubts. If you still having issue or doubts then please add a comment I will try to reply as much possible. In my next python tutorial series, we will learn about the Database with the example.
If you are wondering to learn Android then Please Learn from Android category and Wondering to lean Kotlin then Kotlin Category will help you. If you learn all python article then learn from the python category.
Please do subscribe your email to get the newsletter on this blog on below and if you like this post then do not forget to share like and comment on the below section.
Happy coding 🙂
I am a very enthusiastic Android developer to build solid Android apps. I have a keen interest in developing for Android and have published apps to the Google Play Store. I always open to learning new technologies. For any help drop us a line anytime at contact@mobologicplus.com