SlideShare una empresa de Scribd logo
1 de 56
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
What is a regular expression?
“A string that defines a text
matching pattern”
Jill roll number is 1001
Bob roll number is 1002
Rob roll number is 1003
Jack roll number is 1004 Extract Roll Numbers ?
dddd
Regular Expression
1001
1002
1003
1004
For video lectures, check out
www.facebook.com/CSxFunda
What is the advantage of using
regular expressions?
 Using regular expressions, You can extract
text which follows a pattern by writing only
very few lines of codes
For video lectures, check out
www.facebook.com/CSxFunda
Example
A weight is 46kg
B weight is 54kg
C weight is 60kg
D weight is 70kg
Text File
46
54
60
70
Extract
Without Using Regular Expressions
 Lengthy
Code
 Complex
For video lectures, check out
www.facebook.com/CSxFunda
Example
A weight is 46kg
B weight is 54kg
C weight is 60kg
D weight is 70kg
Text File
46
54
60
70
Extract
Using Regular Expressions
 Less Code
 Simple
For video lectures, check out
www.facebook.com/CSxFunda
Python String
Set of characters enclosed in single or
double quotes
Ex: ‘Kalyan’, “Meghana”
For video lectures, check out
www.facebook.com/CSxFunda
Python Raw Strings
Set of characters enclosed in single or
double quotes preceded by r
Ex: r‘Kalyan’, r“Meghana”
For video lectures, check out
www.facebook.com/CSxFunda
Strings vs Raw Strings
You can write a regular expression as a
string or raw string
In a string regular expression, you have
to escape the special characters.
In a raw string regular expression, you
need not to escape the special
characters
For video lectures, check out
www.facebook.com/CSxFunda
Regular Expressions
 Regular Expression are supported by many programming languages
Ex: Perl, Ruby , Java, Python, Javascript ……………
 Some languages provide regex capabilities built in
Ex: Perl
 Some languages provide regex capabilities via libraries
Ex: Python
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
re Module
Python supports regular expressions through re
module
That is, you have to import re module for
using regular expressions
import re
No need to explicitly install this module
www.facebook.com/CSxFunda
Steps
Import re module
Write regular expression
Create regex object
Call the function using regex object
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
re Module Functions
 match(text)
 search(text)
 findall(text)
 finditer(text)
 sub(replString, text)
 split(text)
For video lectures, check out
www.facebook.com/CSxFunda
match()
o Looks for the match at the beginning of the string
o Returns a match object if there is a match , otherwise
returns None
regex=re.compile(pattern)
mo=regex.match(text)
For video lectures, check out
www.facebook.com/CSxFunda
search()
o Looks for the match any where in the string
o Returns a match object if there is a match , otherwise
returns None
o If string has more than one match, returns the match
object for the first match only
regex=re.compile(pattern)
mo=regex.search(text)
For video lectures, check out
www.facebook.com/CSxFunda
findall()
o Looks for the match any where in the string
o Returns all matched substrings as a list if there is
match, otherwise returns empty list
regex=re.compile(pattern)
values=regex.findall(text)
For video lectures, check out
www.facebook.com/CSxFunda
finditer()
o Looks for the match any where in the string
o Returns objects for all matched substrings as a list
if there is a match, otherwise returns empty list
regex=re.compile(pattern)
moList=regex.finditer(text)
For video lectures, check out
www.facebook.com/CSxFunda
sub()
o replaces all the matched substrings with the
given replString and returns the modified
string, if there is match
o Returns original string, if there is no match
o Similar to replace option in text editors
regex=re.compile(pattern)
regex.sub(replString,text)
For video lectures, check out
www.facebook.com/CSxFunda
split()
o Looks for match anywhere in the string
o Splits the string at the matched substrings and
returns the splitted string as a list
o Returns original string, if there is no match
o Similar to split() method in strings
regex=re.compile(pattern)
regex.split(text)
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
Groups
o You want to match a substring in a string and want to
extract a part of matched substring, grouping is used.
Match the roll number CS1004 and extract the last four digits
For video lectures, check out
www.facebook.com/CSxFunda
Groups - Types
oNumbered Groups
oNamed Groups
oNon-capturing Groups
For video lectures, check out
www.facebook.com/CSxFunda
Numbered Groups
For video lectures, check out
www.facebook.com/CSxFunda
Named Groups
 When groups are
large in number,
it is difficult
to remember the
group numbers
 In such a case,
we use named
groups
NonCapturing Group(?:)
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
Meta Characters
| (pipe)
? (question mark)
* (asterisk)
+ (plus symbol)
. (dot symbol)
For video lectures, check out
www.facebook.com/CSxFunda
|(pipe)
Matches one of the many characters
Matches
42
100
30
111
A weight is 42kg
B weight is 100kg
C weight is 30kg
D weight is 111kg
For video lectures, check out
www.facebook.com/CSxFunda
?(question mark)
Matches zero or one occurrence
Matches
42
100
30
111
A weight is 42kg
B weight is 100kg
C weight is 30kg
D weight is 111kg
For video lectures, check out
www.facebook.com/CSxFunda
*(asterisk)
Matches zero or more occurrence
Matches
abbbc
abc
ac
abbbc
abc
ac
For video lectures, check out
www.facebook.com/CSxFunda
+(plus symbol)
Matches one or more occurrence
Matches
abbbc
abbc
abc
abbbc
abbc
abc
For video lectures, check out
www.facebook.com/CSxFunda
.(dot symbol)
Matches any character except ‘n’
Matches
Kalyan007
Kalyann007
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
pattern{m}
Matches exactly m repetitions
Matches exactly 3 digits
For video lectures, check out
www.facebook.com/CSxFunda
pattern{m,n}
Matches minimum of m repetitions
& maximum of n repetitions
For video lectures, check out
www.facebook.com/CSxFunda
pattern{m,}
Matches a minimum of m repetitions
Matches exactly 3 digits
Matches exactly 5 digits
Matches exactly 4 digits
Matches exactly 6 digits
.
.
. For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
Greedy Matching
Looks for the maximum possible match
abcabcabcabc Greedy Match
abcabcabcabc
For video lectures, check out
www.facebook.com/CSxFunda
NonGreedy Matching(?)
Looks for the minimum possible match
abcabcabcabc NonGreedy Match
abc
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
Character Classes
Matches one of the many characters
Types
Positive Character Class
Negative Character Class
Shorthand Character Class
For video lectures, check out
www.facebook.com/CSxFunda
Positive Character Class
Matches one of the characters specified in []
[abc] Matches a or b or c
[aeiou] Matches a ,e,i,o,u
[0123456789] Matches numbers 0 to 9
[a-c0-9] Matches a,b,c or 0 to 9
For video lectures, check out
www.facebook.com/CSxFunda
Negative Character Class
Matches any character other than the characters
specified in [^]
[^aeiou] Matches other an aeiou
b1001
c1002
d1003
f1004
h1005
b1001
c1002
d1003
f1004
h1005
For video lectures, check out
www.facebook.com/CSxFunda
Shorthand Character Class
d any decimal digit [0-9]Matches Equivalent to
D Any non- digit [^0-9]Matches Equivalent to
w any word character [a-zA-z_0-9]Matches Equivalent to
W
any non word char
[^a-zA-Z_0-9]Matches Equivalent to
s any space character [ntrfv]Matches Equivalent to
S Any non space char [^ntvrf]Matches Equivalent to
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
Anchoring
Specify the relative location of the match
Anchoring Meaning
^ Start of line or string
$ End of line or string
A Start of string
Z End of string
b Word boundary
B Non word boundary
For video lectures, check out
www.facebook.com/CSxFunda
^ - Start of line or String
Specify the location of the match as “start of
line or string”
r’^Hello’
r’hello’
Matches hello anywhere in
the input string
Matches hello at the
beginning of input string
For video lectures, check out
www.facebook.com/CSxFunda
$ - End of line or String
Specify the location of the match as “end of
line or string”
r’bye$’
r’bye’
Matches bye anywhere in the
input string
Matches bye at the end of
input string
For video lectures, check out
www.facebook.com/CSxFunda
b – Word boundary
Specify the word boundary
Matches any character other than word characters
that is, other than [a-zA-z0-9_]
bcatb
Matches cat in
“My cat”
“Your cat”
“cat1 is good”
“(cat) is pet”
But not in
“Concatenation of strings”
“catalyst is zinc”
For video lectures, check out
www.facebook.com/CSxFunda
B – Non word boundary
Specify the non word boundary
Opposite to b
BcatB
Matches cat in
“Concatenation of strings”
“Acatalyst is zinc”
But not in
“My cat”
“Your cat”
“cat1 is good”
“(cat) is pet”
For video lectures, check out
www.facebook.com/CSxFunda
For video lectures, check out
www.facebook.com/CSxFunda
Compilation Flags
 compile() has two paramters, first one is the
pattern and second one is compilation flag
which is optional.
re.compile(pattern, [flag])
 Compilation flags can be passed to compile() or
it can be embedded in the regex pattern itself
Pattern=r’(?i)w+’
For video lectures, check out
www.facebook.com/CSxFunda
Compilation Flags
Compilation Flag Meaning
re.IGNORECASE OR re.I Case insensitive matching
(i)
re.DOTALL or re.S . Matches any character
including ‘n’ (s)
re.VERBOSE or re.X Ignores white spaces and
comments (x)
re.MULTILINE or re.M Enable multi line mode (m)
re.UNICODE or re.U Enable the Unicode mode (u)
For video lectures, check out
www.facebook.com/CSxFunda
Assertions
Look Ahead Assertions
Positive Look Ahead Assertions
Negative Look Ahead Assertions
Look Behind Assertions
Positive Look Behind Assertions
Negative Look Behind Assertions

Más contenido relacionado

La actualidad más candente

Linux basic commands with examples
Linux basic commands with examplesLinux basic commands with examples
Linux basic commands with examplesabclearnn
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in PythonSujith Kumar
 
Java Threads and Concurrency
Java Threads and ConcurrencyJava Threads and Concurrency
Java Threads and ConcurrencySunil OS
 
Inner Classes & Multi Threading in JAVA
Inner Classes & Multi Threading in JAVAInner Classes & Multi Threading in JAVA
Inner Classes & Multi Threading in JAVATech_MX
 
String functions and operations
String functions and operations String functions and operations
String functions and operations Mudasir Syed
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Edureka!
 
Java Input Output (java.io.*)
Java Input Output (java.io.*)Java Input Output (java.io.*)
Java Input Output (java.io.*)Om Ganesh
 
PYTHON - EXTRA Chapter GUI - MAULIK BORSANIYA
PYTHON - EXTRA Chapter GUI - MAULIK BORSANIYAPYTHON - EXTRA Chapter GUI - MAULIK BORSANIYA
PYTHON - EXTRA Chapter GUI - MAULIK BORSANIYAMaulik Borsaniya
 
Modules in Python Programming
Modules in Python ProgrammingModules in Python Programming
Modules in Python Programmingsambitmandal
 
File handling in Python
File handling in PythonFile handling in Python
File handling in PythonMegha V
 

La actualidad más candente (20)

Python - object oriented
Python - object orientedPython - object oriented
Python - object oriented
 
Linux basic commands with examples
Linux basic commands with examplesLinux basic commands with examples
Linux basic commands with examples
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
 
Java Threads and Concurrency
Java Threads and ConcurrencyJava Threads and Concurrency
Java Threads and Concurrency
 
user defined function
user defined functionuser defined function
user defined function
 
DJango
DJangoDJango
DJango
 
Inner Classes & Multi Threading in JAVA
Inner Classes & Multi Threading in JAVAInner Classes & Multi Threading in JAVA
Inner Classes & Multi Threading in JAVA
 
String.ppt
String.pptString.ppt
String.ppt
 
Chapter 03 python libraries
Chapter 03 python librariesChapter 03 python libraries
Chapter 03 python libraries
 
String functions and operations
String functions and operations String functions and operations
String functions and operations
 
File handling in c
File handling in cFile handling in c
File handling in c
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
 
Java Input Output (java.io.*)
Java Input Output (java.io.*)Java Input Output (java.io.*)
Java Input Output (java.io.*)
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
 
Python programming : Files
Python programming : FilesPython programming : Files
Python programming : Files
 
PYTHON - EXTRA Chapter GUI - MAULIK BORSANIYA
PYTHON - EXTRA Chapter GUI - MAULIK BORSANIYAPYTHON - EXTRA Chapter GUI - MAULIK BORSANIYA
PYTHON - EXTRA Chapter GUI - MAULIK BORSANIYA
 
Python Modules
Python ModulesPython Modules
Python Modules
 
Python Functions
Python   FunctionsPython   Functions
Python Functions
 
Modules in Python Programming
Modules in Python ProgrammingModules in Python Programming
Modules in Python Programming
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
 

Último

Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 

Último (20)

Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 

Python Regular Expressions

  • 1. For video lectures, check out www.facebook.com/CSxFunda
  • 2. For video lectures, check out www.facebook.com/CSxFunda
  • 3. What is a regular expression? “A string that defines a text matching pattern” Jill roll number is 1001 Bob roll number is 1002 Rob roll number is 1003 Jack roll number is 1004 Extract Roll Numbers ? dddd Regular Expression 1001 1002 1003 1004 For video lectures, check out www.facebook.com/CSxFunda
  • 4. What is the advantage of using regular expressions?  Using regular expressions, You can extract text which follows a pattern by writing only very few lines of codes For video lectures, check out www.facebook.com/CSxFunda
  • 5. Example A weight is 46kg B weight is 54kg C weight is 60kg D weight is 70kg Text File 46 54 60 70 Extract Without Using Regular Expressions  Lengthy Code  Complex For video lectures, check out www.facebook.com/CSxFunda
  • 6. Example A weight is 46kg B weight is 54kg C weight is 60kg D weight is 70kg Text File 46 54 60 70 Extract Using Regular Expressions  Less Code  Simple For video lectures, check out www.facebook.com/CSxFunda
  • 7. Python String Set of characters enclosed in single or double quotes Ex: ‘Kalyan’, “Meghana” For video lectures, check out www.facebook.com/CSxFunda
  • 8. Python Raw Strings Set of characters enclosed in single or double quotes preceded by r Ex: r‘Kalyan’, r“Meghana” For video lectures, check out www.facebook.com/CSxFunda
  • 9. Strings vs Raw Strings You can write a regular expression as a string or raw string In a string regular expression, you have to escape the special characters. In a raw string regular expression, you need not to escape the special characters For video lectures, check out www.facebook.com/CSxFunda
  • 10. Regular Expressions  Regular Expression are supported by many programming languages Ex: Perl, Ruby , Java, Python, Javascript ……………  Some languages provide regex capabilities built in Ex: Perl  Some languages provide regex capabilities via libraries Ex: Python For video lectures, check out www.facebook.com/CSxFunda
  • 11. For video lectures, check out www.facebook.com/CSxFunda
  • 12. re Module Python supports regular expressions through re module That is, you have to import re module for using regular expressions import re No need to explicitly install this module www.facebook.com/CSxFunda
  • 13. Steps Import re module Write regular expression Create regex object Call the function using regex object For video lectures, check out www.facebook.com/CSxFunda
  • 14. For video lectures, check out www.facebook.com/CSxFunda
  • 15. re Module Functions  match(text)  search(text)  findall(text)  finditer(text)  sub(replString, text)  split(text) For video lectures, check out www.facebook.com/CSxFunda
  • 16. match() o Looks for the match at the beginning of the string o Returns a match object if there is a match , otherwise returns None regex=re.compile(pattern) mo=regex.match(text) For video lectures, check out www.facebook.com/CSxFunda
  • 17. search() o Looks for the match any where in the string o Returns a match object if there is a match , otherwise returns None o If string has more than one match, returns the match object for the first match only regex=re.compile(pattern) mo=regex.search(text) For video lectures, check out www.facebook.com/CSxFunda
  • 18. findall() o Looks for the match any where in the string o Returns all matched substrings as a list if there is match, otherwise returns empty list regex=re.compile(pattern) values=regex.findall(text) For video lectures, check out www.facebook.com/CSxFunda
  • 19. finditer() o Looks for the match any where in the string o Returns objects for all matched substrings as a list if there is a match, otherwise returns empty list regex=re.compile(pattern) moList=regex.finditer(text) For video lectures, check out www.facebook.com/CSxFunda
  • 20. sub() o replaces all the matched substrings with the given replString and returns the modified string, if there is match o Returns original string, if there is no match o Similar to replace option in text editors regex=re.compile(pattern) regex.sub(replString,text) For video lectures, check out www.facebook.com/CSxFunda
  • 21. split() o Looks for match anywhere in the string o Splits the string at the matched substrings and returns the splitted string as a list o Returns original string, if there is no match o Similar to split() method in strings regex=re.compile(pattern) regex.split(text) For video lectures, check out www.facebook.com/CSxFunda
  • 22. For video lectures, check out www.facebook.com/CSxFunda
  • 23. Groups o You want to match a substring in a string and want to extract a part of matched substring, grouping is used. Match the roll number CS1004 and extract the last four digits For video lectures, check out www.facebook.com/CSxFunda
  • 24. Groups - Types oNumbered Groups oNamed Groups oNon-capturing Groups For video lectures, check out www.facebook.com/CSxFunda
  • 25. Numbered Groups For video lectures, check out www.facebook.com/CSxFunda
  • 26. Named Groups  When groups are large in number, it is difficult to remember the group numbers  In such a case, we use named groups
  • 27. NonCapturing Group(?:) For video lectures, check out www.facebook.com/CSxFunda
  • 28. For video lectures, check out www.facebook.com/CSxFunda
  • 29. Meta Characters | (pipe) ? (question mark) * (asterisk) + (plus symbol) . (dot symbol) For video lectures, check out www.facebook.com/CSxFunda
  • 30. |(pipe) Matches one of the many characters Matches 42 100 30 111 A weight is 42kg B weight is 100kg C weight is 30kg D weight is 111kg For video lectures, check out www.facebook.com/CSxFunda
  • 31. ?(question mark) Matches zero or one occurrence Matches 42 100 30 111 A weight is 42kg B weight is 100kg C weight is 30kg D weight is 111kg For video lectures, check out www.facebook.com/CSxFunda
  • 32. *(asterisk) Matches zero or more occurrence Matches abbbc abc ac abbbc abc ac For video lectures, check out www.facebook.com/CSxFunda
  • 33. +(plus symbol) Matches one or more occurrence Matches abbbc abbc abc abbbc abbc abc For video lectures, check out www.facebook.com/CSxFunda
  • 34. .(dot symbol) Matches any character except ‘n’ Matches Kalyan007 Kalyann007 For video lectures, check out www.facebook.com/CSxFunda
  • 35. For video lectures, check out www.facebook.com/CSxFunda
  • 36. pattern{m} Matches exactly m repetitions Matches exactly 3 digits For video lectures, check out www.facebook.com/CSxFunda
  • 37. pattern{m,n} Matches minimum of m repetitions & maximum of n repetitions For video lectures, check out www.facebook.com/CSxFunda
  • 38. pattern{m,} Matches a minimum of m repetitions Matches exactly 3 digits Matches exactly 5 digits Matches exactly 4 digits Matches exactly 6 digits . . . For video lectures, check out www.facebook.com/CSxFunda
  • 39. For video lectures, check out www.facebook.com/CSxFunda
  • 40. Greedy Matching Looks for the maximum possible match abcabcabcabc Greedy Match abcabcabcabc For video lectures, check out www.facebook.com/CSxFunda
  • 41. NonGreedy Matching(?) Looks for the minimum possible match abcabcabcabc NonGreedy Match abc For video lectures, check out www.facebook.com/CSxFunda
  • 42. For video lectures, check out www.facebook.com/CSxFunda
  • 43. Character Classes Matches one of the many characters Types Positive Character Class Negative Character Class Shorthand Character Class For video lectures, check out www.facebook.com/CSxFunda
  • 44. Positive Character Class Matches one of the characters specified in [] [abc] Matches a or b or c [aeiou] Matches a ,e,i,o,u [0123456789] Matches numbers 0 to 9 [a-c0-9] Matches a,b,c or 0 to 9 For video lectures, check out www.facebook.com/CSxFunda
  • 45. Negative Character Class Matches any character other than the characters specified in [^] [^aeiou] Matches other an aeiou b1001 c1002 d1003 f1004 h1005 b1001 c1002 d1003 f1004 h1005 For video lectures, check out www.facebook.com/CSxFunda
  • 46. Shorthand Character Class d any decimal digit [0-9]Matches Equivalent to D Any non- digit [^0-9]Matches Equivalent to w any word character [a-zA-z_0-9]Matches Equivalent to W any non word char [^a-zA-Z_0-9]Matches Equivalent to s any space character [ntrfv]Matches Equivalent to S Any non space char [^ntvrf]Matches Equivalent to For video lectures, check out www.facebook.com/CSxFunda
  • 47. For video lectures, check out www.facebook.com/CSxFunda
  • 48. Anchoring Specify the relative location of the match Anchoring Meaning ^ Start of line or string $ End of line or string A Start of string Z End of string b Word boundary B Non word boundary For video lectures, check out www.facebook.com/CSxFunda
  • 49. ^ - Start of line or String Specify the location of the match as “start of line or string” r’^Hello’ r’hello’ Matches hello anywhere in the input string Matches hello at the beginning of input string For video lectures, check out www.facebook.com/CSxFunda
  • 50. $ - End of line or String Specify the location of the match as “end of line or string” r’bye$’ r’bye’ Matches bye anywhere in the input string Matches bye at the end of input string For video lectures, check out www.facebook.com/CSxFunda
  • 51. b – Word boundary Specify the word boundary Matches any character other than word characters that is, other than [a-zA-z0-9_] bcatb Matches cat in “My cat” “Your cat” “cat1 is good” “(cat) is pet” But not in “Concatenation of strings” “catalyst is zinc” For video lectures, check out www.facebook.com/CSxFunda
  • 52. B – Non word boundary Specify the non word boundary Opposite to b BcatB Matches cat in “Concatenation of strings” “Acatalyst is zinc” But not in “My cat” “Your cat” “cat1 is good” “(cat) is pet” For video lectures, check out www.facebook.com/CSxFunda
  • 53. For video lectures, check out www.facebook.com/CSxFunda
  • 54. Compilation Flags  compile() has two paramters, first one is the pattern and second one is compilation flag which is optional. re.compile(pattern, [flag])  Compilation flags can be passed to compile() or it can be embedded in the regex pattern itself Pattern=r’(?i)w+’ For video lectures, check out www.facebook.com/CSxFunda
  • 55. Compilation Flags Compilation Flag Meaning re.IGNORECASE OR re.I Case insensitive matching (i) re.DOTALL or re.S . Matches any character including ‘n’ (s) re.VERBOSE or re.X Ignores white spaces and comments (x) re.MULTILINE or re.M Enable multi line mode (m) re.UNICODE or re.U Enable the Unicode mode (u) For video lectures, check out www.facebook.com/CSxFunda
  • 56. Assertions Look Ahead Assertions Positive Look Ahead Assertions Negative Look Ahead Assertions Look Behind Assertions Positive Look Behind Assertions Negative Look Behind Assertions