Python中的正则表达式S2(搜索,匹配和查找全部)

2021年3月30日11:52:46 发表评论 844 次浏览

Python中的正则表达式和示例|套装1

模块re提供对Python中正则表达式的支持。以下是此模块中的主要方法。

搜索模式的出现

re.search():此方法返回None(如果模式不匹配), 或者返回re.MatchObject, 其中包含有关字符串匹配部分的信息。此方法在第一个匹配项后停止, 因此它最适合于测试正则表达式, 而不是提取数据。

# A Python program to demonstrate working of re.match().
import re
  
# Lets use a regular expression to match a date string
# in the form of Month name followed by day number
regex = r "([a-zA-Z]+) (\d+)"
  
match = re.search(regex, "I was born on June 24" )
  
if match ! = None :
  
     # We reach here when the expression "([a-zA-Z]+) (\d+)"
     # matches the date string.
  
     # This will print [14, 21), since it matches at index 14
     # and ends at 21. 
     print "Match at index %s, %s" % (match.start(), match.end())
  
     # We us group() method to get all the matches and
     # captured groups. The groups contain the matched values.
     # In particular:
     #    match.group(0) always returns the fully matched string
     #    match.group(1) match.group(2), ... return the capture
     #    groups in order from left to right in the input string
     #    match.group() is equivalent to match.group(0)
  
     # So this will print "June 24"
     print "Full match: %s" % (match.group( 0 ))
  
     # So this will print "June"
     print "Month: %s" % (match.group( 1 ))
  
     # So this will print "24"
     print "Day: %s" % (match.group( 2 ))
  
else :
     print "The regex pattern does not match."

输出:

Match at index 14, 21
Full match: June 24
Month: June
Day: 24

将模式与文本匹配

re.match():此函数尝试将模式匹配到整个字符串。 re.match函数成功返回匹配对象, 失败则返回None。

re.match(pattern, string, flags=0)

pattern : Regular expression to be matched.
string : String where p attern is searched
flags : We can specify different flags 
        using bitwise OR (|).
# A Python program to demonstrate working
# of re.match().
import re
  
# a sample function that uses regular expressions
# to find month and day of a date.
def findMonthAndDate(string):
      
     regex = r "([a-zA-Z]+) (\d+)"
     match = re.match(regex, string)
      
     if match = = None : 
         print "Not a valid date"
         return
  
     print "Given Data: %s" % (match.group())
     print "Month: %s" % (match.group( 1 ))
     print "Day: %s" % (match.group( 2 ))
  
      
# Driver Code
findMonthAndDate( "Jun 24" )
print ("")
findMonthAndDate( "I was born on June 24" )

查找所有出现的模式

re.findall():以字符串列表形式返回字符串中所有不重复的模式匹配项。从左到右扫描字符串, 并以找到的顺序返回匹配项(来源:Python文档:https://docs.python.org/2/library/re.html)。

# A Python program to demonstrate working of
# findall()
import re
  
# A sample text string where regular expression 
# is searched.
string  = """Hello my Number is 123456789 and
              my friend's number is 987654321"""
  
# A sample regular expression to find digits.
regex = '\d+'             
  
match = re.findall(regex, string)
print (match)
  
# This example is contributed by Ayush Saluja.

输出:

['123456789', '987654321']

正则表达式是一个巨大的话题。这是一个完整的图书馆。正则表达式可以做很多事情。你可以匹配, 搜索, 替换, 提取大量数据。例如, 下面的小代码非常强大, 可以从文本中提取电子邮件地址。因此, 我们可以使用easy.Lake regex查看python中的Web爬网程序和爬虫。

# extract all email addresses and add them into the resulting set
new_emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", text, re.I))

我们将很快讨论正则表达式的更多方法。

如果发现任何不正确的地方, 或者想分享有关上述主题的更多信息, 请发表评论。

首先, 你的面试准备可通过以下方式增强你的数据结构概念:Python DS课程。

木子山

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: