有史以来最伟大的正则表达式技巧? python精简实验

有史以来最伟大的正则表达式技巧? python精简实验,第1张

缘起

The Best Regex Trick
==========

The Greatest Regex Trick Ever,

Clipped from The Best Regex Trick at 2022-04-16.


琢磨和折腾

文章艰涩难读…好在有代码直接进入py示例代码:

基础代码
import re
subject = 'Jane"" ""Tarzan12"" Tarzan11@Tarzan22 {4 Tarzan34}'
regex = re.compile(r'{[^}]+}|"Tarzan\d+"|(Tarzan\d+)')
# put Group 1 captures in a list译
matches = [group for group in re.findall(regex, subject) if group]
  • 💡: 意思是说从subject里面查找Tanzan+数字(如:Tarzan11和Tarzan22), 但是不包括带引号的和结尾为"}"的Tanzan+数字

【译】将第 1 组捕获放在列表matches中

6大任务 任务I:有匹配吗?

######## The six main tasks we're likely to have ########

# Task 1: Is there a match?
print("*** Is there a Match? ***")
if len(matches)>0:
	print ("Yes")
else:
	print ("No")
任务II: 有几个匹配?

# Task 2: How many matches are there?
print("\n" + "*** Number of Matches ***")
print(len(matches))
任务III:第一个匹配?
# Task 3: What is the first match?
print("\n" + "*** First Match ***")
if len(matches)>0:
	print (matches[0])
任务IV:所有的匹配?
# Task 4: What are all the matches?
print("\n" + "*** Matches ***")
if len(matches)>0:
	for match in matches:
	    print (match)
任务V:替换
# Task 5: Replace the matches
def myreplacement(m):
    if m.group(1):
        return "Superman"
    else:
        return m.group(0)
replaced = regex.sub(myreplacement, subject)
print("\n" + "*** Replacements ***")
print(replaced)
任务VI:分词
# Task 6: Split
# Start by replacing by something distinctive,
# as in Step 5. Then split.
splits = replaced.split('Superman')
print("\n" + "*** Splits ***")
for split in splits:
	    print (split)

结果
*** Is there a Match? ***
Yes

*** Number of Matches ***
2

*** First Match ***
Tarzan11

*** Matches ***
Tarzan11
Tarzan22

*** Replacements ***
Jane"" ""Tarzan12"" Superman@Superman {4 Tarzan34}

*** Splits ***
Jane"" ""Tarzan12"" 
@
 {4 Tarzan34}
理解:

大概是这个意思: regex = re.compile(r’“不匹配”|(匹配)’)

  • 💡: re.compile(‘不带括号的排除|(带括号的匹配)’)
  • 注意⚠️: 其实是利用了正则的一个bug, 可用于编程语言,但是在文本编辑器如:EditPad Pro 或 Notepad++等的查找框里不起作用.

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/716250.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-04-25
下一篇 2022-04-25

发表评论

登录后才能评论

评论列表(0条)

保存