c# – 如何进行平衡组捕获?

c# – 如何进行平衡组捕获?,第1张

概述假设我有这个文本输入. tes{}tR{R{abc}aD{mnoR{xyz}}} 我想提取ff输出: R{abc} R{xyz} D{mnoR{xyz}} R{R{abc}aD{mnoR{xyz}}} 目前,我只能使用msdn中找到的平衡组方法提取{}组内的内容.这是模式: ^[^{}]*(((?'Open'{)[^{}]*)+((?'Target-Open'})[^{}]*)+)*(?(O 假设我有这个文本输入.
tes{}tR{R{abc}aD{mnor{xyz}}}

我想提取ff输出:

R{abc} R{xyz} D{mnor{xyz}} R{R{abc}aD{mnor{xyz}}}

目前,我只能使用msdn中找到的平衡组方法提取{}组内的内容.这是模式:

^[^{}]*(((?'Open'{)[^{}]*)+((?'Target-Open'})[^{}]*)+)*(?(Open)(?!))$

有谁知道如何在输出中包含R {}和D {}?

解决方法 我认为这里需要采用不同的方法.一旦你匹配第一个更大的群体R {R {abc} aD {mnor {xyz}}}(请参阅我对可能的拼写错误的评论),你将无法获得内部的子群,因为正则表达式不允许你捕获单个R {…}组.

因此,必须有一些方法来捕获而不是消费,显而易见的方法是使用积极的前瞻.从那里,你可以把你使用的表达,尽管有一些变化,以适应新的焦点变化,我想出:

(?=([A-Z](?:(?:(?'O'{)[^{}]*)+(?:(?'-O'})[^{}]*?)+)+(?(O)(?!))))

[我还将’打开’重命名为’O’并删除了用于近距离大括号的命名捕获,以缩短并避免在比赛中产生噪音]

在regexhero.net(我目前唯一知道的免费.NET正则表达式测试程序)中,我得到了以下捕获组:

1: R{R{abc}aD{mnor{xyz}}}1: R{abc}1: D{mnor{xyz}}1: R{xyz}

正则表达式的细分:

(?=                         # opening positive lookahead    ([A-Z]                  # opening capture group and any uppercase letter (to match R & D)        (?:                 # First non-capture group opening            (?:             # Second non-capture group opening                (?'O'{)     # Get the named opening brace                [^{}]*      # Any non-brace            )+              # Close of second non-capture group and repeat over as many times as necessary            (?:             # Third non-capture group opening                (?'-O'})    # Removal of named opening brace when encountered                [^{}]*?     # Any other non-brace characters in case there are more nested braces            )+              # Close of third non-capture group and repeat over as many times as necessary        )+                  # Close of first non-capture group and repeat as many times as necessary for multiple sIDe by sIDe nested braces        (?(O)(?!))          # Condition to prevent unbalanced braces    )                       # Close capture group)                           # Close positive lookahead

以下内容在C#中不起作用

我实际上想要尝试它应该如何在PCRE引擎上运行,因为有选项可以使用递归正则表达式,我认为它更容易,因为我更熟悉它并且产生了更短的正则表达式:)

(?=([A-Z]{(?:[^{}]|(?1))+}))

regex101 demo

(?=                    # opening positive lookahead    ([A-Z]             # opening capture group and any uppercase letter (to match R & D)        {              # opening brace            (?:        # opening non-capture group                [^{}]  # Matches non braces            |          # OR                (?1)   # Recurse first capture group            )+         # Close non-capture group and repeat as many times as necessary        }              # Closing brace    )                  # Close of capture group)                      # Close of positive lookahead
总结

以上是内存溢出为你收集整理的c# – 如何进行平衡组捕获?全部内容,希望文章能够帮你解决c# – 如何进行平衡组捕获?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1252971.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-07
下一篇 2022-06-07

发表评论

登录后才能评论

评论列表(0条)

保存