regex - Find strings between << and >> that contain backslashes -
i working in rtf file have insert tags in custom markup language program replaces data. example, in file, have:
account number: <<@account.accountnumber>>
i editing template in microsoft word 2007 , whenever backspace, microsoft word inserts bunch of rtf garbage in template this:
<<@am\hich\af1\dbch\af31505\loch\f1 ount>>
instead of:
<<@amount>>
how find wherever happened? tried writing regular expressions this, don't know how write them well. here's 1 tried:
<<.+?\\.+?>>
but when pass in phrase:
<<where: phrase =\ @value>>\<<hi>>\hi<<hi>>
the backslash after "=" should matched, neither backslash between "<<where>>" , "<<hi>>" tags nor "\hi" between "<<hi>>" tags should matched (regex101.com , notepad++ matches them).
i not care if backslashes matched or entire tags backslashes in them are. end goal able find them in notepad++ (or other editor if that's necessary) can fix them.
you can use following regex:
<<[^\\>]*\\[^>]*>>
explanation:
<<
opening tag of custom markup language[^\\>]*
number of characters not\
or>
\\
literal\
[^\\>]*
number of characters not>
>>
closing tag of custom markup language
edit: match when >
character can inside custom markup tag, can use following expression, relies on atomic groups / possessive quantifiers prevent catastrophic backtracking , keep matches fast:
<<(?>(?>[^\\>]*)(?>>(?!>))?)*+\\(?>(?>[^>]*)(?>>(?!>))?)++>>
it's similar previous expression includes:
(?>...)
atomic groups(?>>(?!>))?
optionally match>
if not followed>
*+
number of times + possessive quantifier++
@ least once + possessive quantifier
Comments
Post a Comment