regex - Find strings between << and >> that contain backslashes -


i working in rtf file have insert tags in custom markup language program replaces data. example, in file, have:

account number:  <<@account.accountnumber>> 

i editing template in microsoft word 2007 , whenever backspace, microsoft word inserts bunch of rtf garbage in template this:

<<@am\hich\af1\dbch\af31505\loch\f1 ount>> 

instead of:

<<@amount>> 

how find wherever happened? tried writing regular expressions this, don't know how write them well. here's 1 tried:

<<.+?\\.+?>> 

but when pass in phrase:

<<where: phrase =\ @value>>\<<hi>>\hi<<hi>>  

the backslash after "=" should matched, neither backslash between "<<where>>" , "<<hi>>" tags nor "\hi" between "<<hi>>" tags should matched (regex101.com , notepad++ matches them).

i not care if backslashes matched or entire tags backslashes in them are. end goal able find them in notepad++ (or other editor if that's necessary) can fix them.

you can use following regex:

<<[^\\>]*\\[^>]*>> 

demo

explanation:

  • << opening tag of custom markup language
  • [^\\>]* number of characters not \ or >
  • \\ literal \
  • [^\\>]* number of characters not >
  • >> closing tag of custom markup language

edit: match when > character can inside custom markup tag, can use following expression, relies on atomic groups / possessive quantifiers prevent catastrophic backtracking , keep matches fast:

<<(?>(?>[^\\>]*)(?>>(?!>))?)*+\\(?>(?>[^>]*)(?>>(?!>))?)++>> 

it's similar previous expression includes:

  • (?>...) atomic groups
  • (?>>(?!>))? optionally match > if not followed >
  • *+ number of times + possessive quantifier
  • ++ @ least once + possessive quantifier

demo


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -