Regex subbing in Python leads to ASCII characters appearing -

- February 15, 2015

i trying use regex replace issues in text.

strings this:

a = "here shortstring various issueswith spacing"

my regex looks right now: new_string = re.sub("[a-z][a-z]", "\1 \2", a).

this takes places missing spaces (there capital letter after lowercase letter), , adds space.

unfortunately, output looks this:

here shor\x01 \x02tring various issue\x01 \x02ith spacing

i want this:

b = "here short string various issues spacing"

it seems regex matching correct instances of things want change, there wrong substitution. thought \1 \2 meant replace first part of regex, add space, , add second matched item. reason else?

>>> = "here shortstring various issueswith spacing" >>> re.sub("([a-z])([a-z])", r"\1 \2", a) 'here short string various issues spacing'

capturing group , backslash escaping missing.

you can go further:

>>> = "here shortstring various issueswith spacing" >>> re.sub('([a-z])([a-z])', r'\1 \2', a).lower().capitalize() 'here short string various issues spacing'

Search This Blog

Image

Regex subbing in Python leads to ASCII characters appearing -

Comments

Post a Comment

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -