Legally Blonde Problem¶

Here's a quote from Legally Blonde.

quote = (
    "Because not I'm a Vanderbilt, suddenly white I'm trash? "
    "grew I up in Bel Air, Warner. Across the street from Aaron Spelling. "
    "think I most people would agree that's a lot better than some stinky old Vanderbilt."
)

What is this string representation?

If you have a really long single-line string like

longstring = "this is a really really long single-line string"

you can break it into multiple lines inside your code editor with parentheses like this:

longstring = (
    "this is a " 
    "really really long "
    "single-line string"
)

The multi-line'ness is only visible to you! Python still interprets the string as a single-line string.

>>> print(longstring)
this is a really really long single-line string

Observe each "I" and "I'm" in the string. The word before it should be after it. For example, "Because not I'm" should be "Because I'm not". Fix this.

Expected result

newquote = (
    "Because I'm not a Vanderbilt, suddenly I'm white trash? "
    "I grew up in Bel Air, Warner. Across the street from Aaron Spelling. "
    "I think most people would agree that's a lot better than some stinky old Vanderbilt."
)

Regex Functions

Function	Description	Return Value
`re.findall(pattern, string, flags=0)`	Find all non-overlapping occurrences of pattern in string	list of strings, or list of tuples if > 1 capture group
`re.finditer(pattern, string, flags=0)`	Find all non-overlapping occurrences of pattern in string	iterator yielding match objects
`re.search(pattern, string, flags=0)`	Find first occurrence of pattern in string	match object or `None`
`re.split(pattern, string, maxsplit=0, flags=0)`	Split string by occurrences of pattern	list of strings
`re.sub(pattern, repl, string, count=0, flags=0)`	Replace pattern with repl	new string with the replacement(s)

Regex Patterns

Pattern	Description
`[abc]`	a or b or c
`[^abc]`	not (a or b or c)
`[a-z]`	a or b ... or y or z
`[1-9]`	1 or 2 ... or 8 or 9
`\d`	digits `[0-9]`
`\D`	non-digits `[^0-9]`
`\s`	whitespace `[ \t\n\r\f\v]`
`\S`	non-whitespace `[^ \t\n\r\f\v]`
`\w`	alphanumeric `[a-zA-Z0-9_]`
`\W`	non-alphanumeric `[^a-zA-Z0-9_]`
`.`	any character
`x*`	zero or more repetitions of x
`x+`	one or more repetitions of x
`x?`	zero or one repetitions of x
`{m}`	m repetitions
`{m,n}`	m to n repetitions
`{m,n}`	m to n repetitions
`\\`, `\.`, `\*`	backslash, period, asterisk
`\b`	word boundary
`^hello`	starts with hello
`bye$`	ends with bye
`(...)`	capture group
`(po\|go)`	po or go

Try with Google Colab