Star Wars Solution¶
Solution 1¶
import re
pattern = re.compile("[^\d]*")
newquotes = [pattern.search(string=x).group() for x in quotes]
print(newquotes)
# [
# 'A ',
# 'I might as well sleep for ',
# 'Hey Patrick, I thought of something funnier than ',
# 'I will have you know that I stubbed my toe last week and only cried for ',
# 'Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until '
# ]
Explanation¶
Consider the string I might as well sleep for 100 years or so.
.
-
re.search(pattern="[^\d]", string=...)
matches the first non digit character (from the beginning of the string).I might as well sleep for 100 years or so.
-
re.search(pattern="[^\d]*", string=...)
matches zero or more non digit characters (from the beginning of the string).I might as well sleep for 100 years or so.
-
re.search()
returns a match object (assuming a match is found).re.search(pattern="[^\d]*", string=quote) # <re.Match object; span=(0, 26), match='I might as well sleep for '>
Match objects have a
.group()
method which can be used to return the matching substring (or the ith matching capture group).match = re.search(pattern="[^\d]*", string=quote) match.group() # 'I might as well sleep for '
-
This technique can be extended with list comprehension to fetch the matching substring from each string in
quotes
.[re.search(pattern="[^\d]*", string=x).group() for x in quotes] # [ # 'A ', # 'I might as well sleep for ', # 'Hey Patrick, I thought of something funnier than ', # 'I will have you know that I stubbed my toe last week and only cried for ', # 'Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until ' # ]
-
Since we reuse the same regex multiple times, it'll be more efficient if we compile it!
pattern = re.compile("[^\d]*") newquotes = [pattern.search(string=x).group() for x in quotes] print(newquotes) # [ # 'A ', # 'I might as well sleep for ', # 'Hey Patrick, I thought of something funnier than ', # 'I will have you know that I stubbed my toe last week and only cried for ', # 'Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until ' # ]
Solution 2¶
import re
pattern = re.compile("\d.*")
newquotes = [pattern.sub(string=x, repl="") for x in quotes]
print(newquotes)
# [
# 'A ',
# 'I might as well sleep for ',
# 'Hey Patrick, I thought of something funnier than ',
# 'I will have you know that I stubbed my toe last week and only cried for ',
# 'Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until '
# ]
Explanation¶
The strategy here is to identify the first digit in each string and everything after it, and replace that
matching substring with the empty string ""
.
-
\d
matches a digit.A 5 letter word for happiness... MONEY. I might as well sleep for 100 years or so. Hey Patrick, I thought of something funnier than 24... 25! I will have you know that I stubbed my toe last week and only cried for 20 minutes. Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until 4
-
\d.
matches a digit followed by any character.A 5 letter word for happiness... MONEY. I might as well sleep for 100 years or so. Hey Patrick, I thought of something funnier than 24... 25! I will have you know that I stubbed my toe last week and only cried for 20 minutes. Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until 4
Important
Notice that the last quote no longer has a match!
-
\d.*
matches a digit followed by zero or more repetitions of any character.A 5 letter word for happiness... MONEY. I might as well sleep for 100 years or so. Hey Patrick, I thought of something funnier than 24... 25! I will have you know that I stubbed my toe last week and only cried for 20 minutes. Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until 4
-
Replace the matches in step 3 with empty strings.
import re pattern = re.compile("\d.*") # (1)! newquotes = [pattern.sub(string=x, repl="") for x in quotes] # (2)! print(newquotes) # [ # 'A ', # 'I might as well sleep for ', # 'Hey Patrick, I thought of something funnier than ', # 'I will have you know that I stubbed my toe last week and only cried for ', # 'Sandy: Don’t you have to be stupid somewhere else? Patrick: Not until ' # ]
- Compile the pattern in preparation for repeated use.
- Use list comprehension combined with
Pattern.sub()
to replace matching groups with the empty string.