Went through some tutorials for regex today…
I learned:
- \d can be used to replace any digit from 0 to 9
- . is a wildcard
- To use normal period. use \.
- To specify characters use []
- ex: [abc]an
- To match any value that starts with a or b or c, and ends in ‘an’
- To exclude, use [^abc]
- Or use range [a-c]
- ex: [abc]an
- To repetitions of characters
- A{1,6} to signify how many repetitions
- A+ one or more repetitions
- A* zero or more repetitions
- Optional characters
- ?
- White space characters
- \s
- \S (any non white space characters)
- Starting and ending
- ^…$
- To cap off start and end values which helps prevent false positives (eg: unSUCCESSFUL)
- Capturing a group
- Use ()…
- Can be useful for separating files from extension:
- ^(file_name.+)\.png$
I’m still confused on:
For this dataset I’m practicing on I’m being asked to replace a letter in a string, however using the .str.replace(r”,”) function doesn’t seem to be working. I’ll continue this one tomorrow.