Содержание
- 2. Intro to Python for Data Science Regular Expression
- 4. WHAT IS A REGULAR EXPRESSION? A Regular Expression (RegEx) is a sequence of characters that defines
- 5. A pattern defined using RegEx can be used to match against a string.
- 6. Python has a module named re to work with RegEx. Here's an example: import re pattern
- 7. THERE ARE OTHER SEVERAL FUNCTIONS DEFINED IN THE RE MODULE TO WORK WITH REGEX. BEFORE WE
- 8. SPECIFY PATTERN USING REGEX To specify regular expressions, metacharacters are used. In the previous example, ^
- 9. METACHARACTERS METACHARACTERS ARE CHARACTERS THAT ARE INTERPRETED IN A SPECIAL WAY BY A REGEX ENGINE. HERE'S
- 10. METACHARACTERS [] - Square brackets Square brackets specifies a set of characters you wish to match.
- 11. METACHARACTERS You can also specify a range of characters using - inside square brackets. [a-e] is
- 12. METACHARACTERS . - Period A period matches any single character (except newline '\n').
- 13. METACHARACTERS ^ - Caret The caret symbol ^ is used to check if a string starts
- 14. METACHARACTERS $ - Dollar The dollar symbol $ is used to check if a string ends
- 15. METACHARACTERS * - Star The star symbol * matches zero or more occurrences of the pattern
- 16. METACHARACTERS + - Plus The plus symbol + matches one or more occurrences of the pattern
- 17. METACHARACTERS ? - Question Mark The question mark symbol ? matches zero or one occurrence of
- 18. METACHARACTERS {} - Braces Consider this code: {n,m}. This means at least n, and at most
- 19. METACHARACTERS Let's try one more example. This RegEx [0-9]{2, 4} matches at least 2 digits but
- 20. METACHARACTERS | - Alternation Vertical bar | is used for alternation (or operator). Here, a|b match
- 21. METACHARACTERS () - Group Parentheses () is used to group sub-patterns. For example, (a|b|c)xz match any
- 22. METACHARACTERS \ - Backslash Backlash \ is used to escape various characters including all metacharacters. For
- 23. SPECIAL SEQUENCES Special sequences make commonly used patterns easier to write. Here's a list of special
- 24. SPECIAL SEQUENCES \b - Matches if the specified characters are at the beginning or end of
- 25. SPECIAL SEQUENCES \B - Opposite of \b. Matches if the specified characters are not at the
- 26. SPECIAL SEQUENCES \d - Matches any decimal digit. Equivalent to [0-9] \D - Matches any non-decimal
- 27. SPECIAL SEQUENCES \s - Matches where a string contains any whitespace character. Equivalent to [ \t\n\r\f\v].
- 28. SPECIAL SEQUENCES \w - Matches any alphanumeric character (digits and alphabets). Equivalent to [a-zA-Z0-9_]. By the
- 29. SPECIAL SEQUENCES \Z - Matches if the specified characters are at the end of a string.
- 30. SPECIAL SEQUENCES Tip: To build and test regular expressions, you can use RegEx tester tools such
- 31. PYTHON REGEX Python has a module named re to work with regular expressions. To use it,
- 32. PYTHON REGEX re.findall() The re.findall() method returns a list of strings containing all matches. Example 1:
- 33. PYTHON REGEX re.split() The re.split method splits the string where there is a match and returns
- 34. PYTHON REGEX You can pass maxsplit argument to the re.split() method. It's the maximum number of
- 35. PYTHON REGEX re.sub() The syntax of re.sub() is: re.sub(pattern, replace, string) The method returns a string
- 36. PYTHON REGEX Example 3: re.sub() # Program to remove all whitespaces import re # multiline string
- 37. PYTHON REGEX You can pass count as a fourth parameter to the re.sub() method. If omitted,
- 38. PYTHON REGEX re.subn() The re.subn() is similar to re.sub() expect it returns a tuple of 2
- 39. PYTHON REGEX re.search() The re.search() method takes two arguments: a pattern and a string. The method
- 40. PYTHON REGEX Example 5: re.search() import re string = "Python is fun" # check if 'Python'
- 41. MATCH OBJECT You can get methods and attributes of a match object using dir() function. Some
- 42. MATCH OBJECT match.start(), match.end() and match.span() The start() function returns the index of the start of
- 43. MATCH OBJECT match.re and match.string The re attribute of a matched object returns a regular expression
- 44. USING R PREFIX BEFORE REGEX When r or R prefix is used before a regular expression,
- 46. Скачать презентацию