Predefined variables to use in regular expressions:
[:lower:] |
Lower-case letters |
[:upper:] |
Upper-case letters |
[:alpha:] |
Alphabetic characters: [:lower:] and [:upper:] |
[:digit:] |
Digits: 0 1 2 3 4 5 6 7 8 9 |
[:alnum:] |
Alphanumeric characters: [:alpha:] and [:digit:] |
[:print:] |
Printable characters: [:alnum:], [:punct:] and space. |
[:punct:] |
Punctuation characters: ! " # $ % & ’ ( ) * + , - . / : ; < = > ? @ [ ] ^ _ ` { | } ~ |
[:blank:] |
Blank characters: space and tab |
- Take the previous character vector containing email addresses:
- Remove the @ and the email provider from each address
gsub(pattern="@[[:lower:][:punct:]]+",
replacement="",
x=vec_ad)
- Same thing but remove additionally any number(s) BEFORE the @ (if any):
gsub(pattern="[[:digit:]]*@[[:lower:][:punct:]]+",
replacement="",
x=vec_ad)
gsub(pattern="[[:digit:]]*@[[:print:]]+",
replacement="",
x=vec_ad)