Sometimes you just need a list of e-mail addresses from text files on your computer. I have personally needed this while managing an e-mail server.
Here is the scenario, given a text file that has e-mail addresses intermixed with other text, extract a sorted list of e-mail addresses.
While there are commercial applications to do this, if you have a Unix-based system then you have all of the tools that you need available at the command line.
For an input file called
EMAIL_SAMPLES.TXT, this will work:
grep -o '[[:alnum:]+\.\_\-]*@[[:alnum:]+\.\_\-]*' EMAIL_SAMPLES.TXT | sort | uniq -i
Let’s break down the call pipeline:
grep -oscans the text file for matches to the requested regular expression and prints each match to a line
'[[:alnum:]+\.\_\-]*@[[:alnum:]+\.\_\-]*'is a regular expression that matches e-mail addresses
sorttakes the list of e-mail addresses produced by grep and sorts them alphabetically
uniq -ifilters out repeated e-mail addresses so that each is only listed once. The
-iflag instructs it to use a case insensitive comparison of lines.
If you need to run this more than once, it makes sense to create a custom shell script like this:
#!/usr/bin/env bash if [ -f "$1" ]; then grep -o '[[:alnum:]+\.\_\-]*@[[:alnum:]+\.\_\-]*' "$1" | sort | uniq -i else echo "Expected a file at $1, but it doesn't exist." >&2 exit 1 fi
And now the same can be accomplished by running
This goes to show just how flexible the standard Unix tools are. They can be connected together to accomplish really neat tasks without the need for more complicated code. Anyone with a Mac OS X or Linux system have all that it takes at their finger tips.