Regular Expressions
Verify and extract login from an email address
Validates that an email address is formatted correctly, and extracts everything before the @ symbol.
use lazy_static::lazy_static;
use regex::Regex;
fn extract_login(input: &str) -> Option<&str> {
lazy_static! {
static ref RE: Regex = Regex::new(r"(?x)
^(?P<login>[^@\s]+)@
([[:word:]]+\.)*
[[:word:]]+$
").unwrap();
}
RE.captures(input).and_then(|cap| {
cap.name("login").map(|login| login.as_str())
})
}
fn main() {
assert_eq!(extract_login(r"I❤email@example.com"), Some(r"I❤email"));
assert_eq!(
extract_login(r"sdf+sdsfsd.as.sdsd@jhkk.d.rl"),
Some(r"sdf+sdsfsd.as.sdsd")
);
assert_eq!(extract_login(r"More@Than@One@at.com"), None);
assert_eq!(extract_login(r"Not an email@email"), None);
}
Extract a list of unique #Hashtags from a text
Extracts, sorts, and deduplicates list of hashtags from text.
The hashtag regex given here only catches Latin hashtags that start with a letter. The complete twitter hashtag regex is much more complicated.
use lazy_static::lazy_static;
use regex::Regex;
use std::collections::HashSet;
fn extract_hashtags(text: &str) -> HashSet<&str> {
lazy_static! {
static ref HASHTAG_REGEX : Regex = Regex::new(
r"\#[a-zA-Z][0-9a-zA-Z_]*"
).unwrap();
}
HASHTAG_REGEX.find_iter(text).map(|mat| mat.as_str()).collect()
}
fn main() {
let tweet = "Hey #world, I just got my new #dog, say hello to Till. #dog #forever #2 #_ ";
let tags = extract_hashtags(tweet);
assert!(tags.contains("#dog") && tags.contains("#forever") && tags.contains("#world"));
assert_eq!(tags.len(), 3);
}
Extract phone numbers from text
Processes a string of text using Regex::captures_iter
to capture multiple
phone numbers. The example here is for US convention phone numbers.
Filter a log file by matching multiple regular expressions
Reads a file named application.log
and only outputs the lines
containing “version X.X.X”, some IP address followed by port 443
(e.g. “192.168.0.1:443”), or a specific warning.
A regex::RegexSetBuilder
composes a regex::RegexSet
.
Since backslashes are very common in regular expressions, using
raw string literals makes them more readable.
Replace all occurrences of one text pattern with another pattern.
Replaces all occurrences of the standard ISO 8601 YYYY-MM-DD date pattern
with the equivalent American English date with slashes.
For example 2013-01-15
becomes 01/15/2013
.
The method Regex::replace_all
replaces all occurrences of the whole regex.
&str
implements the Replacer
trait which allows variables like $abcde
to
refer to corresponding named capture groups (?P<abcde>REGEX)
from the search
regex. See the replacement string syntax for examples and escaping detail.
use lazy_static::lazy_static;
use std::borrow::Cow;
use regex::Regex;
fn reformat_dates(before: &str) -> Cow<str> {
lazy_static! {
static ref ISO8601_DATE_REGEX : Regex = Regex::new(
r"(?P<y>\d{4})-(?P<m>\d{2})-(?P<d>\d{2})"
).unwrap();
}
ISO8601_DATE_REGEX.replace_all(before, "$m/$d/$y")
}
fn main() {
let before = "2012-03-14, 2013-01-15 and 2014-07-05";
let after = reformat_dates(before);
assert_eq!(after, "03/14/2012, 01/15/2013 and 07/05/2014");
}