Parsing content from string
Parse tagged fields from a log line
nom is a parser-combinator library: you build a parser by composing
small parsers, each of which consumes part of the input and hands the rest to
the next. A parser is any function fn(Input) -> IResult<Input, Output>, where
IResult carries either the parsed value together with the unconsumed tail,
or an error.
This recipe parses a log line like level=warn line=42 into a struct. Tokens
are matched with tag and alternatives with alt; value maps each
matched keyword to a Level without a closure. A fallible conversion (string to
u32) is wrapped in map_res so a bad number becomes a parse error rather
than a panic, and Finish turns the streaming-style result into a plain
Result once parsing is complete.
use std::error::Error;
use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::character::complete::{digit1, space1};
use nom::combinator::{map, map_res, value};
use nom::sequence::{preceded, separated_pair};
use nom::{Finish, IResult, Parser};
/// A single log line such as `level=warn line=42` parsed into a struct.
#[derive(Debug, PartialEq)]
struct LogEntry {
level: Level,
line: u32,
}
#[derive(Clone, Debug, PartialEq)]
enum Level {
Info,
Warn,
Error,
}
/// `nom` builds a parser by combining small parsers. Each one takes the
/// remaining input and returns the parsed value plus the unconsumed tail,
/// so combinators like `separated_pair` and `preceded` thread that tail
/// through for you. `parse` drives one of these combinators over the input.
///
/// `value` matches a token and yields a fixed value, ignoring the matched
/// text — exactly what mapping each keyword to a `Level` needs.
fn level(input: &str) -> IResult<&str, Level> {
alt((
value(Level::Info, tag("info")),
value(Level::Warn, tag("warn")),
value(Level::Error, tag("error")),
))
.parse(input)
}
/// `map_res` runs a fallible conversion (`str::parse`) and turns an `Err`
/// into a parse failure instead of panicking.
fn number(input: &str) -> IResult<&str, u32> {
map_res(digit1, str::parse).parse(input)
}
/// `level=<level> line=<number>`, separated by whitespace.
fn log_entry(input: &str) -> IResult<&str, LogEntry> {
map(
separated_pair(
preceded(tag("level="), level),
space1,
preceded(tag("line="), number),
),
|(level, line)| LogEntry { level, line },
)
.parse(input)
}
fn main() -> Result<(), Box<dyn Error>> {
// `Finish` converts the streaming-style result into a plain `Result`;
// `?` then propagates a parse error instead of panicking. The empty
// remaining input is discarded.
let (_, entry) = log_entry("level=warn line=42").finish()?;
println!("{entry:?}");
assert_eq!(
entry,
LogEntry {
level: Level::Warn,
line: 42
}
);
Ok(())
}
Decode a hex color
A second nom parser, this one working at the byte level: it decodes an
HTML-style #1b2a3c color literal into its red, green and blue components.
take_while_m_n consumes a fixed number of characters matching a predicate —
here exactly two hex digits — and map_res converts each pair into a u8,
failing the parse instead of panicking on invalid input. A tuple of parsers,
(hex_byte, hex_byte, hex_byte), is itself a parser that runs each in sequence,
and preceded discards the leading #. Finish turns the streaming-style
result into a plain Result once parsing is complete.
use std::error::Error;
use nom::bytes::complete::take_while_m_n;
use nom::character::complete::char;
use nom::combinator::map_res;
use nom::sequence::preceded;
use nom::{Finish, IResult, Parser};
/// An HTML-style `#1b2a3c` color literal decoded into its red, green and
/// blue components.
#[derive(Debug, PartialEq)]
struct Color {
red: u8,
green: u8,
blue: u8,
}
/// `take_while_m_n` consumes between `m` and `n` characters matching a
/// predicate — here exactly two hex digits — and `map_res` turns them into
/// a byte, failing the parse instead of panicking on bad input.
fn hex_byte(input: &str) -> IResult<&str, u8> {
map_res(
take_while_m_n(2, 2, |c: char| c.is_ascii_hexdigit()),
|hex| u8::from_str_radix(hex, 16),
)
.parse(input)
}
/// A leading `#` followed by three hex bytes.
fn color(input: &str) -> IResult<&str, Color> {
// A tuple of parsers is itself a parser that runs each in sequence.
let (input, (red, green, blue)) =
preceded(char('#'), (hex_byte, hex_byte, hex_byte)).parse(input)?;
Ok((input, Color { red, green, blue }))
}
fn main() -> Result<(), Box<dyn Error>> {
// `Finish` converts the streaming-style result into a plain `Result`;
// `?` then propagates a parse error instead of panicking.
let (_, parsed) = color("#1b2a3c").finish()?;
println!("{parsed:?}");
assert_eq!(
parsed,
Color {
red: 0x1b,
green: 0x2a,
blue: 0x3c
}
);
Ok(())
}