Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
David Holroyd
@dholroyd
would like to express, for example,
Char       ::=       #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]    /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
...and variants on that
I think I could copy'n'paste my own parser based on an example from nom::character::complete::, but maybe I've missed an easier way! :)
jtenner
@jtenner
Seems like the docs could use some work
@RadicalZephyr showed me a way to do productions yesterday.
fn production_something<'a, E: ParseError<&'a str>>(input: &'a str) -> IResult<&'a str, ReturnType, E> {
  try_parse_something::<'a, E>(input)
    .or_else(|_| try_parse_something_else::<'a, E>(input))
    .or_else(|_| try_parse_this_thing::<'a, E>(input))
}

fn try_parse_something<'a, E: ParseError<&'a str>>(input: &'a str) -> IResult<&'a str, ReturnType, E> {
  Ok((input, value))
}
jtenner
@jtenner
@dholroyd hope that helps
David Holroyd
@dholroyd
not quite the problem I was chasing, but thank you!
I just wrote this,
pub fn char_in<I, R, Error: ParseError<I>>(range: R) -> impl Fn(I) -> IResult<I, char, Error>
    where
        I: Slice<RangeFrom<usize>> + InputIter,
        <I as InputIter>::Item: AsChar + Copy,
        R: RangeBounds<char>,
{
    move |i: I| match (i).iter_elements().next().map(|c| (c, range.contains(&c.as_char()))) {
        Some((c, true)) => Ok((i.slice(c.len()..), c.as_char())),
        _ => Err(nom::Err::Error(Error::from_error_kind(i, ErrorKind::OneOf))),
    }
}
type-checks, but not actually tested :)
...I was wondering if this is already built into nom, if you know where to look :)
jtenner
@jtenner
the or_else() chains were the production values.
I come from peg grammars so the idea of parsing a production vs parsing a production instance was new to me
David Holroyd
@dholroyd
So, I think the function I quoted above works (although I'd still be happy to hear if there's an inbuilt thing for this), however it raises another question for me,
I can use it for example like this,
    let res = many1(alt((
        char_in('\u{20}'..='\u{D7FF}'),
        char_in('\u{E000}'..='\u{FFFD}'),
    )))(input);
...and that produces a result that's a Vec<char>
jtenner
@jtenner
I bet if you filed a pull request for a char range function you could definitely get some attention :)
David Holroyd
@dholroyd
is there an idiom in nom that would directly produce a &str / String when making parsers in terms of char?
can certainly convert (String::from_iter(v) was the first thing that came to hand), but again, maybe there's a better way! :)
Denis Lisov
@tanriol
Have you seen is_a?
David Holroyd
@dholroyd
I had, but I thought it would be inconvenient to specify large lists of characters
oh, is there an impl of FindToken for Range, or something?
well, no -- but maybe I could make one
thanks @tanriol that could well be better, I will give it a go!
Alberto
@0X1A
Does nom have any way of doing something like a peek but in reverse?
Christoph Hegemann
@kritzcreek
How would I go about wrapping an existing decoder built using io::Read in a nom parser? In particular I'm looking at https://docs.rs/leb128/0.2.4/leb128/
Would I have to reimplement the decoder to walk over slices instead?
Denis Lisov
@tanriol
You can create a Cursor on the slice and use it as a Read
Christoph Hegemann
@kritzcreek
@tanriol Does that also tell me how many bytes end up being read so I can advance the input?
ahh .position() looks like what I was looking for. Thank you!
thealchemist17
@thealchemist17
// *****parser*****
// it recognizes *instructions* like "box x y"
// in this case, x and y are variables
// i need to convert variables in f32, if the variables were previously declared
// MVC has just to do DrawShapeWf32(shape, val1, val2) where val1 and val2 are f32!
// i need an HashMap inside the parser because we need to keep track of all variables in the context.
use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::character::complete::{alpha1, char, multispace0, space0};
use nom::combinator::map;
use nom::error::VerboseError;
use nom::multi::{fold_many0, many0};
use nom::number::complete::float;
use nom::sequence::{delimited, pair, preceded, terminated, tuple};
use nom::IResult;
use std::collections::HashMap;

// it recognizes a variable name like "x", "y", "xy", "myVariablE"
fn variable_parser(input: &str) -> IResult<&str, &str, VerboseError<&str>> {
    map(alpha1, |x: &str| x)(input)
}

// it recognizes pattern **x: expr**
fn declare_variable_parser(input: &str) -> IResult<&str, Command, VerboseError<&str>> {
    map(
        tuple((variable_parser, tag(":"), space0, expr)),
        |(name, _, _, value)| Command::DeclareVariable((name.to_string(), value)),
    )(input)
}

// it recognizes pattern **box**
fn declare_box(input: &str) -> IResult<&str, Command, VerboseError<&str>> {
    map(tag("box"), |shape: &str| {
        Command::DrawShape(shape.to_string())
    })(input)
}
// *****box variable*****
// So here if we find ******box variable***** we have to find the value of the variable on the HashMap and set the command
// Command::DrawShapeWf32((shape.to_string(), val1, val2))
fn declare_box_with_var_or_f32_var_or_var_f32_or_var_var_or_f32_f32(
    input: &str,
) -> IResult<&str, Command, VerboseError<&str>> {
    alt((
        map(
            tuple((tag("box"), space0, expr, space0, expr)),
            |(shape, _, val1, _, val2): (&str, _, _, _, _)| {
                Command::DrawShapeWf32((shape.to_string(), val1, val2))
            },
        ),
        map(
            tuple((tag("box"), space0, expr)),
            |(x, _, value): (&str, _, f32)| Command::DrawShapeWf32((x.to_string(), value, value)),
        ),
    ))(input)
}

// We parse any expr surrounded by parens, ignoring all whitespaces around those
fn parens(i: &str) -> IResult<&str, f32, VerboseError<&str>> {
    delimited(space0, delimited(tag("("), expr, tag(")")), space0)(i)
}

fn factor(i: &str) -> IResult<&str, f32, VerboseError<&str>> {
    alt((map(delimited(space0, float, space0), |x| x), parens))(i)
}

fn term(i: &str) -> IResult<&str, f32, VerboseError<&str>> {
    let (i, init) = factor(i)?;
    fold_many0(
        pair(alt((char('*'), char('/'))), factor),
        init,
        |acc, (op, val): (char, f32)| {
            if op == '*' {
                acc * val
            } else {
                acc / val
            }
        },
    )(i)
}

pub fn expr(i: &str) -> IResult<&str, f32, VerboseError<&str>> {
    let (i, init) = term(i)?;

    fold_many0(
        pair(alt((char('+'), char('-'))), term),
        init,
        |acc, (op, val): (char, f32)| {
            if op == '+' {
                acc + val
            } else {
                acc - val
            }
        },
    )(i)
}

pub fn parser(input: &str) -> IResult<&str, Vec<Command>, VerboseError<&str>> {
    many0(terminated(
        alt((
            preceded(
                multispace0,
                declare_box_with_var_or_f32_var_or_var_f32_or_var_var_or_f32_f32,
            ),
            preceded(multispace0, declare_variable_parser),
            preceded(multispace0, declare_box),
        )),
        multispace0,
    ))(input)
}

#[derive(Debug, PartialEq, Clone)]
pub enum Command {
    // x: f32
    DeclareVariable((String, f32)),
    // box | circle
    DrawShape(String),
    // box var var
    DrawShapeWf32((String, f32, f32)),
}
can u help me guys? thanks
i ve described my problem on comments
Denis Lisov
@tanriol
@thealchemist17 What is the problem with this code? I'm not seeing any kind of "this is what does not work" in there...
thealchemist17
@thealchemist17
@tanriol code it's working, but i'd also like to know how to return (from parser function) an HashMap<String, f32> that stores all the variables previously declared
Denis Lisov
@tanriol
Well, you return a Vec<Command>. Why not just take all the DeclareVariable commands and collect the variables into a HashMap?
thealchemist17
@thealchemist17
yes but I need that hashmap to be returned in parser function..
David Holroyd
@dholroyd
is it necessary / useful / required to left-factor parsers built from nom combinators?
I mean, I can profile when the parser is done to find out if it is necessary; maybe a better question is, does nom otherwise do anything to save repeated work?
Denis Lisov
@tanriol
@thealchemist17 Why not do the same transformation in the parser function?
@dholroyd If you mean "does nom somehow magically cache the results of the parsers that multiple branches start with", no, it does not.
David Holroyd
@dholroyd
Thanks!
gaowanqiu
@meritozh
Hi all, I have a problem, post in stack overflow: https://stackoverflow.com/questions/57818920/combinator-return-result-only-if-all-child-parser-success can anyone help me? thanks.
Denis Lisov
@tanriol
@meritozh Do you need specifically these errors and not other?
Denis Lisov
@tanriol
Why do you need these to work exactly as specified? Is it for error reporting or just for the following parsers to work correctly?
gaowanqiu
@meritozh
@tanriol the heading is a composed parser, and if it failed, the input will be handle by raw_text parser. So heading cannot consume any input if it failed. I add a demo, you can try it.
Denis Lisov
@tanriol
It does not :-)
gaowanqiu
@meritozh
sorry, I cannot understand, what does not?
Denis Lisov
@tanriol
You have a slightly wrong mental model. nom parsers do not consume any input on their own, and the parsers for multiple alternatives do not take into account what the failed parsers consumed before they failed.
gaowanqiu
@meritozh
you are right, I misunderstand how the "remains" string slice works. :-(
Denis Lisov
@tanriol
I'd probably write the parser either like this:
fn heading(input: &str) -> IResult<&str, (usize, &str)> {
    map(
        tuple((
            take_while_m_n(1, 6, |c| c == '#'),
            space1,
            verify(not_line_ending, |text: &str| !text.is_empty())
        )),
        |(hashes, _, text): (_, &str, _)| (hashes.len(), text),
    )(input)
}