Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Ghost
@ghost~56d3703de610378809c41a40

Can someone help me out a little bit? I am trying to create a parser that can read multi-line strings.

somekey = some value with spaces but \
     all white-space after the backslash is ignored until the first non-whitespace character

The parser should return some value with spaces but all white-space after the backslash is ignored until the first non-whitespace character. The key-value thing is not the problem and it works fine with one-line values but I struggle to somehow make the backslash parser.

The current parser I use is this one:

fn ws<I, O, E: ParseError<I>, F>(parser: F) -> impl FnMut(I) -> IResult<I, O, E>
where
    I: InputTakeAtPosition,
    <I as InputTakeAtPosition>::Item: AsChar + Clone,
    F: FnMut(I) -> IResult<I, O, E>,
{
    delimited(multispace0, parser, multispace0)
}

fn parse_entry(i: &str) -> IResult<&str, (&str, &str)> {
    separated_pair(
        preceded(multispace0, alphanumeric0),
        ws(tag("=")),
        preceded(multispace0, not_line_ending),
    )(i)
}

Can anyone help me out with this? The output would have to have the single backslash removed, so I guess zero-copy (&str) isn't generally possible as a return type with this kind of problem, is it?

Denis Lisov
@tanriol
Correct, zero-copy won't work in the general case.
I'd probably go for separated_list alternating a slice of non-newline non-backslash and backslash followed by whitespace.
And then concatenate them (possibly into Cow)
Ghost
@ghost~56d3703de610378809c41a40
Thanks @tanriol! After reading the doc of separated_list, would that work in the case where I can have both types of values in any specific order, e.g. values can either be one-line or multi-line. It seems that the separator_list parser alternates, meaning that multiple multi-line values wouldn't work in this case, would they?
key1 = line value
key2 = multi-line \
    line \
   line
key3 = line value again
key4 = again, line value
Denis Lisov
@tanriol
Why does it matter if you use separated_list(...) in place of not_line_ending in parse_entry?
Ghost
@ghost~56d3703de610378809c41a40
ohhh I would have used it somewhere else, that is totally true!
thanks
John Barker
@j16r
is there something like this https://github.com/Geal/nom/blob/master/examples/string.rs more generally available so I don't have to copy and paste a bunch of code?
Tom Alexander
@tomalexander
Hey I've updated my long-dormant PR and I was hoping I could get some eyes on it. No rush, but considering how long it was dormant I wouldn't be surprised if my updates are completely unnoticed: Geal/nom#469
zserik
@zserik
is there any documentation available regarding how streaming parsers differentiate between Incomplete (try to fetch more data) and EOF (end of file condition; reentry after Incomplete, but no more data will ever become available for this particular byte stream , which is important in case the EOF would finish some element which would otherwise continued)? and is there documentation available how to integrate streaming parsers with AsyncRead byte streams`?
Denis Lisov
@tanriol
@zserik AFAIK, they do not differentiate, you build the async integration yourself.
zserik
@zserik
ok, but how are cases handled when both partial input (Incomplete) and full input (EOF at reading) would be possible alternatives, and might yield different results, how should that information be propagated through the parser?
Denis Lisov
@tanriol
Do you mean that the final parser can be something like "take either N bytes or until EOF, whichever happens first"?
zserik
@zserik
yes, or something like "take until EOF or whitespace, whichever happens first" basically everything like "take until EOF or $condition"
Denis Lisov
@tanriol
AFAIK, no special support for that at the moment. You may want to pass the EOF flag to the final parser explicitly.
zserik
@zserik
is there some way to avoid a part of the code duplication for handling such a flag (e.g. at least at some levels it would include a switch from a streaming parser to a complete parser, I suppose)?
Denis Lisov
@tanriol
That's why I'm talking about a flag and not about duplicating every parser :-)
zserik
@zserik
is there some best practise on how to implement such a flag?
Denis Lisov
@tanriol
No idea, I haven't dealt with this yet.
I'm not sure whether it's possible to reproduce nom 4's approach of having an input type wrapper for that.
Elliot Stern
@PipocaQuemada

I'm trying to write a scheme parser.

flat_map(tag("("), |_| {
  flat_map(listContents, |l| {
    map(tag(")"), |_| l)})})(i)

fails with

error[E0507]: cannot move out of `l`, a captured variable in an `Fn` closure
  --> src/parser.rs:80:31
   |
79 |           flat_map(listContents, |l| {
   |                                   - captured outer variable
80 |             map(tag(")"), |_| l)})})(i)
   |                               ^ move occurs because `l` has type `ast::LispVal`, which does not implement the `Copy` trait

I've tried map(tag(")"), move |_| l) and flat_map(listContents, move |l| {, but I can't seem to figure out how to fix that error.

zserik
@zserik
wouldn't delimited be more appropriate there?
this error occurs because flat_map expects an Fn, which isn't allowed to move out of outer variables.
Elliot Stern
@PipocaQuemada
Hmm. Looks like I could use that, there, but that doesn't really help me figure out how to use flat_map to parse e.g. dotted lists (i.e. (foo bar . baz))
gamma-delta
@gamma-delta
hello!
i'm trying to use nom to parse meshes i write in a text file, like really really simple svg sort of
how do I parse an f32?
as in, is there a function that will consume characters until it finishes with the f32?
hm that was easy
zserik
@zserik
@PipocaQuemada |_| l.clone() maybe?
gamma-delta
@gamma-delta
OK I have a legitimate issue this time
I would like to parse stuff like this: 1, 3
so, f32, maybe whitespaces, comma, maybe whitespaces, and another f32
/// Parses a Point.
fn point(input: &str) -> IResult<&str, Point> {
    let (input, x) = recognize_float(input)?;
    // allow spaces between number & comma
    let (input, _) = take_while(is_space)(input)?;
    // the comma
    let (input, _) = char(',')(input)?;
    // more maybe whitespace
    let (input, _) = take_while(is_space)(input)?;
    // And the y
    let (input, y) = recognize_float(input)?;

    // Return OK!
    Ok((x, y))
}
but, it looks like take_while doesn't work on characters or something?
error[E0271]: type mismatch resolving `<&str as nom::traits::InputTakeAtPosition>::Item == u8`
  --> trident\src\parse.rs:17:22
   |
17 |     let (input, _) = take_while(is_space)(input)?;
   |                      ^^^^^^^^^^ expected `char`, found `u8`

error[E0271]: type mismatch resolving `<&str as nom::traits::InputTakeAtPosition>::Item == u8`
  --> trident\src\parse.rs:21:22
   |
21 |     let (input, _) = take_while(is_space)(input)?;
   |                      ^^^^^^^^^^ expected `char`, found `u8`

error[E0308]: mismatched types
  --> trident\src\parse.rs:26:12
   |
26 |     Ok((x, y))
   |            ^ expected tuple, found `&str`
   |
   = note:  expected tuple `(f32, f32)`
           found reference `&str`
i don't understand what is wrong
please help?
zserik
@zserik
maybe take_while(|x| x.is_ascii_whitespace()) or something like that
and you probably need to change Ok((x,y)) to Ok(Point(x,y)) or something like that (I don't know your type definition of Point...)...
gamma-delta
@gamma-delta
It's pub type Point = (f32, f32) dw :)
zserik
@zserik
hm
gamma-delta
@gamma-delta
is_a(" \t")(input)?; seems to work
But now, it thinks y is a (f32, f32)?
oh
i messed up the return
How do you convert some other kind of error into an IResult?
specifically the error from .parse()