Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
awygle
@awygle
still not sure why though
Markus Westerlind
@Marwes
many expectes a collection type to collect into whereas skip_many1 just outputs ()
skip_many is better regardless here since you don't care about the inner parser's output
awygle
@awygle
I do care about the result of the whole function though, which is why I used many1. I assume there's some interaction between recognize and its argument that I'm not properly understanding.
Markus Westerlind
@Marwes
Yes, recognize checks the start to end span of what was parsed and returns a Range of that
So it is great when you need to do some more complicated parsing that still needs to return a Range (&[T]/&str)
In this case you could use take_while instead though
take_while1(|c: char| c.is_alpha_num())
awygle
@awygle
I ended up switching to skip_until(space()) because i needed to accept symbols like -. thanks for the help!
Michael Jones
@michaeljones
Hi, thanks for this library. It is very impressive. Unfortunately I'm relatively new to Rust and I'm struggling. I'm attempting to write a parser for the Elm language syntax but it requires tracking indentation and I'm struggling to figure out how to use the combine::stream::state::Stream with custom parsers. Probably a beginner issue on my part but I'd certainly appreciate some documentation around the state::Stream.
I also got caught out for a while because I reached for 'take_while' instead of 'many & satisfy' and didn't understand the implications of range vs non-range streams in that choice. I don't know what to do about that to make the onboarding for new users smoother.
Johannes Maas
@Y0hy0h
I am currently using regex, but I noticed that some pieces keep being used and composed into bigger regexes. Notably, I am searching for Docker images (gitlab/gitlab-ce) but I also have a larger pattern part of which is such an image. So I thought that having parsers for each of these and combining them could be useful.
Where I'm unsure is how I can misuse the parser library for such regex-like matching. Essentially, I need a combinator that takes a parser for what I'm looking for and tries it until it matches or reaches the end, without having the failure make it abort or backtrack. What is such a combinator?
Markus Westerlind
@Marwes
Johannes Maas
@Y0hy0h
And how do I continue searching if end fails? attempt will go back to before end started consuming, right? So how can I keep looking until end is successful or I reach the end of file?
Johannes Maas
@Y0hy0h
Oh, I'm realizing in general this is a bad idea, because a legitimate end could begin in the middle what we read before endfailed. E.g., if end matches aab and we parse aaab, then it would consume aa, fail on the next a and then continue at that a and next encounter the b and still fail, whereas there is an aabin the input...
I think in my case this might not actually happen, but I'll try it with attempt or something.
Out of curiosity, is there a way to continue searching even after a parser has failed?
Johannes Maas
@Y0hy0h
I'm sorry, it looks like what I want is actually the last example given in the skip_untildocumentation. :see_no_evil:
Thanks a lot for your swift and concise response! :)
Markus Westerlind
@Marwes

Oh, I'm realizing in general this is a bad idea, because a legitimate end could begin in the middle what we read before endfailed. E.g., if end matches aab and we parse aaab, then it would consume aa, fail on the next a and then continue at that a and next encounter the b and still fail, whereas there is an aabin the input...

Yeah, I was going to mention this problem though from the angle of skip_until not doing anything clever around at and therefore it can be quite slow if end can match a long string

It was late though so I didn't have the time
JackFly26
@JackFly26
why does combine::parser::char::digit return a Digit instead of an opaque type?
oh no the last message was july 11
dhgelling
@dhgelling
Hey, I have a simple parse defined, but am wondering what the simplest way to read from a file is. I'm only using parsers from combine::parser::repeat and combine::parser::char
Lloyd
@lloydmeta

I've been using Combine for AoC for a while, and something that still stumps me is how to parse something like a <Vec<Vec<A>> where the individual A in the text to be parsed are separated by newlines, and each Vec<A> are separated by two newlines..

e.g.

abc

a
b
c

ab
ac

a
a
a
a

b

Should be parsed into

vec![
  vec![ "abc"],
  vec![ "a", "b", "c"],
  vec![ "ab", "ac"],
  vec![ "a", "a", "a", "a"],
  vec!["b"]
]

For AoC, since it doesn't really matter, I end up splitting the input by \n\n first and parsing, silently throwing out invalid groups, but this feels ugly and is probably not how Combine is meant to be used, so wondering if someone can help direct me to The Right Way :tm: :)

https://github.com/lloydmeta/aoc2020-rs/blob/227f84d1a412715b6dd67bd84bcb5024bf6d83e1/src/day_06.rs#L75-L89

Markus Westerlind
@Marwes
@lloydmeta Since this involves two tokens '\n\n' you will want to first look at https://docs.rs/combine/4.4.0/combine/parser/combinator/fn.attempt.html then since you basically want to stop the inner parser you would look at https://docs.rs/combine/4.4.0/combine/fn.not_followed_by.html (or a plain satisfy() would work here as well)
let person_answers_parser = many::<String, _, _>(letter()).map(PersonAnswers);
                let group_people_answers_parser =
                    sep_by1(person_answers_parser, (newline(), not_followed_by(newline()))).map(GroupAnswers)
let parser = sep_by1(group_people_answers_parser,  (newline(), newline()));
Lloyd
@lloydmeta
aah ok thanks @Marwes will give that a go
dhgelling
@dhgelling
I want to use combine for an advent of code solution, but need mutually recursive parsers. Is there a way to define parsers using functions without everything being cluttered up by where clauses?
I just want to parse from a string anyway
dhgelling
@dhgelling
oh also, I tend use sep_by1 to parse input separated by newline, but often there is a final newline at the end of the document. What is the best way to handle that?
ah never mind, I thought sep_end_by would force the separator to be at the end, but turns out it doesn't
Markus Westerlind
@Marwes

I want to use combine for an advent of code solution, but need mutually recursive parsers. Is there a way to define parsers using functions without everything being cluttered up by where clauses?

I suppose you could do it like combine-language and declare a struct parameterized by the input and then declare the required where bound on the impl https://github.com/Marwes/combine-language/blob/873c7f1aa977731a87e29fd8ced8ce48b589dcb1/src/lib.rs#L336-L340

The where clause has to go somewhere though
dhgelling
@dhgelling

Thanks =) I'm trying to use the not_followed_by construct you suggested above, but it's not accepting my input. the parser looks like this:

let rule = some_parser_not_accepting_newline;
let rules = sep_by1(rule, (newline(), not_followed_by(newline())));
let file = (rules.skip((newline(), newline())), string("somestring"));

but it fails on the empty line with the message

Error while parsing input: Parse error at line: 7, column: 1
Unexpected `
`

Some debug prints show that it's failing in the separator of the sep_by1, but I don't know how to fix it

Markus Westerlind
@Marwes
Might need an attempt in there perhaps

let rules = sep_by1(rule, attempt((newline(), not_followed_by(newline()))));
Since the separate ends up committing the first newline
dhgelling
@dhgelling
yeah it works wrapping the separator in attempt(), but I'm not clear on why that's needed. If my separator is string("next") and the next input is new instead, would it consume the first two characters anyway?
hmm yes it seems so, guess in my mind the separator was automatically wrapped in attempt() anyway, since it might fail at the end of the sequence
eaglgenes101
@eaglgenes101
Is it okay to reuse combine parsers?
And why do they take &mut self for their methods anyways?
Markus Westerlind
@Marwes

Is it okay to reuse combine parsers?

Yes

And why do they take &mut self for their methods anyways?

It allows them to take FnMut functions so that they can mutate things (say push to a Vec for the many parser)

eaglgenes101
@eaglgenes101
Okay good experience, but trying to make spans work was a pain
Im just interested in the location of the token where a parser began parsing, and of the token it either committed a success or ran into an error
Seems severely burdensome to have to replace half the types in my function signatures with spanned analogs just to get those
eaglgenes101
@eaglgenes101
(Also, I tried to implement the combinators myself. What's with all the parse_mode stuff in the library combinators, and do I need to concern myself with those?)
marwes
@marwes:matrix.org
[m]

Seems severely burdensome to have to replace half the types in my function signatures with spanned analogs just to get those

If you need the default errors to know about spans then I am afraid you need a custom stream that knows about spans (the stream would also tokenize so I'd expect a custom would be needed regardless). If you only need the span of a particular parser you can do something like (position(), my_parser, position()) to get the start end end of it