by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
dangrabcad
@dangrabcad
@cakraww_twitter Does separated_list do what you want?
Wisnu
@cakraww_twitter

Sorry, actually what I want to do is to parse this:

-------
heading
-------
content1

---------------
another heading
---------------
content2
----
this is still content.
because it spans on more than one line.
----

---------------
this is heading
---------------
content3

and the expected result is:

[ Block { title: "heading", content: "content1" }
, Block { title: "another heading", content: "content2\n----\nthis is still content.\nbecause it spans on more than one line.\n----" }
, Block { title: "this is heading", content: "content3" }
]

What I have tried is with: many1( tuple((parse_heading, parse_content )) ). Ideally the parse_content is something like "take as long as parse_heading doesn't match; OR end of string is reached" but I don't how to do it in nom. There is combinator::not but it returns unit (). Any idea?

dangrabcad
@dangrabcad
How do you tell the difference between the dashes within a block and the dashes delimiting the block and heading?
Wisnu
@cakraww_twitter
heading is always one line that's sandwiched between two dash lines. if it has more than one line inside the sandwich, it's not a heading. in above example, content2 is not a heading because the dash line before it has been consumed by another heading.
dangrabcad
@dangrabcad
but the four dashes under "because it spans..." could be the start of the next heading
Wisnu
@cakraww_twitter
yes, if there is non-empty string below the four dashes, it will become the heading instead of this is heading.
anyway I found the solution using regex 😅
something like below but with some modification to capture the heading as well:
let re = Regex::new(r"-+\n([^\n]+)\n-+").unwrap();
let contents: Vec<_> = re.split(txt).map(|s| s.trim()).collect();
Håkon Jordet
@Bunogi
Hi, how would I allow for whitespace using with nom 5.0.0? I see that in the past you could use the ws! macro but it it's deprecated
s/with//
Denis Lisov
@tanriol
By manually inserting the whitespace parser everywhere it's needed :-(
Håkon Jordet
@Bunogi
welp, I'll just suck it up then
or well keep doing it
peckpeck
@peckpeck
@Bunogi look at nomplus https://github.com/peckpeck/nomplus that i wrote there is a sp! and a wsequence! macro that may help you
Håkon Jordet
@Bunogi
oh right I could have written a macro to do this easier but now it's too late I guess. Would also rather not use a crate that's not on crates.io if I can help it
whentze
@whentze
hey, small question
i'm trying to build a parser for a text-based format. one of the basic building blocks of this format is an "operator", which is one of ~50 possible atoms
what's the best way to parse these into a big enum?
the s_expression example uses one_of, but that only works for single chars. what if my operators are things like "add" or "sub"?
the operator names are not all the same length, but none of them are prefixes of other operator names
Jeremy Day
@z2oh
I tend to use alts for this. The unfortunate thing is that they are implemented up to a fixed number of branches, and I believe this number is less than 50. You can always have nested alts though, or try to group your operators into some kind of semantic meaning to make the code less ugly.
whentze
@whentze
but what to use inside the alt?
tag + map?
Jeremy Day
@z2oh
Yes
whentze
@whentze
i see
Jeremy Day
@z2oh
Or, tag with value to avoid having to execute a closure
whentze
@whentze
is the alt! macro also limited in the same way?
oh, tag can take a value directly?
Jeremy Day
@z2oh
I'm referring to the value combinator: https://docs.rs/nom/5.0.0/nom/combinator/fn.value.html
which throws away the parsed result and replaces it with some value (an instance of an Operator enum variant, for example)
whentze
@whentze
oh, i see
that's nice
thank you
i think i can group them somewhat usefully
Jeremy Day
@z2oh
The alt! macro I don't think is limited, since macros can emulate variadic arguments
Here is the relevant issue on the limitations of alt: Geal/nom#994
whentze
@whentze
can i combine macro-based parsers with function-based ones?
Jeremy Day
@z2oh
When grouping operators you need to be careful about ordering though. If you have ">" and ">>" operators for example, you should be careful that the ">>" appears as an earlier alt branch
whentze
@whentze
yeah, but as i said, I'm lucky and there are no operators that are prefixes of others
so i can group freely :)
Jeremy Day
@z2oh
Oh excellent :) I have had some level of success mixing macros and functions but I don't think it is supported
whentze
@whentze
does the limitation about length mentioned in the docs for alt! still apply to the alt function?
i.e. if i have a longer tag first in the list, will it parse too far when matching a later tag?
Denis Lisov
@tanriol
If you're using the complete version of the tag parser, should not be a problem.
whentze
@whentze
ah, right, i am
the macros are always the streaming version, i forgot
whentze
@whentze
are there utilities for parsing decimal numbers in nom?
Denis Lisov
@tanriol
Floating point or specifically decimal?
whentze
@whentze
ascii decimal integers
like "14"
of course std can do this, but then i'd have to wrap that in something to make it work with nom's combinators (right?)