Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Nov 30 05:19
    matthew-dean opened #569
  • Nov 30 04:21
    matthew-dean closed #568
  • Nov 30 04:21
    matthew-dean commented #568
  • Nov 30 04:04
    matthew-dean opened #568
  • Nov 27 14:57
    abhimanyu003 opened #567
  • Nov 26 07:02
    dlight commented #566
  • Nov 26 07:01
    dlight commented #566
  • Nov 26 06:52
    dlight edited #566
  • Nov 26 06:50
    dlight opened #566
  • Nov 06 11:36
    jhoobergs commented #333
  • Nov 04 00:27
    CAD97 closed #563
  • Nov 04 00:27
    CAD97 locked #563
  • Nov 03 22:01
    VinnyVicious opened #563
  • Nov 03 18:47
    bors[bot] closed #562
  • Nov 03 18:47

    bors[bot] on master

    Don't enable "std" for ucd-trie… Merge #562 562: Don't enable "… (compare)

  • Nov 03 18:47
    bors[bot] commented #562
  • Nov 03 18:44

    bors[bot] on staging.tmp

    (compare)

  • Nov 03 18:44

    bors[bot] on staging

    Don't enable "std" for ucd-trie… Merge #562 562: Don't enable "… (compare)

  • Nov 03 18:44

    bors[bot] on staging.tmp

    Don't enable "std" for ucd-trie… [ci skip][skip ci][skip netlify… (compare)

  • Nov 03 18:44

    bors[bot] on staging.tmp

    [ci skip][skip ci][skip netlify] (compare)

Noah
@NoahTheDuke
i opened a PR for allowing a single infix operator (sequence or choice) to be placed at the start of expressions: pest-parser/pest#553
Noah
@NoahTheDuke
i thought about also allowing them at the end, like a dangling comma, but i want to make sure i understand the process of modifying the base parser first
Noah
@NoahTheDuke
@CAD97 thanks for the feedback!
Colin Basnett
@cmbasnett
Does anyone know how to stop EOI from emitting a token? In my parser I apparently now need to explicitly handle EOI for some reason. Why isn't this a quiet token like SOI?
Noah
@NoahTheDuke
i took a look and couldn't find where it's emitted
Colin Basnett
@cmbasnett
I have EOI in my main rule:
program = { SOI ~ ... ~ EOI }
and it emits the EOI token at the end for some reason
Noah
@NoahTheDuke
right, if you leave it out, it won't be emitted
Colin Basnett
@cmbasnett
alright, easy enough, thanks
ratsclub
@ratsclub:matrix.org
[m]
Hi, guys! I have a little problem here. For some reason I can parse this 😱 value | value but I can't parse this ❄️ value | value. Do you have an idea of why one emoji works and the other don't?
link = { dest ~ ("|" ~ alias)? }
dest = { note ~ anchor? }
anchor = { (header | blockref) }

note = { ident }
alias = { ident_extended }
header = { "#" ~ !"^" ~ ident }
blockref = { "#^" ~ ident }

ident = _{ (LETTER | NUMBER | DASH_PUNCTUATION | CONNECTOR_PUNCTUATION | OTHER_SYMBOL |"$" | "(" | ")" )+ }
ident_extended = _{ (!("[[" | "]]" | CONTROL | LINE_SEPARATOR | PARAGRAPH_SEPARATOR ) ~ ANY)+ }
WHITESPACE = _{ " " }
Toasterson
@therealtoaster:matrix.org
[m]
Probably because OTHER_SYMBOL does not cover the second emoji
I know ANY covers UTF-8 completely but not if any of the builtins in ident rule do
Colin Basnett
@cmbasnett
Somewhat unrelated to pest exactly, but I've finished my translation of my language to an abstract syntax tree, does anyone have any advice for how to emit warnings after parsing? My current strategy is to recursively "visit" all the "nodes" of my syntax tree and inspect them for errors and pass them up through a visitor object. The problem is that I'd have to copy the span (positional) information to each of the nodes so I can emit meaningful warnings with line numbers etc. Seems like a rather inelegant solution. Has anyone else doing something like this before?
Colin Basnett
@cmbasnett
Nevermind, figured it out ^
ollien
@ollien:matrix.org
[m]

I've been wanting to experiment with pest, but even with a simple example, I'm having inexplicable compile errors. Would someone be able to take a look? https://gist.github.com/ollien/4e1ba081e332b22423275d7b1564fffe

This looks similar to pest-parser/pest#427 but I don't think I have a name collision like they did in this issue

ollien
@ollien:matrix.org
[m]
Welp I'm an idiot - my versions of the derive crate and the original didn't line up
SymmetricChaos
@SymmetricChaos
Every time I try to get the example to run I get an error that function or associated
item not found in proc_macro::Literal
also the example book still uses extern crate?
SymmetricChaos
@SymmetricChaos
looks like my version of Rust was out of date which caused the original error
Abhimanyu Sharma
@abhimanyu003

is there any way to do exact test matching.

first = { "m" }
second = { "mi" }

find = { first | second }

if mi is provided as input it will match second

Noah
@NoahTheDuke
you want it to match second? or it currently is matching second?
when i try this out, it matches m, because there's no repetition marker and first is the first choice in find
Abhimanyu Sharma
@abhimanyu003
image.png
Yes, I want to match second, it's matching first currently.
Noah
@NoahTheDuke
put second before first: find = { second | first }
Abhimanyu Sharma
@abhimanyu003
Yes I can do that with quick fix @NoahTheDuke but I have very big parser with over 100s of lines.
Its simple, but sometime it's hard to move precedence.
Was thinking if there anyway I can do exact text matching, same as we do with regex.
Noah
@NoahTheDuke
that's what happening here, but the choice operator means you have to handle the precedence, sadly
cuz it's doing "exact text matching" on "m" first and only if that fails does it try "mi"
(which as you've noticed, won't happen lol)
Abhimanyu Sharma
@abhimanyu003
:( I wish there is say, I will try to tweak precedence.
Noah
@NoahTheDuke
if you can share it, i could take a look at the grammar file and the point of contention
Abhimanyu Sharma
@abhimanyu003
Sure I'm planning to make it open-source, will create a git repo and share it here as well. :)
Abhimanyu Sharma
@abhimanyu003
it's using pest - yew - tailwind, it's a webassembly based calculator. Supports conversion and basic math.
EricE
@EricE
I am writing my first Rust app to parse a pdf credit card statement. I get all the text using pdf_extract and then I was planning on going line by line through the text looking for the statement date, payments, total transaction amount, individual transactions, etc., with separate parsers 1). I have different pest files with the individual grammars but now I see that I can only have one #[derive(Parser)] per file (the compiler complained 2))? I don't see how I can combine all the individual grammars given the randomness of text strings between the desirable ones. Is my only alternative to have an *.rs file that ingests each pest file and then call them all from main?
1) Example grammar
WHITESPACE = _{ " " }
stmt_str = _{ "New Purchases" }
credit = { "+" | "-" }
dollar = _{ "$" }
amount = { ASCII_DIGIT{1, } ~ "." ~ ASCII_DIGIT{2} }
NewPurchases = { stmt_str ~ credit ~ dollar ~ amount }

2) the name `Rule` is defined multiple times
`Rule` must be defined only once in the type namespace of this modulerustcE0428
main.rs(19, 10): previous definition of the type `Rule` here
conflicting implementations of trait `std::clone::Clone` for type `Rule`
conflicting implementation for `Rule`rustcE0119 ...
Noah
@NoahTheDuke
that's one way, yeah. another way is combine all of the different versions a single rule: all_rows = { row_type_1 | row_type_2 | row_type_3 } and then have the consuming code branch for each type
EricE
@EricE
I had tried that but I couldn't figure out how to skip the lines between the ones I wanted. I was looking back through the posts and saw one from you with "example = { (!"/" ~ ANY)+ ~ "/" }" and tried the pattern and it works (though I needed to chain the rules with "~" instead of "|")! For the record...(and feel free to comment if it could be done better, I have 9 separate grammars in the actual code) ):
WHITESPACE = _{ " " }
ws = { " "+ }

stmt_str = _{ "New Purchases" }
credit = { "+" | "-" }
dollar = _{ "$" }
amount = @{ (ASCII_DIGIT{1, } | ",")+ ~ "." ~ ASCII_DIGIT{2} }
new_purchases = { stmt_str ~ credit ~ dollar ~ amount }
NewPurchases = { (!new_purchases ~ ANY)+ ~ new_purchases }

charge_date = { ASCII_DIGIT{2} ~ "/" ~ ASCII_DIGIT{2} }
post_date = { ASCII_DIGIT{2} ~ "/" ~ ASCII_DIGIT{2} }
ref_num = @{ ASCII_ALPHANUMERIC+ }
description = { (ASCII_ALPHANUMERIC | "-" | "." | "," | "*" | "/" | "'" | "#" | "&" | ws)+ }
payment = { "-"{0,1} }
charge = { charge_date ~ post_date ~ ref_num ~ description ~ dollar ~ amount ~ payment ~ NEWLINE{0,} }
charges = { charge+ }
Charges = { (!charges ~ ANY)+ ~ charges }

Total = { NewPurchases ~ Charges }
Noah
@NoahTheDuke
you have 9 different potential lines?
EricE
@EricE
image.png
Actually I miscounted, it is 10 (above plus statement date) plus they break down the charges into four separate sections, payments and credits, my transactions, my wife's transactions and an interest charged section. I had to tweak my rules to pick up the multiple charge sections, it works perfectly. Now to figure out the best way to pull all the results out of the Pairs (4349 lines when I pretty print them):
charges_repeat = { (!charges ~ ANY)+ ~ charges }
Charges = { charges_repeat+ }

Total = { StmtDate ~ BalancePrevious ~ Payments ~ CreditsOther ~ NewPurchases ~ NewCashAdvances ~ NewBalanceTransfers ~ FeesCharged ~ InterestCharged ~ BalanceNew ~ Charges }
EricE
@EricE

I've read through the book and looked at some code on Github on how to unwind the result Pairs to get to the strings I need. Currently I have a match statement to cycle through the Pairs from the Parser:

/// This is the results from Total above
for record in enclosed.into_inner()
    match record.as_rule() {
        Rule::StmtDate => {
            println!(
                "{:#?}",
                &record.into_inner().next().unwrap().into_inner().as_str()
            );
        }
        Rule::BalancePrevious => println!("{:#?}", &record.as_rule()),
        Rule::Payments => println!("{:#?}", &record.as_rule()),
        ...

The examples I found on Github have similar code for digging down into the structure:
Some(pair.into_inner().into_iter().next().unwrap().as_str().to_string()) for instance.

Are there any helper functions that would allow you to pull out the contents of any Rule in the structure of Pairs that you get for complicated grammars? I imagine I could write a function to do that but I'd rather borrow one :) Also, if I mangled any terminology, I am a hardware guy learning Rust to get back into programming microcontrollers. I'm learning by converting some Python scripts I wrote.

William Tange
@s1gtrap

How would you get the inner contents of a string? I stole this from the JSON grammar,

string  = @{ "\"" ~ inner ~ "\"" }
inner   = @{ (!("\"" | "\\") ~ ANY)* ~ (escape ~ inner)? }
escape  = @{ "\\" ~ ("\"" | "\\" | "/" | "b" | "f" | "n" | "r" | "t" | unicode) }
unicode = @{ "u" ~ ASCII_HEX_DIGIT{4} }

but doing something like

Rule::string => Expr::Str(pair.as_str().to_string()),

would obviously take the whole thing, including the wrapping "s and escaped escape chars, whereas trying to take the inner like

Rule::string => Expr::Str(pair.into_inner().next().unwrap().as_str().to_string()),

fails as there are no inner (I assume because @ rules aren't named?).. removing the @ gets me closer, but then whitespace isn't read by the inner so I'm essentially back to square one.. are you supposed to like "unpack" the dquote wrapped string contents or something?

jess ✨
@jesopo:matrix.org
[m]
is there a way to match an empty string

usecase is

param1 param2
param1 :param2

should be the same thing, but

param1 :

should be param1 and an empty string

SymmetricChaos
@SymmetricChaos
Are there any examples of how to use the prec_climber? I can't seem to find any. Also how is the Pest3 project going?
Matthew Dean
@matthew-dean
Question: can grammars be extended? For instance, could I maintain a CSS grammar in one place, and extend that base grammar into Less or Sass?
Thomas Barusseau
@tbarusseau

Hi! I'm trying an extremely simple parser for AoC, but struggle to declare the simplest grammar using grammar_inline? Anyone who could help me out?

#[derive(Parser)]
#[grammar_inline = "
instruction = _{ forward | down | up }
    forward = { "forward" }
    down    = { "down" }
    up      = { "up" }
"]
struct MyParser;

Produces the following error: suffixes on a string literal are invalid, pointing to the end of my string in forward = { "forward" }... Which I took directly from the book here: "other_rule" https://pest.rs/book/grammars/syntax.html

It seems to come from grammar_inline, I'll switch to file-based grammars for now.
Bastien Orivel
@Eijebong
Hey, I'm looking at cleaning some dependency tree right now and got sha-1 duplicated because of pest_meta. Is there any chance to get a release of that at some point since the work's already been done?