multiplication = {(multiplication ~ ("*" | "/" | "%"))? ~ ASCII_DIGIT}
also fails.
5 * 3 + 3 + 3 * 5
should be evaluated as 5 * 9 * 5
, for example.
calculation = _{ SOI ~ expr ~ EOI }
WHITESPACE = _{ " " | "\t" }
int = { ASCII_DIGIT+ }
add = { "+" }
multiply = { "*" }
operation = _{ multiply | add }
term = _{ int | "(" ~ expr ~ ")" }
expr = { (multiplication ~ operation)* ~ multiplication }
addition = { (term ~ add)* ~ term }
multiplication = { (addition ~ multiply)* ~ addition }
from my understanding it should work the way you want
8 + 9
you get an int with spaces inside (which is weird, as int is only composed of ASCII_DIGIT+)
I am dealing with terrible grammer that goes like this
When an io_here token has been recognized by the grammar (see Shell Grammar), one or more of the subsequent lines immediately following the next NEWLINE token form the body of one or more here-documents and shall be parsed according to the rules of Here-Document.
Hey, I am having big trouble with my grammar for a custom language. This is a very simplified version:
program = {SOI ~ item* ~ EOI}
item = {block ~ ";"}
block = {"{"~item*~expr?~"}"}
expr = {block}
As a bit of context, an item could be some statement, for example let a = 1;
or, as in this grammar, a block: {...};
. However, a block can have as its last element an expression (implicite return) and a block can also be interpreted as an expression. This leads to exponential time for inputs that look like this: {{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{
.
I know that this is parsed so slowly, because the parser has to backtrack for every {
, leading to a 2^n time where n is the number of brackets.
I am quite desperate, is there any way to fix this?
@Inky-developer, could you please try pest-parser/pest#473 to see if it improves the situation?
Basically these are warts in the optimiser that's currently not that well designed. I have changes in mind for pest3 that will render these cases obsolete.
&
or !
)
// Pest Grammar for parsing XPATH Duration strings like this: P2Y3M1DT5H2M5.5S
// - A "T" divides the year, month, day fields from the hour, minute, second fields.
// - Some parts may be omitted, but at least one must be specified.
// For example, PT10M is ten minutes.
// - If hour, minute and second are all omitted, you must omit the "T".
// - Never omit the "P".
// - Only the Seconds field may have a fraction, the others are integers.
// - A leading minus sign indicates a negative duration. It must come befor ethe "P".
// For example: -P2Y is negative two years.
// - Months can be more than twelve, hours more than 24, etc.
date_time = @{
SOI ~ sign? ~ "P" ~ (
date ~ "T" ~ time
| date
| "T" ~ time
) ~ EOI
}
date = {
year ~ month ~ day
| year ~ month
| year ~ day
| year
| month ~ day
| month
| day
}
time = {
hour ~ minute ~ second
| hour ~ minute
| hour ~ second
| hour
| minute ~ second
| minute
| second
}
sign = !{ "-" }
year = !{ number ~ "Y" }
month = !{ number ~ "M" }
day = !{ number ~ "D" }
hour = !{ number ~ "H" }
minute = !{ number ~ "M" }
second = ${ number ~ "S" }
digits = @{ ASCII_DIGIT+ }
number = @{ digits ~ fraction? }
fraction = @{ "." ~ digits ~ !integral }
integral = { "Y" | "M" | "D" | "H" }