Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Nicholas Blumhardt
    @nblumhardt
    Try: from rest in SubPath.Try().AtLeastOnce()
    might need to reorganize things depending on how that impacts error message quality, but should unblock it :-)
    RedlineTriad
    @RedlineTriad
    Since I want other people to learn from answers I get I made a StackOverflow post for my new question:
    https://stackoverflow.com/questions/59770626/how-to-ignore-tokens-until-a-certain-token-pattern-is-found-in-superpower-c
    José Manuel Nieto
    @SuperJMN
    I'm glad to see this channel is still alive 😊
    Nicholas Blumhardt
    @nblumhardt
    :-D
    :wave:
    José Manuel Nieto
    @SuperJMN
    haha, nice! a waving hand!
    RedlineTriad
    @RedlineTriad
    Can someone please answer the question?
    RedlineTriad
    @RedlineTriad
    Or more general, how can i match any token.
    A Token.Except(DMToken.Return)
    RedlineTriad
    @RedlineTriad
    Also, is what would be the easiest way to do AtLeastTwiceDelimitedBy()?
    RedlineTriad
    @RedlineTriad
    Found out a hack: Union.Where(a => a.Length > 1).Try()
    Identifier.AtLeastOnceDelimitedBy(Token.EqualTo(DMToken.Bar)); is equivalent to:
    Identifier.ManyDelimitedBy(Token.EqualTo(DMToken.Bar)).Where(a => a.length > 0).Try()
    Gargaj
    @Gargaj
    Hey guys, very quick question: how do I construct a parser that has an optional token followed by a mandatory token?
    I've tried something like
    from type in Token.EqualTo(MyToken.Literal) .Named("array/object type") .Apply(MyTextParsers.Literal) .Select(s => (string)s) .OptionalOrDefault("") from begin in Token.EqualTo(MyToken.LBracket) [..etc..]
    basically i'm hoping to get that working with both abc { ... } and { ... }
    Gargaj
    @Gargaj
    bit of elaboration: my problem seems to be that i have two parsers and both can start with the same token
    Nicholas Blumhardt
    @nblumhardt
    @Gargaj seems like what you have should work - any chance of a more complete sample? might be a good one for the longer-form Stack Overflow format, if you end up posting a q there please drop a link here, will take a look :+1:
    Gargaj
    @Gargaj
    yeah I figured it out
    i needed some more use of .Try()
    cos i didnt realize i need .Try() to allow backtracking in the token list
    i even found a hax for parsing a list that has a dangling delimiter (so the javascript style [1,2,3,] empty element) by just creating an end token that can be either ] or ,]
    i.sinister
    @i-sinister
    Good day, everybody. I need to pass a "context object" to the TokenListParser so that I can do symbol lookups (variables actually). Parser delegate signature does not allow it but there are several workarounds and I like none of them. First of all, I can have parsers as a readonly instance properties of some Parser class, but then they have to be recreated for every parse operation which is not good for performance, especially considering linq usage when using combinators. Another approach is to put reference to the "context object" to every token so that it is always available at the parser - this is also a "performance killer" because it would require convert tokens from enum to the struct (at least) with the pointer to the context. The third option is to build AST and perform lookups/validation at a later stage - this one seems like doing unnecessary jobs and (maybe) producing invalid tree (also I'm loosing token location information). Last options is to do variable lookup at the tokenization stage (similar to the "lexer hack"), but this approach does not solve by problem 100% because in some cases I need to know the "context around the token" (aka future AST node), so its really better be done at the parsing stage. And there is also an "option" (which is not an option of "using superpower") to write "parallel implementation" for TokenListParser (with combinators etc) that accepts context as an argument - I'd like to avoid it of course as means writing lot code and fixings lots of bugs in the code written. So what are the recommended/best practices to handle "accessing context during parsing" problems?
    i.sinister
    @i-sinister
    @nblumhardt, is this chat alive?
    Nicholas Blumhardt
    @nblumhardt
    Hi @i-sinister :-) ... yup! I don't have anything to add to your analysis above, though - using instance-based parsers for context-sensitive grammars works but isn't very efficient; sometimes the context-sensitivity can be kept to just a few rules, though, with the majority of syntactic forms still context-free, sounds like that's the best option.
    Truly context-sensitive-grammars are a bit of a special case, though - most of the time, AST post-processing and a forgiving grammar is the way to go
    not easy answers to all these questions, though
    Erik Schierboom
    @ErikSchierboom
    Just wanted to let you know I love superpower. Brilliant job!
    Nicholas Blumhardt
    @nblumhardt
    Thanks @ErikSchierboom :+1: :-)
    Jeivardan
    @jeivardan
    Hi @nblumhardt my problem statement is I'll receive a response when I execute a command and the start of response string contains any of the following token ":", "?", "Finished", "Error", "Info", "Warning" and depending upon the tokens specified above the response body that is rest of the string present after this token may vary for example if the token is ":" it means a valid prompt and has no response body and if the token is "Finished" the response body contains two things first is a command name that I executed and the rest is the actual responsedata and I need to convert this whole response into an object with members like { responseType, cmdName, responsedata } Is it possible with superpower.
    Jeivardan
    @jeivardan
    Possible types of responses
    1) " : " ( : Means a valid prompt and no response body).
    2) " ? some response message" (? Means Invalid prompt after executing an invalid command)
    3) " Finished : CommandName : responsedata"
    this is the first level of parsing the response and futher the responsedata can be parsed
    Jeivardan
    @jeivardan
    Can Superpower solve my problem or should I look into ANTLR
    Jeivardan
    @jeivardan
    @nblumhardt any suggestion please I am confused

    For a command possible responses are

    ":"

    "?"

    "Finished : commandname : msgbody"

    "Error errorcode commandname(if)"

    "Info : msgbody"

    "Warning errorcode msgbody"

    Andrew Savinykh
    @AndrewSav
    @jeivardan you've got to try and see if it works out for you. there are a lot of unknowns here, what response can contain, what body can contain, are you going to tokenize them or not, do you have a grammar for them, what other "tags" mean, etc, etc, etc
    Kenneth Ellested
    @ellested_gitlab

    Is Nicholas to be found here?

    I just wanted to send my deep thanks for his great work on this incredible library. I was actually "forced" in this direction, as I couldn't find a HTML parser for .net, that could load the nodes and retain the exact stream positions, and at the same time allow for all the quirks to be found in HTML documents. I've tried both HtmlAgilityPack, AngleSharp and other libraries - but this problem is apparently quite general. Anyway, I've started some days ago, and I was lost several times trying to figure my way around the code and concepts. But the more I used it, the more sense it suddenly made. The HTML parser is maybe just 300 lines of code, and it reads every quirk correctly so far - and it's the Superpower library that makes this possible. Some of the usual problems are matching tags, non closed tags (occasionally), multiline attributes, markup in script and style tags and so on. I know that large documents are not suited for this kind of parser design, but the Speed is around 1.5 seconds for a 1.2MB document (around 10 times longer than HAP and AngleSharp). That's pretty good without optimizations and probably some mistakes made.

    To make a long story short, every developer should learn Superpower - the investment in time will come back 10 fold. A lot of new opportunities will also open, and you can make a lot of cool stuff with your new Superpowers (great name too 😀)

    Thanks Nicholas

    Nicholas Blumhardt
    @nblumhardt
    Woot! @ellested_gitlab that's awesome to hear, thanks for dropping by - much appreciated :sunglasses:
    Kenneth Ellested
    @ellested_gitlab
    Hi Nicholas - I've made a custom TextParser<TextSpan> based on the built-in Comment parsers. The idea is to match any character until a certain string/span occurs. I'm almost sure it can be done with the build in parsers, but I haven't found out yet. Example: "I need this text, until I encounter a <STOP>". So I need any character until the word <STOP>. My problem seems to be that Character.AnyChar is greedy, and I can't figure a way to limit it. I'm sure this is ultra simple when you know how :-)
    Kenneth Ellested
    @ellested_gitlab

    This is actually what I'm looking for:
    public static TextParser<TextSpan> MatchUntil(string stopword) =>
    from value in Span.Regex($@".+?(?={Regex.Escape(stopword)})")
    select value;

    But I can't figure out to make it with the fast parser methods in the library. I've tried combinations with Or, IgnoreThen and Try, but I fail every time. I'm sure I'm missing the point somewhere, so it would be great to see how this is done without the Regex.

    Nicholas Blumhardt
    @nblumhardt
    @ellested_gitlab I think we're missing something like Span.Until("<STOP>") - I have a feeling I've seen an implementation of it in the past, but can't put my finger on where it was, sorry :)
    Kenneth Ellested
    @ellested_gitlab
    OK, I feel better now :) - I was pretty sure I was just overlooking something fundamental. Anyway, seems like a nice challenge, so I will give it a try.
    Kenneth Ellested
    @ellested_gitlab
        public static class SpanEx
        {
            public static TextParser<TextSpan> Until(string stopword)
            {
                bool isWithinLength(TextSpan ts) => ts.Length >= stopword.Length;
                bool isStopwordMatching(TextSpan ts) => ts.First(stopword.Length).EqualsValue(stopword);
                bool isMatch(TextSpan ts) => isWithinLength(ts) && isStopwordMatching(ts);
    
                return (TextSpan input) =>
                {
                    TextSpan x = input;
    
                    while (!x.IsAtEnd && !isMatch(x))
                        x = x.ConsumeChar().Remainder;
    
                    return isMatch(x)
                      ? Result.Value(input.Until(x), x, x)
                      : Result.Empty<TextSpan>(input, $"Until expected {stopword}");
                };
            }
        }
    Came up with this, which is at least 15% faster than the Regex on my integrations tests. I'm not so confident about how the error messaging works yet, so not sure if this is fully compatible.
    If it's not totally off, I can submit a PR with my tests...
    Nicholas Blumhardt
    @nblumhardt
    Looks about right to me @ellested_gitlab - haven't thought through it in detail but a PR would be welcome, we can dig in further there! :+1:
    Khiem Pham
    @vi3tkhi3m
    Hi, anyone know how I can extract a string between two brackets that has nested brackets? Ex. (I want to (extract (this)) text). Output should be : I want to (extract (this)) text. I've tried to use .Contained(OpenBracket, ClosingBracket), but this will close as soon it sees the first closing bracket ... Thanks in advance!
    Kristian Hellang
    @khellang

    Hello 👋🏻 Does anyone have any pointers on how to tokenize/parse a template like this:

     Hello {upper(firstName)}!

    Basically, I want to tokenize everything outside the curlies as just text, including whitespace, but parse everything inside the curlies with full fidelity as identifiers etc., ignoring whitespace

    Kristian Hellang
    @khellang
    It feels like I want to nest tokenizers, where the outer would separate template from text, while the inner would dig into the template itself
    Kristian Hellang
    @khellang
    I guess I could always write the tokenizer by hand, but using TokenizerBuilder is just too lovely
    Nicholas Blumhardt
    @nblumhardt
    Heya @khellang !