Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Jeivardan
    @jeivardan
    this is the first level of parsing the response and futher the responsedata can be parsed
    Jeivardan
    @jeivardan
    Can Superpower solve my problem or should I look into ANTLR
    Jeivardan
    @jeivardan
    @nblumhardt any suggestion please I am confused

    For a command possible responses are

    ":"

    "?"

    "Finished : commandname : msgbody"

    "Error errorcode commandname(if)"

    "Info : msgbody"

    "Warning errorcode msgbody"

    Andrew Savinykh
    @AndrewSav
    @jeivardan you've got to try and see if it works out for you. there are a lot of unknowns here, what response can contain, what body can contain, are you going to tokenize them or not, do you have a grammar for them, what other "tags" mean, etc, etc, etc
    Kenneth Ellested
    @ellested_gitlab

    Is Nicholas to be found here?

    I just wanted to send my deep thanks for his great work on this incredible library. I was actually "forced" in this direction, as I couldn't find a HTML parser for .net, that could load the nodes and retain the exact stream positions, and at the same time allow for all the quirks to be found in HTML documents. I've tried both HtmlAgilityPack, AngleSharp and other libraries - but this problem is apparently quite general. Anyway, I've started some days ago, and I was lost several times trying to figure my way around the code and concepts. But the more I used it, the more sense it suddenly made. The HTML parser is maybe just 300 lines of code, and it reads every quirk correctly so far - and it's the Superpower library that makes this possible. Some of the usual problems are matching tags, non closed tags (occasionally), multiline attributes, markup in script and style tags and so on. I know that large documents are not suited for this kind of parser design, but the Speed is around 1.5 seconds for a 1.2MB document (around 10 times longer than HAP and AngleSharp). That's pretty good without optimizations and probably some mistakes made.

    To make a long story short, every developer should learn Superpower - the investment in time will come back 10 fold. A lot of new opportunities will also open, and you can make a lot of cool stuff with your new Superpowers (great name too 😀)

    Thanks Nicholas

    Nicholas Blumhardt
    @nblumhardt
    Woot! @ellested_gitlab that's awesome to hear, thanks for dropping by - much appreciated :sunglasses:
    Kenneth Ellested
    @ellested_gitlab
    Hi Nicholas - I've made a custom TextParser<TextSpan> based on the built-in Comment parsers. The idea is to match any character until a certain string/span occurs. I'm almost sure it can be done with the build in parsers, but I haven't found out yet. Example: "I need this text, until I encounter a <STOP>". So I need any character until the word <STOP>. My problem seems to be that Character.AnyChar is greedy, and I can't figure a way to limit it. I'm sure this is ultra simple when you know how :-)
    Kenneth Ellested
    @ellested_gitlab

    This is actually what I'm looking for:
    public static TextParser<TextSpan> MatchUntil(string stopword) =>
    from value in Span.Regex($@".+?(?={Regex.Escape(stopword)})")
    select value;

    But I can't figure out to make it with the fast parser methods in the library. I've tried combinations with Or, IgnoreThen and Try, but I fail every time. I'm sure I'm missing the point somewhere, so it would be great to see how this is done without the Regex.

    Nicholas Blumhardt
    @nblumhardt
    @ellested_gitlab I think we're missing something like Span.Until("<STOP>") - I have a feeling I've seen an implementation of it in the past, but can't put my finger on where it was, sorry :)
    Kenneth Ellested
    @ellested_gitlab
    OK, I feel better now :) - I was pretty sure I was just overlooking something fundamental. Anyway, seems like a nice challenge, so I will give it a try.
    Kenneth Ellested
    @ellested_gitlab
        public static class SpanEx
        {
            public static TextParser<TextSpan> Until(string stopword)
            {
                bool isWithinLength(TextSpan ts) => ts.Length >= stopword.Length;
                bool isStopwordMatching(TextSpan ts) => ts.First(stopword.Length).EqualsValue(stopword);
                bool isMatch(TextSpan ts) => isWithinLength(ts) && isStopwordMatching(ts);
    
                return (TextSpan input) =>
                {
                    TextSpan x = input;
    
                    while (!x.IsAtEnd && !isMatch(x))
                        x = x.ConsumeChar().Remainder;
    
                    return isMatch(x)
                      ? Result.Value(input.Until(x), x, x)
                      : Result.Empty<TextSpan>(input, $"Until expected {stopword}");
                };
            }
        }
    Came up with this, which is at least 15% faster than the Regex on my integrations tests. I'm not so confident about how the error messaging works yet, so not sure if this is fully compatible.
    If it's not totally off, I can submit a PR with my tests...
    Nicholas Blumhardt
    @nblumhardt
    Looks about right to me @ellested_gitlab - haven't thought through it in detail but a PR would be welcome, we can dig in further there! :+1:
    Khiem Pham
    @vi3tkhi3m
    Hi, anyone know how I can extract a string between two brackets that has nested brackets? Ex. (I want to (extract (this)) text). Output should be : I want to (extract (this)) text. I've tried to use .Contained(OpenBracket, ClosingBracket), but this will close as soon it sees the first closing bracket ... Thanks in advance!
    Kristian Hellang
    @khellang

    Hello 👋🏻 Does anyone have any pointers on how to tokenize/parse a template like this:

     Hello {upper(firstName)}!

    Basically, I want to tokenize everything outside the curlies as just text, including whitespace, but parse everything inside the curlies with full fidelity as identifiers etc., ignoring whitespace

    Kristian Hellang
    @khellang
    It feels like I want to nest tokenizers, where the outer would separate template from text, while the inner would dig into the template itself
    Kristian Hellang
    @khellang
    I guess I could always write the tokenizer by hand, but using TokenizerBuilder is just too lovely
    Nicholas Blumhardt
    @nblumhardt
    Heya @khellang !
    Yes, in fact there's an example of a parser exactly like this one at:
    Nicholas Blumhardt
    @nblumhardt
    Actually, that one might be more complicated than you need, since the expressions in that language include top-level { and }, so the end of a "hole" depends on the expression grammar; e.g. Hello {greeting({name: 'ted'})}!
    Nicholas Blumhardt
    @nblumhardt
    (Or Hello {greeting({name: '}')}! :-) )
    Think you'll need to either adopt something like that, though, or write the tokenizer by hand; TokenizerBuilder is a bit too simplistic for this
    Nicholas Blumhardt
    @nblumhardt
    Digging back into it some more, the Serilog.Expressions one was even nastier because of the need to support , and : as delimiters between the expression and the alignment/width/format specifiers, while also using them in various roles within the expression syntax. Hopefully if you tackle writing the tokenizer by hand, it won't be quite that nasty :-)
    If you need another set of eyes on anything, ping me here or by mail :)
    Nicholas Blumhardt
    @nblumhardt
    @vi3tkhi3m is it just text you're dealing with, or are there other more complex aspects to the grammar? If it's just text, iterating through character by character and tracking a depth variable for parenthesis nesting will be a lot more straightforward than doing this with a parser, I think.
    Kristian Hellang
    @khellang
    Thanks @nblumhardt! I ended up writing the tokenizer by hand. Was pretty straight forward for what I needed :)
    Thanks for an awesome library 😍
    Nicholas Blumhardt
    @nblumhardt
    @khellang :bow:
    José Manuel Nieto
    @SuperJMN
    Hi! Trouble Man here, kicking again!
    I hope somebody can help me with this. I'm a big ashamed that I cannot handle it by myself: https://stackoverflow.com/questions/66959755/parse-string-between-a-pair-of-delimiters-that-are-strings