by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Florian Rappl
    @FlorianRappl
    Hi @JettFlat - on parsing the page you will never see a server response 403. Potentially, you mean that on a parsed page you submit a form?
    Either way - AngleSharp does not come with a JS engine out of the box. For simple scripts using AngleSharp.Js may be sufficient, but the project is still experimental and may not work given a sufficiently complex script.
    Egil Hansen
    @egil
    Hi all
    Just wanted to drop in and say that I've pushed a non preview version of AngleSharp Diffing
    And the docs are pretty much done for now. All the critical components are documented. Of course I discovered a few small things while writing docs, so I will push a v0.13.2 soon.
    Still wondering about the versionering strategy. Should ask the sub project try to follow AngleSharps versions and when are we going for a 1.0.0 release?
    Egil Hansen
    @egil
    As for keeping versions in sync, it might be hard since we are not releasing at the same place, but I guess that is ok as long as it patch (x.x.x+) or at most minor level increments (x.x+.x).
    Florian Rappl
    @FlorianRappl

    Yes that is one problem (not only for diffing, but for all the other libs, too).

    The road towards 1.0 is still ongoing. There should be a 0.14, soon - then the only thing left is a docs revamp, which I have planned for a long time. Unfortunately, I never found the time. Hopefully, I can do it quite soon (any contribution welcome!).

    Egil Hansen
    @egil
    Hehe yes, contributions always are. I'll see what I can do.
    As for versioning, one solution is to simply drop the version sync and have each project do their own versions, but explicitly say what versions of AngleSharp they are compatible with. For Diffing it should basically be all versions of AngleSharp that has the simple html5 Dom APIs it uses. So maybe I can change the reference to AngleSharp in diffings .Csproj to be any version above 0.13.0?
    Ok well, I do have a few helpers classes that help setup a diffing run. Those include calls to the AngleSharp specific APIs such as create context etc.
    Egil Hansen
    @egil
    OK, the new version of diffing has been pushed with mostly documentation updates. btw. @FlorianRappl it looks like your nuget auth key is about to expire.
    Also, in my blazor testing library I am using github actions to do CI/CD. the flow is a little different from what we have. in that, the creation of a tag (release) on github causes the CD flow, that pushes a new nuget package to nuget.org. I have not integrated it into the pull requests as we have though, but I assume it is possible.
    Here is a twitter thread where I talk about it: https://twitter.com/egilhansen/status/1209244814048403456
    I must admit I like it better than the appvoyer setup we have right now, because it is so much more simple - here is the workflow file: https://github.com/egil/razor-components-testing-library/blob/master/.github/workflows/nuget-pack-push.yml
    Florian Rappl
    @FlorianRappl
    Sure the Github workflow is nice and should be simpler (after all, the system is much more modern than AppVeyor - which was back in the days one of the rare free CI/CD systems with .NET support). From my POV a migration can be done, but we need to make sure that all of AngleSharp can be migrated, i.e., all flows that currently work need to keep in working. After all, the CI/CD pipeline is just one tool to get things done.
    Egil Hansen
    @egil
    Yeah, makes sense. I'm in no hurry. Still very much just learning and playing around.
    wuyu8512
    @wuyu8512
    Hi, guys
    I am trying to get the text content in Html
    When I try to use INode.TextContent, there is no correct line break
    I want to know how can I use GetInnerText correctly, I would be very grateful if there is a simple example
    Florian Rappl
    @FlorianRappl
    If your post your question to StackOverflow together with a MWE (https://en.wikipedia.org/wiki/Minimal_working_example) then I know somebody will be able to help you! (at least I myself will also understand what you are trying to do and how you are doing it)
    Egil Hansen
    @egil
    Hey @FlorianRappl
    Will raising an event on an IElement, e.g. a keypress event on an input element, change the elements value attribute, besides triggering any C# event handlers bound to the event?
    Florian Rappl
    @FlorianRappl
    Hm the event alone not. Of course, I don't know what any handler is doing. If a SetAttribute call is performed then this may be. But not on its own.
    Egil Hansen
    @egil
    Hmm, ok. The event handler will not cause any side effect on the DOM element. So in the example where there is a listener for onkeypress on an input element, the event handler will just receive the event, but not change the value attribute. The browser would normally do this, right? So that's the kind of side effects I am wondering if AngleSharp mimics?
    Florian Rappl
    @FlorianRappl

    Well, the handler is obviously attached to the DOM element. But this is only reflected in memory.

    Now regarding the relationship between value and onkeypress. value as in the value prop of the DOM element will obviously reflect its current state. So this will change in AngleSharp as it does in the DOM. The valueattribute, however, reflects the defaultValue, which is the initial value. It won't change with any key press.

    AngleSharp mimics the same; we have an initial value (read from the attribute) and an internal value, which is wired together with the DOM prop accordingly.

    Egil Hansen
    @egil
    Interesting, I was not aware of the differences. So if I am writing a unit test, and for some reason want to see/assert that the value attribute/property has changed after a onkeypress, how do I do that? Is it just through the GetAttribute("value") method, or is that the one that points to the defaultValue, set on the element?
    Florian Rappl
    @FlorianRappl
    I would assess that via the Value property, e.g., var input = document.QuerySelector<IHTMLInputElement>("input#myinput"); Assert.AreEqual(input.Value, "whatever");. https://github.com/AngleSharp/AngleSharp/blob/master/src/AngleSharp/Html/Dom/IHtmlInputElement.cs#L210
    James
    @jglover15
    Thanks AngleSharp team; I have begun using it for HTML parsing, very easy and smooth..love it. Being new to this and trying to utilize the AngleSharp for XML has posed a bit more challenging. I have looked but have not found very good examples for parsing XML. Am I missing an obvious link? Just curious if there are some good examples using AngleSharp to parse XML. Thanks
    Florian Rappl
    @FlorianRappl

    Hi @jglover15 I guess not.

    There is one example using the parser directly:
    https://github.com/AngleSharp/AngleSharp.Xml

    Otherwise in the test files you find a lot more cases:
    https://github.com/AngleSharp/AngleSharp.Xml/tree/master/src/AngleSharp.Xml.Tests

    If you would tell me what use case you have I may be able to craft a simple snippet.

    James
    @jglover15
    @FlorianRappl Thanks for responding, I'll check the tests.
    With HTML parsing it's very simple to select an element or collection of elements with QuerySelector and QuerySelectorAll, then use the element or collection.
    Attempting to use the XMLParser, while I have gotten the document parsed with no problem, actually using the resulting XMLDocument is the documentation I was looking for. Maybe my experience with XML parsing is what is lacking and I should be looking somewhere else?
    Florian Rappl
    @FlorianRappl

    The XmlDocument is just a special instance of an IDocument, so all the things such as QuerySelector and QuerySelectorAll still work. In general this is the idea.

    In most use cases there will not be a difference to using a document that came from an HtmlParser. There are, of course, edge cases, mostly dealing with fragment parsing, which is not well-defined for XML. But even in such cases we now have "something" and it should just work.

    James
    @jglover15
    Ok, thanks. I had been looking at first into the DocumentElement and SelectNodes/SelectSingleNode with XPATH and not getting what I expecting, but likely due to my lack of using XPATH before. I'll try the QuerySelectors. Thanks again.
    Jaja
    @a9261
    Hi all , is AngleSharp js script behavior not like browser right ? e.g. in the browser my javascript variable if use var to declare , the variable will auto append to window . but if i use AngleSharp , it's not append to window .
    Thanks
    Florian Rappl
    @FlorianRappl
    Depends. Sure it is like the browser, but AngleSharp.Js is experimental and certainly not as capable as the JS engines from browsers. Thus your JS will potentially not run at all. Also if you append a value to window it is only available from the JS Engine - not from the window accessible to C# code.
    Egil Hansen
    @egil
    Hey @FlorianRappl
    I see 0.14 is out on nuget
    Is there an ETA on anglesharp.css?
    Florian Rappl
    @FlorianRappl
    Currently on it
    We will need to cut features here, unfortunately.
    Quite quick :)!
    Egil Hansen
    @egil
    cut features? well as long as you do not cut stuff that is in 0.13 I am happy ;)
    Florian Rappl
    @FlorianRappl
    No - nothing is removed. Just items from the backlog (https://github.com/AngleSharp/AngleSharp.Css/milestone/5) moved out of the milestone. I moved out 2 items already, but I am not sure if we get the remaining 2 in. Let's see. Maybe AngleSharp.Css will be released tomorrow in 0.14.0.
    Egil Hansen
    @egil
    OK, fingers crossed. I will be ready to push a new version, assuming all my tests stay green after the upgrade.
    Jenix-Park
    @Jenix-Park
    I installed AngleSharp Core within my Unity project and it worked like a charm.
    But after adding AngleSharp.Js along with Jint to my project, Unity editor always crashes whenever I build.
    I read through FAQ but no answer there.
    I even removed all the AngleSharp.Js related code but only to fail.
    Is AngleSharp Core the only one that is supported in Unity?
    2 replies
    Milosz Kukla
    @miloszkukla

    Florian:

    As written - no not directly, but we could add another attribute on the interface (e.g., DomBoolean).

    Egil:

    Could another option be to simply normalize the value of the attribute so that it's always empty or always the name of the attribute if it's truthy of simply remove the attribute if it has a falsy value?

    Florian:

    The attribute will always reflect the attribute - this is according to the specs
    If we would just change this then we would violate the specs which violates one of the core principles of AngleSharp

    @FlorianRappl what did you mean by "The attribute will always reflect the attribute" ?
    Milosz Kukla
    @miloszkukla
    Florian Rappl
    @FlorianRappl
    What I meant is that GetAttribute will always reflect the real / raw value, while properties (such as href) may be normalized / changed somehow.
    Milosz Kukla
    @miloszkukla
    oh so now I think I understand what Egil was asking about, thanks :)
    Rune Jacobsen
    @havremunken

    Hey guys, so I have this simple code

                var config = Configuration.Default.WithDefaultLoader();
                var context = BrowsingContext.New(config);
                var document = await context.OpenAsync("http://some.url/file.html");

    And I am using document.QuerySelectorAll() to parse some anchor tags. These have relative links, like "file2.html". document.BaseUri in this case is http://some.url/file.html - is there a simple way of creating the full URL for accessing file2.html in this case? In this trivial example that would mean removing file.html and substituting file2.html - but is there a way to say "create a full URL based on this BaseURI and this relative link" that will work like a browser does?

    8 replies
    Tom Hazell
    @The-Nutty
    Hi Guys,
    Im doing some profiling on the HTML parsing as we have noticed that it is sometimes very slow (we think on documents that are invalid or otherwise not quite right). So im doing some profiling and the HtmlDomBuilder#HeisenbergAlgorithm keeps coming up, googling HeisenbergAlgorithm does not bring anything related up, would someone be able to point me in the right place that explains what its doing.
    In trying to understand what its doing I was looking through chrome’s blink source code for the place where HeisenbergAlgorithm would have been called, and the logic they employ seems way simpler/quicker. code The equivalent of what they are doing seems to be :
    1. Find the/if there is an A tag in the _formattingElements
    2. Call ProcessFakeEndTag which when you are in the body just calls InBodyEndTag on a new token of type end tag with name tag name a.
    3. removes the active A tag from _formattingElements
    4. If open elements contains the a tag then remove it from there.
      Is there some difference between the 2 parsers that im missing?
    8 replies