AngleSharp is the ultimate angle brackets parser library. It parses HTML5, MathML, SVG and CSS to construct a DOM based on the official W3C specification.
Egil:
Could another option be to simply normalize the value of the attribute so that it's always empty or always the name of the attribute if it's truthy of simply remove the attribute if it has a falsy value?
Florian:
The attribute will always reflect the attribute - this is according to the specs
If we would just change this then we would violate the specs which violates one of the core principles of AngleSharp
Hey guys, so I have this simple code
var config = Configuration.Default.WithDefaultLoader();
var context = BrowsingContext.New(config);
var document = await context.OpenAsync("http://some.url/file.html");
And I am using document.QuerySelectorAll() to parse some anchor tags. These have relative links, like "file2.html". document.BaseUri
in this case is http://some.url/file.html
- is there a simple way of creating the full URL for accessing file2.html in this case? In this trivial example that would mean removing file.html and substituting file2.html - but is there a way to say "create a full URL based on this BaseURI and this relative link" that will work like a browser does?
HtmlDomBuilder#HeisenbergAlgorithm
keeps coming up, googling HeisenbergAlgorithm does not bring anything related up, would someone be able to point me in the right place that explains what its doing.Hi Everyone. Great to see this tool, I'm evaluating it with purpose of improving our Web UI test automation.
We are considering to adopt Asp Net Core Test Server (https://docs.microsoft.com/en-us/aspnet/core/test/integration-tests?view=aspnetcore-3.1), and I'm curious what is the best way to integrate AngleSharp into the TestServer pipeline. I can see the MS example first gets full response from WebApplicationFactory created HttpClient, which I can imagine only works for initial page load.
We would like to include Javascript and subsequent Ajax calls (it's a Asp Net Core Angular SPA website) in the testing, which I assume needs more sophisticated integration with AngleSharp (via the extensibility points, so it internal always uses the HttpClient provided from the TestServer)
I had a look at the extensibility points and don't know if that's the way to go, and where to start just based on the docs https://anglesharp.github.io/docs/API.html
Any ideas appreciated => AngleSharp testing Asp Net Core Angular Spa
Maybe something like this would be a good starting point?
https://developer.mozilla.org/en-US/docs/Web/API/Element/scrollIntoView
Hi. I'm trying to write some code to find an h1 on a page and get it's computed style. The code I've got works fine for getting styles that are inline in the HTML document but the styles that I have in an external style sheet don't seem to be applied.
Are external style sheets supported and is there something I need to do to enable external stylesheets, perhaps in the config? I have tried Configuration.Default.WithDefaultLoader(new LoaderOptions { IsResourceLoadingEnabled = true }).WithCss()
Hi, I"m trying out AngleSharp in F#, and experiencing a very strange with QuerySelectorAll
basically working only for the top-level body element, but not for any elements inside it. Here's an example:
> doc.QuerySelectorAll("body") |> Seq.tryHead |> Option.map (fun n -> n.TagName);;
val it : string option = Some "BODY"
> doc.QuerySelectorAll("div") |> Seq.tryHead |> Option.map (fun n -> n.TagName);;
val it : string option = None
>
This is for the documented loaded from this url.
HtmlBodyElement
has the correct BaseUrl
but the value of OutterHtml
is a mere "<body></body>"
. Hmm, that might be it.
But when I just tried with local content, it works:
let getDoc (htmlContent: string) =
let cfg = Configuration.Default.WithDefaultLoader()
let ctx = BrowsingContext.New(cfg)
async { return! ctx.OpenAsync(fun req -> req.Content(htmlContent) |> ignore) |> Async.AwaitTask } |> Async.RunSynchronously
[<EntryPoint>]
let main argv =
let doc = getDoc "<body><div>Hello</div></body>"
let cells = doc.QuerySelectorAll("div")
let titles = query {
for cell in cells do
select cell.TextContent
}
printfn $"Printing {Seq.length titles} titles"
for title in titles do
printfn $"Title: {title}"
0 // return an integer exit code
^ This works, which is good enough I guess, as I don't plan to load remote content anyway.
Is it possible to get the contents of a <template>
tag (so as to query on it)?
In JS, you would do it using the .content
property.
td
inside the <template>
tag of https://developer.mozilla.org/en-US/docs/Web/HTML/Element/template#examples
let tmpl = doc.QuerySelectorAll("template")
, which returns the template tag, but I can't drill down further than.
let templates = doc.QuerySelectorAll("template")
let titles =
query {
for tmpl in templates do
for p in tmpl.QuerySelectorAll("p") do
select p.OuterHtml
}
titles
is empty seq, so the "p" querying didn't work, which isn't surprising I guess, because I'm suppose to get the fragment inside tmpl
and then query on it.
tmpl.GetContentFragment().QuerySelectorAll("p")
- but what would be GetContentFragment
?
tmpl.
GetContentFragment
is / should be. I guess you want Content
?