AngleSharp is the ultimate angle brackets parser library. It parses HTML5, MathML, SVG and CSS to construct a DOM based on the official W3C specification.
Hi. I'm trying to write some code to find an h1 on a page and get it's computed style. The code I've got works fine for getting styles that are inline in the HTML document but the styles that I have in an external style sheet don't seem to be applied.
Are external style sheets supported and is there something I need to do to enable external stylesheets, perhaps in the config? I have tried Configuration.Default.WithDefaultLoader(new LoaderOptions { IsResourceLoadingEnabled = true }).WithCss()
Hi, I"m trying out AngleSharp in F#, and experiencing a very strange with QuerySelectorAll
basically working only for the top-level body element, but not for any elements inside it. Here's an example:
> doc.QuerySelectorAll("body") |> Seq.tryHead |> Option.map (fun n -> n.TagName);;
val it : string option = Some "BODY"
> doc.QuerySelectorAll("div") |> Seq.tryHead |> Option.map (fun n -> n.TagName);;
val it : string option = None
>
This is for the documented loaded from this url.
HtmlBodyElement
has the correct BaseUrl
but the value of OutterHtml
is a mere "<body></body>"
. Hmm, that might be it.
But when I just tried with local content, it works:
let getDoc (htmlContent: string) =
let cfg = Configuration.Default.WithDefaultLoader()
let ctx = BrowsingContext.New(cfg)
async { return! ctx.OpenAsync(fun req -> req.Content(htmlContent) |> ignore) |> Async.AwaitTask } |> Async.RunSynchronously
[<EntryPoint>]
let main argv =
let doc = getDoc "<body><div>Hello</div></body>"
let cells = doc.QuerySelectorAll("div")
let titles = query {
for cell in cells do
select cell.TextContent
}
printfn $"Printing {Seq.length titles} titles"
for title in titles do
printfn $"Title: {title}"
0 // return an integer exit code
^ This works, which is good enough I guess, as I don't plan to load remote content anyway.
Is it possible to get the contents of a <template>
tag (so as to query on it)?
In JS, you would do it using the .content
property.
td
inside the <template>
tag of https://developer.mozilla.org/en-US/docs/Web/HTML/Element/template#examples
let tmpl = doc.QuerySelectorAll("template")
, which returns the template tag, but I can't drill down further than.
let templates = doc.QuerySelectorAll("template")
let titles =
query {
for tmpl in templates do
for p in tmpl.QuerySelectorAll("p") do
select p.OuterHtml
}
tmpl.GetContentFragment().QuerySelectorAll("p")
- but what would be GetContentFragment
?
tmpl.
GetContentFragment
is / should be. I guess you want Content
?
QuerySelectorAll
returns a collection of IElement
. I should cast that to IHtmlTemplateElement
QuerySelector
looks interesting - let me see how I can use it to pull multiple <template> tags
I was trying something like the following so as to obviate having to cast things latter:
let templates : IHtmlCollection<IHtmlTemplateElement> =
doc.QuerySelectorAll<IHtmlTemplateElement>("template")
But this just throws a
error FS0001: This expression
was expected to have type↔ 'IHtmlCollection<IHtmlTemplateElement>' ↔but here has type↔ 'Collections.Generic.IEnumerable<IHtmlTemplateElement>
System.Collections.Generic
. Polymorphism for the win.
As an aside, it seems that I'm reinventing XSLT, as it appears to do the 'templating' feature I'm trying to build from scratch: https://developer.mozilla.org/en-US/docs/Web/API/XSLTProcessor/Basic_Example
(My idea uses JSON as data input; but XSLT uses XML as data input, which might be more interesting when used with AngleSharp)