AngleSharp is the ultimate angle brackets parser library. It parses HTML5, MathML, SVG and CSS to construct a DOM based on the official W3C specification.
let tmpl = doc.QuerySelectorAll("template")
, which returns the template tag, but I can't drill down further than.
let templates = doc.QuerySelectorAll("template")
let titles =
query {
for tmpl in templates do
for p in tmpl.QuerySelectorAll("p") do
select p.OuterHtml
}
titles
is empty seq, so the "p" querying didn't work, which isn't surprising I guess, because I'm suppose to get the fragment inside tmpl
and then query on it.
tmpl.GetContentFragment().QuerySelectorAll("p")
- but what would be GetContentFragment
?
tmpl.
GetContentFragment
is / should be. I guess you want Content
?
QuerySelectorAll
returns a collection of IElement
. I should cast that to IHtmlTemplateElement
QuerySelector
looks interesting - let me see how I can use it to pull multiple <template> tags
I was trying something like the following so as to obviate having to cast things latter:
let templates : IHtmlCollection<IHtmlTemplateElement> =
doc.QuerySelectorAll<IHtmlTemplateElement>("template")
But this just throws a
error FS0001: This expression
was expected to have type↔ 'IHtmlCollection<IHtmlTemplateElement>' ↔but here has type↔ 'Collections.Generic.IEnumerable<IHtmlTemplateElement>
System.Collections.Generic
. Polymorphism for the win.
As an aside, it seems that I'm reinventing XSLT, as it appears to do the 'templating' feature I'm trying to build from scratch: https://developer.mozilla.org/en-US/docs/Web/API/XSLTProcessor/Basic_Example
(My idea uses JSON as data input; but XSLT uses XML as data input, which might be more interesting when used with AngleSharp)
Does anyone know why the sample
static async Task Main()
{
var jsService = new JsScriptingService();
var config = Configuration.Default.With(jsService);
var context = BrowsingContext.New(config);
var source = "<!DOCTYPE html><html><head><title>Test</title><script>function f1(el) { document.body.appendChild(el); }</script></head><body><h1>Some example source</h1><p>This is a paragraph element.</p><script>var em1 = document.createElement('em'); em1.textContent = 'This is some emphasized text.'; f1(em1);</script></body></html>";
var document = await context.OpenAsync(req => req.Content(source));
Console.WriteLine("Serializing the (original) document:");
Console.WriteLine(document.DocumentElement.OuterHtml);
var f1 = jsService.GetOrCreateJint(document).GetValue("f1");
var bElement = document.CreateElement("b");
bElement.TextContent = "This is some bold text.";
var jsB = JsValue.FromObject(jsService.GetOrCreateJint(document), bElement);
f1.Invoke(jsB);
Console.WriteLine("Serializing the document again:");
Console.WriteLine(document.DocumentElement.OuterHtml);
}
gives a JavaScript exception
Jint.Runtime.JavaScriptException
HResult=0x80131500
Message=
Source=jint
StackTrace:
at Jint.Native.Function.ScriptFunctionInstance.Call(JsValue thisArg, JsValue[] arguments)
at Jint.Native.JsValue.Invoke(JsValue thisObj, JsValue[] arguments)
at Jint.Native.JsValue.Invoke(JsValue[] arguments)
on the f1.Invoke(jsB)
call? I use release 0.15 of AngleSharp and 0.14 of AngleSharp.Js
AngleSharp.Url
class supposed to be exposed to JavaScript as the URL
(https://developer.mozilla.org/en-US/docs/Web/API/URL) constructor? I am trying to load an HTML document using JavaScript with AngleSharp 0.15 and AngleSharp.Js 0.14 and get a script error "ReferenceError: URL is not defined" from jint. Is there anything required or any way possible to have URL
defined in JavaScript based the AngleSharp.Url
class?
Hello there, regarding AngleSharp.CSS: Is it possible to configure the CssParser to accept invalid values? From what I can see this used to be a configuration option, but not anymore? I know I can do this:
new CssParser(new CssParserOptions
{
IsIncludingUnknownDeclarations = true,
IsIncludingUnknownRules = true,
IsToleratingInvalidSelectors = true
})
but nothing related to IsToleratingInvalidValues
BrowserContextExtensions.OpenAsync
does it only parse the document or also load scripts, stylesheets, and set cookies? I don't need anything more than parsing.
WithDefaultLoader
a LoaderOptions
with IsNavigationDisabled
set to true
and IsResourceLoadingEnabled
set to false
. Is that enough to have it only download and parse the HTML document? I don't need or want anything else.