Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 20 16:22
    wjnbreu starred validator/validator
  • Sep 19 20:18
    rujiel commented #868
  • Sep 19 20:17
    rujiel commented #868
  • Sep 19 07:37
    toptalo closed #867
  • Sep 19 07:37
    toptalo commented #867
  • Sep 19 05:38
    sideshowbarker unlabeled #867
  • Sep 19 05:38
    sideshowbarker labeled #867
  • Sep 19 05:38
    sideshowbarker commented #867
  • Sep 19 04:37
    toptalo commented #867
  • Sep 19 00:58
    sideshowbarker labeled #867
  • Sep 19 00:58
    sideshowbarker commented #867
  • Sep 19 00:48
    sideshowbarker labeled #868
  • Sep 19 00:45
    sideshowbarker commented #868
  • Sep 19 00:44
    sideshowbarker commented #868
  • Sep 19 00:44
    sideshowbarker commented #868
  • Sep 19 00:07
    rujiel edited #868
  • Sep 19 00:00
    rujiel opened #868
  • Sep 18 20:26
    netcarver starred validator/validator
  • Sep 18 12:11
    1jj commented #866
  • Sep 18 09:40
    toptalo edited #867
Michael[tm] Smith
@sideshowbarker
I will try later myself too
Peter Rushforth
@prushforth
cool, pretty neat code
Michael[tm] Smith
@sideshowbarker
@prushforth OK yeah so I tried it and it seems you need to have a clone of the mozilla-central repo
either that or you need these two header files from it:
@prushforth This should work:
curl -s -O https://hg.mozilla.org/mozilla-central/raw-file/tip/parser/htmlparser/nsHTMLTagList.h \
&& curl -s -O https://hg.mozilla.org/mozilla-central/raw-file/tip/dom/svg/SVGTagList.h \
&& java -cp jars/htmlparser.jar nu.validator.htmlparser.impl.ElementName nsHTMLTagList.h SVGTagList.h \
> ElementName.java.OUT
Michael[tm] Smith
@sideshowbarker
Then take the contents of that ElementName.java.OUT file, and in the src/nu/validator/htmlparser/impl/ElementName.java source file, replace everything from the // START GENERATED CODE line to end with the contents of ElementName.java.OUT
Michael[tm] Smith
@sideshowbarker
…and then re-comment the parts you commented out before you did that
Peter Rushforth
@prushforth
@sideshowbarker thanks for this good info, it helps a lot! Cheers!
Michael[tm] Smith
@sideshowbarker
cheers @prushforth
Peter Rushforth
@prushforth
Hi @sideshowbarker , It worked but I had a chicken and egg problem. By only using the .h files as input, I kept getting a message that the element I had added (to nsHTMLTagList.h) was not unknown. I manually added it to the ElementName.java, with a hash of 0, and then was able to generate the appropriate hash code. So thanks for that boost. My next task will be more about how to convince the parser to create a mapml document instead of a html document. I will look at how it does that for an svg document, as I think it will need a similar approach.
I've been using TreePrinter.java as a harness for the parser.
Michael[tm] Smith
@sideshowbarker
@prushforth Glad to hear you’re making progress. I’m looking forward to seeing what you come up with
Peter Rushforth
@prushforth
@sideshowbarker Hi Mike, So I took a stab at adding the <mapml> tag as an alternative for the parser. That effort is not complete, as I believe there may be more invasive changes needed to integrate it fully. But I got it running and sort of parsing some test files I committed. The big issue currently is how to deal with namespaces, specifically because MapML is theoretically MicroXML, when parsed by the XML parser it seems to autogenerate xmlns="", or something like that. It's like the <!DOCTYPE html> is the HTML equivalent of xmlns="http://www.w3.org/1999/xhtml" for XML. Anyway I made a spreadsheet of results when running some sample files through the testing tools, maybe you could have a look at the results and see if you have any comment? https://github.com/prushforth/htmlparser/blob/validator-nu/Mapml_parsing_testcases.xlsx If you don't have time, no worries. I'll keep plugging away at it.
Cheers, Peter
Peter Rushforth
@prushforth
image.png
@sideshowbarker Hi Mike, So I've got the validator validating using my mapml.rnc (haven't got to schematron stuff yet). I think that system is pretty awesome, as I hardly had to change a thing to get that working. A weird thing is happening, though, and I'm curious. Even though I added an optional lang attribute to the root <mapml> element, I am getting a validation error from another schema / checker: "Attribute lang not allowed on element mapml at this point." I am pretty certain that's not a result of my .rnc file, because I tested it in oxygen. I am suspecting some xmlreader that is embedded in the process. Unfortunately, I can't debug the statement that is generating that error. I speculate its because it's in code generated by jing, but I am uncertain. It is being generated via a ImpossibleAttributeIgnoredException thrown by com.thaiopensource.relaxng.impl.PatternValidator, but the debugger refuses to go to that line during execution because it says there's no line number information associated to it. Anyway, the last statement I can get the debugger on is in AttributesPermutingXMLReaderWrapper.startElement, before it throws and emits messages. Do you have a notion of why there is a validator in the stack above that?
Michael[tm] Smith
@sideshowbarker
howdy @prushforth
Peter Rushforth
@prushforth
Hi!
Michael[tm] Smith
@sideshowbarker
but I think the should only be affecting elements in the HTML namespace
Peter Rushforth
@prushforth
Yeah, I'm kinda pretending mapml is in the html namespace.
Michael[tm] Smith
@sideshowbarker
ah
so that makes it even more like the langattributes code is the cause
that code causes lang attributes to be dropped from the parser stream under certain conditions before the RelaxNG checking gets performed
Peter Rushforth
@prushforth
Yes, that is a pretty good call I think. I also just discovered that I can avoid the suggestion to add the lang attribute if I set the nu.validator.checker.enableLangDetection system property and thus avoid the flipside warning that comes if you omit the attribute (but it's still in the schema as optional)
Michael[tm] Smith
@sideshowbarker
ah yeah
for your use case you probably don’t want any language detection to be performed anyway
Peter Rushforth
@prushforth
No I didn't but I wasn't averse to adding the attribute, but then it got a little hairy
Peter Rushforth
@prushforth
Doesn't seem to be https://github.com/validator/validator/blob/master/src/nu/validator/xml/langattributes/XmlLangAttributeDroppingContentHandlerWrapper.java as a breakpoint in the startElement method for that ContentHandler doesn't get called when validating a mapml document with <mapml lang="en">, perhaps because the schema isn't one of those designated by VerifierServletTransaction.isXmlLangAllowingSchema() as allowing xml:lang, and indeed it's lang not xml:lang . But I think I will just avoid the whole issue by not adding the attribute and chalk it up to something that will no doubt become clear in time. Thanks for your advice on this! Have a great Monday December 17th (my 58th birthday, which I am taking off as a national holiday).
Michael[tm] Smith
@sideshowbarker
oh!
happy 58th!
Peter Rushforth
@prushforth
Thanks!!
Michael[tm] Smith
@sideshowbarker
enjoy the day off
Peter Rushforth
@prushforth
Whoa, I set htmlParser.setMappingLangToXmlLang() from true to false and now it's validating correctly. I see in the HtmlAttributes class hixie has a note "Be careful with this class. QName is the name in from HTML tokenization." Which makes me feel like Young James Heriot (dinna mess wit sumat ye ken nawt aboot).
Michael[tm] Smith
@sideshowbarker
@prushforth heh. I wish I could claim I understand all that handling myself, but actually every time I need to mess around with it, I basically have to re-learn it all again — because I forgot whatever I had known about it from the previous time I touched it. And then after re-learning it I just forget it all again until the next time
Peter Rushforth
@prushforth
Lol
Peter Rushforth
@prushforth
@sideshowbarker Hi Mike Happy New Year! What is the division of labour between a checker and Assertions.java? Is there a category of task which deserves its own checker? Thanks.
Michael[tm] Smith
@sideshowbarker
Happy New Year @prushforth. I’ve not really followed any strict division, but the language-detector code is an example of something added more recently that seemed big enough on its own to merit a separate checker.
Going a little further back in time, another example is the Microdata checker
on the other hand, the RDFaLite checker is one that I made into a separate checker but that in hindsight I could have just as well incorporated into the Assertions.java code, I guess
…because it’s much simpler than the Microdata checker, and doing much fewer checks
the XmlPiChecker is another that was a big-enough and complicated-enough chunk of code on its own to merit a separate checker
Peter Rushforth
@prushforth
OK great, I will keep that info in mind. Good evening!
Mike
@skazx_twitter
hey all, is this room for a specific type of validator? I'm using express-validator.js and had a question about validating a string that is alphanumeric with spaces (street address).
Michael[tm] Smith
@sideshowbarker
@jaimeiniesta I finally set up some automation that makes a new jar available for every commit to master
Mavaddat Javid
@mavaddat
hi all I would like to build a child-friendly version of nu validator that uses more plain English and visuals to explain syntax mistakes to children learning to code