by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 09:42
    fingertrick starred validator/validator
  • Jul 03 15:52
    boapps starred validator/validator
  • Jul 02 17:41
    sideshowbarker closed #938
  • Jul 02 17:41
    sideshowbarker commented #938
  • Jul 02 17:40
    sideshowbarker commented #994
  • Jul 02 15:38
    titouanmathis starred validator/validator
  • Jul 02 13:23
    surli closed #994
  • Jul 02 13:23
    surli commented #994
  • Jul 02 05:16
    JustMisha synchronize #967
  • Jul 01 23:03
    sideshowbarker labeled #993
  • Jul 01 23:02
    sideshowbarker commented #993
  • Jul 01 23:00
    sideshowbarker commented #938
  • Jul 01 17:02
    surli commented #994
  • Jul 01 16:44
    sideshowbarker commented #994
  • Jul 01 15:32
    surli commented #994
  • Jul 01 15:01
    sideshowbarker commented #994
  • Jul 01 15:01
    sideshowbarker commented #994
  • Jul 01 14:59
    surli commented #994
  • Jul 01 14:34
    sideshowbarker commented #938
  • Jul 01 14:33
    sideshowbarker commented #994
Peter Rushforth
@prushforth
@sideshowbarker thanks for this good info, it helps a lot! Cheers!
Michael[tm] Smith
@sideshowbarker
cheers @prushforth
Peter Rushforth
@prushforth
Hi @sideshowbarker , It worked but I had a chicken and egg problem. By only using the .h files as input, I kept getting a message that the element I had added (to nsHTMLTagList.h) was not unknown. I manually added it to the ElementName.java, with a hash of 0, and then was able to generate the appropriate hash code. So thanks for that boost. My next task will be more about how to convince the parser to create a mapml document instead of a html document. I will look at how it does that for an svg document, as I think it will need a similar approach.
I've been using TreePrinter.java as a harness for the parser.
Michael[tm] Smith
@sideshowbarker
@prushforth Glad to hear you’re making progress. I’m looking forward to seeing what you come up with
Peter Rushforth
@prushforth
@sideshowbarker Hi Mike, So I took a stab at adding the <mapml> tag as an alternative for the parser. That effort is not complete, as I believe there may be more invasive changes needed to integrate it fully. But I got it running and sort of parsing some test files I committed. The big issue currently is how to deal with namespaces, specifically because MapML is theoretically MicroXML, when parsed by the XML parser it seems to autogenerate xmlns="", or something like that. It's like the <!DOCTYPE html> is the HTML equivalent of xmlns="http://www.w3.org/1999/xhtml" for XML. Anyway I made a spreadsheet of results when running some sample files through the testing tools, maybe you could have a look at the results and see if you have any comment? https://github.com/prushforth/htmlparser/blob/validator-nu/Mapml_parsing_testcases.xlsx If you don't have time, no worries. I'll keep plugging away at it.
Cheers, Peter
Peter Rushforth
@prushforth
image.png
@sideshowbarker Hi Mike, So I've got the validator validating using my mapml.rnc (haven't got to schematron stuff yet). I think that system is pretty awesome, as I hardly had to change a thing to get that working. A weird thing is happening, though, and I'm curious. Even though I added an optional lang attribute to the root <mapml> element, I am getting a validation error from another schema / checker: "Attribute lang not allowed on element mapml at this point." I am pretty certain that's not a result of my .rnc file, because I tested it in oxygen. I am suspecting some xmlreader that is embedded in the process. Unfortunately, I can't debug the statement that is generating that error. I speculate its because it's in code generated by jing, but I am uncertain. It is being generated via a ImpossibleAttributeIgnoredException thrown by com.thaiopensource.relaxng.impl.PatternValidator, but the debugger refuses to go to that line during execution because it says there's no line number information associated to it. Anyway, the last statement I can get the debugger on is in AttributesPermutingXMLReaderWrapper.startElement, before it throws and emits messages. Do you have a notion of why there is a validator in the stack above that?
Michael[tm] Smith
@sideshowbarker
howdy @prushforth
Peter Rushforth
@prushforth
Hi!
Michael[tm] Smith
@sideshowbarker
but I think the should only be affecting elements in the HTML namespace
Peter Rushforth
@prushforth
Yeah, I'm kinda pretending mapml is in the html namespace.
Michael[tm] Smith
@sideshowbarker
ah
so that makes it even more like the langattributes code is the cause
that code causes lang attributes to be dropped from the parser stream under certain conditions before the RelaxNG checking gets performed
Peter Rushforth
@prushforth
Yes, that is a pretty good call I think. I also just discovered that I can avoid the suggestion to add the lang attribute if I set the nu.validator.checker.enableLangDetection system property and thus avoid the flipside warning that comes if you omit the attribute (but it's still in the schema as optional)
Michael[tm] Smith
@sideshowbarker
ah yeah
for your use case you probably don’t want any language detection to be performed anyway
Peter Rushforth
@prushforth
No I didn't but I wasn't averse to adding the attribute, but then it got a little hairy
Peter Rushforth
@prushforth
Doesn't seem to be https://github.com/validator/validator/blob/master/src/nu/validator/xml/langattributes/XmlLangAttributeDroppingContentHandlerWrapper.java as a breakpoint in the startElement method for that ContentHandler doesn't get called when validating a mapml document with <mapml lang="en">, perhaps because the schema isn't one of those designated by VerifierServletTransaction.isXmlLangAllowingSchema() as allowing xml:lang, and indeed it's lang not xml:lang . But I think I will just avoid the whole issue by not adding the attribute and chalk it up to something that will no doubt become clear in time. Thanks for your advice on this! Have a great Monday December 17th (my 58th birthday, which I am taking off as a national holiday).
Michael[tm] Smith
@sideshowbarker
oh!
happy 58th!
Peter Rushforth
@prushforth
Thanks!!
Michael[tm] Smith
@sideshowbarker
enjoy the day off
Peter Rushforth
@prushforth
Whoa, I set htmlParser.setMappingLangToXmlLang() from true to false and now it's validating correctly. I see in the HtmlAttributes class hixie has a note "Be careful with this class. QName is the name in from HTML tokenization." Which makes me feel like Young James Heriot (dinna mess wit sumat ye ken nawt aboot).
Michael[tm] Smith
@sideshowbarker
@prushforth heh. I wish I could claim I understand all that handling myself, but actually every time I need to mess around with it, I basically have to re-learn it all again — because I forgot whatever I had known about it from the previous time I touched it. And then after re-learning it I just forget it all again until the next time
Peter Rushforth
@prushforth
Lol
Peter Rushforth
@prushforth
@sideshowbarker Hi Mike Happy New Year! What is the division of labour between a checker and Assertions.java? Is there a category of task which deserves its own checker? Thanks.
Michael[tm] Smith
@sideshowbarker
Happy New Year @prushforth. I’ve not really followed any strict division, but the language-detector code is an example of something added more recently that seemed big enough on its own to merit a separate checker.
Going a little further back in time, another example is the Microdata checker
on the other hand, the RDFaLite checker is one that I made into a separate checker but that in hindsight I could have just as well incorporated into the Assertions.java code, I guess
…because it’s much simpler than the Microdata checker, and doing much fewer checks
the XmlPiChecker is another that was a big-enough and complicated-enough chunk of code on its own to merit a separate checker
Peter Rushforth
@prushforth
OK great, I will keep that info in mind. Good evening!
Mike
@skazx_twitter
hey all, is this room for a specific type of validator? I'm using express-validator.js and had a question about validating a string that is alphanumeric with spaces (street address).
Michael[tm] Smith
@sideshowbarker
@jaimeiniesta I finally set up some automation that makes a new jar available for every commit to master
Mavaddat Javid
@mavaddat
hi all I would like to build a child-friendly version of nu validator that uses more plain English and visuals to explain syntax mistakes to children learning to code
Patrycja Kazala
@QMG-kazala
Hi, is it possible to use nu validator in Java, from the code level? I found this: https://vzurczak.wordpress.com/2015/03/16/validating-a-html-page-with-java/ , which has a workaround, but I was hoping there is a more straightforward and more legitimate way.... Cheers
Michael[tm] Smith
@sideshowbarker
@QMG-kazala https://gist.github.com/vincent-zurczak/23e0f626eaafab96cb32 is the straightforward way to do it, actually — not just a workaround
Patrycja Kazala
@QMG-kazala
thx @sideshowbarker ! I've seen the corresponding blog post (https://vzurczak.wordpress.com/2015/03/16/validating-a-html-page-with-java/) but was hoping there might be a tidier way. I'll give it a try anyway as I'm testing different approaches to html5 validation at the mo :) thx again
Michael[tm] Smith
@sideshowbarker
@QMG-kazala yeah, I guess the https://github.com/validator/validator/blob/master/src/nu/validator/client/EmbeddedValidator.java code in the repo might be a better starting point than the lower-level way the person used by the person who wrote that blog post
I’ve never actually used that EmbeddedValidator.java code myself (a contributor added it), so I’m not completely certain it works as expected. But at least I can say from looking at the code, it looks like it should work
anyway, if you run into any specific problems, feel free to ping me here again
Michael[tm] Smith
@sideshowbarker
PSA: yesterday I made an change to the Dockerfile for the checker which reduces the image size from 130MB down to “just” 60MB
I smoke-tested the resulting image to make sure the change didn’t regress anything, but I otherwise did no testing beyond that — because I personally don’t actually use the Docker image for anything
so any of y’all who actually do use the Docker image, please try running the latest from Docker Hub and if you notice any problems, either lemme know here or else file an issue