These are chat archives for frictionlessdata/chat

24th
Aug 2017
Paul Walsh
@pwalsh
Aug 24 2017 04:32

Hey @georgeslabreche_twitter

1) Just to be sure, you mean infer, or, validate?
2) Good question. Let’s see how @danielfireman is doing this in Go.

roll
@roll
Aug 24 2017 09:40
@georgeslabreche_twitter In dynamic languages for infer we're trying to cast values to geojson (and other types). Yes it's kinda overkill but it's consistent across all types.
To improve performance we could e.g. before full validation here - https://github.com/frictionlessdata/tableschema-py/blob/master/tableschema/types/geojson.py - check some dict attributes. But as it was discussed earlier infer performance is more related to high-level algorithm. It's anyway ok to validate geojson 10 times but could be a problem if high-level algorithm require to validate it 1000 times for 1 infer run.
Daniel Fireman
@danielfireman
Aug 24 2017 14:03
@georgeslabreche_twitter @pwalsh 1) your approach seems fine, even though many times I also got myself windering if that or that other cast wasn't overkill 2) in go, there is mantra to be simple. So, my approach does not use inheritance. The cast is done in a switch block, which verifies the target tries it. If there is an error, ignore the attempt and stick with the previous successful one. I implemented two infer algorithms and benchmark them. What changed between the two algorithms is the choice of the next cast and that could make a massive difference in performance.
Let me know if you would like to have a video or audio conversation
@georgeslabreche_twitter let me know if you would like to have a video conference chat. I would be happy in going through the code and discuss design decisions.
Georges L J Labrèche
@georgeslabreche_twitter
Aug 24 2017 15:59
@danielfireman, I think I will keep it simple and do a switch block as you suggested