These are chat archives for NJsonSchema/NJsonSchema

27th
Aug 2016
Rico Suter
@RSuter
Aug 27 2016 02:00
Have you done some perf profiling to see where its slow? I currently dont use the validation much (mainly the schema and code generator for http://nswag.org) and the validation part should be easy to parallelize... We just have to synchronize the error result data structure... Then you can parallelize the validation methods... But because the val works recursively, only adding parallel.foreach is not enough... And yes, we have to be careful - parallel programming is not easy.
Alexey Shytikov
@shytikov
Aug 27 2016 05:46
I haven't done any serious investigations so far, but my current problems is that NJsonSchema is used for validations of incoming web-server requests... and for requests with arrays of objects it gets noticeably slower...
Will try to dig for more information
It seems to be that .NET4 project is using the same files that portable project
and most of parallel optimizations will not work in portable I guess
what would be your approach in optimizing only .NET4 code?
I mean...
should I use conditional compilation symbols in common code
or I need to create .NET4 specific implementation?
Alexey Shytikov
@shytikov
Aug 27 2016 06:21
Stop... LEGACY is for .NET4 code?
so we're moving from .NET4 to Portable?
Rico Suter
@RSuter
Aug 27 2016 08:30
I think it will be the other way: Only the PCL lib supports async etc and .NET4 is missing the apis... Id focus on the pcl and add as less as possible conditional code
Alexey Shytikov
@shytikov
Aug 27 2016 08:31
got it
Alexey Shytikov
@shytikov
Aug 27 2016 09:08
First intermediate results not pleasing... But predictable — parallel processing overhead will decrease performance for smaller batches...
Rico Suter
@RSuter
Aug 27 2016 09:08
what are you using to parallelize? Parallel.ForEach?
Alexey Shytikov
@shytikov
Aug 27 2016 09:09
yes, I've tried it as well, but it seems to be not available in (portable)
I'm still new to this target... Portable... need to investigate more
then I tried to create Tasks
and almost the same result
Rico Suter
@RSuter
Aug 27 2016 09:11
i think, if you reference the "System.Threading.Tasks.Parallel" assembly in the PCL you can use Parallel.ForEach
what have you parallelized?
Alexey Shytikov
@shytikov
Aug 27 2016 09:11
but is it allowed in PCL?
Rico Suter
@RSuter
Aug 27 2016 09:11
maybe only the parallelize the array item validations might be faster
Alexey Shytikov
@shytikov
Aug 27 2016 09:12
I was working on ValidateType and ValidateArray methods
Rico Suter
@RSuter
Aug 27 2016 09:12
Alexey Shytikov
@shytikov
Aug 27 2016 09:13
OK, so it's supported...
Rico Suter
@RSuter
Aug 27 2016 09:13
it seems..
Alexey Shytikov
@shytikov
Aug 27 2016 09:13
sorry, I'm new to PCL... like fist time in my life :)
Rico Suter
@RSuter
Aug 27 2016 09:13
but I'm not sure if it is really faster than tasks and task.whenall
Alexey Shytikov
@shytikov
Aug 27 2016 09:14
yes, but since I was not able to use Parallel.ForEach, I decided to try that as well
but I will try it once again
Rico Suter
@RSuter
Aug 27 2016 09:14
yes, it seems more clean to me
Alexey Shytikov
@shytikov
Aug 27 2016 09:14
I agree
give me couple minutes :)
Rico Suter
@RSuter
Aug 27 2016 09:14
i think we should use foreach and not use async and task here to avoid lots of context switches
no hurry :)
maybe you have to check the size of an array item and parallize only if it is big or if array size > 10
Alexey Shytikov
@shytikov
Aug 27 2016 09:15
but it's hardcoding...
and it depends on complexity of the rules
I believe it should be more flexible
or intelligent...
Rico Suter
@RSuter
Aug 27 2016 09:16
I'm also not sure what parallelize does: if it creates a thread per item then this might also be a problem, it should use at max e.g. 5 threads in parallel and process everything in these threads sequentially
Answer has a link to the document with explanation where and which libraries supported
Parallel namespace is not available for SilverLight and WP as far as I understood
Rico Suter
@RSuter
Aug 27 2016 12:19
It depends on the pcl profile, sl and wp8 are currently not supported any
anyways
Alexey Shytikov
@shytikov
Aug 27 2016 12:19
I've tried to remove them, but VS tells me something stupid...
let me copy
---------------------------
Microsoft Visual Studio
---------------------------
The project's targets cannot be changed. The selected targets require the project to opt-into NuGet 3.0 support, however, Visual Studio cannot automatically do this for you. Please uninstall all NuGet packages and try again.
---------------------------
OK   
---------------------------
Rico Suter
@RSuter
Aug 27 2016 12:20
We h
When you do what?
Alexey Shytikov
@shytikov
Aug 27 2016 12:20
Try to remove SL and WP from targets
Rico Suter
@RSuter
Aug 27 2016 12:21
They are on?
Alexey Shytikov
@shytikov
Aug 27 2016 12:21
yes
Rico Suter
@RSuter
Aug 27 2016 12:21
One moment lls
Pls
Rico Suter
@RSuter
Aug 27 2016 12:53
ok, i created the fork "Parallelize" clone this and try with it...
i think we cannot use Parallelize.ForEach
Alexey Shytikov
@shytikov
Aug 27 2016 12:58
but we should, we we target .NET 4.0+ and .NET Core only
I mean we should be able to use it, if SL and WP is not priority
Rico Suter
@RSuter
Aug 27 2016 13:01
Hmm i think using tasks is the same... (See my branch, you should fark it and work with that..) do you see some improvements?
Alexey Shytikov
@shytikov
Aug 27 2016 13:01
let me check
Alexey Shytikov
@shytikov
Aug 27 2016 13:15
It became worse, actually, I have array of 200 objects (in production it's about 1000), with master version it processed around 40 milliseconds, with new Paralelized it's around 60
the problem that I don't know what could be another source of improvements...
Rico Suter
@RSuter
Aug 27 2016 13:23
I dont think you can make it faster with parallelization here... You should profile the sync code and see where we could for example add caches..
So you want to optimize these 40ms
Maybe you can split these 200 in 50 batches and run 4 tasks in parallel? Avoid creating and running too many tasks..
Alexey Shytikov
@shytikov
Aug 27 2016 13:27
Yes, probably this would be the only way
Another idea, it's related to performance, but not strictly
what do you think about having option validate until first error?
Rico Suter
@RSuter
Aug 27 2016 13:28
So your problem is this 40ms duration?
Yes, good idea... But for valid objects the perf will not improve...
Alexey Shytikov
@shytikov
Aug 27 2016 13:29
yes, but that could be fixed other way
for example, maybe I need to simplify schema
maybe I'm checking too much :)
I will try to run profiling on the app, maybe I would be able to identify what was taking so long
for some reason, on production with real data (1000 entries per batch) validation takes about minute
Rico Suter
@RSuter
Aug 27 2016 13:32
Hmm strange..
Alexey Shytikov
@shytikov
Aug 27 2016 13:32
but, this also includes, network delays, overall load
Rico Suter
@RSuter
Aug 27 2016 13:32
Ah ok
Alexey Shytikov
@shytikov
Aug 27 2016 13:32
and objects probably huge
will try to find out more
Rico Suter
@RSuter
Aug 27 2016 13:32
Maybe the bottleneck is somewhere else?
But yes try to reproduce it in a local app and repeat it, the perf profile it and you should see where it takes the most time...
Alexey Shytikov
@shytikov
Aug 27 2016 13:33
it's a web-server, acting like fire and forget you throw 1000 records, if they are valid, it starts async execution
so only synchronous thing is validation