These are chat archives for symengine/symengine

10th
Jun 2015
Sumith Kulal
@Sumith1896
Jun 10 2015 10:40
Could anybody fix the formatting of @vermaseren's last comment in this issue: sympy/symengine#396 ?
Sumith Kulal
@Sumith1896
Jun 10 2015 13:07
I want to write that benchmark he mentioned but having trouble figuring it out :smile:
Sumith Kulal
@Sumith1896
Jun 10 2015 14:04
While there is a good speed up when I use std::array, the benchmarks are slower when I use std::valarray, as compared to our standard vector which we use in polys currently, any insight?
Francesco Biscani
@bluescarni
Jun 10 2015 14:06
@Sumith1896 std::array does not allocate memory on the heap, std::valarray does.
Sumith Kulal
@Sumith1896
Jun 10 2015 14:07
@bluescarni Yes, so does vector. We are expecting a slight speed up over that, aren't we?
I am mostly following the citation mention here sympy/symengine#111.
Francesco Biscani
@bluescarni
Jun 10 2015 14:11
It depends what you are doing really. If your bottleneck is in memory allocation time, valarray will probably not bring you much over vector
Sumith Kulal
@Sumith1896
Jun 10 2015 14:15
Agreed. I didn't find enough areas where I could replace by valarrayspecifics.
I think memory is the bottleneck here.
Thanks @bluescarni
Francesco Biscani
@bluescarni
Jun 10 2015 14:16
No problem
I haven't used valarray much and there might be situations in which it's worth using it over vector, just something to keep in mind
even though most linear algebra and related libraries in C++ use expression templates rather than the syntactic sugar provided by valarray
Ondřej Čertík
@certik
Jun 10 2015 15:58
@Sumith1896 which comment do you want to fix?
Sumith Kulal
@Sumith1896
Jun 10 2015 15:58
It is fixed now.
Ondřej Čertík
@certik
Jun 10 2015 15:58
Ok!
Sumith Kulal
@Sumith1896
Jun 10 2015 15:59
The benchmark for valarray was done
Ondřej Čertík
@certik
Jun 10 2015 15:59
Yeah, looks like we should use std::vector or std::array. I was hoping valarray would bring some speedup, but I guess it doesn't.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:00
I tried std::array for the 4 variable benchmark
It is faster
but we need dynamic, don't we?
Ondřej Čertík
@certik
Jun 10 2015 16:01
I think Piranha has some optimizations, that it is using std::array for small number of elements std::vector for larger number.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:02
I thought that std::array was not flexible enough to be used. Will look into it more.
@Sumith1896 I would suggest to leave this for later. If we want speed, we'll use the exponent packing, that you started implementing. For everything else, we can use std::vector for now. It's good enough.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:06
Agreed
Ondřej Čertík
@certik
Jun 10 2015 16:07
Try to use the packing in the benchmark in #470.
Report the speedup, and we'll go from there.
We probably should introduce expand2c, which uses packing, and leave expand2b which uses std::vector.
So that we can see the speedup.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:08
Okay
Shivam Vats
@shivamvats
Jun 10 2015 16:08
http://stackoverflow.com/questions/1602451/c-valarray-vs-vector
The second comment. valarray is probably not worth the effort.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:10
@shivamvats Thanks, I'll have a look
Ondřej Čertík
@certik
Jun 10 2015 16:10
@shivamvats yeah, it looks like it's not worth the pain.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:11
@certik Won't the internals in rings change so as to accomodate packing? Or should I implement poly_mul separately?
Ondřej Čertík
@certik
Jun 10 2015 16:11
@Sumith1896 implement poly_mul separately for now
Sumith Kulal
@Sumith1896
Jun 10 2015 16:11
Okay
Ondřej Čertík
@certik
Jun 10 2015 16:11
essentially we need two types to store the polynomial -- once with packing, once with not packing.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:12
Yes
Ondřej Čertík
@certik
Jun 10 2015 16:12
There are lots of design decisions, for now concentrate on getting the expand2c benchmark working with packing, and see how fast it is.
We need to get as fast as Piranha, or pretty close. Only once we get there, we can start thinking how to best design it.
Sumith Kulal
@Sumith1896
Jun 10 2015 16:13
Is the same benchmark written in Piranha or should I send that in?
Sumith Kulal
@Sumith1896
Jun 10 2015 16:15
I'll have a look
Ondřej Čertík
@certik
Jun 10 2015 16:19
So I suggest the following plan:
1) Add expand2c in #470 that uses the packed exponents, report timing
2) Let's try to use piranha::integer instead of gmp_class for coefficients, we should see a speedup (i.e. merge #464 first)
3) Compare timings against fateman1_perf.cpp and fateman2_perf.cpp
Hopefully 2) is pretty close to 3). There are some further optimizations in Piranha, but I am hoping the above are the main ones. If it is close, then we'll talk about how to best design it.