These are chat archives for symengine/symengine

18th
Aug 2015
Isuru Fernando
@isuruf
Aug 18 2015 18:10
Before using catch, stacktrace went like place where exception was thrown, __cxa_throw, std::terminate, abort, ...
When the exception is caught, stack is at the point where the exception was caught, not the place where it was thrown
Shall I disable exception catching when bfd is turned on?
Ondřej Čertík
@certik
Aug 18 2015 21:00
@isuruf I would disable exception catching using a new argument in catch, something like --no-catch-exceptions or something. We would then use this flag on Travis.
Or as a first step, you can do it when bfd is on. We can refactor it as an option later.
Ultimately we should send it upstream as a PR.
Ondřej Čertík
@certik
Aug 18 2015 21:32
@Sumith1896 thanks for putting up the #597
Can you post timings on expand2c vs Piranha on your machine, and also how much time is spent in _normalise_polymul?
Is Piranha calling something like _normalise_polymul as well?
Sumith Kulal
@Sumith1896
Aug 18 2015 21:34
No, I have not looked as to how Piranha multiples hash_sets internally for single thread.
But I think Piranha doesn't have a analogue to _normalize_polymul
expand2c takes 22ms approx
Ondřej Čertík
@certik
Aug 18 2015 21:35
What exactly is _normalise_polymul doing? Is it only needed if you multiply polynomials of different symbols?
Sumith Kulal
@Sumith1896
Aug 18 2015 21:36
Yes, if the symbols differ
An example would be _normalize_mul converts {1, 2, 3} of x, y, z and {4, 5, 6} of p, q, r to {1, 2, 3, 0, 0, 0} and {0, 0, 0, 4, 5, 6} and _mul_hashset multiplies the two hash_set
Ondřej Čertík
@certik
Aug 18 2015 21:37
Right. So how much time is spent in this routine if the symbols do not differ?
Sumith Kulal
@Sumith1896
Aug 18 2015 21:37
14.5 ms approx
I'll time it exactly for you
Ondřej Čertík
@certik
Aug 18 2015 21:37
14.5ms is spent in _normalize_mul doing nothing?
And time Piranha on the same machine as well.
Sumith Kulal
@Sumith1896
Aug 18 2015 21:38
Oh yes, I can add a conditional to _normalize_polymul only if the symbols differ, that should speed up times
Give me five minutes
Ondřej Čertík
@certik
Aug 18 2015 21:39
Just comment it out, it shouldn't be needed for expand2c, that way you can tell how much time is spent there.
Sumith Kulal
@Sumith1896
Aug 18 2015 21:42
Yes, I'll comment it out and run.
Sumith Kulal
@Sumith1896
Aug 18 2015 22:08
@certik It takes 18ms for expand2c to run
Ondřej Čertík
@certik
Aug 18 2015 22:09
And Piranha?
and the _normalize_polymul?
Sumith Kulal
@Sumith1896
Aug 18 2015 22:10
I commented out _normalize_polymul
Piranha takes the same as before, average 13.5ms
Ondřej Čertík
@certik
Aug 18 2015 22:11
I know in the past you got the time down to 14.5ms
Sumith Kulal
@Sumith1896
Aug 18 2015 22:12
Yes, I think there are overheads
Maybe parameter passing
I am commenting the lines I doubt in GitHub, we'll converse there
Ondřej Čertík
@certik
Aug 18 2015 22:14
Just put the timings inside _mul_hashset and only benchmark that.
If it is not 14.5ms, then we need to get back to your old branch and start over and figure out what got slow.
Sumith Kulal
@Sumith1896
Aug 18 2015 22:14
Let me put it inside _mul_hashset
Ondřej Čertík
@certik
Aug 18 2015 22:23

Just put

auto t1 = std::chrono::high_resolution_clock::now(); 
...
auto t2 = std::chrono::high_resolution_clock::now(); 
std::cout
        << std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count()
        << "ms" << std::endl;

In _mul_hashset and rerun. That should be a simple change.

Sumith Kulal
@Sumith1896
Aug 18 2015 22:26
Weird, it is 17ms
This routine is same as the one I used previously, except I had to add one change
I am speculating there is some issue there
Ondřej Čertík
@certik
Aug 18 2015 22:27
Cool. So now let's concentrate on _mul_hashset and make it as fast as Piranha. Which change did you make there?
Sumith Kulal
@Sumith1896
Aug 18 2015 22:27
Previously we just had just
auto it = C._find(temp, bucket);
It gave me segfaults
so I made it
if (!C.empty())
       it = C._find(temp, bucket);
else
       it = C.end();
Ondřej Čertík
@certik
Aug 18 2015 22:29
When is C empty?
Sumith Kulal
@Sumith1896
Aug 18 2015 22:30
Yes initially we pass a empty hash_set, after 1 iteration it isn't
Ondřej Čertík
@certik
Aug 18 2015 22:31
But how is it that previously it ran for you fast and without segfault?
Is it that find() segfaults for an empty hash set? If so, you can check count and if it is 0, assign C.end() to it.
Sumith Kulal
@Sumith1896
Aug 18 2015 22:39
That's what I am wondering, previously it did not segfault, technically it shouldn't now
Ondřej Čertík
@certik
Aug 18 2015 22:39
Can you point me to your branch where you got the 14.5ms timing?
Sumith Kulal
@Sumith1896
Aug 18 2015 22:40
Give me a moment
This is the PR sympy/symengine#470
Ondřej Čertík
@certik
Aug 18 2015 22:47
Cool. Can you confirm that if you run expand2d in #470 that you get 14.5ms?
And then we just need to bisect it to figure out what change caused it.
Sumith Kulal
@Sumith1896
Aug 18 2015 22:49
Checking now
Sumith Kulal
@Sumith1896
Aug 18 2015 22:55
Yes, it still averages 14.5 ms
Ondřej Čertík
@certik
Aug 18 2015 23:05
Perfect.
Do you know how to bisect it now?
You can't use git bisect, but rather you need to look at the differences and bisect them in your mind, i.e. keep adding features from #597 into #470 until you see the slowdown, or vice versa.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:06
I'm still worried about the segfault
Ondřej Čertík
@certik
Aug 18 2015 23:07
The same with segfault.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:07
It doesn't happen in expand2d, even there we pass a empty hash_set
Ondřej Čertík
@certik
Aug 18 2015 23:07
I bet it's because you do not call n.rehash(10000); anymore.
I thought that was part of the speedup.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:08
Oh yes, how can I forget that
Ondřej Čertík
@certik
Aug 18 2015 23:08
Just put it there and see what it does.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:08
Maybe because, we now need a good way of guessing that 10000
Ondřej Čertík
@certik
Aug 18 2015 23:08
Don't worry about the guessing for now.
That's a separate issue.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:09
Yes
Ondřej Čertík
@certik
Aug 18 2015 23:09
Just put it there and tell me the timing.
with #597.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:09
Will do
Ondřej Čertík
@certik
Aug 18 2015 23:10
Then, as the next step, revert your "segfault fix" and see if it gets even faster.
Hopefully that's it.
I need to leave soon, would you mind sending me an email with the timings?
Sumith Kulal
@Sumith1896
Aug 18 2015 23:13
No problem, I'll mail it to you
Thanks for the help, that segfault was troubling me
Ondřej Čertík
@certik
Aug 18 2015 23:14
Maybe there is some other problem as well I don't know. Finally, do you remember why Piranha is still 1s faster?
Sumith Kulal
@Sumith1896
Aug 18 2015 23:15
Nothing that I recollect
Major optimizations have been incorporated
I'll discuss this with Francesco
Ondřej Čertík
@certik
Aug 18 2015 23:16
Ok. We can worry about the 1s later. For now, try to get #597 in a state that also runs in 14.5s, then we'll go from there.
Sumith Kulal
@Sumith1896
Aug 18 2015 23:18
Running in 14.5ms :)