These are chat archives for symengine/symengine

5th
Jan 2016
Abhinav Agarwal
@abhinavagarwal07
Jan 05 2016 14:32
for code quality check we can also use grep commands which are less error prone and easy to write:
Example
if ! (egrep -r "\){" --include \*.h --include \*.cpp --include \*.inc*  $SOURCE_DIR/ ); then echo Good to go; else exit -1; fi
Ondřej Čertík
@certik
Jan 05 2016 15:42
@abhinavagarwal07 what does this egrep command check?
Trailing white space?
I noticed that our Python script doesn't in Python 3, so we would need to fix it.
Sumith Kulal
@Sumith1896
Jan 05 2016 15:46
This message was deleted
Sorry, got sent in from my phone :)
Isuru Fernando
@isuruf
Jan 05 2016 15:55
@certik, about #738, about my suggestion, it's not going to work as it is. It is possible via CRTP, but then if there is any Visitor that we want to let the user extend, then it would have to be in a .h file as CRTP uses templates
Ondřej Čertík
@certik
Jan 05 2016 16:04
@isuruf good job!
I think that the header file can only be included from one cpp file, so it won't slow down compilation, cannot it?
Isuru Fernando
@isuruf
Jan 05 2016 16:06
Yes, it won't slow down compilation
Ondřej Čertík
@certik
Jan 05 2016 16:09
Right, so it is still just a local implementation that the rest of the code doesn't know about.
I think that's the way to go.
@isuruf can you give #738 a final +1 on the PR?
Isuru Fernando
@isuruf
Jan 05 2016 16:10
StrPrinter is the one which the user can extend, but the user cannot extend StrPrinterFinal nor should the user extend methods like virtual void accept(StrPrinterFinal &v). I wanted to add final keyword to it, but that's not possible since there are some classes that are extended
Ondřej Čertík
@certik
Jan 05 2016 16:11
The AppVeyor is stuck, but other PRs will have to get updated once #738 is merged, so I don't want to hold other people any longer. If something broke on MSVC, we can fix it once AppVeyor is up. We are not using any new features of C++ that we didn't test on AppVeyor before.
I think all the accept methods are hidden via our macros.
so the user can't extend them anyway, since then the method would be declared twice.
So the way it works is that StrPrinter works with visitor pattern and the user can extend it, and the StrPrinterFinal hooks it up with just one virtual function call?
Isuru Fernando
@isuruf
Jan 05 2016 16:15
Close. What happens is StrPrinterFinal will use one virtual function call, but if the user extends StrPrinter it will use two virtual function calls
Ondřej Čertík
@certik
Jan 05 2016 16:16
So if the user/developer wanted to extend StrPrinter, I think one should be able to subclass it to MyStrPrinter (which would use two virtual function calls) and then create MyStrPrinterFinal and hook it up as well (which would use one virtual function call), shouldn't it?
Isuru Fernando
@isuruf
Jan 05 2016 16:16
Yes. if the user wants to change symengine
Ondřej Čertík
@certik
Jan 05 2016 16:17
Right. So the design is completely general.
I think we nailed it.
The only other possible issue that I can see is that the visitor pattern forces you to handle input and output arguments via internal variables because the accept methods have a set signature, while the single virtual function call allows any kind of function signature, since you have to extend basic-methods.inc anyway. I.e. the diff methods pass everything using arguments. The issue is that there might be some performance implications of passing everything via arguments vs internal variables. I am not sure which one is faster, but there might be some difference.
Isuru Fernando
@isuruf
Jan 05 2016 16:19
Yes, the best advantage is that if the user doesn't want to change symengine, then he/she can extend StrPrinter like in MyStrPrinter and it will automatically become a two virtual function call visitor
Abhinav Agarwal
@abhinavagarwal07
Jan 05 2016 16:20

@certik

@abhinavagarwal07 what does this egrep command check?
the above code checks if you have any instance of ){ (note no space between ) and {. similar checks can be made for no space after comma.
P.S. symengine/symengine#716 I did that using grep and sed only.
egrep -r ",[[:digit:]|[:alnum:]]" --include \*.h --include \*.cpp --include \*.in* .

Ondřej Čertík
@certik
Jan 05 2016 16:20
Exactly, that's the big advantage of the visitor pattern.
@abhinavagarwal07 I see, thanks.
Isuru Fernando
@isuruf
Jan 05 2016 16:22
@certik, do you see any speed difference in eval_double1 between master and my branch? I didn't see any significant improvement
Ondřej Čertík
@certik
Jan 05 2016 16:23
I didn't check your branch.
Since you overload the virtual method, you have to be very careful to ensure that only one virtual function call is being made, i.e. that the right overload is called.
@isuruf we should create an even simpler benchmark, where you do almost nothing in the visitor methods, and then we should be able to see a difference.
Isuru Fernando
@isuruf
Jan 05 2016 16:25
I'm a bit puzzled by that difference. Since in that commit, only the first call uses accept(EvalRealDoubleVisitor) and all later calls use two virtual function calls
Ondřej Čertík
@certik
Jan 05 2016 16:25
Right. It might be just a statistical fluke.
We should create a benchmark to measure this.
To make sure we are not fooling ourselves.
I know I was seeing difference between the visitor pattern and the single dispatch using an array.
Even yesterday, i.e. if I change eval_double to eval_double_single_dispatch, then the eval_double_single_dispatch version is slower.
So there is a difference there for sure.
We can create a benchmark that each visitor method adds an integer (each method a different integer) towards an internal variable.
That way each method still needs to be executed and we print the final sum.
But the overhead there should be just the virtual function calls.
Add and Mul would iterate over members. Or perhaps only Add. And then we would run this on a large Add of various things.
Or any other similar benchmark, we just need to make the work trivial (but impossible to optimize out) and expose the function calls overhead.
Isuru Fernando
@isuruf
Jan 05 2016 16:31
That's a good idea, but if the change only influence benchmarks with no computation and other benchmarks are not affected, then there is no point in changing the visitor pattern that is there now
Ondřej Čertík
@certik
Jan 05 2016 16:32
That is indeed true.
Should we wait with merging #738 for now?
I think it is safe to merge #738, because we can always port it to full visitor pattern later.
The only question is whether we need this intermediate step with one virtual function call. For now we are not sure, so we keep it. If it turns out that the difference is not measurable, we can use two virtual function calls.
Ondřej Čertík
@certik
Jan 05 2016 16:38
@isuruf let me know what you think.
Isuru Fernando
@isuruf
Jan 05 2016 16:43
I'm okay with merging #738 for now
Ondřej Čertík
@certik
Jan 05 2016 16:44
Ok, I just merged it.
Some of the other PRs need to be updated.
I'll try to see if we can see the speed difference in benchmarks.
Abhinav Agarwal
@abhinavagarwal07
Jan 05 2016 16:48
@certik I will do the merge tomorrow if that's ok with you.
Ondřej Čertík
@certik
Jan 05 2016 18:00
@abhinavagarwal07 no worries!
Sumith Kulal
@Sumith1896
Jan 05 2016 18:07
@certik This decision looks good. I need to look into it further to understand more.
By this it means I shall not proceed with symengine/symengine#734, shall I?
@isuruf Let me know too.
Ondřej Čertík
@certik
Jan 05 2016 18:09
@Sumith1896 your work prompted me to do #738.
Sumith Kulal
@Sumith1896
Jan 05 2016 18:09
@certik I realised :smile: For the good
I like this from the overview
Ondřej Čertík
@certik
Jan 05 2016 18:09
When we created the visitor pattern, I didn't know the #738 approach was possible. I just got the idea few days ago by seeing your PR.
And even then didn't know how to connect it to the visitor pattern --- it was @isuruf who figured that out.
Sumith Kulal
@Sumith1896
Jan 05 2016 18:11
I think if we have nailed it, you can close symengine/symengine#734
Having a look at this now symengine/symengine#634
Ondřej Čertík
@certik
Jan 05 2016 18:11
@Sumith1896 I think the next step now when we can easily test single and double virtual calls is to create benchmarks to test what the overhead is.
So that we can see in what situations it makes sense to use the single call (which does require adding a line to basic-methods.inc, so it is not completely local).
Sumith Kulal
@Sumith1896
Jan 05 2016 18:14
Yes, I just went through your idea above.
Sumith Kulal
@Sumith1896
Jan 05 2016 18:22
One off topic thing, why are my references showing as symengine/symengine#634 instead of #634?
Just wondering
Ondřej Čertík
@certik
Jan 05 2016 18:23
Because you posted the whole url, instead of just a # and a number.
Akash Trehan
@CodeMaxx
Jan 05 2016 18:27
I'm looking for the difference between Ptr<RCP> and RCP. Where websites should I be searching . Can't find anything promising on google.
Ondřej Čertík
@certik
Jan 05 2016 18:28
Akash Trehan
@CodeMaxx
Jan 05 2016 18:29
@certik Thanxx
Ondřej Čertík
@certik
Jan 05 2016 18:33
const Ptr<X> &x in an argument is used to return values to the caller. Honestly, I still prefer to use just X &x. Both is equally safe.
Sumith Kulal
@Sumith1896
Jan 05 2016 18:36
Oh man, I've been doing that a lot :smile:
Ondřej Čertík
@certik
Jan 05 2016 18:37
@Sumith1896 are you talking about the urls?
Sumith Kulal
@Sumith1896
Jan 05 2016 18:38
Yeah
@certik To get reproducible benchmark on Nonius, do we need tcmalloc all the time?
Ondřej Čertík
@certik
Jan 05 2016 18:49
@Sumith1896 it seems to me that to get reproducible benchmarks (with or without Nonius) we require TCMALLOC.
Actually, they are reproducible. It's just that the benchmarks depend on the order which you run them (with or without Nonius) and TCMALLOC seems to reduce this order dependence, since it seems to be managing the heap more efficiently.
Sumith Kulal
@Sumith1896
Jan 05 2016 18:52
@certik We need to add this in the README. We can do it in the PR itself else we can do it later too.
Ondřej Čertík
@certik
Jan 05 2016 18:53
In the long run, we might want to rather use C++ allocators, where we preallocate a chunk of memory (with either malloc) and then manage the small object allocations ourselves. Though this TCMALLOC seems to be performing very well for me at least.
I experimented a bit with allocators already, but it's quite a lot of work.
Sumith Kulal
@Sumith1896
Jan 05 2016 18:56
I'm playing with your PR. I'll let you know.
Akash Trehan
@CodeMaxx
Jan 05 2016 18:59
@Sumith1896 can you help me out a bit with passing to functions in #736. Have got most of it working.Stuck in a couple of places.
Sumith Kulal
@Sumith1896
Jan 05 2016 19:01
Yes. Will do. Give me some time, I'm in between #634. I'll get back to you.
Akash Trehan
@CodeMaxx
Jan 05 2016 19:07
Cool
Ondřej Čertík
@certik
Jan 05 2016 19:21

We'll have to revisit the whole SymEngine memory model, but I would do it once we have more real life experience with symengine and good real life benchmarks, since it's a lot of work and I want to optimize for how people actually want to use SymEngine, not for artificial benchmarks. Things to play with are:

  • removing internal members (e.g. we cache the hash, but that means more memory needs to be allocated for each instance)
  • removing reference counting -- we would always copy things by value and not have to use RCP at all (so you save on reference counting and can remove the refcount member, but copies are more expensive).
  • cheap copying (related to the previous point) if all our classes are trivially copyable, i.e. that you can use memcopy to create a copy of the class, then this might speed up lots of things. Add and Mul use std::map or std::unordered_map, neither of which is trivially copyable (i.e. it uses pointers internally), so we would have to figure out if it is possible to design a hashtable that is trivially copyable.
  • if the size of each instance (of classes like Add, Mul, Sin, Cos, ...) is predictable, then we should manage the memory ourselves by using C++ allocators. For each operation, like expand or diff, you predict (at least roughly) how much memory you need, then you allocate a chunk with one malloc, and then you fill in all the little instances into it sequentially. The idea is simple, but it's quite difficult in practice.
  • another idea is that since we already have our own type codes for each instance and since we are taking stuff out from the core and reimplementing it outside using the approach in #738, instead of using C++ classes, we can switch to just C structs (i.e. not storing pointers to the virtual functions tables in each instance) and implement the virtual functions ourselves. We already have our own implementation of the single dispatch in eval_double_single_dispatch and it is slower than double virtual functions in the visitor pattern (in eval_double), so this might be difficult to do. But the advantage would be an easier memory model that can be manipulated (created, copied, moved) faster.

So there is a lot that can be done in the future. What I would do now is try to use the approach of #738 and get as much stuff out of the core as we can. That will simplify any later refactoring.

Sumith Kulal
@Sumith1896
Jan 05 2016 19:26
Do we have any other methods that can be benefited from #738 kind of refactoring?
Ondřej Čertík
@certik
Jan 05 2016 19:46
@Sumith1896 we have subs.
Then expand_as_exp
Ondřej Čertík
@certik
Jan 05 2016 19:52
And then probably we should also do it for __eq__, __cmp__ and __hash__, I am not 100% sure yet, but it looks like it makes sense to do it.
That's it, there are no more virtual methods in Basic.
Sumith Kulal
@Sumith1896
Jan 05 2016 20:32
Cool, these are really important points you mention above. I have put up a wiki doc here https://github.com/symengine/symengine/wiki/SymEngine-memory-model
I like Nonius, you can merge the PR when ready. tcmalloc does bring in consistency.
@CodeMaxx What problems are you facing?
Akash Trehan
@CodeMaxx
Jan 05 2016 20:35
@Sumith1896 certik pointed out the error so its solved. But you can still review the function.
Sumith Kulal
@Sumith1896
Jan 05 2016 20:36
Cool :smile:
I'll review it
Ondřej Čertík
@certik
Jan 05 2016 20:42
@Sumith1896 thanks for putting my notes to the wiki.
Sumith Kulal
@Sumith1896
Jan 05 2016 20:43
NP. We'll need it.
Akash Trehan
@CodeMaxx
Jan 05 2016 21:21
@Sumith1896 can you take a look at this