These are chat archives for bluescarni/pagmo_reborn

27th
May 2016
Dario Izzo
@darioizzo
May 27 2016 08:27
So, I disappeared for a while as I was hitting against a rather nasty bug ans stuff .... I think I got rid of it ... but here is the problem: how do we test algorithms?
In the old PAGMO DE on rosenbrock 10 -> average perf. is 0.98:
``````In [106]: prob
Out[106]:
Problem name: Rosenbrock
Global dimension:            10
Integer dimension:            0
Fitness dimension:            1
Constraints dimension:            0
Inequality constraints dimension:    0
Lower bounds: [-5, -5, -5, -5, -5,  ... ]
Upper bounds: [1, 1, 1, 1, 1,  ... ]
Constraints tolerance: []

In [107]: algo
Out[107]: Algorithm name: Differential Evolution - gen:500 F: 0.8 CR: 0.9 variant:2 ftol:1e-06 xtol:1e-06

In [108]: res = list()

In [109]: for i in range(1000):
pop = population(prob, 20)
pop = algo.evolve(pop)
res.append(pop.champion.f[0])
.....:

In [110]: mean(res)
Out[110]: 0.98316032875280235``````
Francesco Biscani
@bluescarni
May 27 2016 08:31
what do you mean exactly by testing the algorithm? performance wise, correctness? number of fevals?
I just finished fixing the slowdown reported yesterday:
``````In [8]: time pygmo.problem(pygmo.rosenbrock(dim = 1000))
CPU times: user 0 ns, sys: 132 µs, total: 132 µs
Wall time: 139 µs
Out[8]:
Problem name: Multidimensional Rosenbrock Function
Global dimension:                       1000
Fitness dimension:                      1
Number of objectives:                   1
Equality constraints dimension:         0
Inequality constraints dimension:       0
Lower bounds: [-5, -5, -5, -5, -5, ... ]
Upper bounds: [1, 1, 1, 1, 1, ... ]

Has hessians: false
User implemented hessians sparsity: false

Function evaluations: 0

In [9]: time pygmo.problem(pygmo.translate(pygmo.rosenbrock(dim = 1000),a))
CPU times: user 0 ns, sys: 447 µs, total: 447 µs
Wall time: 454 µs
Out[9]:
Problem name: Multidimensional Rosenbrock Function [translated]
Global dimension:                       1000
Fitness dimension:                      1
Number of objectives:                   1
Equality constraints dimension:         0
Inequality constraints dimension:       0
Lower bounds: [-4, -4, -4, -4, -4, ... ]
Upper bounds: [2, 2, 2, 2, 2, ... ]

Has hessians: false
User implemented hessians sparsity: false

Function evaluations: 0

Extra info:

Translation Vector: [1, 1, 1, 1, 1, ... ]``````
Dario Izzo
@darioizzo
May 27 2016 08:33
The same computation on the new PaGMO helds:
``````Running 1 test case...
de: 1.08959

*** No errors detected
./sade  25.91s user 0.17s system 99% cpu 26.092 total``````
I mean they are not the same algorithm anymore .... how do we deal with this?
Francesco Biscani
@bluescarni
May 27 2016 08:36
I don't understand the question... do you want to test the performance?
Dario Izzo
@darioizzo
May 27 2016 08:36
I have no idea why, nor how to check or test or debug .... this is going to be a problem for all algorithms to be implemented. I am actually thinking to blame the random number generator ... algoritmically to me they are the same!
Francesco Biscani
@bluescarni
May 27 2016 08:37
so the difference is between 1.08 and 0.98?
Dario Izzo
@darioizzo
May 27 2016 08:37
And its a problematic one.
Not one of CS though as the code compiles runs and has no leaks :)
Its not a question. Its a statement
yes over 1000 runs. If i repeat I get the same numbers so its statistically very significant
Francesco Biscani
@bluescarni
May 27 2016 08:39
I wouldn't care about a 10% performance difference at this stage
you compiled everything in release I presume?
Dario Izzo
@darioizzo
May 27 2016 08:39
yes, and that is not a performance issue (not time) that is the obj fun
Francesco Biscani
@bluescarni
May 27 2016 08:40
I am still not udnerstanding what the problem is
I thought you were worried that it is slower?
Dario Izzo
@darioizzo
May 27 2016 08:40
No, its a different algo.
For unknown reasons
This message was deleted
Francesco Biscani
@bluescarni
May 27 2016 08:41
sorry but it's kinda hard to deduce what you mean by what you pasted in the chat... so the issue it that the number of objfun evaluations is different?
Dario Izzo
@darioizzo
May 27 2016 08:42
Experiment-> call DE(500) 1000 times on rosenbrock(10), record the result
In PaGMO legacy I get 0.98, In new PaGMO I get 1.1
Same algorithm in theory, different results
Explanations:
1 - a small bug somewhere (how to find it?)
2 - the rng (WTF)
3 - Its all random anyway
Francesco Biscani
@bluescarni
May 27 2016 08:43
how would you understand if it is a "bug" in the old or new pagmo?
Dario Izzo
@darioizzo
May 27 2016 08:43
Exactly!!!
Thats the fucking issue
with these random meta heuristic bullshit
How?
I mean, I did my best in both cases but is that "scientific?"
Francesco Biscani
@bluescarni
May 27 2016 08:44
and the result in the old pagmo is always the same regardless of the RNG? do you have seed control i nthe old pagmo?
Dario Izzo
@darioizzo
May 27 2016 08:45
In both cases I do not control the seed. but in old pagmo fibonacci stuff is used as m_e, we only have marsenne twister in the new
Anyway, my point is:
how do we communicate to the user the expected performance of the algorithms in PaGMO?
Francesco Biscani
@bluescarni
May 27 2016 08:46
well if you are comparing different code, and you don't know upfront how to compare the result because you don't know what the result is supposed to be in the first place, the comparison looks rather pointless
but hey, I live in a more deterministic world usually :)
Dario Izzo
@darioizzo
May 27 2016 08:47
exactly, on the other hand is it not important to know whether what we call DE in PaGMOreborn has any relation to what we called DE in legacy ?
Francesco Biscani
@bluescarni
May 27 2016 08:48
I mean, is it really a meaningful measure to give to the user that the old pagmo after N generations of DE gives a specific result on a specific problem? does it have any relation to the general behaviour of the algo?
Francesco Biscani
@bluescarni
May 27 2016 08:49
my instinctive reaction would be that if there are no order of magnitude differences we are good
in any case if the RNGs are different all hopes are off to reproduce the exact behaviour
Dario Izzo
@darioizzo
May 27 2016 08:52
hence my desperation
Francesco Biscani
@bluescarni
May 27 2016 08:52
the mersenne twister (at least in the C++11 implementation) is guaranteed to produce a certain sequence of values, but I don't know if it applies also to the Boost one
I would not worry
Dario Izzo
@darioizzo
May 27 2016 08:53
My worry is that I implement all the DE algos now to then find out they have worse performance w.r.t. the old PaGMO
Anyway, I am thinking to have an ERT module earlier than I was planning (ERT is the Expected Run Time, a metric that is quite solid for this type of things and that is computed from multiple trials of an algo)
Francesco Biscani
@bluescarni
May 27 2016 08:55
the runtime is tricky though as it will depend on many things outside pagmo
Dario Izzo
@darioizzo
May 27 2016 08:56
Its defined in terms of objective function evaluatios needed
They call it ERT
Francesco Biscani
@bluescarni
May 27 2016 08:56
right ok
Dario Izzo
@darioizzo
May 27 2016 08:56
Francesco Biscani
@bluescarni
May 27 2016 09:01
I hope we don't have to modify the problem any more, it's always a pain to keep things in sync between C++ and Python
anyway now the problem should be fully exposed
Dario Izzo
@darioizzo
May 27 2016 09:02
you mean the has_sparsity thingy?
Francesco Biscani
@bluescarni
May 27 2016 09:02
yes I just pushed that
Francesco Biscani
@bluescarni
May 27 2016 09:37
``````In [4]: p.set_seed(0)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-f72c15c60d88> in <module>()
----> 1 p.set_seed(0)

RuntimeError:
function: set_seed_impl
where: /home/yardbird/repos/PaGMOreborn/pygmo/../include/problem.hpp, 375
what: The set_seed method has been called but not implemented by the user.
A function with prototype 'void set_seed(unsigned int)' was expected in the user defined problem.``````
probably we want to trim down these error messages when they are visible from both C++ and Python
Dario Izzo
@darioizzo
May 27 2016 09:38
why?
Francesco Biscani
@bluescarni
May 27 2016 09:38
the last line specifically
because it does not make sense from python
there's no `void` and no `unsigned` in python
Dario Izzo
@darioizzo
May 27 2016 09:43
right ... pity though I liked to give the prototype in the message, I felt PRO
Francesco Biscani
@bluescarni
May 27 2016 09:44
I think it's fine as long as we have decent documentation and tutorials
I don't expect one to code a problem by trial and error
Dario Izzo
@darioizzo
May 27 2016 09:44
``````In [4]: p.set_seed(0)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-f72c15c60d88> in <module>()
----> 1 p.set_seed(0)

RuntimeError:
function: set_seed_impl
where: /home/yardbird/repos/PaGMOreborn/pygmo/../include/problem.hpp, 375
what: The set_seed method has been called but not implemented by the user.
C++: A method with prototype 'void set_seed(unsigned int)' was expected in the user defined problem.
Python: A method with prototype 'set_seed(self, seed)' was expected in the user defined problem.``````
Francesco Biscani
@bluescarni
May 27 2016 09:45
I have re-done it anyway, as the current version did not check whether the `has_set_seed()` has been overridden or not... this was the current behaviour:
``````In [14]: class prob(object):
def fitness(self,v):
return [v[0]*v[0]]
def get_bounds(self):
return ([0],[1])
def set_seed(self,n):
pass
def has_set_seed(self):
return False
....:

In [15]: p = pygmo.problem(prob())

In [16]: p.set_seed(seed=2)``````
could do that
Dario Izzo
@darioizzo
May 27 2016 09:46
This is soo cool ...
I am sure even Luis will love it!
Francesco Biscani
@bluescarni
May 27 2016 09:47
it's pretty awesome yeah
the kwargs are going to be useful especially for all those params in the evolutionary algos
Dario Izzo
@darioizzo
May 27 2016 09:48
Indeed, in the old PaGMO was a pain to rewrite all constructors to have kwargs
thousand of monkey lines of coding
Why has_set_seed is false?
why did you put it I mean in the example above?
Francesco Biscani
@bluescarni
May 27 2016 09:49
because I wanted to check if the override was working
Dario Izzo
@darioizzo
May 27 2016 09:49
but then you p.set_seed ?
that is not throwing?
Francesco Biscani
@bluescarni
May 27 2016 09:50
yes that was the problem I was alluding to above
Dario Izzo
@darioizzo
May 27 2016 09:50
ah ok .. also in c++ ?
Francesco Biscani
@bluescarni
May 27 2016 09:50
yes
Dario Izzo
@darioizzo
May 27 2016 09:50
k
If I stay one day away form these classes I am lost :)
Hopefully they are done
Francesco Biscani
@bluescarni
May 27 2016 09:51
``````In [1]: import pygmo

In [2]: class prob(object):
def fitness(self,v):
return [v[0]*v[0]]
def get_bounds(self):
return ([0],[1])
def set_seed(self,n):
pass
def has_set_seed(self):
return False
...:

In [3]: p = pygmo.problem(prob())

In [4]: p.set_seed(seed=2)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-8dac1e1205f4> in <module>()
----> 1 p.set_seed(seed=2)

RuntimeError:
function: set_seed
where: /home/yardbird/repos/PaGMOreborn/pygmo/../include/problem.hpp, 1043
what: the user-defined problem does not support seed setting: either it does not provide the 'set_seed()' method or its 'has_set_seed()' method returns false

The expected prototypes for 'set_seed()' are:
C++: 'void set_seed(unsigned int)'
Python: 'set_seed(self, seed)'

In [5]:``````
better?
Dario Izzo
@darioizzo
May 27 2016 09:52
nice
Francesco Biscani
@bluescarni
May 27 2016 10:36
we should really look into becoming part of this
Francesco Biscani
@bluescarni
May 27 2016 10:56
it's pretty cool, we can refer the breathe doc from the Python side of doc. For instance, the current docstring of the problem class:
``````The main problem class.

>>> from pygmo import problem, rosenbrock
>>> p = problem(rosenbrock(dim=5))
>>> p.fitness([1,2,3,4,5])
array([ 14814.])

this creates the correct link to `pagmo::problem` in the generated sphinx
Dario Izzo
@darioizzo
May 27 2016 11:49
numfocus .. yes
lets
@CoolRunning did you investigate on the making pagmo a no profit?
Marcus Märtens
@CoolRunning
May 27 2016 12:00
Hey - not yet. The guy who could know these things unfortunately got his PhD terminated so he is rarely here anymore :D
Francesco Biscani
@bluescarni
May 27 2016 13:08
I started looking a bit the multiprocessing stuff on windows, so my understanding has a bit evolved
I believe now that we can actually make the multiprocessing work on recent Python 3.x without too many issues across linux and windows
talking about Python >= 3.4 here
it looks like they uniformed the way the processes are spawned
for anything earlier, I still believe we need the jupyter/ipyparallel support
which makes it slightly less user friendly
with the jupyter stuff it basically means that pagmo cannot start external processes on its own: you need to fire up the worker processes either from the command line, or from the jupyter notebook interface
Francesco Biscani
@bluescarni
May 27 2016 13:14
so before running any parallel stuff in pagmo, you need to do this extra step (which is a single command from the command line or a click from the notebook interface)
Dario Izzo
@darioizzo
May 27 2016 13:14
I do not get it
What do I need to do in jupyter?
Francesco Biscani
@bluescarni
May 27 2016 13:14
the upshot is that this works also over the network (with, of course, slightly more configuration steps due to the security setup)
do you have a notebook open?
Dario Izzo
@darioizzo
May 27 2016 13:15
In a moment
I do
Francesco Biscani
@bluescarni
May 27 2016 13:15
http://localhost:8888/ you should see a "clusters" tab there
Dario Izzo
@darioizzo
May 27 2016 13:16
I do
Francesco Biscani
@bluescarni
May 27 2016 13:16
does it show anything there for you?
Dario Izzo
@darioizzo
May 27 2016 13:16
no
"Clusters tab is now provided by IPython parallel. See IPython parallel for installation details."
Francesco Biscani
@bluescarni
May 27 2016 13:16
ok I guess we need to install and setup ipyparallel in some way, since now it's an external module
but a few months ago they were together
basically you can start worker kernels with a mouse click from that tab
and then pagmo uses those to do the parallel thingie
Dario Izzo
@darioizzo
May 27 2016 13:17
So if I have an archi with 1024 island I need to click 1024 times?
Francesco Biscani
@bluescarni
May 27 2016 13:17
no you can put a number of kernels in a textbox
Dario Izzo
@darioizzo
May 27 2016 13:18
ah ok
so I start them, then pagmo uses them ... well I am curious to see how this will work ...
and in ipython?
no need?
or its the same and we need to issue a command?
Francesco Biscani
@bluescarni
May 27 2016 13:18
same thing.. you can start them from the notebook interface or from the commandline
Dario Izzo
@darioizzo
May 27 2016 13:18
commandline of ipython?
or of jupyter?
or both?
Francesco Biscani
@bluescarni
May 27 2016 13:19
it's a seaprate project called ipyparallel, so you need to call that with some option in order to start the workers
`ipcluster nbextension enable`
try this and then restart the notebook
Dario Izzo
@darioizzo
May 27 2016 13:20
try where? bash?
python?
Francesco Biscani
@bluescarni
May 27 2016 13:20
bash
Dario Izzo
@darioizzo
May 27 2016 13:20
Francesco Biscani
@bluescarni
May 27 2016 13:20
`pacman -S ipyparallel` maybe?
anyway now I see the interface to start workers from the notebook
`ipcluster start` I think this one starts as many workers as you have processors on the machine
yes
Dario Izzo
@darioizzo
May 27 2016 13:22
``````Traceback (most recent call last):
File "/usr/bin/ipcluster", line 2, in <module>
from ipyparallel.apps.ipclusterapp import launch_new_instance
File "/usr/lib/python3.5/site-packages/ipyparallel/__init__.py", line 10, in <module>
import zmq
ImportError: No module named 'zmq'``````
Francesco Biscani
@bluescarni
May 27 2016 13:22
so pacman does not install all the deps of ipyparallel?
weird
Dario Izzo
@darioizzo
May 27 2016 13:23
its not pacman
Francesco Biscani
@bluescarni
May 27 2016 13:23
ah I see
Dario Izzo
@darioizzo
May 27 2016 13:23
this thing is not supported in pacman
its in AUR
yaourt
Francesco Biscani
@bluescarni
May 27 2016 13:23
but don't you use pacman as well for stuff in non-official repos?
I believe I used a non official repo with pacman to install the gitlab ci runner
but I don't really know :)
Dario Izzo
@darioizzo
May 27 2016 13:24
``````  File "/usr/bin/ipcluster", line 2, in <module>
from ipyparallel.apps.ipclusterapp import launch_new_instance
File "/usr/lib/python3.5/site-packages/ipyparallel/__init__.py", line 16, in <module>
from .serialize import *
File "/usr/lib/python3.5/site-packages/ipyparallel/serialize/__init__.py", line 2, in <module>
from .serialize import (
File "/usr/lib/python3.5/site-packages/ipyparallel/serialize/serialize.py", line 20, in <module>
from jupyter_client.session import MAX_ITEMS, MAX_BYTES
ImportError: No module named 'jupyter_client'``````
?? I have jupyter
Francesco Biscani
@bluescarni
May 27 2016 13:25
did you install jupyter via pacman? maybe you need to update? they had a big renaming/splitting of the ipython/jupyter project recently
Dario Izzo
@darioizzo
May 27 2016 13:26
ok updateing .... che ROTTA DE CAZZO
Francesco Biscani
@bluescarni
May 27 2016 13:26
:D
luckily once we are in conda it should work okish
Dario Izzo
@darioizzo
May 27 2016 13:26
stasera ci sei per qualche ludoata?
o se lavora?
Francesco Biscani
@bluescarni
May 27 2016 13:27
I am always there, it's you who are "throwing garbage cans" :)
Dario Izzo
@darioizzo
May 27 2016 13:27
:) macche ieri ho sbattuto la testa contro DE e compagni ..... poi ho risolto
stasera spero di smettere di scrivere cazzate per le 9 - 10 :)
Francesco Biscani
@bluescarni
May 27 2016 13:30
wanna play eve?
Dario Izzo
@darioizzo
May 27 2016 13:30
ci sei rimasto sotto?
posso provare .... l' hardware lo ho :)
Francesco Biscani
@bluescarni
May 27 2016 13:31
ho loggato oggi, c'era una mail di un mese fa dei vecchi compari che mi davano dritte su dove erano al momento, nel caso volessi tornare
Dario Izzo
@darioizzo
May 27 2016 13:32
che devo scaricare da steam?
o e separato?
Francesco Biscani
@bluescarni
May 27 2016 13:35
separato
c'e' una versione per steam ma e' na merda
Dario Izzo
@darioizzo
May 27 2016 13:36
poi il primo mese e aggratisse vero?
Francesco Biscani
@bluescarni
May 27 2016 13:36
yes
point is
sara' un'esperienza molto diversa da dota2
Dario Izzo
@darioizzo
May 27 2016 13:37
immagino, il problema e eriuscire a renderla enjoyable con un budget di 1-2 ore a sera
Francesco Biscani
@bluescarni
May 27 2016 13:39
vedremo
Dario Izzo
@darioizzo
May 27 2016 15:25
so, I now have the cluster thingy
Francesco Biscani
@bluescarni
May 27 2016 15:25
cool stuff
Dario Izzo
@darioizzo
May 27 2016 15:26
sarcastic bastard
Francesco Biscani
@bluescarni
May 27 2016 15:26
what do you want me to tell you :)
Dario Izzo
@darioizzo
May 27 2016 15:26
like what do I do with it?
Francesco Biscani
@bluescarni
May 27 2016 15:27
nothing at the moment... if you run `ipcluster start` it starts a pool of N workers, N is the number of cores
then from pagmo we can use those things as processes in which to run the evolution
Dario Izzo
@darioizzo
May 27 2016 15:28
What happens if I start an evolve on 340 island and I have only 320 workers?
Francesco Biscani
@bluescarni
May 27 2016 15:28
this depends on how we code it... it's the same question as for threads
in the old pagmo we would start 340 threads or processes
Dario Izzo
@darioizzo
May 27 2016 15:28
Francesco Biscani
@bluescarni
May 27 2016 15:29
but I was thinking of going for a thread pool
in the old pagmo yes
Dario Izzo
@darioizzo
May 27 2016 15:29
in this case someone needs to make sure workers are opened manually before right?
Francesco Biscani
@bluescarni
May 27 2016 15:29
yes, if there's no workers open the evolve will fail
Dario Izzo
@darioizzo
May 27 2016 15:30
so we can message "Not enough workers are available - spawn a few more"
Francesco Biscani
@bluescarni
May 27 2016 15:30
I would not do that, I'd rather have a queue of evolutions, like I plan to do with thread
Dario Izzo
@darioizzo
May 27 2016 15:30
Why not 1 evolution 1 thread?
Uno vale uno!! M5S
Francesco Biscani
@bluescarni
May 27 2016 15:31
it would still be 1 thread for evolution, but instead of opening one you use one from the thread pool
Dario Izzo
@darioizzo
May 27 2016 15:31
Francesco Biscani
@bluescarni
May 27 2016 15:31
if all threads are busy, it ends up in a queue and it gets consumed as the threads finish up their tasks
Dario Izzo
@darioizzo
May 27 2016 15:32
but then someone has to decide how many threads to use right?
Francesco Biscani
@bluescarni
May 27 2016 15:32
rather than spawning a new thread every time you have a pool of threads that are waiting for data to process
yes, byt default you create the pool with the same number as the cores on the system
it's what I do in Piranha
Dario Izzo
@darioizzo
May 27 2016 15:32
k
the word Piranha ends the conversation :)
Francesco Biscani
@bluescarni
May 27 2016 15:33
waking up a sleeping thread costs less than spawning a new one, but it will change the algorithm in some way
Dario Izzo
@darioizzo
May 27 2016 15:33
I thought threads are quite efficient in spawning time
Francesco Biscani
@bluescarni
May 27 2016 15:34
it depends on the definition of efficient I guess, but I am not married to the idea of the pool
Dario Izzo
@darioizzo
May 27 2016 15:34
I actually like it... I am just asking
Francesco Biscani
@bluescarni
May 27 2016 15:34
you can probably start 10^3 thread per seconds as an order of magnitude
Dario Izzo
@darioizzo
May 27 2016 15:34
The advantage is that I can control the number of threads in use by PaGMO
Like I can have an archi with 1000 islands simulated on 16 threads
Francesco Biscani
@bluescarni
May 27 2016 15:35
yes that would work
we would probably need to add some type of shuffling on the threadpool queues
Dario Izzo
@darioizzo
May 27 2016 15:35
right now we would flood the OS wth threads instead which may become an overhead for large archis
Francesco Biscani
@bluescarni
May 27 2016 15:36
yep
in the simplest case, you have a queue of tasks shared by all the threads.. the work package (the evolve) goes into the queue and gets consumed
if you do archi.evolve(), you are enqueueing evolves in the order the islands appear in the archipelago
so the last island will start later (unless we use barriers or stuff like that)
but we could randomise the order in which the tasks are selected for being consumed in the queue
Dario Izzo
@darioizzo
May 27 2016 15:38
but if I do have enough threads they will all start at the same time right?
Francesco Biscani
@bluescarni
May 27 2016 15:39
well there's always the fact that, no matter what you do, but the enqueuing will be sequential
if's a for loop inside the archi evolve
even if you spawn new threads
so there's always some time that passes between the first island's evolve() call and the last island's
Dario Izzo
@darioizzo
May 27 2016 15:40
Francesco Biscani
@bluescarni
May 27 2016 15:40
you remember that we put those barriers in the old pagmo
Dario Izzo
@darioizzo
May 27 2016 15:40
I think here we added a barier
it was needed during the migration study to track migration and test
Francesco Biscani
@bluescarni
May 27 2016 15:41
yes because there were much more islands than cores
Dario Izzo
@darioizzo
May 27 2016 18:31