## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
• Oct 19 20:48
1ozturkbe commented #2355
• Oct 19 20:41
odow commented #2355
• Oct 19 20:33
1ozturkbe commented #2355
• Oct 19 19:43
odow commented #2355
• Oct 19 19:29
1ozturkbe commented #2355
• Oct 19 19:28
1ozturkbe commented #2355
• Oct 19 19:03
1ozturkbe commented #2355
• Oct 19 19:02
1ozturkbe commented #2355
• Oct 19 19:01
1ozturkbe commented #2355
• Oct 19 19:00
1ozturkbe commented #2355
• Oct 19 08:04
blegat edited #2356
• Oct 19 08:03
blegat opened #2356
• Oct 19 05:44
• Oct 19 05:25
chriscoey opened #1178
• Oct 18 18:32
odow commented #2355
• Oct 18 17:01
blegat commented #1021
• Oct 18 16:47

blegat on alpine

Add Alpine to solver list (compare)

• Oct 18 12:57
chriscoey edited #1021
• Oct 18 12:57
chriscoey edited #1021
• Oct 18 12:25

mlubin on gh-pages

build based on ba2313f (compare)

Henrique Becker
@henriquebecker91
@odow, updating CPLEX.jl would kinda solve all my headaches, but I have experiments to run and a paper to write, how much work do you think it is? I could contribute with one or two days of work, if it meant this would be solved next week. This would save me from testing workarounds, and give me a more solid paper, I just cannot delay my work too much, nor I am very knowlegeable in MOI internals, is something a beginner can help, or would take more work from you to babysit me than to your solve the problem alone?
mtanneau
@mtanneau

If you create a Gurobi model, then add (scalar) variables x, y, z, their respective indices are 1, 2, 3 (or maybe they start at 0, not sure, anyway...).
Delete variable y, and now the indices of x, z are 1, 2, which means to query the value of z you must query the value of the variable whose index is 2.
Gurobi only ever works with variable indices (unless you're using names but that's much slower), which means that you, as a Gurobi user, must keep track of which index corresponds to which variable, otherwise things go wrong.

In JuMP/MOI, if you create variables x, y, z, you get three objects x, y, and z, for which you can query whatever you want.
Delete y from the model, and the JuMP/MOI convention is that x, zstill allow you to access information about the corresponding variables, without having to make any modification to the objects x, z themselves. That means, the shift of index must be accounted for somehow. If you're using Gurobi directly, you're responsible for it. If you're using JuMP/MOI, then MOI is responsible for it.

Henrique Becker
@henriquebecker91
@mtanneau Yes, but something like @odow mentioned is possible, right? When you ask JuMP to delete a variable, JuMP does not really needs to delete a variable, it may just mark the variable as to-be-deleted in an internal structure, and everything that would need to query solver state flush this internal structure before doing what they would do. Basically, making JuMP to also have lazy updates, because being eager if a solver is lazy is problem, but being lazy if a solver is eager often is not a problem.
mtanneau
@mtanneau
@odow, is there such a thing as deleting several variables/constraints simultaneously, i.e., MOI.delete(model, x::Vector{MOI.VariableIndex})? (without using an individual delete fallback)
If so, it may be possible (hopefully without requiring to much work) to wrap GRBdelvars so that all variables are eliminated in bulk, thus resulting in a single update.
Just an idea, I haven't looked at how it would impact the re-building of wrapper-level data structures.
mtanneau
@mtanneau

JuMP does not really needs to delete a variable, it may just mark the variable as to-be-deleted in an internal structure

This is kind-of already implemented in the MOI wrapper of Gurobi: there is a needs_update flag that is updated when modifying the model.
What @odow suggested was to be less conservative as to when updates should be performed, and when not.
If you delete x, you don't need to update before deleting y. But if you add a constraint c, then you need to update before deleting y. However that requires extensive tracking of what's the stack of buffered modifications, and do I need to flush before performing some modification X. Sorting all this out is not impossible, it's just cumbersome (and someone has to do it and then maintain it)...

Oscar Dowson
@odow

how much work do you think it is?

It's a non-trivial amount of work.

It's probably not that much work to implement a "last operation was deletion" flag in the Gurobi wrapper, since this is the most expensive operation.

Miles Lubin
@mlubin
The juliaopt.org transfer is complete. Please let me know if anyone runs into trouble accessing the site.
Henrique Becker
@henriquebecker91
@mtanneau, @odow, so, if I submit a pull request to change MOI.delete of Gurobi.jl (https://github.com/JuliaOpt/Gurobi.jl/blob/master/src/MOI_wrapper.jl#L448) to always add the deletion to a queue, and add a new method to be called below _update_if_needed (called _flush_var_deletes) in all methods except delete, do you think it could be accepted? The del_vars! is already wrapped (https://github.com/JuliaOpt/Gurobi.jl/blob/2ba66a47a19050cca16f719eb1735a5620bb535a/src/grb_vars.jl#L125). I do not want to start at this if you people think it is a bad idea, but seems the best way to conciliate my needs and helping the project.
Oscar Dowson
@odow

It might be better to modify _require_update so that it is extensible to other operations:

@enum(MutatingOperation, VARIABLE_DELETION, OTHER)

function _require_update(
model::Optimizer,
operation::MutatingOperation = OTHER
)
model.needs_update = true
# You will need to add .last_mutating_operation to Optimizer
model.last_mutating_operation = operation
return
end

function _update_if_necessary(
model::Optimizer,
operation::MutatingOperation = OTHER
)
if model.last_mutating_operation == operation == VARIABLE_DELETION
# If the last operation was a variable deletion and this one is too,
# we can safely delay the update.
elseif model.needs_update
update_model!(model.inner)
model.needs_update = false
end
return
end

And then edit the MOI.delete function to pass VARIABLE_DELETION as the optional argument.

Does that make sense?

Henrique Becker
@henriquebecker91
Yes, it makes sense. I need to change update_model! to do all deletions if it is the case, right? Will start working on it now.
Oscar Dowson
@odow
I think you just need to update _require_update. and _update_if_necessary here: https://github.com/JuliaOpt/Gurobi.jl/blob/4bc1f91a398fd149c3c0c0fdb4288a5b08b46957/src/MOI_wrapper.jl#L449-L452
You don't need to change update_model!.
Henrique Becker
@henriquebecker91
Oh, ok, you intend to keep calling del_vars! over a single var, and then executing the loop over all variables each time. I was thinking of keeping the variables in a vector and then deleting them all with a single call to del_vars! and making a single sweep to correct the indexes.
I can do it in this simplified way you suggest, but if you think I am not adding too much complexity, I would prefer to buffer them for maximum performance.
Miles Lubin
@mlubin
What if someone alternates between deleting constraints and deleting variables?
Henrique Becker
@henriquebecker91
@mlubin In this case it will revert to bad performance we have now, independently if it is implemented as @odow suggested or how I have suggested. Ok, the way I have suggested will be slight worse because it will push! and pop! a vector too, but basically, both ways are just for optimizing multiple variable deletions in sequence. My way adds more code complexity but is more efficient when this happens (lots of variables are deleted in sequence).
Henrique Becker
@henriquebecker91
@mlubin As @mtanneau said, the problem is cumbersome. Making deleting constraints and variables alternately efficient is possible, but it would need to someone verify with much care what needs to be buffered and what not. In the other hand, the MutatingOperator framework that @odow suggested makes very simple to improve the performance of the same operation being called over multiple times in sequence.
Miles Lubin
@mlubin
I'd at least recommend describing the performance characteristics of the interface in the Gurobi.jl readme. It shouldn't be necessary to look at the source to know how to efficiently structure your code.
Henrique Becker
@henriquebecker91
It is a good idea. However, I have the sensation that if JuMP allowed for a specialized delete(model, vector of variable references) it would be intuitive that you have better (or the same, but never worse) performance if you call the specialized method for doing a chunk of work in a single go, than multiple repeated calls to the method that deals with one variable at a time.
Henrique Becker
@henriquebecker91
I am starting to suspect that the culprit of the bad performance is in fact CleverDict. There is a reason for it to use OrderedCollections? OrderedCollections code is seven years old and deprecated in favor of DataStructures.jl.
Oscar Dowson
@odow
You're probably looking at a different ordered collections://github.com/JuliaCollections/OrderedCollections.jl
Henrique Becker
@henriquebecker91
Yes, I was thinking of this one: https://github.com/peter1000/OrderedCollections.jl
Benoît Legat
@blegat

JuMP allowed for a specialized delete(model, vector of variable references)

You mean an easy way to do MOI.delete(backend(model), index.(var_refs)) ?

Henrique Becker
@henriquebecker91
@blegat There is a MOI.delete(Optimizer, array of var_refs)? The only one I found (in Gurobi MOI wrapper) is https://github.com/JuliaOpt/Gurobi.jl/blob/master/src/MOI_wrapper.jl#L448. But, yes, something like that, I suppose if you call what you suggested directly the JuMP model will become desynchonized with the state of the underlying solver, so JuMP would need to wrap that too, no?
Benoît Legat
@blegat

I suppose if you call what you suggested directly the JuMP model will become desynchonized with the state of the underlying solver

No, they will stay in sync. As Gurobi does not implement deletion of vector of variables, it will just fallback to deleting them one by one for Gurobi. Adding the method for Gurobi would make sense

Henrique Becker
@henriquebecker91
Good to know, maybe I will give a look at that too then, seems simple to provide such method for Gurobi MOI Wrapper, as the CCall to be used is already wrapped. I just want to finish my hunt for the reason of the bad performance deleting multiple variables in sequence. I have changed the MOI delete to just add the deleted variables in an array, and the deletion only occurs when any other method call update_if_needed, but the method continue to be slow.
Benoît Legat
@blegat
JuMP v0.20.1 will not be automatically be merged because: The following dependencies do not have a compat entry that has an upper bound: DataStructures, Calculus, NaNMath
Maybe we should add upper bounds
Miles Lubin
@mlubin
I don't see any reason not to add upper bounds.
Benoît Legat
@blegat
They might break their API which might break JuMP if we don't have any upper bound
Miles Lubin
@mlubin
Right
Benoît Legat
@blegat
Looking at DataStructures, they make a breaking release every 3-4 months we would force us to make a patch release with the same pace
Henrique Becker
@henriquebecker91
I wrote that code for buffering the deletions (it seemed to work, in the sense, using this version in my code did not crash, and gave the correct result): henriquebecker91/Gurobi.jl@7ed63c3 However, deleting continued to be slow. So I did what I should have done from start and profiled the code. There about four methods called delete in the traceback from JuMP to MathOptInterface to MathOptInterface (again) and finally to Gurobi (I just changed this last one, pertaining to Gurobi.jl), the main reason for the slowdown seems to be this line https://github.com/JuliaOpt/MathOptInterface.jl/blob/master/src/Utilities/functions.jl#L615 called from there https://github.com/JuliaOpt/MathOptInterface.jl/blob/master/src/Utilities/model.jl#L181 . I am not sure the problem is attribution of Gurobi.jl (should it be specializing something it isn't?), and I do not know how much work is to solve this real culprit of the performance degradation. I will probably solve my problem with one of the workarounds discussed. If someone think the change in Gurobi.jl is worth a PR, I can polish my code and make a PR, but I am not so sure it is worth anymore.
Benoît Legat
@blegat
That means that the major part of the deletion time is spent in updating the cache. The performance of the _rmvar might be improve, we have never tried to optimize it but the simple answer is do you need a cache ?
Oscar Dowson
@odow
If you don't have interval constraints, try model = JuMP.direct_model(Gurobi.Optimizer())
Benoît Legat
@blegat
If you use direct mode: https://www.juliaopt.org/JuMP.jl/v0.20.0/solvers/#Direct-mode-1, there will be nothing between you and Gurobi, every JuMP function will directly redirect to an MOI function of the Gurobi MOI wrapper, nothing in between
Oscar Dowson
@odow
Direct model skips the JuMP caching layer for people interested in performance at the cost of less generality
Benoît Legat
@blegat
If you do have interval constraints, you can still do model = JuMP.direct_model(MOI.Bridges.full_bridge_optimizer(Gurobi.Optimizer(), Float64))
Oscar Dowson
@odow
Do you have benchmark times for deleting before/after?
Henrique Becker
@henriquebecker91
Thanks for the support. I do not know if I need a cache (outside of copy, what I cannot do? I think I probably use the cache some way, but maybe just for convenience, is not exactly necessary), and I think I do not use interval constraints (can you point me to their reference, I am not sure if I know what they are). I will try the direct mode. I do not have benchmark times, but I can provide them. Unfortunately I will be busy for the next two hours. But after this I will work 2+ hours. So tell me about what info the benchmark should contain (what exactly do you want me to time?), and I can make one for you.
Henrique Becker
@henriquebecker91
The direct_model worked like a charm, now instead of five minutes it takes 0.01 seconds to delete all variables I wanted to delete (30k from 50k). I thought I used at least something from this cache, but seems none of my code relied in anything it provided. I now can benchmark if the delayed deletion actually is helpful or not, will do this next.
Oscar Dowson
@odow
It's great you got things working. Benchmarking the delayed deletion is useful to see if it's worth the additional code. However, the cache shouldn't take multiple orders of magnitude longer to delete things. So that would be worth improving
Henrique Becker
@henriquebecker91

Ok. I really need to polish the code and check if it really does not bug anything (I fear other methods may depend on the fact the deletions were eager and do not call update_if_needed when they should). The values are for a single (second) run with that instance (where I delete 30k variables from 50k total). You can ask me more info but the basics seem to be:
With my patch and direct_model:

• time to loop through and call all deletions: 0.001
• time to actually do the deletions (_flush_var_deletions): 0.018

Without my patch (but with direct_model too):

• there is only the time to loop though and call all deletions: 106.82

So... if you people can give me directions to make a full pull request and give a good testing if it does not break anything else, I would be happy. If the buffering is too clunky I may implement the delete(model, vector of variables) instead. But it would be good to me to be able to add this to the package, so I do not need to explain in my paper I had to use a patched version, XD.

Henrique Becker
@henriquebecker91
One thing I do not tested is how much better is the simpler way proposed by @odow (do not buffer, just do not call update each time). Seems clear it should be between the two tested alternatives (how it is today and using buffering), but where in this interval I have no idea.
Oscar Dowson
@odow
I'd test without the buffer. It looks like most of the time is in the Gurobi model update. Open a pull request against your branch and we can go from there.
Henrique Becker
@henriquebecker91
Ok. I do not know if I will have time to sort this today, but I think I will have time tomorrow. I was thinking of having two branches, with and without buffer, then benchmark both, and have the two as PRs to chose from. Or do you prefer to decide on the alternative before opening a PR?
Henrique Becker
@henriquebecker91
Edited my last message to remove it. The code needs more testing. It is giving wrong results.
Henrique Becker
@henriquebecker91

@mtanneau You said before "To ensure consistence of variable/constraint indices, the wrapper will perform that update every time a variable/constraint is deleted. Hence, deleting lots of variables/constraints is slow.
There's no simple fix because of the need to ensure that the indices will remain consistent: when deleting a variable, the indices of all subsequent variables are shifted in the Gurobi object, while the MOI indices are not. Thus the MOI<-->GRB index correspondence must be updated."

I have an implementation of @odow's non-buffered suggestion (https://github.com/JuliaOpt/Gurobi.jl/compare/master...henriquebecker91:hb/simple_lazy_update) and my implementation that buffers the deletes (https://github.com/JuliaOpt/Gurobi.jl/compare/master...henriquebecker91:hb/lazy_delete). My implementation of @odow's no-buffer suggestion gives me wrong results, clearly are not the right variables that are being deleted (the warm start is not recognized, and the best estimate is wrong, giving a suboptimal solution at end). Seems to me that updating the column fields in model.variable_info without calling update on the gurobi model will desynchronize them (as you said), so a no-buffer solution is not possible because to keep the model.variable_info.column and Gurobi columns synced between calls to delete we need to update the indexes on Gurobi after each deletion (what is being done right now, with bad performance), or keep track of which column indexes were deleted since last gurobi update to correct the indexes of new deletions (what my buffered solution ends up doing).

So, it is not that @odow's suggestion has not a "good enough" performance (in my tests it had not, it took 70s, but as the code is wrong I cannot be sure the time makes sense), but I do not see how to make it correct without adding the extra complexity the buffered version already adds. I see now too that my buffered version needs some extra work, because calls to is_valid will probably give "true" for already deleted variables, if no other method forced Gurobi to update.

Oscar Dowson
@odow
Your buffered solution sounds like the way to go then. I'll take a look when I get a chance. At a conference this week
Henrique Becker
@henriquebecker91

Unfortunately, it is far from perfect and I have no time to solve all the things it breaks. For it to work in all cases either: (1) we accept that model.variable_info and gurobi may be both outdated, and call _update_if_necessary in the beginning of basically any other method (even the ones that just query model.variable_info length or something like that, as update_if_necessary will update both model.variable_info and gurobi, not just gurobi); or (2) we accept model.variable_info and gurobi may not be synced, with model.variable_info keeping up to date with everything and gurobi may be outdated (kinda like it is currently), but there will be an extra structure to translate the "predicted column indexes" (the values they will have after the next call to update) which are inside model.variable_info to the "current column indexes" (the values used inside the Gurobi model while an update was not yet called). Anything that uses columns would need to apply this translation as last step before calling a wrapped gurobi method. Calling _update_if_necessary would clear this translating structure, and update the gurobi model, but leave model.variable_info alone (as it already has the predicted state for after the update).

It would be much simpler to JuMP to provide a delete(model, vector of variables) that atomically deletes a bunch of variables, and on Gurobi.jl implement this as a single call to del_vars and one to _update_if_necessary, and no need to save any extra state in the model. But this would need to at least create a new method in JuMP interface, providing a fallback implementation, and specializing in Gurobi to not use such fallback. I will use a workaround for now. If you people think my suggestion is worth implementing, then I can try to contribute to it, but I cannot rely on solving this the right way for now.