--cats
? have you experimented with that at all? For cats I would try different combinations of number of discrete actions used by the algorithm (passed in to the --cats arg) and bandwidths (bandwidth being a property of the continuous range). e.g. I would try a grid of num_actions [8, 16, 32, 64, 128, 256, 1024] and e.g. bandwidths [1, 2, 4, 6, 8, 10, 14, 20]. For different number of discrete actions you might need more data for CATS to converge to something sensible. CATS label support in pyvw should be available in the next release (coming soon-ish, we don't want to wait another year for the next vw release). Let me know if you get better results from CATS or not :)
prob(new_policy)/prob(logging_policy)
, but isn't this only for when we use IPS? I think I'm missing something quite obvious here...3.In order to use -explore_eval I have to convert my data from cb format to cb_adf format since the cb format is not supported when using -explore_eval. For the example data with two arms below, are the two ways to represent the data equivalent?:
2:10.02:0.5 | x0:0.47 x1:0.84 x2:0.29
1:8.90:0.5 | x0:0.51 x1:0.65 x2:0.67shared | x0:0.47 x1:0.84 x2:0.29
| a1
0:10.02:0.5 | a2shared | x0:0.51 x1:0.65 x2:0.67
0:8.90:0.5 | a1
| a2
cb_type ips/dm/dr
and choosing the one with the best reported loss. Isn't that wrong, especially considering dm is biased? --eval throws an error if you use DM.
OPE PR: VowpalWabbit/vowpalwabbit.github.io#193
@lalo @olgavrou et al, Note that I will need expert advice on this. There is a checklist that needs to be confirmed to absolute certainty or, if untrue, commented on to provide me with the correct interpretation.
Hi all, I had a problem when using SquareCB algorithm to train contextual bandit model, especially when saving & loading it again.
I trained and saved SquareCB model in this way (using the simulation setting as in https://vowpalwabbit.org/tutorials/cb_simulation.html):
vw = pyvw.vw("--cb_explore_adf -q UA -f squarecb.model --save_resume --quiet --squarecb")
num_iterations = 5000
ctr = run_simulation(vw, num_iterations, users, times_of_day, actions, get_cost)
plot_ctr(num_iterations, ctr)
vw.finish()
and then loaded the model :
vw_loaded=pyvw.vw('--cb_explore_adf -q UA -i squarecb.model')
num_iterations = 5000
ctr = run_simulation(vw_loaded, num_iterations, users, times_of_day, actions, get_cost, do_learn=False)
plot_ctr(num_iterations, ctr)
print(ctr[-1])
and the loaded model seems like doing a random exploration.
Could anyone explain how to save and load this model correctly? Thanks in advance.