Sorry, couldn't attach the img to my prev message thread
A problem:
for some reasons i get n+1
size of pmf for data with n
distinct action.
Details:
When i do training on cb_explore_adf
for datapoints only with 3 actions (no features apart from shared|...
) and supply one of these examples for testing (action:cost:proba
removed obviously), i get 4 actions in the output file. Why it might be?
I trained the contextual bandit as vw1 = pyvw.vw("-d data/cb_load.dat --cb_explore_adf -q UA -P 1 --invert_hash mymodel.inverted") on https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/test/train-sets/cb_load.dat with the --inverted-hash and I got the mymodel.inverted
I could understand the user and action features. What does 18107:0.137426 . means in "User^time_of_day=afternoonAction^article=politics:18107:0.137426" ?
I think 18107 is the hash value for "User^time_of_day=afternoonAction^article=politics" and 0.137426 is weight. I don't know if this is correct?
How can I get the probability corresponding to the user and action features from the weights?
Version 8.11.0
Id
Min label:-1
Max label:0
bits:18
lda:0
0 ngram:
0 skip:
options: --cb_adf --cb_explore_adf --cb_type mtr --csoaa_ldf multiline --csoaa_rank --quadratic UA
Checksum: 2033437909
event_sum 113
action_sum 791
:0
User^time_of_day=afternoonAction^article=politics:18107:0.137426
User^user=Tom:32581:-0.0636371
User^user=TomAction^article=politics:38087:-0.0636749
Action^article=politics:52568:-0.110663
User^time_of_day=morningAction^article=music:58967:0.224528
User^user=AnnaAction^article=politics:62875:0.0165196
User^time_of_day=afternoon:65137:-0.0253498
Action^article=music:67569:-0.0505464
User^time_of_day=afternoonAction^article=food:67793:0.121444
User^time_of_day=morningAction^article=politics:77054:-0.192732
User^user=AnnaAction^article=music:81714:0.297336
Action^article=sports:86811:0.0540273
User^user=TomAction^article=music:89710:-0.101787
User^user=AnnaAction^article=sports:93144:0.0540273
Action^article=food:99122:0.121444
User^time_of_day=afternoonAction^article=music:101394:-0.190187
User^user=AnnaAction^article=food:113649:0.0457554
Constant:116060:-0.0947514
User^time_of_day=afternoonAction^article=sports:121080:0.0540273
User^user=TomAction^article=food:121517:0.109427
Action^article=camping:134640:0.0742112
User^user=Anna:141841:0.0966574
User^time_of_day=afternoonAction^article=health:144204:0.0344906
User^user=TomAction^article=camping:152687:0.0742112
User^user=AnnaAction^article=health:163948:0.0344906
Action^article=health:178351:0.0971796
User^user=TomAction^article=health:188720:0.09161
User^time_of_day=morningAction^article=health:219401:0.09161
User^time_of_day=morning:243586:0.0320462
User^time_of_day=morning*Action^article=camping:257110:0.0742112
Hi All,Is there any way to add importance weight to off line training of the contextual bandits? Similar to linear regression where we specify importance weight as 2 in the training example " 1 2 second_house | price:.18 sqft:.15 age:.35 1976".
This will help in reducing the training time of contextual bandits as the training data points are in billions. But we get good reduction if we use importance weight as data points are repeated.
--lambda
or --epsilon
value and then load with another?
Hello all,
I'm NG Sai, final year UG @ IIIT Sri City. I got to know about Microsoft RLOS programme via LinkedIn. My experience with open source includes contributing to C++ ML libraries such as: shogun, tensorflow-lite support and, mlpack where I've done GSoC'21 and currently serve as a member. My github.
I came across the Safe Contextual Bandits. Is this topic taken for this summer or will it be available for the year 2022. My forte is implementing algorithms from research papers so I wanted to inquire about this.
Thanks in advance!
vw -h
) but didn't find and it's neither in the documentation. I'm interesting in knowing the values for gamma_scale (I believe I saw in the presentation that it's set to 1000 but would be good to confirm) and gamma_exponent.vw.get_arguments()
on the model also doesn't show the default values.ccb shared | s_1 s_2
ccb action | a:1 b:1 c:1
ccb action | a:0.5 b:2 c:1
ccb action | a:0.5
ccb action | c:1
ccb slot | d:4
ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
[warning] Unlabeled example in train set, was this intentional?
, I didn't understandslot
argument? ccb slot | d:4
what does it mean? Does it mean this is the fourth slot? In L7, what's happening? + the error term. Would be really grateful for an explanation, thanks
In CCB, what's the difference between the weight of an example vs it's cost. I thought weight is just an inverse of cost?
I tried feeding the training data from CCB page, and the weight seems to be equal to the example counter, is that intentional?
Union [ example, Costs]
instead of using kwargs
, should I go by that method too then?
from_example(..)
if example
type is what it claims to be. No?
Version 8.10.1
Id
Min label:-1
Max label:1
bits:18
lda:0
0 ngram:
0 skip:
options: --cb_adf --cb_type dr --csoaa_ldf multiline --csoaa_rank
Checksum: 4264491651
event_sum 0
action_sum 0
:0
s^age:7950:0.108916
:7951:0.242307
s^year:39846:-1.6944
:39847:-1.82654
I had a few questions if you guys don't mind.
a) Are you guys conducting RLOS this year? Not that I'd stop contributing if you aren't, but it certainly gives motivation to contribute :)
b) On the topic of AutoML, I read the wiki and some merged PRs. I’d like to contribute, how can I help since its quite volatile right now, my PRs maybe slow so I don’t want to end up being the bottleneck. Other project which interests me: Python model introspection, CB in python.
c) I’m currently in a 6 month internship, but I think RLOS allows for this to be part-time? Would that be okay?
Thanks for your answers
Hello everyone, I'm trying a toy example to try and understand the model weights in the context of CATS algorithm
Training data : ca 1.23:-1:0.7 | a:1
Test data : | a:1
The vw command vw --cats 4 --bandwidth 1 --min_value 0 --max_value 32 --d train.vw --invert_hash m.ih -f model.vw --noconstant
the output is :
Version 8.11.0
Id
Min label:-1
Max label:0
bits:18
lda:0
0 ngram:
0 skip:
options: --bandwidth 1 --binary --cats 4 --cats_pdf 4 --cats_tree 4 --cb_explore_pdf --get_pmf --max_value 32 --min_value 0 --pmf_to_pdf 4 --sample_pdf --tree_bandwidth 0 --random_seed 2147483647
Checksum: 2730770910
:1
initial_t 0
norm normalizer 0.357143
t 1
sum_loss 0
sum_loss_since_last_dump 0
dump_interval 2
min_label -1
max_label 0
weighted_labeled_examples 1
weighted_labels 1
weighted_unlabeled_examples 0
example_number 1
total_features 1
total_weight 0.357143
sd::oec.weighted_labeled_examples 1
current_pass 1
a:108232:-0.190479 0.714286 1
a[1]:108233:-0.190479 0.714286 1
Can anyone please help to use the weights to get the final results ? Thank you