Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Munu Sairamesh
    @musram:matrix.org
    [m]

    HI. I am using https://vowpalwabbit.org/tutorials/cb_simulation.html. I need to save the model and use it to later as the reward(feedback comes to the system few hours later). vw1 = pyvw.vw("--cb_explore_adf -q UA --quiet --epsilon 0.2 save_resume=True")

    num_iterations = 5000
    ctr = run_simulation(vw1, num_iterations, users, times_of_day, actions, get_cost)

    But then when I save and use the model again with different epsilon, its not using the weigths of the saved model.vw1.save("data/saved_model.model")
    vw2 = pyvw.vw("--cb_explore_adf -q UA --quiet --epsilon 0.8 i=data/saved_model.model").
    1 reply
    Can anybody help here?
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Thanks jackgerrits.
    Alexey C
    @ColdTeapot273K

    Hello, can anyone tell how to make cb_explore_adf agent respond to requests in daemon mode properly? i send the multiline commands via echo ... | netcat ... like in documentation and get no response.

    If i launch w/ --audit flag i recieve a bunch of info with unintuitive formatting (see attachment). i assume the 1st value in each line is action probability, and the very last line is some combined gradients or whatnot. Very different from a pmf output, like in python example on website.

    image.png
    Max Pagels
    @maxpagels_twitter
    Any idea how to get the holdout loss from pyvw?
    foo = pyvw.vw("--ccb_explore_adf -d train.gz --passes 2 -c -k --enable_logging")
    foo.get_sum_loss() / foo.get_weighted_examples() # not the holdout loss (checked against a CLI run)
    2 replies
    Alexey C
    @ColdTeapot273K

    Sorry, couldn't attach the img to my prev message thread

    A problem:
    for some reasons i get n+1 size of pmf for data with n distinct action.

    Details:
    When i do training on cb_explore_adf for datapoints only with 3 actions (no features apart from shared|...) and supply one of these examples for testing (action:cost:proba removed obviously), i get 4 actions in the output file. Why it might be?

    image.png

    5 replies
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Hi all. Is there some way of getting the contextual bandit model as we get using --invert_hash in regression model of vowpal wabbit. This is I need because the only way I can host the model in real time is in csv format.
    Munu Sairamesh
    @musram:matrix.org
    [m]

    I trained the contextual bandit as vw1 = pyvw.vw("-d data/cb_load.dat --cb_explore_adf -q UA -P 1 --invert_hash mymodel.inverted") on https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/test/train-sets/cb_load.dat with the --inverted-hash and I got the mymodel.inverted

    I could understand the user and action features. What does 18107:0.137426 . means in "User^time_of_day=afternoonAction^article=politics:18107:0.137426" ?
    I think 18107 is the hash value for "User^time_of_day=afternoon
    Action^article=politics" and 0.137426 is weight. I don't know if this is correct?

    How can I get the probability corresponding to the user and action features from the weights?

    Version 8.11.0
    Id
    Min label:-1
    Max label:0
    bits:18
    lda:0
    0 ngram:
    0 skip:
    options: --cb_adf --cb_explore_adf --cb_type mtr --csoaa_ldf multiline --csoaa_rank --quadratic UA
    Checksum: 2033437909
    event_sum 113
    action_sum 791
    :0
    User^time_of_day=afternoonAction^article=politics:18107:0.137426
    User^user=Tom:32581:-0.0636371
    User^user=Tom
    Action^article=politics:38087:-0.0636749
    Action^article=politics:52568:-0.110663
    User^time_of_day=morningAction^article=music:58967:0.224528
    User^user=Anna
    Action^article=politics:62875:0.0165196
    User^time_of_day=afternoon:65137:-0.0253498
    Action^article=music:67569:-0.0505464
    User^time_of_day=afternoonAction^article=food:67793:0.121444
    User^time_of_day=morning
    Action^article=politics:77054:-0.192732
    User^user=AnnaAction^article=music:81714:0.297336
    Action^article=sports:86811:0.0540273
    User^user=Tom
    Action^article=music:89710:-0.101787
    User^user=AnnaAction^article=sports:93144:0.0540273
    Action^article=food:99122:0.121444
    User^time_of_day=afternoon
    Action^article=music:101394:-0.190187
    User^user=AnnaAction^article=food:113649:0.0457554
    Constant:116060:-0.0947514
    User^time_of_day=afternoon
    Action^article=sports:121080:0.0540273
    User^user=TomAction^article=food:121517:0.109427
    Action^article=camping:134640:0.0742112
    User^user=Anna:141841:0.0966574
    User^time_of_day=afternoon
    Action^article=health:144204:0.0344906
    User^user=TomAction^article=camping:152687:0.0742112
    User^user=Anna
    Action^article=health:163948:0.0344906
    Action^article=health:178351:0.0971796
    User^user=TomAction^article=health:188720:0.09161
    User^time_of_day=morning
    Action^article=health:219401:0.09161
    User^time_of_day=morning:243586:0.0320462
    User^time_of_day=morning*Action^article=camping:257110:0.0742112

    2 replies
    musram
    @musram

    Hi All,Is there any way to add importance weight to off line training of the contextual bandits? Similar to linear regression where we specify importance weight as 2 in the training example " 1 2 second_house | price:.18 sqft:.15 age:.35 1976".

    This will help in reducing the training time of contextual bandits as the training data points are in billions. But we get good reduction if we use importance weight as data points are repeated.

    6 replies
    Vedant Bhatia
    @vedantbhatia
    hey, is there any example/documentation about running LOLS through VW? specifically how we can create the rollin and rollout policies/how they work
    musram
    @musram
    Does the value for the --passes is direclty correlated with the contextual bandit performance. I have 10 billion data points and if I make --passes big it takes lot of time to train. I can make --passes=5 , i.e small so that training is faster and also my model is also good.
    3 replies
    Reedham20
    @Reedham20
    Hey, I am an emerging web developer and would like to contribute to any of the ongoing projects. I have intermediate skills in JavaScript and its Framework Please let me know for anything I could contribute to. thanks.
    Alexey C
    @ColdTeapot273K
    Can i train an agent i cb_adf mode, dump it, then load as cb_explore_adf?
    3 replies
    And can i save model with one --lambda or --epsilon value and then load with another?
    K Krishna Chaitanya
    @kkchaitu27
    Hi All, Can someone help me with resources related to converting vowpalwabbit models into ONNX format? I tried googling but it didnot provide me with useful resources. I could see there was some work done in open source fest but the code is not available.
    1 reply
    musram
    @musram
    Hi All. When I retrain the model I am getting this error vw (cb_explore_adf.cc:93): cb_adf: badly formatted example, only one cost can be known.
    My command is "./vw.binary -P 4113882 --hash all --data data/part-00000-30a87f0c-0de0-48ce-aade-7f736c900132-c000.txt --cache_file temp.cache --final_regressor data/cb.vwmodel --bit_precision 24 -k --passes 1 --cb_explore_adf --cb_type ips -q EA --save_resume --initial_regressor=data/cb.vwmodel"
    10 replies
    Is this because of version problem. The version is 8.5.0 which I am using.
    Its too big to share. And also its company thing.
    musram
    @musram
    HI All. How data-set with -t option and --invert-hash option works? In my case as the data is in billons we need to break the data into batches and then train the model. So if I have 4 batches then I train the model with the first batch, then retrain the model with 2nd batch... . Now if I have to use it --invert-hash, is it fine to use the final batch i.e 4th batch with -t and --invert-hash to generate the humanreadble model ? Or I have to combine all the batches and then use it with -t and --invert-hash.
    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]

    Hello all,
    I'm NG Sai, final year UG @ IIIT Sri City. I got to know about Microsoft RLOS programme via LinkedIn. My experience with open source includes contributing to C++ ML libraries such as: shogun, tensorflow-lite support and, mlpack where I've done GSoC'21 and currently serve as a member. My github.

    I came across the Safe Contextual Bandits. Is this topic taken for this summer or will it be available for the year 2022. My forte is implementing algorithms from research papers so I wanted to inquire about this.

    Thanks in advance!

    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    oh yes, im intereste for the 2022
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys!
    Question regarding SquareCB: is there a parameter to control the exploration rate? Similar to epsilon for epsilon-greedy?
    2 replies
    jonpsy
    @jonpsy:matrix.org
    [m]
    so I tried attending the open meet yesterday, nobody was present. Will it happen next
    1 reply
    Kwame Porter Robinson
    @robinsonkwame
    I was looking through the summer projects, did the webassembly port (https://vowpalwabbit.org/rlos/2021/projects.html#18--vw-port-to-webassembly-and-javascript-api) ever happen? If so, would love to see the repo.
    5 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Where can I see the default value for VW's hyperparameters? Specifically, I'm looking for SquareCB's. Tried using the command line (vw -h) but didn't find and it's neither in the documentation. I'm interesting in knowing the values for gamma_scale (I believe I saw in the presentation that it's set to 1000 but would be good to confirm) and gamma_exponent.
    Using vw.get_arguments() on the model also doesn't show the default values.
    Thanks
    4 replies
    Max Pagels
    @maxpagels_twitter
    Got some time on my hands, is there a need for a tutorial specifically on progressive validation? I feel there might be some caught out by it, namely the fact that the final regressor in such a setup isn't necessarily good (for lack of a better word). I've seen setups that use incremental learning offline, which is only really advisable if you fully understand the implications (mostly, you are better off using a standard train/test validation procedure).
    3 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys, question about CCBs... is it expected that using the same policy configuration on the same dataset will yield the exact same results in multiple runs? I've been observing this and wondering if something's wrong with my pipeline.
    I thought it shouldn't be due to exploration which inherently brings randomization to the process, but since VW does this under the hood for CCB, I'm not sure.
    To further explain: I'm training a new policy every time (with the same hyperparameters) on the same data ordered the same way. The results (even the action distribution) is always the same.
    Is this expected?
    2 replies
    Sushmita
    @Sushmita10062002
    Hello everyone, I am Sushmita and I am currently studying at Ramjas College, Delhi University. I know python, java, Machine Learning, deep learning and tensorflow. I am new to open source but I really want to contribute in this organisaton. Could you please point me in right direction ?
    2 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    What is the correct way to offline evaluate a policy with exploration?
    My goal is to estimate the average reward of different models and compare that with the current online policy (which can be uniformly random) to assess if it's appropriate to deploy a new policy or not.
    I've been using a custom pipeline that trains for multiple passes using rejection sampling and then assessing the model's performance on a test set, but I'm a bit unsure due to the contexts' distribution shifted induced by the rejection sampling.
    MochizukiShinichi
    @MochizukiShinichi
    Happy holidays everyone! Could anyone please share some knowledge on how to read the logs for contextual bandits in VW? Specifically, 1. is the 'average loss' represent average reward value estimated counterfactually? 2. Is it calculated on a holdout set? 3. If 1 holds, where can I find out the loss of the underlying classifier/oracle? Thanks in advance!
    jonpsy
    @jonpsy:matrix.org
    [m]
    Hello VW team, can someone please explain this format in CCB
    ccb shared | s_1 s_2
    ccb action | a:1 b:1 c:1
    ccb action | a:0.5 b:2 c:1
    ccb action | a:0.5 
    ccb action | c:1
    ccb slot  | d:4
    ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
    jonpsy
    @jonpsy:matrix.org
    [m]
    :point_up: Edit: Hello VW team, can someone please explain this format in CCB
    ccb shared | s_1 s_2
    ccb action | a:1 b:1 c:1
    ccb action | a:0.5 b:2 c:1
    ccb action | a:0.5 
    ccb action | c:1
    ccb slot  | d:4
    ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
    feeding this to the algorithm prints this
    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    [warning] Unlabeled example in train set, was this intentional?, I didn't understand
    What exactly is the slot argument? ccb slot | d:4 what does it mean? Does it mean this is the fourth slot? In L7, what's happening? + the error term. Would be really grateful for an explanation, thanks
    jonpsy
    @jonpsy:matrix.org
    [m]
    Thanks a ton, I was also going through your answer in stackoverflow. I'll be sure to update you on this.
    jonpsy
    @jonpsy:matrix.org
    [m]
    @bassmang: actually could you review my VowpalWabbit/vowpal_wabbit#3546. We can discuss the details over there, let me know if you'd like that
    andy-soft
    @andy-soft
    Hello VW team, I am using VW from C#, for NLP tasks, and there are some examples I could not reproduce because of the constant incompatibilities across versioning of VW, some papers published in 2015 claim the VW is an excellent and fast POS tagger and also claims to be an ultra-fast and precise NERC Named Entity Recognizer Classifier. I've implemented the experiments and it is not true, the F1 scores obtained were significantly lower than the published, and only in English as soon as you switch to other languages like Spanish, the thing renders unusable.
    ¿Has anyone experienced issues like this? I am very disappointed because of this.
    I am using it currently for NLP Intent detection, (OAA Classifier on POS-tagged + morphological analyzed text)
    Although it runs smooth and fine, the F1 score must be always re-calculated afterward, and the loss is tricky, you can get a really low loss, and a bad F1 and vice-versa! Precision-Recall behavior is also an issue.
    12 replies
    jonpsy
    @jonpsy:matrix.org
    [m]

    In CCB, what's the difference between the weight of an example vs it's cost. I thought weight is just an inverse of cost?

    I tried feeding the training data from CCB page, and the weight seems to be equal to the example counter, is that intentional?

    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    Okay, I recollect in the CB example cost was just the inverse of reward.
    Priyanshu Agarwal
    @priyanshuone6
    Did anyone get a chance to look at my comment
    3 replies
    andreacimino
    @andreacimino
    Question regarding Conditional Contextual Bandit (CCB).
    As it stands now, by looking at the source code, seems that the selection of the action on a slot does not take into account the previous action selected as "features". I will try to explain me better:
    Suppose that there are N slots and M actions (M >= N).
    The algorithms choose in slot N_1 action M_1, then a decision on N_2 must be taken, and other actions are similar to M_1.
    There is a high chance that in slot M_2 an action similar to M_1 will be taken.
    I would like to pass as "context" the decision made at the previous step, to promote "diversity".
    I am not an expert, but I would like to know if someone has some experience regarding that.
    jonpsy
    @jonpsy:matrix.org
    [m]
    @jackgerrits: Hey, thanks for the detailed review. Would you mind reviewing the PR sometime soon? I think its done, we could merge it today/tommorow :-)
    jeanjean
    @jeanjeaCyberboy_twitter
    Hey everyone I have a quick question. Just started using the VW library and managed to extract the audit logs for the CB explore model. I was wondering what the scale for the weights for the features is? Does the model assign them at random, why are there negative values? Couldn’t find anything meaningful in the documentation. Would appreciate any assistance.
    3 replies
    jonpsy
    @jonpsy:matrix.org
    [m]
    @bassmang: Hey, just saw you updated the master branch with some major changes for labels.
    I see that you've used Union [ example, Costs] instead of using kwargs, should I go by that method too then?
    1 reply
    Also we should check in from_example(..) if example type is what it claims to be. No?
    3 replies