Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Ryan Angi
    @rangi513
    For anyone who is interested, I've conducted a few VW simulations using different exploration algorithms/hyperparameters with an environment where I add a new option to a personalization bandit after 10k decisions. I saw some weird behavior with regcb and regcbopt (I'm not sure if the mellowness parameter actually works?), but online cover seems to perform well overall when the hyperparameters are tuned. Feel free to reach out if you are interested in different simulations or have more questions on my methodology. The report and plots are in the attached html file.
    4 replies
    Moksh Pathak
    @moksh-pathak
    Hey everyone! I'm Moksh Pathak, a 2nd year CSE UG student at NIT Rourkela. I like to make and break things, play with code and learn new concepts and technologies. I'm looking forward to use and contribute to this project. I would love to hear guidance and advice on how to contribute.
    I was going through some beginner friendly issues, and came across this documentation issue, which seems to be easy for someone like me.
    Link - VowpalWabbit/vowpal_wabbit#3352
    I would like to work on this issue and it would be great if anyone can point to a direction and some files.
    Thanks :)
    Alexey C
    @ColdTeapot273K
    image.png

    Hello everyone. I have problems with convergence for --cv_explore_adf on simple synthetic dataset of 3 classes and 5 features, all informative (the data was simulated via scikit-learn's make_classification)

    Here's a gist and an output plot above. Basically, it doesnt learn the difference b/w classes and oerforms no better than baseline (predict one class each time)/ tried w/ softmax exploration w/ different lambdas (since it's not bound by epsilon-soft convergence bound like e-greedy policy), same results

    FYI, sklearn's softmax multiclass classifier gets 80% mean accuracy. So I really don't understand what might be the problem with VW (maybe smaller learning rate needed?). The documantation and the lack of --audit support in Python certainly don't help.

    https://gist.github.com/ColdTeapot273K/76a91a0416cb9b8ecb114e625f88f4a0

    4 replies
    George Fei
    @georgefei
    Hey team, two quick questions:
    6 replies
    Ignacio Amaya
    @xinofekuator:matrix.org
    [m]
    Is there a way to get the learning rate that vw is using when you train a model on daily batches when using save_resume to continue training the old model with newer data? I think vw uses internally some learning rate decay but I would like to track what is the learning rate after training in batches several weeks to detect when the learning rate is already too low, which would mean training is basically not really needed anymore.
    7 replies
    anamika-yadav99
    @anamika-yadav99
    Hi! I'm Anamika. I'm a 3rd year undergraduate student enthusiastic about open source. I'm new here and I want to contribute to this project, could you please guide me on how to start contributing and point out some beginner-friendly issues to get me started on this project? Thanks!
    Jack Gerrits
    @jackgerrits
    Hi @anamika-yadav99, welcome! You can start with the Good first issue label and se if anything interests you
    bef55
    @bef55
    Hello, I am trying to install the command-line version of VW in Windows 10. I followed all the instructions, and everything appeared to work properly. But I cannot find vw.exe anywhere, and I am confident it does not exist because `find . | grep vw.exe' turned up nothing. Any ideas what might have happened?
    19 replies
    Sander
    @sandstep1_twitter

    may you help understand why I have bug on Windows 10 and vowpalwabbit 8.10.2 or I do something wrong:

    this code runs correctly and it takes few seconds
    from vowpalwabbit import pyvw
    ridge_lambda = 0.001
    loss_function = 'squared'

    works

    VWmodel_simple = pyvw.vw(
    learning_rate = 0.1,
    l2 = ridge_lambda,
    loss_function = loss_function,
    )

    but this code not running correctly
    VWmodel_simple = pyvw.vw(
    cache_file = 'my_VW_2.cache',
    passes = 3,
    learning_rate = 0.1,
    l2 = ridge_lambda,
    loss_function = loss_function,
    )

    code run never finished
    what is wrong with creating this file 'my_VW_2.cache' ??

    command line output is
    using l2 regularization = 0.001
    Num weight bits = 18
    learning rate = 0.1
    initial_t = 0
    power_t = 0.5
    decay_learning_rate = 1
    creating cache_file = my_VW_2.cache
    Reading datafile =
    num sources = 1
    Enabled reductions: gd, scorer
    average since example example current current current
    loss last counter weight label predict features

    2 replies
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Hi all,
    Munu Sairamesh
    @musram:matrix.org
    [m]
    I am using vowal wobbit contextual bandit. As my arms are around 10,000, I am using --cb_explore_adf as its support action features. If I have a context x = a, then I will use arms 1-300. For context x = b I will use arms 200-400. Each context will be mapped to some subset of possible arms. In the document its written that you can add arms and remove the arms using --cb_explore_adf. I am not still clear with it. Can anybody help me?
    Munu Sairamesh
    @musram:matrix.org
    [m]

    HI. I am using https://vowpalwabbit.org/tutorials/cb_simulation.html. I need to save the model and use it to later as the reward(feedback comes to the system few hours later). vw1 = pyvw.vw("--cb_explore_adf -q UA --quiet --epsilon 0.2 save_resume=True")

    num_iterations = 5000
    ctr = run_simulation(vw1, num_iterations, users, times_of_day, actions, get_cost)

    But then when I save and use the model again with different epsilon, its not using the weigths of the saved model.vw1.save("data/saved_model.model")
    vw2 = pyvw.vw("--cb_explore_adf -q UA --quiet --epsilon 0.8 i=data/saved_model.model").
    1 reply
    Can anybody help here?
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Thanks jackgerrits.
    Alexey C
    @ColdTeapot273K

    Hello, can anyone tell how to make cb_explore_adf agent respond to requests in daemon mode properly? i send the multiline commands via echo ... | netcat ... like in documentation and get no response.

    If i launch w/ --audit flag i recieve a bunch of info with unintuitive formatting (see attachment). i assume the 1st value in each line is action probability, and the very last line is some combined gradients or whatnot. Very different from a pmf output, like in python example on website.

    image.png
    Max Pagels
    @maxpagels_twitter
    Any idea how to get the holdout loss from pyvw?
    foo = pyvw.vw("--ccb_explore_adf -d train.gz --passes 2 -c -k --enable_logging")
    foo.get_sum_loss() / foo.get_weighted_examples() # not the holdout loss (checked against a CLI run)
    2 replies
    Alexey C
    @ColdTeapot273K

    Sorry, couldn't attach the img to my prev message thread

    A problem:
    for some reasons i get n+1 size of pmf for data with n distinct action.

    Details:
    When i do training on cb_explore_adf for datapoints only with 3 actions (no features apart from shared|...) and supply one of these examples for testing (action:cost:proba removed obviously), i get 4 actions in the output file. Why it might be?

    image.png

    5 replies
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Hi all. Is there some way of getting the contextual bandit model as we get using --invert_hash in regression model of vowpal wabbit. This is I need because the only way I can host the model in real time is in csv format.
    Munu Sairamesh
    @musram:matrix.org
    [m]

    I trained the contextual bandit as vw1 = pyvw.vw("-d data/cb_load.dat --cb_explore_adf -q UA -P 1 --invert_hash mymodel.inverted") on https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/test/train-sets/cb_load.dat with the --inverted-hash and I got the mymodel.inverted

    I could understand the user and action features. What does 18107:0.137426 . means in "User^time_of_day=afternoonAction^article=politics:18107:0.137426" ?
    I think 18107 is the hash value for "User^time_of_day=afternoon
    Action^article=politics" and 0.137426 is weight. I don't know if this is correct?

    How can I get the probability corresponding to the user and action features from the weights?

    Version 8.11.0
    Id
    Min label:-1
    Max label:0
    bits:18
    lda:0
    0 ngram:
    0 skip:
    options: --cb_adf --cb_explore_adf --cb_type mtr --csoaa_ldf multiline --csoaa_rank --quadratic UA
    Checksum: 2033437909
    event_sum 113
    action_sum 791
    :0
    User^time_of_day=afternoonAction^article=politics:18107:0.137426
    User^user=Tom:32581:-0.0636371
    User^user=Tom
    Action^article=politics:38087:-0.0636749
    Action^article=politics:52568:-0.110663
    User^time_of_day=morningAction^article=music:58967:0.224528
    User^user=Anna
    Action^article=politics:62875:0.0165196
    User^time_of_day=afternoon:65137:-0.0253498
    Action^article=music:67569:-0.0505464
    User^time_of_day=afternoonAction^article=food:67793:0.121444
    User^time_of_day=morning
    Action^article=politics:77054:-0.192732
    User^user=AnnaAction^article=music:81714:0.297336
    Action^article=sports:86811:0.0540273
    User^user=Tom
    Action^article=music:89710:-0.101787
    User^user=AnnaAction^article=sports:93144:0.0540273
    Action^article=food:99122:0.121444
    User^time_of_day=afternoon
    Action^article=music:101394:-0.190187
    User^user=AnnaAction^article=food:113649:0.0457554
    Constant:116060:-0.0947514
    User^time_of_day=afternoon
    Action^article=sports:121080:0.0540273
    User^user=TomAction^article=food:121517:0.109427
    Action^article=camping:134640:0.0742112
    User^user=Anna:141841:0.0966574
    User^time_of_day=afternoon
    Action^article=health:144204:0.0344906
    User^user=TomAction^article=camping:152687:0.0742112
    User^user=Anna
    Action^article=health:163948:0.0344906
    Action^article=health:178351:0.0971796
    User^user=TomAction^article=health:188720:0.09161
    User^time_of_day=morning
    Action^article=health:219401:0.09161
    User^time_of_day=morning:243586:0.0320462
    User^time_of_day=morning*Action^article=camping:257110:0.0742112

    2 replies
    musram
    @musram

    Hi All,Is there any way to add importance weight to off line training of the contextual bandits? Similar to linear regression where we specify importance weight as 2 in the training example " 1 2 second_house | price:.18 sqft:.15 age:.35 1976".

    This will help in reducing the training time of contextual bandits as the training data points are in billions. But we get good reduction if we use importance weight as data points are repeated.

    6 replies
    Vedant Bhatia
    @vedantbhatia
    hey, is there any example/documentation about running LOLS through VW? specifically how we can create the rollin and rollout policies/how they work
    musram
    @musram
    Does the value for the --passes is direclty correlated with the contextual bandit performance. I have 10 billion data points and if I make --passes big it takes lot of time to train. I can make --passes=5 , i.e small so that training is faster and also my model is also good.
    3 replies
    Reedham20
    @Reedham20
    Hey, I am an emerging web developer and would like to contribute to any of the ongoing projects. I have intermediate skills in JavaScript and its Framework Please let me know for anything I could contribute to. thanks.
    Alexey C
    @ColdTeapot273K
    Can i train an agent i cb_adf mode, dump it, then load as cb_explore_adf?
    3 replies
    And can i save model with one --lambda or --epsilon value and then load with another?
    K Krishna Chaitanya
    @kkchaitu27
    Hi All, Can someone help me with resources related to converting vowpalwabbit models into ONNX format? I tried googling but it didnot provide me with useful resources. I could see there was some work done in open source fest but the code is not available.
    1 reply
    musram
    @musram
    Hi All. When I retrain the model I am getting this error vw (cb_explore_adf.cc:93): cb_adf: badly formatted example, only one cost can be known.
    My command is "./vw.binary -P 4113882 --hash all --data data/part-00000-30a87f0c-0de0-48ce-aade-7f736c900132-c000.txt --cache_file temp.cache --final_regressor data/cb.vwmodel --bit_precision 24 -k --passes 1 --cb_explore_adf --cb_type ips -q EA --save_resume --initial_regressor=data/cb.vwmodel"
    10 replies
    Is this because of version problem. The version is 8.5.0 which I am using.
    Its too big to share. And also its company thing.
    musram
    @musram
    HI All. How data-set with -t option and --invert-hash option works? In my case as the data is in billons we need to break the data into batches and then train the model. So if I have 4 batches then I train the model with the first batch, then retrain the model with 2nd batch... . Now if I have to use it --invert-hash, is it fine to use the final batch i.e 4th batch with -t and --invert-hash to generate the humanreadble model ? Or I have to combine all the batches and then use it with -t and --invert-hash.
    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]

    Hello all,
    I'm NG Sai, final year UG @ IIIT Sri City. I got to know about Microsoft RLOS programme via LinkedIn. My experience with open source includes contributing to C++ ML libraries such as: shogun, tensorflow-lite support and, mlpack where I've done GSoC'21 and currently serve as a member. My github.

    I came across the Safe Contextual Bandits. Is this topic taken for this summer or will it be available for the year 2022. My forte is implementing algorithms from research papers so I wanted to inquire about this.

    Thanks in advance!

    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    oh yes, im intereste for the 2022
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys!
    Question regarding SquareCB: is there a parameter to control the exploration rate? Similar to epsilon for epsilon-greedy?
    2 replies
    jonpsy
    @jonpsy:matrix.org
    [m]
    so I tried attending the open meet yesterday, nobody was present. Will it happen next
    1 reply
    Kwame Porter Robinson
    @robinsonkwame
    I was looking through the summer projects, did the webassembly port (https://vowpalwabbit.org/rlos/2021/projects.html#18--vw-port-to-webassembly-and-javascript-api) ever happen? If so, would love to see the repo.
    5 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Where can I see the default value for VW's hyperparameters? Specifically, I'm looking for SquareCB's. Tried using the command line (vw -h) but didn't find and it's neither in the documentation. I'm interesting in knowing the values for gamma_scale (I believe I saw in the presentation that it's set to 1000 but would be good to confirm) and gamma_exponent.
    Using vw.get_arguments() on the model also doesn't show the default values.
    Thanks
    4 replies
    Max Pagels
    @maxpagels_twitter
    Got some time on my hands, is there a need for a tutorial specifically on progressive validation? I feel there might be some caught out by it, namely the fact that the final regressor in such a setup isn't necessarily good (for lack of a better word). I've seen setups that use incremental learning offline, which is only really advisable if you fully understand the implications (mostly, you are better off using a standard train/test validation procedure).
    3 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys, question about CCBs... is it expected that using the same policy configuration on the same dataset will yield the exact same results in multiple runs? I've been observing this and wondering if something's wrong with my pipeline.
    I thought it shouldn't be due to exploration which inherently brings randomization to the process, but since VW does this under the hood for CCB, I'm not sure.
    To further explain: I'm training a new policy every time (with the same hyperparameters) on the same data ordered the same way. The results (even the action distribution) is always the same.
    Is this expected?
    2 replies
    Sushmita
    @Sushmita10062002
    Hello everyone, I am Sushmita and I am currently studying at Ramjas College, Delhi University. I know python, java, Machine Learning, deep learning and tensorflow. I am new to open source but I really want to contribute in this organisaton. Could you please point me in right direction ?
    2 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    What is the correct way to offline evaluate a policy with exploration?
    My goal is to estimate the average reward of different models and compare that with the current online policy (which can be uniformly random) to assess if it's appropriate to deploy a new policy or not.
    I've been using a custom pipeline that trains for multiple passes using rejection sampling and then assessing the model's performance on a test set, but I'm a bit unsure due to the contexts' distribution shifted induced by the rejection sampling.
    MochizukiShinichi
    @MochizukiShinichi
    Happy holidays everyone! Could anyone please share some knowledge on how to read the logs for contextual bandits in VW? Specifically, 1. is the 'average loss' represent average reward value estimated counterfactually? 2. Is it calculated on a holdout set? 3. If 1 holds, where can I find out the loss of the underlying classifier/oracle? Thanks in advance!
    jonpsy
    @jonpsy:matrix.org
    [m]
    Hello VW team, can someone please explain this format in CCB
    ccb shared | s_1 s_2
    ccb action | a:1 b:1 c:1
    ccb action | a:0.5 b:2 c:1
    ccb action | a:0.5 
    ccb action | c:1
    ccb slot  | d:4
    ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
    jonpsy
    @jonpsy:matrix.org
    [m]
    :point_up: Edit: Hello VW team, can someone please explain this format in CCB
    ccb shared | s_1 s_2
    ccb action | a:1 b:1 c:1
    ccb action | a:0.5 b:2 c:1
    ccb action | a:0.5 
    ccb action | c:1
    ccb slot  | d:4
    ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
    feeding this to the algorithm prints this
    1 reply