Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    kornilcdima
    @kornilcdima
    Is it possible somehow to change the posterior distribution for a chosen context? As MABT selected an optimal arm many times, the variance of posterior was decreased and now it is not able anymore to choose another arm which became optimal for a chosen context.
    1 reply
    Jack Gerrits
    @jackgerrits
    Just a quick announcement/FYI, we're using issues as a way to communicate and discuss deprecations and removals for VW. Take a look at the 'deprecation' (there's just two) tag and if it is something you have an opinion on then please feel free to comment https://github.com/VowpalWabbit/vowpal_wabbit/issues?q=is%3Aissue+is%3Aopen+label%3ADeprecation We're hoping this is a reasonable way to communicate changes to allow us to make progress while not adversely affecting anyone
    MochizukiShinichi
    @MochizukiShinichi
    Hey folks, could anyone please point me to some resources I can read on algorithm details of --cb_adf implementation in VowpalWabbit?
    1 reply
    K Krishna Chaitanya
    @kkchaitu27
    Hi Everyone, I have a doubt regarding action probabilities in input format of vowpalwabbit for contextual bandit. In the wiki, it is said that the input format must be action:cost:probability | features . what is probability here, is it probability for the action to get a reward/cost or something else. I read somewhere that it is the probability of exploration for that action, what does it mean?
    Adam Stepan
    @AdamStepan

    hello, I am trying to train vowpal model using C++ API using this piece of code:

        vw* vw = VW::initialize("-f train1.vw --progress 1");
        {
            ezexample ex(vw, false);
    
            ex.set_label("1");
            ex.addf('a', "a", 0.0);
            ex.addf('a', "b", 1.0);
            ex.addf('a', "c", 2.0);
            ex.train();
            ex.finish();
        }
        {
            ezexample ex1(vw, false);
    
            ex1.set_label("0");
            ex1.addf('a', "a", 2.0);
            ex1.addf('a', "b", 1.0);
            ex1.addf('a', "c", 0.0);
            ex1.train();
            ex1.finish();
        }
    
        VW::finish(*vw);

    this snippet generates the model, but the number of examples and number of features is 0, am I doing something wrong? I also tried to use example instead of ezexample and the result was the same and in either case, I did not see a progress log...

    6 replies
    Max Pagels
    @maxpagels_twitter
    @kkchaitu27 contextual bandits have exploration, ie there should always be a nonzero probability of choosing some action. the reason for this is to try out different actions to learn what works and what doesn’t. this probability is the one mentioned in the docs. it’s value depends on the exploration algorithm, for epsilon greedy with two actions and 10 percent exploration the best action is chosen with prob .95 and the other with .05. if you use cb_explore when collecting data, vw calculates these probabilities for you
    K Krishna Chaitanya
    @kkchaitu27
    @maxpagels_twitter Thanks for your response, how do I compute probability if I have historic data? Is it equal to number of times that action has been chosen/total number of times the context has appeared?
    1 reply
    Ryan Angi
    @rangi513

    I'm happy to turn this into a github issue, but want to make sure I'm not attempting some unintended behavior first.

    I am attempting to do multiple passes over a cb_adf dataset to hopefully improve the quality of my q function. I'm thinking of trying an offline bandit using the whole dataset and multiple passes instead of online with iterative updates. However, I get the following error after the first pass:

    libc++abi.dylib: terminating with uncaught exception of type VW::vw_exception: cb_adf: badly formatted example, only one line can have a cost
    [1]    90720 abort      vw --cb_adf --passes 2 -c -d train.dat

    Here is my command and dataset for reproducibility:
    vw --cb_adf --passes 2 -c -d train.dat

    train.dat

    shared | a:1 b:0.5
    0:0.1:0.75 | a:0.5 b:1 c:2
    | a:1 c:3
    
    shared | s_1 s_2
    0:1.0:0.5 | a:1 b:1 c:1
    | a:0.5 b:2 c:1

    I'm using version 8.10.1. I found this SO post and VowpalWabbit/vowpal_wabbit@431c270 by @jackgerrits that maybe was supposed to fix this but also could be unrelated.

    Are multiple passes not supported for --cb_adf? If so, maybe some better error messaging might be useful here?

    2 replies
    K Krishna Chaitanya
    @kkchaitu27

    This is a sample dataset I created

    1:1:1.0 2:2 3:3 4:4 | a b c
    1:1 2:2:1.0 3:3 4:4 | a b c
    1:1 2:2 3:3:1.0 4:4 | a b c
    1:1 2:2 3:3 4:4:1.0 | a b c
    1:1 2:2:0.7 3:3 4:4 | d e f

    when I do

    vw -d sampledata.vw --cb 4
    8 replies
    I get
    Num weight bits = 18
    learning rate = 0.5
    initial_t = 0
    power_t = 0.5
    using no cache
    Reading datafile = sampledata.vw
    num sources = 1
    Enabled reductions: gd, scorer, csoaa_ldf, cb_adf, shared_feature_merger, cb_to_cbadf
    average  since         example        example  current  current  current
    loss     last          counter         weight    label  predict features
    [critical] vw (cb_adf.cc:279): cb_adf: badly formatted example, only one cost can be known.
    Why are these reductions csoaa_ldf, cb_adf, shared_feature_merger, cb_to_cbadf
    enabled when I just do --cb actions?
    K Krishna Chaitanya
    @kkchaitu27
    Hi All, I want to deploy vowpalwabbit model for online learning. I am using historical data to warmstart model. I want to use that trained model and start doing online serving and learning. I can see mmlspark provides functionality to serve vw model but how do I train vw model on real time? Are there any resources for deploying vw contextual bandit model in an online learning and serving fashion?
    2 replies
    MochizukiShinichi
    @MochizukiShinichi
    Hello folks, I'm reading VowpalWabbit/vowpal_wabbit#1306 on removing arms in adf format data but I don't seem to have found a solution in the thread. Could anyone please let me how to format multiline codes to indicate some arms are no longer eligible?
    3 replies
    Owain Steer
    @0wainSteer_twitter
    Hey everyone, I'm currently following the OPE tutorial with both IPS and DR estimators on my own data. I'm finding that average loss while using DR changes from positive to negative at times when using the default 'squared' loss function, but not when using the 'classic' method of squared loss. I was assuming this is because of the importance weight aware updates skewing results but don't know how to interpret the negative loss or if this should be happening at all? Would appreciate any insight into this, thanks!
    1 reply
    George Fei
    @georgefei
    Hi all, I have a quick question. for cb_explore usage like following: vw --cb_explore 2 --cover 10 -d train.txt -l 0.1 --power_t 0 --save_resume vw.model --l1 1e-08; since --power_t is set to 0 and the learning rates don't decay, whether or not having--save_resume makes no difference to the model performance?
    10 replies
    Rohit Choudhary
    @tanquerey
    Should I use a Compute Optimized or Memory Optimized machine to increased training speed ? Or it doesn’t matter ?
    2 replies
    Rohit Choudhary
    @tanquerey

    I am trying to load model from a file

    modelfromfile = pyvw.vw(quiet=True).load('some.model')

    But I am getting following error --
    AttributeError: 'vw' object has no attribute 'load'

    1 reply
    George Fei
    @georgefei
    Hi everyone, I have a batch contextual bandit problem where we make decisions for one cohort and receive the reward afterwards for everyone in the cohort. during backtesting I noticed the order of the training samples that get fed into the model matters a lot when using the default update setting (--adaptive, --normalized and --invariant). Different orderings have different optimal hyperparameter choices, and the validation loss also differs a lot. to address this issue, I was thinking of adopting a nested validation scheme by reshuffling the data before feeding into the model for multiple times and taking the hyperparameter that on average perform the best. Another solution is to use --bfgs to do batch updates. I believe the second method is preferred; is that correct?
    26 replies
    K Krishna Chaitanya
    @kkchaitu27
    Hi everyone, I am experimenting contextual bandits with continuous action with the following line of code.
    2 replies
    vw = pyvw.vw("–cats_pdf 300 –bandwidth 1 –min_value 1 –max_value 300")
    I see that I have to give data to continuous action as follows: action:cost:pdf_value |[namespace] <features> . What should I give pdf_value? Can I give it as 1.0 as I do not clearly have probability distribution function value for the data I have. How does this pdf_value affect learning process in the algorithm?
    Bernardo Favoreto
    @Favoreto_B_twitter

    Hey guys, I was wondering... what influences the time to load a model? I've tested model files of different sizes, but it doesn't seem like there's any correlation.

    Is it the total number weights?

    The number of non-null weights? (I don't think so because this is correlated to file size - which isn't correlated to loading time)... and by the way, is there an easy way to count the number of non-null weights? I'm currently iterating on all weights and counting those that aren't 0

    Jack Gerrits
    @jackgerrits
    A model contains a command line which dictates how VW is instantiated, so if the command line is different then the load times for two different models can be different as it depends on what the reductions which are being setup are doing. In the presence of two models with the same command line then the next things are reduction specific data (this depends on the reduction but I would say most of the time this is a constant time operation) and then the non-zero model weights as you mentioned.
    4 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hello again, I have a follow-up to my previous question regarding the model load time. We're currently building a multi-tenant system to serve VW models, and I was wondering... is there a recommended approach in this scenario? Can we keep models on memory, or do we have to load them for every prediction?
    Any input here is welcome!
    4 replies
    Bernardo Favoreto
    @Favoreto_B_twitter

    Good morning everyone... I have a simple question regarding one of the Personalizer's examples.

    Why the JSON input in this example defines the features as an array of objects? Why not simply an object? Does it have anything to do with Namespaces?

    3 replies
    memeplex
    @memeplex

    Hi all, I was wondering how lrq (aka factorization machine) plays with Importance Weight Aware Updates (https://arxiv.org/pdf/1011.1576.pdf) since the development in that paper is for linear models:

    In this paper we focus on linear models i.e. p = <w, x> where w is a vector of weights

    but lrq models are not linear given that they involve products of weights. So what about the property:

    Therefore all gradients of a given example point to the same direction and only differ in magnitude.

    that's assumed in the paper? I wasn't able to find any related discussion.

    Ignacio Amaya
    @xinofekuator:matrix.org
    [m]
    Hi all, I found this answer (https://stackoverflow.com/questions/48687328/how-much-preprocessing-vowpal-wabbit-input-needs) which says that normalization is used for SDG by default. I was wondering if this is also true for contextual bandits (using cb or cb_explore) numerical variables.
    1 reply
    Tahseen Shabab
    @thshabab_gitlab
    Hi all, great to be a part of this community! I have a quick question regarding the use of Softmax Explorer in cb_explore_adf. VW documentation states the Softmax Explorer predicts a score indicating the quality of each action. Is this Score = Reward, or is this Score proportional to Reward, to something else? Thank you:)
    6 replies
    Jack Gerrits
    @jackgerrits
    We're running a survey to better understand how VW is used. If you have a moment we'd greatly appreciate your input. More details on the blog: https://vowpalwabbit.org/blog/vowpalwabbit-survey.html
    bef55
    @bef55
    I am trying to perform an active learning task to find the documents that will be most helpful to get labeled. But when I run vw in --active mode, it returns an importance weight of 1.0 for every unlabeled example I feed it. Specifically, I run this command: vw --active --port 6075 --daemon --foreground -i existing_model_filename.ext Then I run python3.9 active_iterator.py localhost 6075 unlabeled_examples_filename.ext. All of the over 800K unlabeled examples return an importance of exactly 1.0, even though the predictions are variable and largely accurate. In the past I have received highly useful and variable importance weights, and I cannot figure out what is wrong now. The only possibility that even occurs to me is that in active_iterator.py I had to change the first line of the recvall function to buf=s.recv(n).decode() from buf=s.recv(n), and I changed the sendall calls from sock.sendall(line) to sock.sendall(line.encode()). Any ideas? Thanks very much.
    16 replies
    Priyanshu Agarwal
    @priyanshuone6
    Hey, I am Priyanshu Agarwal, currently a third year undergraduate pursuing my bachelor's degree. I want to contribute to this project, could you please please guide me on how to start contributing and point out some beginner-friendly issues to get me started on this project? Thanks!
    2 replies
    Jack Gerrits
    @jackgerrits
    Hi @priyanshuone6 , you can take a look at issues marked "good first issue" (https://github.com/VowpalWabbit/vowpal_wabbit/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+First+Issue%22), leave a comment on any issue you choose to work on so someone else doesnt double up. Thanks! Feel free to ask questions here or even better in the issue
    Shloka
    @chicken-biryani
    Hey Everyone! I am Shloka, and I look forward to contributing and learning on the go!
    Ryan Angi
    @rangi513
    Has anyone done any simulations/research on how the different exploration algorithms (epsilon greedy, cover, bag, regcb, regcb-opt, etc.) behave after a policy has been learned and a new action is introduced with the cb_explore_adf setup? I have a feeling that a -1/0 vs a 0/1 cost encoding, could cause some drastically different behavior (0/1 might create extreme optimism with new options, -1/0 might never explore the new option except for epsilon greedy). If not, I'll work on some of this simulation work myself.
    2 replies
    Themistoklis Mavridis
    @thmavri
    I saw here https://vowpalwabbit.org/rlos/2021/projects.html#16-vw-parallel-parsing-improvements that there is work around parallel parsing. Is the current state usable? Where could we find details about it?
    2 replies
    Sander
    @sandstep1_twitter
    hello everyone, I am new here. Any ideas for beginner material to understand and use sequence prediction by Python?
    1 reply
    Sander
    @sandstep1_twitter
    another question : how to set interactions only for categorical data for tabular data with both continues and categorical features , for example how to change : vw_squared = VWRegressor(loss_function='squared' , interactions = 'abc')
    4 replies
    Sergey Feldman
    @sergeyf
    howdy. i have a multilabel problem, which can be dealt with csoaa using in command-line VW. does vowpalwabbit.sklearn_vw.VWMultiClassifier support this kind of input - i.e. where each row/sample has multiple labels? seems like VWMultiClassifier.fit only supports a 1-d y
    5 replies
    Ryan Angi
    @rangi513
    For anyone who is interested, I've conducted a few VW simulations using different exploration algorithms/hyperparameters with an environment where I add a new option to a personalization bandit after 10k decisions. I saw some weird behavior with regcb and regcbopt (I'm not sure if the mellowness parameter actually works?), but online cover seems to perform well overall when the hyperparameters are tuned. Feel free to reach out if you are interested in different simulations or have more questions on my methodology. The report and plots are in the attached html file.
    4 replies
    Moksh Pathak
    @moksh-pathak
    Hey everyone! I'm Moksh Pathak, a 2nd year CSE UG student at NIT Rourkela. I like to make and break things, play with code and learn new concepts and technologies. I'm looking forward to use and contribute to this project. I would love to hear guidance and advice on how to contribute.
    I was going through some beginner friendly issues, and came across this documentation issue, which seems to be easy for someone like me.
    Link - VowpalWabbit/vowpal_wabbit#3352
    I would like to work on this issue and it would be great if anyone can point to a direction and some files.
    Thanks :)
    Alexey C
    @ColdTeapot273K
    image.png

    Hello everyone. I have problems with convergence for --cv_explore_adf on simple synthetic dataset of 3 classes and 5 features, all informative (the data was simulated via scikit-learn's make_classification)

    Here's a gist and an output plot above. Basically, it doesnt learn the difference b/w classes and oerforms no better than baseline (predict one class each time)/ tried w/ softmax exploration w/ different lambdas (since it's not bound by epsilon-soft convergence bound like e-greedy policy), same results

    FYI, sklearn's softmax multiclass classifier gets 80% mean accuracy. So I really don't understand what might be the problem with VW (maybe smaller learning rate needed?). The documantation and the lack of --audit support in Python certainly don't help.

    https://gist.github.com/ColdTeapot273K/76a91a0416cb9b8ecb114e625f88f4a0

    3 replies
    George Fei
    @georgefei
    Hey team, two quick questions:
    6 replies
    Ignacio Amaya
    @xinofekuator:matrix.org
    [m]
    Is there a way to get the learning rate that vw is using when you train a model on daily batches when using save_resume to continue training the old model with newer data? I think vw uses internally some learning rate decay but I would like to track what is the learning rate after training in batches several weeks to detect when the learning rate is already too low, which would mean training is basically not really needed anymore.