Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    K Krishna Chaitanya
    @kkchaitu27
    Num weight bits = 18
    learning rate = 0.5
    initial_t = 0
    power_t = 0.5
    using no cache
    Reading datafile = sampledata.vw
    num sources = 1
    Enabled reductions: gd, scorer, csoaa_ldf, cb_adf, shared_feature_merger, cb_to_cbadf
    average  since         example        example  current  current  current
    loss     last          counter         weight    label  predict features
    [critical] vw (cb_adf.cc:279): cb_adf: badly formatted example, only one cost can be known.
    Why are these reductions csoaa_ldf, cb_adf, shared_feature_merger, cb_to_cbadf
    enabled when I just do --cb actions?
    K Krishna Chaitanya
    @kkchaitu27
    Hi All, I want to deploy vowpalwabbit model for online learning. I am using historical data to warmstart model. I want to use that trained model and start doing online serving and learning. I can see mmlspark provides functionality to serve vw model but how do I train vw model on real time? Are there any resources for deploying vw contextual bandit model in an online learning and serving fashion?
    2 replies
    MochizukiShinichi
    @MochizukiShinichi
    Hello folks, I'm reading VowpalWabbit/vowpal_wabbit#1306 on removing arms in adf format data but I don't seem to have found a solution in the thread. Could anyone please let me how to format multiline codes to indicate some arms are no longer eligible?
    3 replies
    Owain Steer
    @0wainSteer_twitter
    Hey everyone, I'm currently following the OPE tutorial with both IPS and DR estimators on my own data. I'm finding that average loss while using DR changes from positive to negative at times when using the default 'squared' loss function, but not when using the 'classic' method of squared loss. I was assuming this is because of the importance weight aware updates skewing results but don't know how to interpret the negative loss or if this should be happening at all? Would appreciate any insight into this, thanks!
    1 reply
    George Fei
    @georgefei
    Hi all, I have a quick question. for cb_explore usage like following: vw --cb_explore 2 --cover 10 -d train.txt -l 0.1 --power_t 0 --save_resume vw.model --l1 1e-08; since --power_t is set to 0 and the learning rates don't decay, whether or not having--save_resume makes no difference to the model performance?
    10 replies
    Rohit Choudhary
    @tanquerey
    Should I use a Compute Optimized or Memory Optimized machine to increased training speed ? Or it doesn’t matter ?
    2 replies
    Rohit Choudhary
    @tanquerey

    I am trying to load model from a file

    modelfromfile = pyvw.vw(quiet=True).load('some.model')

    But I am getting following error --
    AttributeError: 'vw' object has no attribute 'load'

    1 reply
    George Fei
    @georgefei
    Hi everyone, I have a batch contextual bandit problem where we make decisions for one cohort and receive the reward afterwards for everyone in the cohort. during backtesting I noticed the order of the training samples that get fed into the model matters a lot when using the default update setting (--adaptive, --normalized and --invariant). Different orderings have different optimal hyperparameter choices, and the validation loss also differs a lot. to address this issue, I was thinking of adopting a nested validation scheme by reshuffling the data before feeding into the model for multiple times and taking the hyperparameter that on average perform the best. Another solution is to use --bfgs to do batch updates. I believe the second method is preferred; is that correct?
    26 replies
    K Krishna Chaitanya
    @kkchaitu27
    Hi everyone, I am experimenting contextual bandits with continuous action with the following line of code.
    2 replies
    vw = pyvw.vw("–cats_pdf 300 –bandwidth 1 –min_value 1 –max_value 300")
    I see that I have to give data to continuous action as follows: action:cost:pdf_value |[namespace] <features> . What should I give pdf_value? Can I give it as 1.0 as I do not clearly have probability distribution function value for the data I have. How does this pdf_value affect learning process in the algorithm?
    Bernardo Favoreto
    @Favoreto_B_twitter

    Hey guys, I was wondering... what influences the time to load a model? I've tested model files of different sizes, but it doesn't seem like there's any correlation.

    Is it the total number weights?

    The number of non-null weights? (I don't think so because this is correlated to file size - which isn't correlated to loading time)... and by the way, is there an easy way to count the number of non-null weights? I'm currently iterating on all weights and counting those that aren't 0

    Jack Gerrits
    @jackgerrits
    A model contains a command line which dictates how VW is instantiated, so if the command line is different then the load times for two different models can be different as it depends on what the reductions which are being setup are doing. In the presence of two models with the same command line then the next things are reduction specific data (this depends on the reduction but I would say most of the time this is a constant time operation) and then the non-zero model weights as you mentioned.
    4 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hello again, I have a follow-up to my previous question regarding the model load time. We're currently building a multi-tenant system to serve VW models, and I was wondering... is there a recommended approach in this scenario? Can we keep models on memory, or do we have to load them for every prediction?
    Any input here is welcome!
    4 replies
    Bernardo Favoreto
    @Favoreto_B_twitter

    Good morning everyone... I have a simple question regarding one of the Personalizer's examples.

    Why the JSON input in this example defines the features as an array of objects? Why not simply an object? Does it have anything to do with Namespaces?

    3 replies
    memeplex
    @memeplex

    Hi all, I was wondering how lrq (aka factorization machine) plays with Importance Weight Aware Updates (https://arxiv.org/pdf/1011.1576.pdf) since the development in that paper is for linear models:

    In this paper we focus on linear models i.e. p = <w, x> where w is a vector of weights

    but lrq models are not linear given that they involve products of weights. So what about the property:

    Therefore all gradients of a given example point to the same direction and only differ in magnitude.

    that's assumed in the paper? I wasn't able to find any related discussion.

    Ignacio Amaya
    @xinofekuator:matrix.org
    [m]
    Hi all, I found this answer (https://stackoverflow.com/questions/48687328/how-much-preprocessing-vowpal-wabbit-input-needs) which says that normalization is used for SDG by default. I was wondering if this is also true for contextual bandits (using cb or cb_explore) numerical variables.
    1 reply
    Tahseen Shabab
    @thshabab_gitlab
    Hi all, great to be a part of this community! I have a quick question regarding the use of Softmax Explorer in cb_explore_adf. VW documentation states the Softmax Explorer predicts a score indicating the quality of each action. Is this Score = Reward, or is this Score proportional to Reward, to something else? Thank you:)
    6 replies
    Jack Gerrits
    @jackgerrits
    We're running a survey to better understand how VW is used. If you have a moment we'd greatly appreciate your input. More details on the blog: https://vowpalwabbit.org/blog/vowpalwabbit-survey.html
    bef55
    @bef55
    I am trying to perform an active learning task to find the documents that will be most helpful to get labeled. But when I run vw in --active mode, it returns an importance weight of 1.0 for every unlabeled example I feed it. Specifically, I run this command: vw --active --port 6075 --daemon --foreground -i existing_model_filename.ext Then I run python3.9 active_iterator.py localhost 6075 unlabeled_examples_filename.ext. All of the over 800K unlabeled examples return an importance of exactly 1.0, even though the predictions are variable and largely accurate. In the past I have received highly useful and variable importance weights, and I cannot figure out what is wrong now. The only possibility that even occurs to me is that in active_iterator.py I had to change the first line of the recvall function to buf=s.recv(n).decode() from buf=s.recv(n), and I changed the sendall calls from sock.sendall(line) to sock.sendall(line.encode()). Any ideas? Thanks very much.
    16 replies
    Priyanshu Agarwal
    @priyanshuone6
    Hey, I am Priyanshu Agarwal, currently a third year undergraduate pursuing my bachelor's degree. I want to contribute to this project, could you please please guide me on how to start contributing and point out some beginner-friendly issues to get me started on this project? Thanks!
    2 replies
    Jack Gerrits
    @jackgerrits
    Hi @priyanshuone6 , you can take a look at issues marked "good first issue" (https://github.com/VowpalWabbit/vowpal_wabbit/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+First+Issue%22), leave a comment on any issue you choose to work on so someone else doesnt double up. Thanks! Feel free to ask questions here or even better in the issue
    Shloka
    @chicken-biryani
    Hey Everyone! I am Shloka, and I look forward to contributing and learning on the go!
    Ryan Angi
    @rangi513
    Has anyone done any simulations/research on how the different exploration algorithms (epsilon greedy, cover, bag, regcb, regcb-opt, etc.) behave after a policy has been learned and a new action is introduced with the cb_explore_adf setup? I have a feeling that a -1/0 vs a 0/1 cost encoding, could cause some drastically different behavior (0/1 might create extreme optimism with new options, -1/0 might never explore the new option except for epsilon greedy). If not, I'll work on some of this simulation work myself.
    2 replies
    Themistoklis Mavridis
    @thmavri
    I saw here https://vowpalwabbit.org/rlos/2021/projects.html#16-vw-parallel-parsing-improvements that there is work around parallel parsing. Is the current state usable? Where could we find details about it?
    2 replies
    Sander
    @sandstep1_twitter
    hello everyone, I am new here. Any ideas for beginner material to understand and use sequence prediction by Python?
    1 reply
    Sander
    @sandstep1_twitter
    another question : how to set interactions only for categorical data for tabular data with both continues and categorical features , for example how to change : vw_squared = VWRegressor(loss_function='squared' , interactions = 'abc')
    4 replies
    Sergey Feldman
    @sergeyf
    howdy. i have a multilabel problem, which can be dealt with csoaa using in command-line VW. does vowpalwabbit.sklearn_vw.VWMultiClassifier support this kind of input - i.e. where each row/sample has multiple labels? seems like VWMultiClassifier.fit only supports a 1-d y
    5 replies
    Ryan Angi
    @rangi513
    For anyone who is interested, I've conducted a few VW simulations using different exploration algorithms/hyperparameters with an environment where I add a new option to a personalization bandit after 10k decisions. I saw some weird behavior with regcb and regcbopt (I'm not sure if the mellowness parameter actually works?), but online cover seems to perform well overall when the hyperparameters are tuned. Feel free to reach out if you are interested in different simulations or have more questions on my methodology. The report and plots are in the attached html file.
    4 replies
    Moksh Pathak
    @moksh-pathak
    Hey everyone! I'm Moksh Pathak, a 2nd year CSE UG student at NIT Rourkela. I like to make and break things, play with code and learn new concepts and technologies. I'm looking forward to use and contribute to this project. I would love to hear guidance and advice on how to contribute.
    I was going through some beginner friendly issues, and came across this documentation issue, which seems to be easy for someone like me.
    Link - VowpalWabbit/vowpal_wabbit#3352
    I would like to work on this issue and it would be great if anyone can point to a direction and some files.
    Thanks :)
    Alexey C
    @ColdTeapot273K
    image.png

    Hello everyone. I have problems with convergence for --cv_explore_adf on simple synthetic dataset of 3 classes and 5 features, all informative (the data was simulated via scikit-learn's make_classification)

    Here's a gist and an output plot above. Basically, it doesnt learn the difference b/w classes and oerforms no better than baseline (predict one class each time)/ tried w/ softmax exploration w/ different lambdas (since it's not bound by epsilon-soft convergence bound like e-greedy policy), same results

    FYI, sklearn's softmax multiclass classifier gets 80% mean accuracy. So I really don't understand what might be the problem with VW (maybe smaller learning rate needed?). The documantation and the lack of --audit support in Python certainly don't help.

    https://gist.github.com/ColdTeapot273K/76a91a0416cb9b8ecb114e625f88f4a0

    4 replies
    George Fei
    @georgefei
    Hey team, two quick questions:
    6 replies
    Ignacio Amaya
    @xinofekuator:matrix.org
    [m]
    Is there a way to get the learning rate that vw is using when you train a model on daily batches when using save_resume to continue training the old model with newer data? I think vw uses internally some learning rate decay but I would like to track what is the learning rate after training in batches several weeks to detect when the learning rate is already too low, which would mean training is basically not really needed anymore.
    5 replies
    anamika-yadav99
    @anamika-yadav99
    Hi! I'm Anamika. I'm a 3rd year undergraduate student enthusiastic about open source. I'm new here and I want to contribute to this project, could you please guide me on how to start contributing and point out some beginner-friendly issues to get me started on this project? Thanks!
    Jack Gerrits
    @jackgerrits
    Hi @anamika-yadav99, welcome! You can start with the Good first issue label and se if anything interests you
    bef55
    @bef55
    Hello, I am trying to install the command-line version of VW in Windows 10. I followed all the instructions, and everything appeared to work properly. But I cannot find vw.exe anywhere, and I am confident it does not exist because `find . | grep vw.exe' turned up nothing. Any ideas what might have happened?
    19 replies
    Sander
    @sandstep1_twitter

    may you help understand why I have bug on Windows 10 and vowpalwabbit 8.10.2 or I do something wrong:

    this code runs correctly and it takes few seconds
    from vowpalwabbit import pyvw
    ridge_lambda = 0.001
    loss_function = 'squared'

    works

    VWmodel_simple = pyvw.vw(
    learning_rate = 0.1,
    l2 = ridge_lambda,
    loss_function = loss_function,
    )

    but this code not running correctly
    VWmodel_simple = pyvw.vw(
    cache_file = 'my_VW_2.cache',
    passes = 3,
    learning_rate = 0.1,
    l2 = ridge_lambda,
    loss_function = loss_function,
    )

    code run never finished
    what is wrong with creating this file 'my_VW_2.cache' ??

    command line output is
    using l2 regularization = 0.001
    Num weight bits = 18
    learning rate = 0.1
    initial_t = 0
    power_t = 0.5
    decay_learning_rate = 1
    creating cache_file = my_VW_2.cache
    Reading datafile =
    num sources = 1
    Enabled reductions: gd, scorer
    average since example example current current current
    loss last counter weight label predict features

    2 replies
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Hi all,
    Munu Sairamesh
    @musram:matrix.org
    [m]
    I am using vowal wobbit contextual bandit. As my arms are around 10,000, I am using --cb_explore_adf as its support action features. If I have a context x = a, then I will use arms 1-300. For context x = b I will use arms 200-400. Each context will be mapped to some subset of possible arms. In the document its written that you can add arms and remove the arms using --cb_explore_adf. I am not still clear with it. Can anybody help me?
    Munu Sairamesh
    @musram:matrix.org
    [m]

    HI. I am using https://vowpalwabbit.org/tutorials/cb_simulation.html. I need to save the model and use it to later as the reward(feedback comes to the system few hours later). vw1 = pyvw.vw("--cb_explore_adf -q UA --quiet --epsilon 0.2 save_resume=True")

    num_iterations = 5000
    ctr = run_simulation(vw1, num_iterations, users, times_of_day, actions, get_cost)

    But then when I save and use the model again with different epsilon, its not using the weigths of the saved model.vw1.save("data/saved_model.model")
    vw2 = pyvw.vw("--cb_explore_adf -q UA --quiet --epsilon 0.8 i=data/saved_model.model").
    1 reply
    Can anybody help here?
    Munu Sairamesh
    @musram:matrix.org
    [m]
    Thanks jackgerrits.
    Alexey C
    @ColdTeapot273K

    Hello, can anyone tell how to make cb_explore_adf agent respond to requests in daemon mode properly? i send the multiline commands via echo ... | netcat ... like in documentation and get no response.

    If i launch w/ --audit flag i recieve a bunch of info with unintuitive formatting (see attachment). i assume the 1st value in each line is action probability, and the very last line is some combined gradients or whatnot. Very different from a pmf output, like in python example on website.

    image.png
    Max Pagels
    @maxpagels_twitter
    Any idea how to get the holdout loss from pyvw?
    foo = pyvw.vw("--ccb_explore_adf -d train.gz --passes 2 -c -k --enable_logging")
    foo.get_sum_loss() / foo.get_weighted_examples() # not the holdout loss (checked against a CLI run)
    2 replies
    Alexey C
    @ColdTeapot273K

    Sorry, couldn't attach the img to my prev message thread

    A problem:
    for some reasons i get n+1 size of pmf for data with n distinct action.

    Details:
    When i do training on cb_explore_adf for datapoints only with 3 actions (no features apart from shared|...) and supply one of these examples for testing (action:cost:proba removed obviously), i get 4 actions in the output file. Why it might be?

    image.png

    5 replies