by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Paras Chopra
    @paraschopra_twitter
    Thanks for your reply @jackgerrits
    I have a follow up question. I was reading about the hashing trick and am unable to understand why it should work. Also, my intution suggests that a hash like simhash that hashes similar content closely should be a better choice but VW seems to use murmurhash3. Any idea why something like simhash is not used? murmurhash3 would hash very similar features like country=usa and country=uk far apart, while simhash will hash them closely.
    Jack Gerrits
    @jackgerrits
    @paraschopra_twitter based on this mailing list message it seems that the primary motivator for the hash is speed and avoiding collisions: https://groups.io/g/vowpalwabbit/message/903?p=,,,20,0,0,0::Created,,murmur,20,2,0,50252301
    I don't know if locality will matter too much when things are close to the point of a hash being close, I feel as though it is still too sparse to get wins here. But I could be wrong here.
    As far as I know the hashing trick is about mapping strings to indices, which the hashing achieves
    Jack Gerrits
    @jackgerrits
    I haven’t heard of simhash before. I’d love to see a comparison of it used in vw.
    Paras Chopra
    @paraschopra_twitter
    Thanks @jackgerrits - let me read the link
    Hanan
    @hanan-vian
    I have made a docker with jupyter-notebook server and notebook with the tutorial examples. Should I PR this?
    Jack Gerrits
    @jackgerrits
    Hi @hanan-vian, how do they compare to the files here? https://github.com/VowpalWabbit/jupyter-notebooks
    This repo uses a dockerfile that binder loads
    Jack Gerrits
    @jackgerrits
    @paraschopra_twitter this is also a very good reference into previous investigations into hashing https://github.com/VowpalWabbit/vowpal_wabbit/wiki/murmur2-vs-murmur3
    Anshul Patel
    @anshulp2912
    Hi @all, I am working on a project that requires a dynamic multi-class text classification model i.e. new class types will be added as per requirements. Can anyone assist me to check whether Vowpal Wabbit is capable of handling these cases or not?
    Any help would be much appreciated!
    Paul Mineiro
    @pmineiro
    @anshulp2912 the --csoaa_ldf and --wap_ldf modes use label dependent features , which allow you to specify a dynamic set of labels on each example.
    wangtianyu61
    @wangtianyu61
    Hello @all, I have a question for vowpal wabbit python implementation. Since my command encounters some difficulties in building, I am wondering if I can perform training in Python without explicitly looping over each example for CB problem? Like the question (https://stackoverflow.com/questions/62679806/does-vowpal-wabbits-python-interface-support-a-single-call-for-model-training-t).
    Eric Madisson
    @Banjolus_twitter

    Hi there, I have been pointed to this chan regarding my query, seems that this a perfect question for you.

    For my online advertisement project I'm using Thompson Sampling to try optimise which of my headline yield the most clicks. I'm using thompson sampling because after reading several and several paper it looks like it is one of the best suited when there is a delayed feedback (I can only update the TS params every 30 minutes)

    My question is, is there a general formula for TS that would tell me how much experiences (ads impression in my scenario) would I need in order for the optimisation to be considered effective?

    or "What is the minimum time and number of impressions needed for the optimization to be effective?"
    Jack Gerrits
    @jackgerrits
    Hey @wangtianyu61, just posted an answer on the SO question
    Paul Mineiro
    @pmineiro
    @Banjolus_twitter : your question is somewhat OT (it's not a perfect question for us, because VW uses algorithms other than TS). here's a brief answer anyway. TS maintains a distribution over possible arm values, samples from it, and acts greedily. The amount of data you need depends upon the distribution being maintained, the number of arms, and what the true differences in arm value are (large margin best actions can be detected more quickly). In the multi-arm-bandit case (no contexts) where each arm is assigned a beta(a,b) distribution where "a = number of actual clicks plus prior initial click count" and "b = number of actual impressions plus prior initial impression count", you can plot the beta distribution to see how concentrated your posterior becomes on a single arm, and do simulations to see how often you play a certain arm and how much reward you get over time. Looking at reward rather than "did I play the best arm" is important because you care less about being confused between two arms that have similar values.
    @Banjolus_twitter : p.s., the algorithms used in VW work just fine when there is delayed feedback.
    Eric Madisson
    @Banjolus_twitter

    Hi @pmineiro thank you for the clarification. I haven't the chance yet to fully play with VW because (for now) I just want to do something very basic (with no context) so I went and chosen TS which was simple to understand and to code.

    Drawing the distirbution is the soltuion I'm currently doing to see how the variation evolve over time and the only way I found so fare is , like you mentionned, simulate with different param to see how much impression are needed in order to reach for example 95% confidence
    But I was wondering if there are some general formulas that could apply for any MAB algos

    Paul Mineiro
    @pmineiro
    @Banjolus_twitter there are PAC bounds that are insightful but not particularly useful practically.
    Allegra Latimer
    @alatimer
    Hi all, I have a question about the --eval flag. I am trying to evaluate the policy that generated a training data file in cb_explore_adf (multi-line) format. EG, add the loss from whatever action was taken historically and its probability. I couldn't find any documentation on how to do this specifically for multi-line examples/cb_adf on the wiki. Naively trying the eval flag seems to train a model---not what I want, I just want to evaluate the loss of the historical data. I could do this manually, but I have a VW pipeline set up and it would be great if it fit into that. Any thoughts?
    eg something like the following: "vw --eval --progress 100 --cb_explore_adf --binary --loss_function logistic -d train.dat"
    Paul Mineiro
    @pmineiro
    @alatimer : Just checked, looks like --eval is only supported for --cb and not --cb_adf. So the flag is being silently ignored. Difference is between this line in cb_algs.cc and a lack of something similar in cb_adf.cc
    Allegra Latimer
    @alatimer
    Thanks @pmineiro! That makes sense, in that case I'll go the manual evaluation route.
    Allegra Latimer
    @alatimer
    Hi again all---does anyone know whether I can use the --bootstrap flag with cb_adf (or with the cb functionality generally)? It would be great to have an idea of the trained model's variance. Aside from taking longer to train, it doesn't seem like my stdout changes at all with or without the flag. EG, I would like to run a command like this: vw --cb_explore_adf --bootstrap 100 -d train.dat and to get out confidence intervals on the final PVL
    wangtianyu61
    @wangtianyu61
    Hi all. I am wondering if the command line version for vw in online contextual bandit can return the loss in each time t (rather than just pv-loss finally) by the command like that vw --cbify data set path --epsilon 0.05.
    Allegra Latimer
    @alatimer
    Hi @wangtianyu61 , do you mean just have the stdout print the loss for every example? If so, you can use the --progress flag, eg vw --cbify N -d data_path --epsilon 0.05 --progress 1
    The column "since last" prints the average loss since the last printout, so in the case of --progress 1 that is the loss of each training example
    wangtianyu61
    @wangtianyu61
    Thanks @alatimer ! That works well.
    Jeroen Janssens
    @jeroenjanssens

    Hi everybody,

    I'm new here! It's been a few years since I've used VW so I'm really glad I have found this community :) I'm currently writing the second edition of my book Data Science at the Command Line and VW will play a big role in Chapter 9: Modeling Data. I'm also working on the Data Science Toolbox which will include VW and many other command-line tools.

    I was wondering, when installing VW via pip, is the command-line tool vw also installed? The documentation seems to suggest so, but I'm unable to locate it. I'm on Ubuntu.

    Thanks,

    Jeroen

    Jack Gerrits
    @jackgerrits
    Hi @jeroenjanssens, welcome back! pip will not install the command line tool as far as I know. https://vowpalwabbit.org/start.html has info about how to get the C++/command line tool by building from source (or brew on MacOS). Please feel free to reach out to me if you have any more questions!
    Jeroen Janssens
    @jeroenjanssens

    Thanks @jackgerrits, that's good to know. Building from source works great on Ubuntu, so I'll just stick to that.

    The Getting Started tutorial assumes that the command-line tool is installed. Would it be a good idea to add a note that clarifies that?

    Jack Gerrits
    @jackgerrits
    There is a prerequisites note on the tutorial page - think it needs a more prominent note?
    Jeroen Janssens
    @jeroenjanssens
    I think it would be helpful to mention that the command-line tool is needed for this tutorial and that installing VW via pip is not sufficient. Related thought: would it be possible and desirable to let pip install the command-line tool as well?
    Jack Gerrits
    @jackgerrits
    Okay I'll make a note to review the wording there. I am not sure if we want to distribute the CLI as part of the Python package or not, but I agree it would be a good to have an easier way to get the CLI exe
    I created an issue to track here: VowpalWabbit/vowpalwabbit.github.io#153
    Jeroen Janssens
    @jeroenjanssens
    Excellent @jackgerrits!
    pmcvay
    @pmcvay
    On the wiki page, it states that the squared loss is the default loss function for vw. Is this true even for binary classification? I've always thought that using squared loss for binary classification is frowned upon
    Srinath
    @SrinathNair__twitter
    Hi Everyone,
    I have a very basis question. Based on what I have understood the goal of contextual bandit algorithm is to find the best policy among a policy class, i.e. the one that provide the maximum average reward over a period of time.
    So, what is the policy class used by Vowapal Wabbit's contextual bandit tool? Is it neural network or decision tree or something else?
    Allegra Latimer
    @alatimer
    Hi @SrinathNair__twitter , try reading the Bake-off paper (https://arxiv.org/abs/1802.04064), it does a good job of explaining VW's CB implementation
    1 reply
    Yiqiang Zhao
    @YiQ-Zhao

    Hi all, I’m a newbie to contextual bandits and learning to use VW.
    Could anyone help me understand if I’m using it correctly.

    Problem: I have a few hundred thousands of historical data and I want to use them to learn a warm-start model. I saw there are some tutorials showing how to use cli in wiki. But i wonder if I can use its python version in this way, assuming the data has been formatted:

    vw = pyvw.vw("--cb 20 -q UA --cb_type ips")
    for i in range(len(historical_data)):
        vw.learn(historical_data[i])

    my questions are:
    1) Is this the correct way to warm start the model?
    2) If so, what prob should I use for each training instance? If it is deterministic, I guess it would be 1.0?
    3) For exploitation/exploration after having this initial model, can I save the policy and then apply --cb_explore 20 -q UA --cb_type ips --epsilon 0.2 -i cb.model to continue the learning?

    Thanks for the help in advance!

    2 replies
    Srinath
    @SrinathNair__twitter

    Hi Guys, I am working on a project similar to News Recommendation Engine which predicts the most relevant articles given user feature vector. I wanted to used VW's contextual bandit for the same.
    I have tried using VW, but it seems that VW only output's a single action per trial. Instead, I wanted some sort of ranking mechanism such that I can get the top k articles per trial.

    Is there any way to use VW for such use case?

    I have asked this question in stackoverflow as well. (https://stackoverflow.com/questions/63635815/how-to-learn-to-rank-using-vowpal-wabbits-contextual-bandit )
    Thanks in Advance.

    2 replies
    Avighan Majumder
    @AvighanMajumder_twitter
    Do we have any good technical literature regarding the package? Can anyone advise any good place to look into for vowpal wabbit?
    1 reply
    Max Pagels
    @maxpagels_twitter

    Hi! Thanks to VW authors for the CCB support, finding it very useful!

    Quick question: how is offline policy evaluation handled for CCBs in VW? IPS, DM, something else? Was wondering if there is a paper I can read about this. Was looking into https://arxiv.org/abs/1605.04812 but wasn't sure this estimator is the one VW uses specifically for CCBs.

    Paul Mineiro
    @pmineiro
    @maxpagels_twitter : re ope in ccb, great question. ccb currently uses an sum-over-IPS estimate on each slot independently, which is biased (doesn't account for effects of earlier actions on subsequent actions). we're investigating alternate strategies so this might change in another release. the slates estimator you reference is distinct: in slates there is a single reward (not per slot) and the pseudoinverse does a form of credit assignment. slates will be released eventually as a distinct feature.
    Max Pagels
    @maxpagels_twitter

    @pmineiro excellent, thanks for the response.

    A second question: let's say I have collected bandit data from several policies deployed to production one after the other, i.e. thought of as a whole, it is nonstationary.

    • Can I use all of the logged data to train a new policy, even though the logged data is generated by X different policies? If so, are ips/dm/dr all acceptable choices or do they break against nonstationary logged data?

    • How about offline evaluation of a policy? This paper https://arxiv.org/pdf/1210.4862.pdf suggest that IPS can't be used, is explore_eval the right option?

    What I'm looking for is the "correct" way for a data scientist to offline test & learn new policies, possibly with different exploration strategies, using as much data as possible from N previous deployments with N different policies. The same question also applies to automatic retraining of policies on new data as part of a production system, I'm unsure of the "proper" way to do it

    Paul Mineiro
    @pmineiro
    @maxpagels_twitter: first, regarding offline evaluation: IPS (and DR) is a martingale so the estimator is unbiased even if the behaviour policy is changed on every decision. the only thing prohibited is that the behaviour policy "looks into the future". however this assumes the world is IID producing (context, reward vector) pairs and then the behaviour policy draws on p(a|x) and reveals r_a. if the world is actually nonstationary then even if the behaviour policy is constant IPS can be biased. furthermore DM is typically biased. note unbiased isn't everything and biased estimator can have better overall accuracy.
    @maxpagels_twitter: second, regarding learning new policies and automatically retraining. Azure Personalizer Service is VW wrapped in a system that does this. it uses IPS estimator along with counterfactual evaluation to test offline CB algorithms. this supports model selection strategies similar to supervised learning. it's a pain in the butt to get all this right, so just use the product, that's why we made it.
    Max Pagels
    @maxpagels_twitter

    Nice, thanks! I've used the personalizer service, just curious as to how it works under the hood. So with IPS & DM it's ok to train model on logged dataset A-> deploy model -> collect logged data B -> train on A+B -> repeat with ever-growing dataset?

    What is the purpose of explore_eval then?