Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Max Pagels
    @maxpagels_twitter
    Got some time on my hands, is there a need for a tutorial specifically on progressive validation? I feel there might be some caught out by it, namely the fact that the final regressor in such a setup isn't necessarily good (for lack of a better word). I've seen setups that use incremental learning offline, which is only really advisable if you fully understand the implications (mostly, you are better off using a standard train/test validation procedure).
    3 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys, question about CCBs... is it expected that using the same policy configuration on the same dataset will yield the exact same results in multiple runs? I've been observing this and wondering if something's wrong with my pipeline.
    I thought it shouldn't be due to exploration which inherently brings randomization to the process, but since VW does this under the hood for CCB, I'm not sure.
    To further explain: I'm training a new policy every time (with the same hyperparameters) on the same data ordered the same way. The results (even the action distribution) is always the same.
    Is this expected?
    2 replies
    Sushmita
    @Sushmita10062002
    Hello everyone, I am Sushmita and I am currently studying at Ramjas College, Delhi University. I know python, java, Machine Learning, deep learning and tensorflow. I am new to open source but I really want to contribute in this organisaton. Could you please point me in right direction ?
    2 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    What is the correct way to offline evaluate a policy with exploration?
    My goal is to estimate the average reward of different models and compare that with the current online policy (which can be uniformly random) to assess if it's appropriate to deploy a new policy or not.
    I've been using a custom pipeline that trains for multiple passes using rejection sampling and then assessing the model's performance on a test set, but I'm a bit unsure due to the contexts' distribution shifted induced by the rejection sampling.
    MochizukiShinichi
    @MochizukiShinichi
    Happy holidays everyone! Could anyone please share some knowledge on how to read the logs for contextual bandits in VW? Specifically, 1. is the 'average loss' represent average reward value estimated counterfactually? 2. Is it calculated on a holdout set? 3. If 1 holds, where can I find out the loss of the underlying classifier/oracle? Thanks in advance!
    jonpsy
    @jonpsy:matrix.org
    [m]
    Hello VW team, can someone please explain this format in CCB
    ccb shared | s_1 s_2
    ccb action | a:1 b:1 c:1
    ccb action | a:0.5 b:2 c:1
    ccb action | a:0.5 
    ccb action | c:1
    ccb slot  | d:4
    ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
    jonpsy
    @jonpsy:matrix.org
    [m]
    :point_up: Edit: Hello VW team, can someone please explain this format in CCB
    ccb shared | s_1 s_2
    ccb action | a:1 b:1 c:1
    ccb action | a:0.5 b:2 c:1
    ccb action | a:0.5 
    ccb action | c:1
    ccb slot  | d:4
    ccb slot 1:0.8:0.8,0:0.2 0,1,3 | d:7
    feeding this to the algorithm prints this
    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    [warning] Unlabeled example in train set, was this intentional?, I didn't understand
    What exactly is the slot argument? ccb slot | d:4 what does it mean? Does it mean this is the fourth slot? In L7, what's happening? + the error term. Would be really grateful for an explanation, thanks
    jonpsy
    @jonpsy:matrix.org
    [m]
    Thanks a ton, I was also going through your answer in stackoverflow. I'll be sure to update you on this.
    jonpsy
    @jonpsy:matrix.org
    [m]
    @bassmang: actually could you review my VowpalWabbit/vowpal_wabbit#3546. We can discuss the details over there, let me know if you'd like that
    andy-soft
    @andy-soft
    Hello VW team, I am using VW from C#, for NLP tasks, and there are some examples I could not reproduce because of the constant incompatibilities across versioning of VW, some papers published in 2015 claim the VW is an excellent and fast POS tagger and also claims to be an ultra-fast and precise NERC Named Entity Recognizer Classifier. I've implemented the experiments and it is not true, the F1 scores obtained were significantly lower than the published, and only in English as soon as you switch to other languages like Spanish, the thing renders unusable.
    ¿Has anyone experienced issues like this? I am very disappointed because of this.
    I am using it currently for NLP Intent detection, (OAA Classifier on POS-tagged + morphological analyzed text)
    Although it runs smooth and fine, the F1 score must be always re-calculated afterward, and the loss is tricky, you can get a really low loss, and a bad F1 and vice-versa! Precision-Recall behavior is also an issue.
    12 replies
    jonpsy
    @jonpsy:matrix.org
    [m]

    In CCB, what's the difference between the weight of an example vs it's cost. I thought weight is just an inverse of cost?

    I tried feeding the training data from CCB page, and the weight seems to be equal to the example counter, is that intentional?

    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    Okay, I recollect in the CB example cost was just the inverse of reward.
    Priyanshu Agarwal
    @priyanshuone6
    Did anyone get a chance to look at my comment
    3 replies
    andreacimino
    @andreacimino
    Question regarding Conditional Contextual Bandit (CCB).
    As it stands now, by looking at the source code, seems that the selection of the action on a slot does not take into account the previous action selected as "features". I will try to explain me better:
    Suppose that there are N slots and M actions (M >= N).
    The algorithms choose in slot N_1 action M_1, then a decision on N_2 must be taken, and other actions are similar to M_1.
    There is a high chance that in slot M_2 an action similar to M_1 will be taken.
    I would like to pass as "context" the decision made at the previous step, to promote "diversity".
    I am not an expert, but I would like to know if someone has some experience regarding that.
    jonpsy
    @jonpsy:matrix.org
    [m]
    @jackgerrits: Hey, thanks for the detailed review. Would you mind reviewing the PR sometime soon? I think its done, we could merge it today/tommorow :-)
    jeanjean
    @jeanjeaCyberboy_twitter
    Hey everyone I have a quick question. Just started using the VW library and managed to extract the audit logs for the CB explore model. I was wondering what the scale for the weights for the features is? Does the model assign them at random, why are there negative values? Couldn’t find anything meaningful in the documentation. Would appreciate any assistance.
    4 replies
    jonpsy
    @jonpsy:matrix.org
    [m]
    @bassmang: Hey, just saw you updated the master branch with some major changes for labels.
    I see that you've used Union [ example, Costs] instead of using kwargs, should I go by that method too then?
    1 reply
    Also we should check in from_example(..) if example type is what it claims to be. No?
    4 replies
    jonpsy
    @jonpsy:matrix.org
    [m]
    :point_up: Edit: Also we should check in from_example(..) if example labelType is what it claims to be. No?
    MochizukiShinichi
    @MochizukiShinichi
    Hi team, my invert_hash output from cb_adf always has one additional line for each feature without a feature name, what does this mean? Example below
    Version 8.10.1
    Id 
    Min label:-1
    Max label:1
    bits:18
    lda:0
    0 ngram:
    0 skip:
    options: --cb_adf --cb_type dr --csoaa_ldf multiline --csoaa_rank
    Checksum: 4264491651
    event_sum 0
    action_sum 0
    :0
    s^age:7950:0.108916
    :7951:0.242307
    s^year:39846:-1.6944
    :39847:-1.82654
    1 reply
    jonpsy
    @jonpsy:matrix.org
    [m]
    hey, do we have meet today? haven't been seeing agenda for a while
    2 replies
    jonpsy
    @jonpsy:matrix.org
    [m]

    I had a few questions if you guys don't mind.

    a) Are you guys conducting RLOS this year? Not that I'd stop contributing if you aren't, but it certainly gives motivation to contribute :)

    b) On the topic of AutoML, I read the wiki and some merged PRs. I’d like to contribute, how can I help since its quite volatile right now, my PRs maybe slow so I don’t want to end up being the bottleneck. Other project which interests me: Python model introspection, CB in python.

    c) I’m currently in a 6 month internship, but I think RLOS allows for this to be part-time? Would that be okay?

    Thanks for your answers

    1 reply
    mustaphabenm
    @mustaphabenm

    Hello everyone, I'm trying a toy example to try and understand the model weights in the context of CATS algorithm
    Training data : ca 1.23:-1:0.7 | a:1
    Test data : | a:1
    The vw command vw --cats 4 --bandwidth 1 --min_value 0 --max_value 32 --d train.vw --invert_hash m.ih -f model.vw --noconstant
    the output is :

    Version 8.11.0
    Id 
    Min label:-1
    Max label:0
    bits:18
    lda:0
    0 ngram:
    0 skip:
    options: --bandwidth 1 --binary --cats 4 --cats_pdf 4 --cats_tree 4 --cb_explore_pdf --get_pmf --max_value 32 --min_value 0 --pmf_to_pdf 4 --sample_pdf --tree_bandwidth 0 --random_seed 2147483647
    Checksum: 2730770910
    :1
    initial_t 0
    norm normalizer 0.357143
    t 1
    sum_loss 0
    sum_loss_since_last_dump 0
    dump_interval 2
    min_label -1
    max_label 0
    weighted_labeled_examples 1
    weighted_labels 1
    weighted_unlabeled_examples 0
    example_number 1
    total_features 1
    total_weight 0.357143
    sd::oec.weighted_labeled_examples 1
    current_pass 1
    a:108232:-0.190479 0.714286 1
    a[1]:108233:-0.190479 0.714286 1

    Can anyone please help to use the weights to get the final results ? Thank you

    1 reply
    Bernardo Favoreto
    @Favoreto_B_twitter

    Hey guys, just trying out the new VW release (it looks awesome, by the way)!
    I was wondering what's the proper way of using --automl. I tried using --cb_explore_adf --automl 5 --oracle_type one_diff and it didn't give me any warning, is that correct?
    Moreover, is there a way for me to know which interactions the model found to be the best after training with --automl?

    Thank you!

    12 replies
    Max Pagels
    @maxpagels_twitter
    Fantastic work by the VW team on the 9.0 release, congrats!
    Bernardo Favoreto
    @Favoreto_B_twitter

    I just tried to use another experimental feature, experimental_full_name_interactions, but wasn't able to.
    Scenario: I've changed one of the namespaces in my dataset to begin with the same letter (changed Session to Usession - just for testing purposes, and I also have the User namespace).
    Then, I tried running the following command:
    vw --cb_explore_adf -c train.dat --passes 5 experimental_full_name_interactions Usession|User -f regular.vw
    And got the following output:
    User: command not found
    Is this a bug or am I doing something wrong?

    Thanks

    18 replies
    George Fei
    @georgefei
    Hey team, quick questions: why do I get 4 lines per feature in the invert_hash output if I use the legacy cb option?
    Version 9.0.0
    Id 
    Min label:-2
    Max label:0.189197
    bits:18
    lda:0
    0 ngram:
    0 skip:
    options: --cb 2 --cb_force_legacy --cb_type dr --csoaa 2 --random_seed 123
    Checksum: 955399906
    :1
    initial_t 0
    norm normalizer 508
    t 32
    sum_loss -32.2039
    sum_loss_since_last_dump 0
    dump_interval 64
    min_label -2
    max_label 0.189197
    weighted_labeled_examples 32
    weighted_labels 0
    weighted_unlabeled_examples 0
    example_number 32
    total_features 128
    total_weight 127
    sd::oec.weighted_labeled_examples 32
    current_pass 1
    l1_state 0
    l2_state 1
    d:20940:0.0795723 0.676124 1
    d[1]:20941:-0.253962 17.0199 1
    d[2]:20942:0.0793786 0.305578 1
    d[3]:20943:-0.253947 8.55648 1
    e:69020:0.0795723 0.676124 1
    e[1]:69021:-0.253962 17.0199 1
    e[2]:69022:0.0793786 0.305578 1
    e[3]:69023:-0.253947 8.55648 1
    a:108232:-0.253785 17.065 1
    a[1]:108233:0.0793965 0.706623 1
    a[2]:108234:-0.253947 8.55648 1
    a[3]:108235:0.0793786 0.305578 1
    b:129036:-0.253785 17.065 1
    b[1]:129037:0.0793965 0.706623 1
    b[2]:129038:-0.253947 8.55648 1
    b[3]:129039:0.0793786 0.305578 1
    f:139500:0.0795723 0.676124 1
    f[1]:139501:-0.253962 17.0199 1
    f[2]:139502:0.0793786 0.305578 1
    f[3]:139503:-0.253947 8.55648 1
    Constant:202096:-0.238701 17.7411 1
    Constant[1]:202097:-0.238162 17.7265 1
    Constant[2]:202098:-0.238136 8.86206 1
    Constant[3]:202099:-0.238136 8.86206 1
    c:219516:-0.253785 17.065 1
    c[1]:219517:0.0793965 0.706623 1
    c[2]:219518:-0.253947 8.55648 1
    c[3]:219519:0.0793786 0.305578 1
    34 replies
    Kwame Porter Robinson
    @robinsonkwame
    If anyone knows of a better space to ask please me know, but I'm a PhD student interested in funding continued development of a current VW branch that's been sitting since 2020
    3 replies
    Ryan Angi
    @rangi513

    I was attempting to use the VW estimators library for some bias correction on a pandas dataframe before building a Q function (model) outside of VW, but I noticed there is no Doubly Robust or MTR methods in this python library? Is that intentional or are they named as something else and I am missing them?

    Also a small Usage snippet in the README might be useful. I've been reading through the basic-usage.py, but it is somewhat difficult to figure out with just this script.

    Tobias S
    @Tobias2020_gitlab

    Hello VW community,

    We are evaluating to use of Vowpal Wabbit for our recommender system (multi-arm-bandit). We want to show different images (combinatorial) and predict with which the user will interact.
    Going through the documentation, Vowpal Wabbit supports with Slates the combinatorial setup. For the reward, it is stated that:
    "A single, global, reward is produced that signals the outcome of an entire slate of decisions. There’s a linearity assumption on the impact of each action on the observed reward."
    I.e. the semi-feedback we need is not supported by default.

    Our question: Is there a way to work with semi-feedback and Slots in Vowpal Wabbit?

    Thank you :)

    2 replies
    Rajan
    @rajan-chari
    Hi Tobais, What do you mean by semi-feedback?
    Debraj Maji
    @snnipetr

    Hello all, I am trying to install Vowpal Wabbit on my local machine and have succesfully built and also ran the tests without any failures. However whenever I try to run make install it gives an error.

    make: *** No rule to make target 'install'.  Stop.

    I am unable to resolve it . Any help would be appreciated.

    5 replies
    musram
    @musram
    I have doubt regarding the continuous contextual bandit.
    I train the model as bandwidth = 1000
    num_actions = 20
    vw = pyvw.vw("--cats " + str(num_actions) + " --bandwidth " + str(bandwidth) + " -d data/cats.acpx --min_value 0 --max_value 20000 --json --chain_hash --coin --epsilon 0.2 -f saved_model2.model --save_resume -q :: --quiet")
    I want to generate the model file uisng invert-hash and i use pyvw.vw(" -d data/cats.acpx -t -i saved_model2.model --invert_hash model2.humanreadable"). But i am getting ouput in the model2.humanreadable as ['Version 8.11.0\n', 'Id \n', 'Min label:0\n', 'Max label:0\n', 'bits:18\n', 'lda:0\n', '0 ngram:\n', '0 skip:\n', 'options: --bandwidth 1000 --binary --cats 20 --cats_pdf 20 --cats_tree 20 --cb_explore_pdf --coin --epsilon 0.200000002980232 --get_pmf --max_value 20000 --min_value 0 --pmf_to_pdf 20 --quadratic :: --sample_pdf --tree_bandwidth 1 --random_seed 11542015123243797559\n', 'Checksum: 1406756192\n', ':0\n']. Is something wrong in the way I pass the params?
    1 reply
    David Chanin
    @chanind
    Is there a way to convert the .json format to the standard vw format? I want to verify that I'm using the JSON format correctly with cb_explore_adf, but I can't find any examples of using JSON with namespaces in cb_explore_adf
    6 replies
    also is --json the same as --dsjson?
    musram
    @musram
    I have taken this example from https://github.com/VowpalWabbit/jupyter-notebooks/blob/master/cats_tutorial.ipynb. With the command vw = pyvw.vw("--cats_pdf " + str(num_actions) + " --bandwidth " + str(bandwidth) + " --min_value 0 --max_value 100 --json --chain_hash --coin --epsilon 0.2 -q :: ")
    I printed the actions and probs which are like [(0.0, 0.5625, 0.0020000000949949026), (0.5625, 2.5625, 0.4020000100135803), (2.5625, 100.0, 0.0020000000949949026)]. The last value in the each tuple corresponds to pdf. Now how do you sample using this pdf? If I am not wrong to sample from pdf, we sample from x ~ unif(0,1) and then cdf-inv(x) will be the sample. Is similar thing is done here? If yes then there should be a parameter for this pdf. for eg guassian has mean and variance. How to get that?
    2 replies
    David Chanin
    @chanind
    I'm trying to do off-policy evaluation for a contextual bandit, as described here: https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/tutorials/off_policy_evaluation.html. However, when I follow this guide, it says the average loss is negative. What does it mean to have a negative loss? I thought loss can only be positive? Does a larger negative number mean better results or worse results if the loss is negative?
    8 replies
    Atharv Sonwane
    @threewisemonkeys-as

    Hello. I had a question about the Compiler Optimisation project for RLOS Fest.

    I wanted to clarify the aims for the project. From the project description: "We will develop RL agents using VowpalWabbit" and "Implement VW agents for CompilerGym".

    Now the tasks in CompilerGym such as Optimisation Phase Ordering are multi-state MDPs which are usually tackled with full RL approaches (as opposed to Contextual Bandits).
    The closest method I could find to full RL in VW is learning to search. However these are all imitation style learning algorithms which require an oracle as a reference policy to learn from which is not available in the Compiler Optimisation case.

    One approach that came to mind is to modify the reward and observation space of CompilerGym so that each observation would contain context from previous states and reward would be reward to go to termination. This would allow for the use of Contextual Bandits but would be a little awkward compared to using a full RL agent.

    The other approach I thought of was using core VW classification algorithms to learn a Q-function within a larger RL agent. If this is the case then is there a particular reason for using VW algorithms (such as for scale and performance) as opposed to neural nets or other ML models?

    In general I am just trying to get an idea of the expected approach since CompilerGym hosts full-RL problems whereas VW is targetted towards contextual bandits.

    1 reply
    Jack Gerrits
    @jackgerrits
    Applications close on April 4 for this year's RLOS Fest. If you are a student or know a student that would be interested. Consider applying! https://www.microsoft.com/en-us/research/academic-program/rl-open-source-fest/
    Raj Gupta
    @rajuthegr8

    Hello. I want to work on the project "Introduce Feature Engineering Language in VowpalWabbit" for RLOS Fest.

    I have completed part 1 and 2 of the screening exercise and I wanted to clarify something about part 3. In the DataBase of DataRows where each row is is map<string,float>

    Can i assume the key values will be disjoint for the different rows?

    What will be the number of keys in a row and the number of rows with respect to the length of the queries? I am asking because this part asks for some ideas about how to optimize the function select() but the constraints are a little vague.

    3 replies
    Raj Gupta
    @rajuthegr8
    Hi everyone,
    I have submitted my application for the project "Introduce Feature Engineering Language in VowpalWabbit" . I hope I am one of the students who gets selected and look forward to working with the team behind Vowpal Wabbit.
    Thank you
    Saahil Ali
    @programmer290399

    Hello Everyone!!
    I have applied for the "Native CSV parsing" project in RLOSF,
    One thing I noticed related to the screening exercise was that the output for the second example had some discrepancy, I
    am not sure if it is the case or not, it'd be really great if someone can confirm that. I think shouldn't the first line of output should have C:1 rather than C:2 ?

    Anyways, I hope I too get selected this year...
    Best of luck to @threewisemonkeys-as , @rajuthegr8, and all other applicants 👍

    1 reply
    image.png