Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Arunagirinadan Sudharshan
    @SudhanAruna
    Hi everyone, I'm student applying for RLOS 2021 and I'm planning to work on vw daemon to use gRPC task. I wanted to whether I can email any mentors or maintainers to clarify certain doubts which will be helpful to finish the proposal. Please let me know if its possible.
    Raphael Ottoni
    @raphaottoni

    Guys would please help me with two questions:
    1) Using a cb_explore_adf, for a pricing agent. I was trying two types of reward: i) Sales and ii) sales X Price where each arm is a Price. I have noticed that the cb_explore_adf converge well when the reward is sales, but when we multiple the sales by the arm price. it simply doesnt converge at all. Is it possible that it is sensitive to scale? sales are in units ( like 40 at most) and price are in cents ( e.g 399).

    2) Another quick question? How to pass multiple Namespaces features in the the -q UA parameter.... I mean I want to add more variables from another namespace, something like -q [UA, MA].

    6 replies
    (U)ser and (M)erchant and (A)ction
    Bernardo Favoreto
    @Favoreto_B_twitter
    @raphaottoni I can help with your second question: adding multiple interactions is pretty straightforward in VW. You can do it in a few ways:
    1) Simply pass multiple interactions after the -q flag, e.g. -q UA UM MA and so on, depending on how many namespaces you have
    2) You can also pass different types of interactions (i.e., quadratic or cubic) using the --interactions flag. Here, you could use, for example --interactions UA UM UAM
    3) Finally, you can specify all possible interactions by using a special "symbol" (don't know if it's the appropriate way of calling it). For example, specify all possible quadratic interactions -q ::. You should notice that using this is slower because there are lots of features created on-the-fly. Also, you should see a warning saying that some repeated features were ignored (by default in VW).
    Raphael Ottoni
    @raphaottoni
    @Favoreto_B_twitter thank you!
    7 replies
    @Favoreto_B_twitter Could you please, point me in the Documentation on this --intereactions? I didnt get what is the difference between -q and --interactions or how to say it is quadratic or cubic
    Bernardo Favoreto
    @Favoreto_B_twitter
    @raphaottoni Unfortunately there isn't any specification of it in the docs (at least I did not find it). I feel your pain, I just discovered about this by accident when watching their latest presentation, so I don't really know the details.
    As for how to say when it is quadratic or cubic, you can just think of how many namespaces are you interacting: if you use UAM, this is a cubic interaction. If you use UA, this is a quadratic. I believe this is valid, though I cannot confirm
    3 replies
    Raphael Ottoni
    @raphaottoni
    I know I use the namespace Action to set the price: like :
    |Action price=399
    is it expected of me..to also use a name space called price, so it will help to converge since I am building a reward that is a Gaussian value times the Price the arm represent?
    I am supose to build another namespace like:
    |Price value:399
    or
    |Price value=399
    I am asking this because I am not sure if the |Action namespace for the cb_explore_adf threats this as a feature to be quadratic related to the reward
    this is the only thing I thought would explain why the model converges if it is a value but when those values are multiplied by a constant (lets say the price each arm represent) it stops to converge
    Raphael Ottoni
    @raphaottoni
    shared |Merchant merchant_id=Restaurante_Japidin city=sao_paulo radius=500
    0:-79.8:0.866666661699613 |Price value:399 |Action price=399
    |Price value:499 |Action price=499
    |Price value:599 |Action price=599
    I am trying things like that... but it wont converge either =(
    Those are the curves.... one would thing it is easy to converge:
    { curve_type: "Gaussian", curve_id: "arm_1", mean: 20.0, std: 0.0},
    { curve_type: "Gaussian", curve_id: "arm_2", mean: 5.0, std: 0.0},
    { curve_type: "Gaussian", curve_id: "arm_3", mean: 4.0, std: 0.0}
    Those are the arm_prices:
    {"399": "arm_1", "499": "arm_2", "599": "arm_3"}
    reward = Arm_value * Gaussian Sample
    vw = pyvw.vw("--cb_explore_adf -q :: --epision 0.2")
    Raphael Ottoni
    @raphaottoni
    Screen Shot 2021-02-22 at 19.39.41.png
    Above is a graph of this setup, I really dont know why the agent chosen the most expensive arm!
    if i simple divide the rewards by 100 and run the very same experiment:
    Screen Shot 2021-02-22 at 19.40.32.png
    this thing bugs me
    =(
    I forgot to mention: reward is actually -1 X arm_value X Gaussian Sample
    Max Pagels
    @maxpagels_twitter
    @raphaottoni could you provide a github gist of your data?
    Raphael Ottoni
    @raphaottoni
    there is no data training data... just a "simulator" which is a object that would return a sample from those curves regarding the arm_id:
    { curve_type: "Gaussian", curve_id: "arm_1", mean: 20.0, std: 0.0},
    { curve_type: "Gaussian", curve_id: "arm_2", mean: 5.0, std: 0.0},
    { curve_type: "Gaussian", curve_id: "arm_3", mean: 4.0, std: 0.0}
    {"399": "arm_1", "499": "arm_2", "599": "arm_3"}
    each step VW chooses a ARM, I sample from this curve and multiply the result by the Arm's Price. Then I change the signal so it would be reward instead of cost and them fit it to the model...
    Raphael Ottoni
    @raphaottoni
    The problem appears to be solved if we apply a log function upon the reward.
    Max Pagels
    @maxpagels_twitter

    there is no data training data... just a "simulator" which is a object that would return a sample from those curves regarding the arm_id:

    if you have a simulator, you are training on some VW data somewhere. could you provide that dataset as a gist?

    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys, I was checking the Slates formulation out of curiosity, and it got me thinking. I could swear streaming services like Netflix used slates for recommendations. I know they personalize both the title recommendation and the thumbnail for each title. Thus, it seemed like the perfect use case for Slates (here, the title would be a slot and the image another slot): there is a single global reward (play or not), and the action set is disjointed.
    However, when trying to visualize how this would work in VW, I noticed that it probably wouldn't. What made me think this is that Slates predicts for all slots at once, and therefore there is no way we could select first the title, then pre-filter the possible thumbnails for that tile, and then make a prediction for the thumbnail slot.
    Am I missing something here? What are some use-cases of Slates for personalization using VW? The only one that comes into mind is "whole page optimization".
    Thanks!
    2 replies
    Jui Pradhan
    @JuiP
    Hi everyone, I was looking at the estimators repository issue: VowpalWabbit/estimators#1, we already have an implementation of ips estimator in Python. My question is why is "convert current IPS estimator to Python" mentioned as a Goal for this project? Can someone please clarify?
    2 replies
    pushpendre
    @pushpendre

    Hi I was wondering if I could get a pointer to the implementation of the --cb k --cb_type dr in the source code? Basically I am trying to understand the parameters that are learnt at the end of off-policy CB training in VW. E.g. I did

    vw --cb 3 --cb_type ips  -f cb.model -d train.txt --invert_hash readable_ips.model
    vw --cb 3 --cb_type dm  -f cb.model -d train.txt --invert_hash readable_dm.model
    vw --cb 3 --cb_type dr  -f cb.model -d train.txt --invert_hash readable_dr.model

    and the dr model obviously contains parameters equal to ips+ dm but I want to know exactly what is the linear regression formula that is being implemented in dr.

    3 replies
    CP500
    @CP500
    Hi everyone, Just a newbie question on CATS. Does it give you a PMF when you call predict?
    vw = pyvw.vw("––cats_pdf 7 –bandwidth 0.1 –min_value 0 –max_value 1")
    ex = vw.parse('ca | c1:0.5 c2:1.3', labelType=8)
    vw.predict(ex)
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hey guys, I would like to know if anyone found an appropriate way of calculating feature importance after training a model?
    I tried using sklearn/eli5 permutation methods but neither properly worked.
    Then, I decided to code my own, where I first train a model, then do permutation importance in a held out set. I'm a bit concerned as to whether the results are significant, mainly because of all interactions created on the fly with VW. I should mention I am aware of the multilinearity/correlation problem, and this is not my biggest concern.
    Does it even make sense to calculate feature importance in VW? (I assume so because this was one of the topics from the VW presentation at: https://slideslive.com/38942331/vowpal-wabbit
    Thanks!
    olgavrou
    @olgavrou
    @CP500 cats pdf should give you a pdf (probability density function) and not a pmf (probability mass function) as cats is predicting in a continuous action space. PDF is in the form of (left:right:pdf_value) triples so you could check that the pdf integrates to 1 by doing (left - right) pdf_value + (left - right) pdf_value for all the returned triples
    7 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hello everyone!
    Does anyone know where (and if) I can find the notebook for the Estimators library? I would like to use this lib, but there's not much documentation/examples on how to do so.
    Also, does it make sense to use this lib for CCBs?
    Thanks!
    Jack Gerrits
    @jackgerrits
    @Favoreto_B_twitter that repo is very much still a work in progress - so docs/examples are not there yet unfortunately. For CCB the approach that has been taken so far is to do CFE on the first slot only - so in this context I think it does make sense to use it. But it would need some adapting and I am not positive here.
    9 replies
    Max Pagels
    @maxpagels_twitter
    Is there an estimate on when the pypi version of pyvw will have CATS support? 8.9.0 doesn't support CATS labels
    Max Pagels
    @maxpagels_twitter
    I'm fiddling around with CATs, and have a simple setup with a fixed context. Per round, I ask for an action (range 0-100) and calculate a cost that is zero at 50, otherwise quadratically the absolute distance from 50 in either direction. If tried gridsearching a whole mess of bandwith, epsilon and learning rate values, but the learning is just all over the place. I would have expected the system to converge to an optimal prediction of 50.0 per round pretty easily since the context is always fixed. Instead, it either bounces around or gets stuck on some non-optimal values around 40. Any tips?
    olgavrou
    @olgavrou
    Hi @maxpagels_twitter what is the parameter you pass to --cats? have you experimented with that at all? For cats I would try different combinations of number of discrete actions used by the algorithm (passed in to the --cats arg) and bandwidths (bandwidth being a property of the continuous range). e.g. I would try a grid of num_actions [8, 16, 32, 64, 128, 256, 1024] and e.g. bandwidths [1, 2, 4, 6, 8, 10, 14, 20]. For different number of discrete actions you might need more data for CATS to converge to something sensible. CATS label support in pyvw should be available in the next release (coming soon-ish, we don't want to wait another year for the next vw release). Let me know if you get better results from CATS or not :)
    Max Pagels
    @maxpagels_twitter
    I tried gridsearching a whole mess of options, including a bunch of action counts, and can get relatively close to an optimum, but the hyperparams seem to be super important to get just right or the learning is way off. But I'll experiment further and report back
    3 replies
    Bernardo Favoreto
    @Favoreto_B_twitter
    Hello guys!
    Can someone help me understand why propensities scores are important when training a CB?
    I've been thinking about this lately and just couldn't wrap my head around a good explanation...
    Let's take the epsilon-greedy, for example. When we train a CB model with epsilon-greedy, the pmf output is always the same (just the indexes change). This makes me assume that propensities scores aren't supposed to teach the CB how to output probabilities. Moreover, I believe they are used for "importance weighting", i.e., prob(new_policy)/prob(logging_policy), but isn't this only for when we use IPS? I think I'm missing something quite obvious here...
    Also, when we offline train a new model using logged CB data, how is the new CB able to achieve better performance than the logging policy? I mean, it's an excellent thing, but I would like to understand how that is possible.
    Thanks!
    George Fei
    @georgefei
    Hi all, I have a few questions related to contextual bandit evaluation:
    1.How do I compare the performance of different policies’ decisions using --eval? Do I look at the average loss in the output? If the costs in the input data are all negative and a lower cost is better, does a lower average loss mean one policy is better? What does average loss represent?
    2.How do I interpret the output of --explore_eval? More specifically update count, violation count, and final multiplier (what variables do they correspond to in the algorithm on slide 9 of https://pdfs.semanticscholar.org/presentation/f2c3/d41ef70df24b68884a5c826f0a4b48f17095.pdf). Do I also look at the average loss to compare different exploration algo + hyperparameter combinations?

    3.In order to use -explore_eval I have to convert my data from cb format to cb_adf format since the cb format is not supported when using -explore_eval. For the example data with two arms below, are the two ways to represent the data equivalent?:

    2:10.02:0.5 | x0:0.47 x1:0.84 x2:0.29
    1:8.90:0.5 | x0:0.51 x1:0.65 x2:0.67

    shared | x0:0.47 x1:0.84 x2:0.29
    | a1
    0:10.02:0.5 | a2

    shared | x0:0.51 x1:0.65 x2:0.67
    0:8.90:0.5 | a1
    | a2

    Wes
    @wmelton
    Hello all - ive been evaluating Microsoft Personalizer for our company, which i have largely assumed is VW under the hood with MS specific tech/service written on top of it.
    My question is this - within a given namespace, does the order of features or their names matter? Im assuming yes, but the VW documentation out there doesnt make it super clear how to handle a situation where two given documents have the same keywords in them, but after tokenization, the keywords are not in the same order due to variance in the number of keywords found in each document. Appreciate guidance there.
    finally, i referenced Personalizer only because it sparked this train of thought largely because the documentation for it leverages only rhe JSON format of input data, but seems to neglect any instruction with regards to variation in keyword order if your features are keywords extracted from a document. Thanks!
    18 replies