Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Robin Niel
    Hello everyone !
    I'm facing an issue with the popularity bias. I have different class of documents and one of them is over represented (like 2.5 m vs 500,000). The problem is that user that are interested mostly in the under represented class gets recommended the over represented class. What do you think is the best to prevent this issue ? Mess with the items biases ?
    Maciej Kula
    @Datavoore have a look at this issue: lyst/lightfm#475
    My recommendation would be to use sample weights to implement some sort of inverse propensity weighting.
    Robin Niel
    Thanks for the quick answer and your awesome work :)
    Andrea Dodet
    I am currently experimenting with the https://www.kaggle.com/gspmoreira/articles-sharing-reading-from-cit-deskdrop dataset. Items are news articles and interactions (which will have different weights) are defined as:
    • View
    • Like
    • Bookmark
    • Follow
    • Comment
      When computing recommendation for users, known positives get recommended. Just for confirmation: is that completely normal? I do believe that should be interpreted as a good sign and I'll just need to cut out those predictions when serving to users.
    Simon Weiß
    Hi @andodet yes, that's normal. If you use LightFM.predict() known positives will get recommended, using LightFM.predict_rank() you can provide your training matrix and filter out known positives from your recommendations. I think for your use case removing known positives is a good idea. There are other use cases (think re-purchase of batteries on Amazon)where you would want to keep them.
    Andrea Dodet

    Thanks @SimonCW, I've actually overlooked potential re-purchases and it makes perfect sense, will probably use a mix of both depending on the context.

    Another "problem" I am facing is that I can't seem to invert ID mappings correctly to retrieve user_idfrom indexes. I am probably missing something easy here:

    def get_user_id(uid):
        return [k for k, v in dataset.mapping()[0].items() if v == uid]

    None of the user interactions from a ùser_id` computed from that function relates to the known positives from the following way of sampling recommendation:

     def sample_recommendations(model, labels, interactions, user_ids):
        n_users, n_items = interactions.shape
        for user_id in user_ids:
            known_positives = labels[interactions.tocsr()[user_id].indices]
            scores = model.predict(user_id, np.arange(n_items))
            top_items = labels[np.argsort(-scores)]
            print("User {}".format(user_id))
            print("Known positives:")
            for x in known_positives[:15]:
                print("    {}".format(x))
            for x in top_items[:15]:
                print("    {}".format(x))

    Might it be that a list comprehension like that messes dict ordering up?

    Andrea Dodet

    I left a gcp instance go through some parameter tuning. Everyting, except for num components looks unstable from run to run (I already performed a tuning session a few days ago with very different results in terms of learning rate, loss and metrics).
    Now, I believe this might be caused by a dumb mistake: I didn't fix the seed np.random.RandomState() before fitting the models, which causes the stochastic descent process to swing widly.

    Is that the case or is there anything else I should pay attention to?

    Robin Niel

    Hi @andodet,
    If I understood your question correctly, you should indeed be fixing the RandomState when doing tuning.

    For your mapping problem, I don't see what could be wrong in the code but dicts are not ordered structures so the ordering might differ from one execution to another. Concerning your get_user_id function, I don't get why you would want to return a list of indices

    Andrea Dodet

    @Datavoore Let me triple check that and get back to you (I think it was due to how the mapping is structured).
    Anyway I've retrained the model and it seems way more stable now. There's just one thing I am struggling to intepret:
    Sans features:

    As I've mentioned before I am working on the transactional dataset from Instacart (https://www.kaggle.com/c/instacart-market-basket-analysis) and item features consist in aisle and deparment. I expected them not to be very informative, but I'm wondering how they lead to a slightly less accurate model. Guess on that scale the performance delta is next to neglibible...

    Robin Niel
    I am running into the same kind of issue here when I try to add metadata of the musicals works I am working on. I haven't figured out why it would decrease the performance given that I have a lot of them ...
    Robin Niel
    I have been investigating on wether or not it could have a "qualitative" impact on the recommendation that is not measured in term of metrics.
    Andrea Dodet
    No idea how large is features' negative impact on your performances (in my case is totally bareable). Also, eyeballing some recommendations everything seems to be fine (cosine similarity between similar items too).
    Robin Niel
    It's also bearable, it's the same magnitude of impact as you but I thought it would improve the accuracy@k by better identifying taste for certain genre of music.. I'm using implicit feedback data tho
    Andrea Dodet
    I am currently comparing train/validation model performances (model without item/user features) and it looks like it is degrading across epochs. Is that a sign of model overfit?
    Robin Niel
    It does really looks like it, I found that a number of epochs around the default 30 epochs is often the best
    Hi @maciejkula sir, can I am working for e-commerce website recommendation. I need production level model train and predict recommendtion guidance. regarding inputs strategy
    Simon Weiß

    Hi all, hi @maciejkula I was hoping to source your crowd-intelligence about the best way to persist & serve recommendations to users. Currently, we save pre-computed recommendations for all users that we serve to be displayed as a recommendation stripe (Netflix-style).

    In the future, we would like to support multiple stripes based on item-metadata filtering which means we would need to pre-compute multiple stripes per user.
    Our current idea is to store latent-embeddings for users and items, then upon a request, filter the right items and only then score (do the dot-products). Afaik elasticsearch supports sth like that. Does anyone have industry experience with sth. like this? Is it fast enough w.r.t. to serving recommendations upon request? Do you have recommendations w.r.t. to technology?

    Simon Weiß
    And additional question: what do you do if you have a NN that's deeper than 1 embedding layer? Live score or pre-compute?
    Maciej Kula
    @SimonCW what you suggest sounds sensible if your corpus isn't too large. I would probably maintain a separate service for storing your embeddings: get the ids from ES and pass them onto your embedding service for computation.
    Once you have a million items or so it becomes impractical to do brute-force computations, and you'll want to start using something like Annoy.
    For deeper NNs, I have seen mixtures of precomputation and live computation. For example, your item embeddings are static, so you can precompute them; user features change from request to request and so you may want to compute them in real time.
    Simon Weiß
    Hi @maciejkula thanks so much for sharing your advice, this kind of information is really hard to come by! As you suggested, we'll probably move away from pre-computing all recommendations. I'll try to share our solution here in the future.
    Hi @maciejkula I would like to ask you a question.
    1. While predicting for a new user which is not present in the dataset, I will have to create a new user_feature matrix of shape (1, length of user_features) .
      So While predicting for a new user, I cannot use concatenated user feature matrix (identity matrix + features matrix) in model.fit() .
      Please correct me if i am wrong here .
    2. I have 2 users with same features in my dataset and my prediction for these two users comes exact same , although they have interactions with different items and have given different ratings to these items.
      Can you please tell what can be done to fix this issue ?
    Felipe Acuña
    Hi everyone, thanks so much for this library, its solving a lot of problems for me. I have a question regarding something I'm trying to do: let's say I have n items in a group A, and m items in group B, and I have features for users (demographics). If what i'm interested on is on doing a recommendation of a group B item for a user that has not done an interaction with items for group B, should I keep that interaction in the interaction matrix?
    should I keep that user's interaction on the interaction matrix*?
    Maheswaran Parameswaran
    Hello everyone, There is an issue i am facing though... ValueError: The item feature matrix specifies more features than there are estimated feature embeddings: 50 vs 51.
    How to rectify this, please help

    Hello everyone, There is an issue i am facing though... ValueError: The item feature matrix specifies more features than there are estimated feature embeddings: 50 vs 51.
    How to rectify this, please help

    I have the same issue

    2 replies
    @andodet : If you used the sample_recommendations function from here: https://github.com/jamesdhope/recommender/blob/master/recommender.py it has an error in labels[np.argsort(-scores)]. labels has to have the same length as scores, which was not the case in the recommender.py. This will match the wrong items to users. I therefore used dataset.mapping() to get the item mappings and replaced labels with that.
    mapp = dataset.mapping()
    dict_item_id = mapp[2]
    items_list = list(dict_item_id.keys())
    labels =np.array(items_list)
    In implicit feedback what could be the 'weight' in (user_id, item_id, weight) iterable when building interaction matrix ? Thanks
    2 replies
    Simon Weiß

    Hi, I'm hoping to crowd-source some opinions/knowledge concerning offline evaluation.

    TL;DR: Do you tune your hyper-parameters on the plain model or including business logic filters that might be in the candidate selection phase?

    Let's say you have some business logic for your movie recommender build into the candidate selection e.g.:

    1. Since it is December, you remove movies with the tag summer from the candidates
    2. For each user, you remove the already watched movies from their candidates.

    Would you do hyper-parameter tuning on the plain model or on the system including the business logic?

    I'm asking because I wanted to speed up my hyper-parameter tuning by predicting scores via batched matrix multiplys (instead of the model.predict() method). However, this doesn't work with the user-specific candidate selection like removing already watched movies.

    Dan Ofer

    Hi, n00b question: I'm trying to build my first lightFM model, with side data (user and item features - I'm keeping only categorical/string/discrete features).

    Adding item level features improves the model, adding user level features makes it much worse than the baseline (without any side features).
    ~160k users, 8k items, 75% of users have 4 interactions with items. Items and users can repeat.
    I'm trying to figure out why the performance is so much worse. (The user side-features are definetely relevant - I built a multiclass model beforehand with 200 top items to validate the user features).

    Is it (still) necessary to add an "eye" sparse matrix for users when adding side features, as in https://www.ethanrosenthal.com/2016/11/07/implicit-mf-part-2/ ?

    1 reply
    I wanted to know how one could use lightFM model to make prediction for new users which were not present in training data ?
    Hi @maciejkula , we wanted to move from ALS to lightfm in production, our current production interaction matrix is quite large , is there any limit that lightfm currently operates at?
    2 replies
    Matthew E. Porter

    Disclosure of being a lightfm n00b... have built a model using warp. The training data includes a user that rated an item 4.0. When I call model.predict() passing in the user ID and the item ID, it comes back with a -0.55.

    Am I doing something wrong here? Any help is appreciated.

    3 replies
    What could explain a performance decrease when we decrease the number of item ?
    2 replies

    I was looking for documentation or article which could help to get an intuitive understanding on how each of the following parameters could impact the lighftFM model (prediction or ranking accuracy)

    'k': 5
    'n': 10

    The reason why I'm asking this is that I did a lot of parameters optimization, but no important changes observed on the model performances except for epochs (high value causes overfitting)


    Hi @maciejkula , while predicting for cold start scenario i am using model.predict(0, item_ids,item_features=shows_metadata_csr,user_features=user_features_t), what should be the dimension of shows_metadata_csr?, should the identity matrix be included in shows_metadata_csr?
    @SimonCW , we are currently deploying lightfm in production, could you please respond to the above query
    Armin Haghi
    Hi everyone,
    I'm also working on the user cold start problem and have some questions on the user_id in predict().
    1. Do I understand lyst/lightfm#210 correctly that user_id 0 is a similar user to the new user? So in practice I would first need to have a function that finds the closest user based on their features, but then predict would also take the similar user's interactions and assume these are the same as the new user?
    2. Would it make sense to set up an empty user for this purpose? I'm not sure how that would work though, since interactions does seem to require a value for the item and rating. Alternatively is it correct to add an a new experience and have every user interacted with it with the lowest possible weight or would that affect the model too strongly?
    3. In general: I noticed the user_ids can take as many values as in user_feature_map, not user_id_map. I'm not sure how to interpret this, e.g. if I pass the mapping of a user_feature instead of the user_id. Would I want to do this in the cold start case? How would I add multiple features?

    Hi everyone,
    I'm pretty new to LightFM library and I am using it to produce some quality hybrid recommendation output for my thesis on movie recommendation systems. Trying to use LightFM as a normal collaborative filtering with custom embeddings (with weight matrices set as identity matrices of course), I found in library's source code that both item and user feature embeddings have to be of the same dimensionality. I haven't found any point in the source code where this dimensionality restriction is forced, except from the embeddings matrices initialization in the fitting function. Nevertheless, most of the computationally heavy code is written in openMP C files and it is really extensive, which I didn't research. So my question stands as follows, can I set the user/item feature embeddings matrices with user/item feature vectors of different dimensionality?

    Thanks everyone in advance

    1 reply
    I'm wondering if the lightfm library was compatibe with python 3.7
    Mansur Uralov
    hello, I am also pretty new to LightFM. Can anyone help me with this https://github.com/muralov/lightfm-playaround/blob/main/small-test.ipynb. I am trying to recommend movies based on the user ratings. If you look at the my jupyter notebook file. I have created two 'romance' and 'action' features for movies. I am testing the prediction for the users 2 and 3 who like more actions movies, but lightfm recommending romance movies for them, which is wrong. Can you please help me what I am doing wrong here?
    I really appreciate your help.
    Thanks everyone in advance
    Johnathan Nguyen
    Hi everyone,
    Do you have any suggestion to get explananation on why Lightfm recommend to one a user some product ?
    Is there equivalent to Shap values apply to lightfm ?
    Thanks everyone in advance !
    Kunal Bhadra
    Hey all,
    I wanted to use LightFM for my recommendation project, but I have not seen how I'd incorporate a user x user matrix for the model anywhere. Can a user-user matrix be formed instead of a user-item matrix through LightFM?
    5 replies
    Amr Mashlah
    Is there a way to identify items or interactions that has the highest impact on the fitted model / learnt users embeddings? I want to select informative items and ask new users to rate them
    8 replies
    I am currently using LightFM to build item embeddings. For input, i use interactions and item features (containing both numerical and categorical variables). I understand that feature scale is very important. I am currently using sklearn standard scaler (using mean and variance) for this and also using the build interactions function with normalize=True so that rows for each user sum to 1. Is this the best way to proceed. Thanks !
    3 replies
    Aditya Vats
    Hello, I have a problem where there are very few items (almost 25) and many users (almost 40k). I also don't have item metadata but I have user metadata. I also have interactions. How should I approach this and is lightfm a good option for this? Need this urgently. Thanks :tada:

    Hi! I have a question about quick recommendations for a cold cold start
    I have a trained model for which I learned how to get top vector recommendations.
    it works quickly and gives satisfactory results.
    However, I still have many users whom I discarded when training the model as not representative, I also need recommendations for them and I am trying to get them by means of a cold start.

    this works for one user, but it is too slow to iterate over all users through the for:

    model.predict (0, np.arange (len(games_list)), user_features = new_user_features_0)

    I am looking for a vectorized solution for a cold start using
    user_biases, user_embeddings, item_biases, item_embeddings
    to feed in a sparse matrix of all new users new_user_features and get recommendations.
    Any suggestions would be helpful, thanks

    def get_recommendations(model, invert_item_id_map, players_list, top_size):

    userid = 0
    batchsz = len(players_list)
    itemsz = len(invert_item_id_map)
    user_biases = model.get_user_representations()[0]
    user_embeddings = model.get_user_representations()[1]
    item_biases = model.get_item_representations()[0]
    item_embeddings = model.get_item_representations()[1]
    userbatch = user_embeddings[(userid):(userid + batchsz), :]
    userbatch_bias = user_biases[(userid):(userid + batchsz), ]
    fullbias = np.tile(userbatch_bias, (itemsz, 1)).T + np.tile(item_biases.T, (batchsz, 1))
    scores = fullbias + np.matmul(userbatch, item_embeddings.T)
    top_ind = scores.argsort()[:, ::-1][:, :top_size]
    top_v = np.vectorize(invert_item_id_map.get)(top_ind)
    top_df = pd.DataFrame(data=top_v, columns=list(range(1, top_size + 1)))
    result_df = top_df.stack().reset_index()
    result_df.columns = [['player_id', 'top_position', 'internal_game_id']]
    result_df['player_uuid'] = np.repeat(players_list, top_size)
    print(f"Recommendation df: {result_df.shape}")
    return result_df