Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Max Kanter
    @kmax12
    @geoHeil ya, i understand the error now.
    Max Kanter
    @kmax12
    @geoHeil answers on SO. let me know if that helps
    geoHeil
    @geoHeil
    @kmax12 Initially I just used plain strings for the join - but forcing same categories is probably a more efficient idea
    Max Kanter
    @kmax12
    ya, i think that is the more memory efficient approach. how big is the dataset?
    might not matter that much
    geoHeil
    @geoHeil
    no not really. Decompressed about 3G CSV files
    Max Kanter
    @kmax12
    @Tullsokk @geoHeil I just answered this question about using multiple training windows on Stack Overflow. Hopefully, it's helpful for you two: https://stackoverflow.com/questions/52472930/featuretools-multiple-cutoff-times-aggregation
    geoHeil
    @geoHeil
    How can I additionally to agg_primitives als invoke trans_primitives?
    Max Kanter
    @kmax12
    @geoHeil sorry, can you try rephrasing the question? I don't undersand
    geoHeil
    @geoHeil
    it looks that only SUM, MEAN columns (agg columns) are generated from feature synthesis, but none of the trans primitives.
    Max Kanter
    @kmax12
    what are you passing for the trans_primitive argument?
    geoHeil
    @geoHeil
    trs_primitives = ['percentile', 'year', 'days', 'diff', 'negate', 'month', 'cum_max',
    'divide', 'days_since', 'week', 'time_since_previous',
    'cum_mean', 'minute', 'weekday', 'or', 'isin', 'weeks', 'weekend']
    Max Kanter
    @kmax12
    can you share the repr of your entityset?
    you may need to increase max depth
    geoHeil
    @geoHeil
    I will try this and report tomorrow
    geoHeil
    @geoHeil
    max_depth of 2 does not seem to finish calculation, even if only 20 / 50 records are passed to the relationships
    Max Kanter
    @kmax12
    @geoHeil can you provide some details about your entityset? You can do print(your_entityset_object) and copy the results of that here or in a direct message to me
    geoHeil
    @geoHeil
    @kmax12 unfortunately I fear this will not be possible due to NDA ...
    Fabio Votta
    @favstats

    Hi everyone! I love featuretools and the idea to automically engeineer features. Unfortunately I can't seem to add interesting variables and I would be happy if someone could help out :)

    I suspect that it has something to do with my data because I can reproduce the example in the docs just fine..

    https://stackoverflow.com/questions/52673694/specifying-interesting-variables-with-featuretools-does-not-work

    Maybe this is an easy question for Pythonistas.. I am an ardent R user so maybe there is something I am just not seeing.
    Max Kanter
    @kmax12
    @favstats thanks for posting. we're taking a look and will put up an answer shortly.
    Fabio Votta
    @favstats
    Thanks a lot, really! :) Saw your comment on the initial post and I accidentally deleted the whole post when I wanted to edit it, sorry for that.
    It's not high priority though, just something that I couldn't figure out. Enjoy your sunday everyone :)
    Max Kanter
    @kmax12
    Happy to help! Will ping you here once I have an answer.
    Fabio Votta
    @favstats
    @kmax12 works well for me :)
    Max Kanter
    @kmax12

    @favstats This looks like incorrect behavior, thanks for sharing with us. I just made of fix for it on a branch. Can you try to install that branch of featuretools and run your code again? You can install that branch using pip with this command

    pip install -e git://github.com/featuretools/featuretools.git@interesting-values-direct-features#egg=featuretools

    Let us know if it helps!

    there's also a github pull request here if you'd like to comment there: Featuretools/featuretools#279
    Fabio Votta
    @favstats
    Oh wow, I was certain that the issue would be on my part. I'll try this out immediately. Thanks!
    Fabio Votta
    @favstats
    This worked perfectly! Thank you so much!
    Fabio Votta
    @favstats

    @kmax12 one thing I just noticed.. I mistyped the value name at first and it gave me back only NaN values, which makes sense since it can't match the arguments. However, I wonder if this is intended behaviour or if it should say something like "value not found" and throw an error. Just thinking out loud :)

    Anyway, thank you again for answering this so fast! :)

    Max Kanter
    @kmax12
    ya, that is correct behavior. its still a valid feature even if the value is nan for your particular data
    Fabio Votta
    @favstats
    alright :) great
    Max Kanter
    @kmax12
    happy to help! let us know if you have any other questions
    Fabio Votta
    @favstats
    Will do! :) So far I'm good
    Fabio Votta
    @favstats

    Hello everyone :) I have a question and hope someone can help out.

    Say I would want to calculate features for specific timeframes since the cut-off value, so for example:

    My cut-off value is 1 January 2005. I want to count the number of products a customer bought in the last month/ the last three months and the last year before that and have them all in the same feature matrix.

    I know I can do something like this (from reading this):

    feature_matrix, features = ft.dfs(
                                      target_entity="customers", 
                                      agg_primitives=["count"],
                                      cutoff_time=pd.Timestamp('January 1, 2005'),
                                      training_window=ft.Timedelta("30 days"), 
                                      entityset=es,
                                      verbose=True
                                     )

    But this would of course only give me features for the 30 days before January 1, 2005, however I would like them for different time ranges as well (i.e. three months, a year or really any other time range that would interest me).

    So I am not sure how to get to my goal right now. Would I need to create a new primitive for this task or can this be done with already existing functions?

    Max Kanter
    @kmax12
    @favstats take a look at this answer on SO. let me know if it helps. https://stackoverflow.com/a/52593818/8964531
    Fabio Votta
    @favstats
    @kmax12 This does the trick! Thanks! :)
    Max Kanter
    @kmax12
    :thumbsup:
    Fabio Votta
    @favstats

    Hello everyone!

    I encountered a problem when I tried to create relationships between entititysets (using my own data). There is no error, but it just doesn't create features for one of my entities (the "prods" entity), although everything should be connected just fine. In some ways, this is similar to the first issue I encountered, only that it occurs just with this specific entity set up. Unfortunately, I can't share my data this time but attached you will find a minimal example with some mock data, where this problem also occurs.

    Hope somebody can help and thank you for your awesome support!

    Best, Fabio

    Max Kanter
    @kmax12
    @favstats can you post the first part our question about getting product features on stackoverflow?
    rather than include a notebook, you can just put your code / comments in the question
    Fabio Votta
    @favstats
    @kmax12 Will do!
    Fabio Votta
    @favstats

    Done:

    https://stackoverflow.com/questions/53067099/features-are-not-being-generated-for-my-entityset-set-up-in-featuretools

    It's just a lot of code so I thought a python notebook would be a bit more compact :)

    Max Kanter
    @kmax12
    thanks. will answer shortly
    Fabio Votta
    @favstats
    Thank you! :)
    Max Kanter
    @kmax12
    answer posted. let us know if you have any other questions
    Fabio Votta
    @favstats
    Ooooooh, I see. Well this is definetely something I should have known. Thank you for the quick help! Everything works as expected.
    Max Kanter
    @kmax12
    happy to help!