Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Max Kanter
    @kmax12
    @favstats can you post the first part our question about getting product features on stackoverflow?
    rather than include a notebook, you can just put your code / comments in the question
    Fabio Votta
    @favstats
    @kmax12 Will do!
    Fabio Votta
    @favstats

    Done:

    https://stackoverflow.com/questions/53067099/features-are-not-being-generated-for-my-entityset-set-up-in-featuretools

    It's just a lot of code so I thought a python notebook would be a bit more compact :)

    Max Kanter
    @kmax12
    thanks. will answer shortly
    Fabio Votta
    @favstats
    Thank you! :)
    Max Kanter
    @kmax12
    answer posted. let us know if you have any other questions
    Fabio Votta
    @favstats
    Ooooooh, I see. Well this is definetely something I should have known. Thank you for the quick help! Everything works as expected.
    Max Kanter
    @kmax12
    happy to help!
    Dan Houghton
    @dah33
    target = read_feather("../input/gstore-2-prep/target.feather")
    
    # Bug: featuretools doesn't like datetime64[ns, UTC]
    target.cut_off_time = target.cut_off_time.astype("datetime64[ns]")
    Note, this is pyarrow.read_feather
    It assumes a UTC timezone
    Max Kanter
    @kmax12
    @dah33 what is the error you end up getting?
    Dan Houghton
    @dah33
    @kmax12 Cannot convert column last_sessions_time to <class 'featuretools.variable_types.variable.DatetimeTimeIndex'>
    The workaround is the astype conversion above.
    Max Kanter
    @kmax12
    got it. would you mind post this on github as an issue?
    Dan Houghton
    @dah33
    @kmax12 Done!
    Max Kanter
    @kmax12
    thanks! we'll keep the issue up to date as we fix
    Dan Houghton
    @dah33
    In EntitySet.normalize_entity I've been using the time_index_reduce parameter. In my example, I can request the last instance of a user's details as I normalise the sessions table. However, this appears to not be time-aware. The last instance of a user's details, can appear AFTER the cut_off_time.
    Am I correct? It's unlikely to make any difference to my business question (the GStore competition on Kaggle), but it seems inconsistent with the way time is handled elsewhere. I think the alternative is to leave all the instances of the user details in the sessions table, and let the DFS (which is time-aware) extract the correct feature.
    Max Kanter
    @kmax12
    @dah33 you're right. we haven't actually found a good use case for time_index_reduce being anything other than first and will likely remove it from Featuretools soon
    Junghyun Kim
    @Dpnia
    Hello, I want to ask about an algorithm in paper. (http://www.jmaxkanter.com/static/papers/DSAA_DSM_2015.pdf) In algorithm 1, is line 7 is correct? <Fj = Fj∪RFEAT(Ei, Ej)> I guess it should be <Fi = Fi∪RFEAT(Ei, Ej)> not Fj, but Fi. Please, tell me what wrong with me
    Max Kanter
    @kmax12
    @Dpnia that is correct. It should be Fi as you point out
    Salman-Jawad91
    @Salman-Jawad91
    Hi
    I am unable to install featuretools on azure juptyer notebook or even using azure ml studio and getting errors as below:
    Installing collected packages: dask, pandas, future, msgpack, psutil, distributed, jmespath, urllib3, botocore, s3transfer, boto3, s3fs, tqdm, featuretools
    Found existing installation: dask 0.15.3
    Uninstalling dask-0.15.3:
    Successfully uninstalled dask-0.15.3
    Found existing installation: pandas 0.20.3
    Uninstalling pandas-0.20.3:
    Successfully uninstalled pandas-0.20.3
    Found existing installation: future 0.15.2
    DEPRECATION: Uninstalling a distutils installed project (future) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
    Uninstalling future-0.15.2:
    Successfully uninstalled future-0.15.2
    Running setup.py install for future ... done
    Found existing installation: psutil 2.1.1
    DEPRECATION: Uninstalling a distutils installed project (psutil) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
    Uninstalling psutil-2.1.1:
    Successfully uninstalled psutil-2.1.1
    Running setup.py install for psutil ... error
    Complete output from command /home/nbuser/anaconda2_20/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-T67_lN/psutil/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-gEaonv-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-2.7
    creating build/lib.linux-x86_64-2.7/psutil
    copying psutil/_psosx.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_exceptions.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_pswindows.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_common.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_compat.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_psbsd.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_pslinux.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/init.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_pssunos.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_psposix.py -> build/lib.linux-x86_64-2.7/psutil
    copying psutil/_psaix.py -> build/lib.linux-x86_64-2.7/psutil
    creating build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/main.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_unicode.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_aix.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/init.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_connections.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_process.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_contracts.py -> build/lib.linux-x86_64-2.7/psutil/tests
    copying psutil/tests/test_system.py -> build/lib.linux-x86_64-2.7/psutil/tests
    running build_ext
    building 'psutil._psutil_linux' extension
    creating build/temp.linux-x86_64-2.7/psutil
    gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=548 -DPSUTIL_LINUX=1 -I/home/nbuser/anaconda2_20/include/python2.7 -c psutil/_psutil_common.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_common.o
    In file included from /home/nbuser/anaconda2_20/include/math.h:71:0,
    from /home/nbuser/anaconda2_20/include/python2.7/pyport.h:325,
    from /home/nbuser/anaconda2_20/include/python2.7/Python.h:58,
    from psutil/_psutil_common.c:9:
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:63:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (cos,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:65:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (sin,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:81:22: error: unknown type name ‘sincos’
    MATHDECL_VEC (void,sincos,,
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:81:29: error: expected declaration specifiers or ‘...’ before ‘,’ token
    MATHDECL_VEC (void,sincos,,
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:82:3: error: expected declaration specifiers or ‘...’ before ‘(’ token
    (Mdouble x, Mdouble *sinx, Mdouble cosx));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:100:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (exp,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:109:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (log,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:153:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (pow,, (Mdouble x, Mdouble y));
    ^
    In file included from /home/nbuser/anaconda2_20/include/math.h:94:0,
    from /home/nbuser/anaconda2_20/include/python2.7/pyport.h:325,
    from /home/nbuser/anaconda2_20/include/python2.7/Python.h:58,
    from psutil/_psutil_common.c:9:
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:63:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (cos,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:65:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (sin,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:81:22: error: unknown type name ‘sincos’
    MATHDECL_VEC (void,sincos,,
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:81:29: error: expected declaration specifiers or ‘...’ before ‘,’ token
    MATHDECL_VEC (void,sincos,,
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:82:3: error: expected declaration specifiers or ‘...’ before ‘(’ token
    (Mdouble __x, Mdouble
    sinx, Mdouble *cosx));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:100:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (exp,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:109:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (log,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:153:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (pow,, (Mdouble x, Mdouble __y));
    ^
    In file included from /home/nbuser/anaconda2_20/include/math.h:141:0,
    from /home/nbuser/anaconda2_20/include/python2.7/pyport.h:325,
    from /home/nbuser/anaconda2_20/include/python2.7/Python.h:58,
    from psutil/_psutil_common.c:9:
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:63:21: error: expected ‘)’ before ‘,’ token

    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:65:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (sin,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:81:22: error: unknown type name ‘sincos’
    MATHDECL_VEC (void,sincos,,
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:81:29: error: expected declaration specifiers or ‘...’ before ‘,’ token
    MATHDECL_VEC (void,sincos,,
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:82:3: error: expected declaration specifiers or ‘...’ before ‘(’ token
    (Mdouble x, Mdouble *sinx, Mdouble *cosx));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:100:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (exp,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:109:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (log,, (Mdouble x));
    ^
    /usr/include/x86_64-linux-gnu/bits/mathcalls.h:153:21: error: expected ‘)’ before ‘,’ token
    MATHCALL_VEC (pow,, (Mdouble x, Mdouble y));
    ^
    error: command 'gcc' failed with exit status 1

    ----------------------------------------

    Rolling back uninstall of psutil
    Command "/home/nbuser/anaconda2_20/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-T67_lN/psutil/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-gEaonv-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-T67_lN/psutil/
    You are using pip version 9.0.3, however version 18.1 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command

    However, I was able to install on-premise using the command
    pip install featuretools , but on azure jupyter notebook I am unable
    tried even !pip install --ignore-installed featuretools but not wokring

    Any Help!

    Markus Löning
    @mloning
    Hi there, I came across featuretools today, I usually used tsfresh for this type of task, are you aware of tsfresh? Couldn't find any reference in the paper or Github repo, and do you know of any other notable package that do automatic feature engineering? Thanks for your help and great work on relational feature engineering!
    Max Kanter
    @kmax12
    @mloning we are aware of tsfresh and like the library. if you scroll up in the chatroom, you'll see @MaxBenChrist (developer of tsfresh) was here recently
    i'm not sure of any other notable packages for automated feature engineering
    Markus Löning
    @mloning
    Alright, thanks for the quick reply!
    Darío López Padial
    @bukosabino

    Hi guys, I have some problems with circleci (python2.7) validation. I am working on this pull request: Featuretools/featuretools#323

    I try to reproduce locally the errors, but I can not. I do something like this:

    virtualenv -p python2.7 env
    source env/bin/activate
    pip install -r test-requirements.txt
    make installdeps lint

    But, I have no errors. What can I do? Do you have any developer documentation? This is my first time using circleci...

    Max Kanter
    @kmax12
    @bukosabino have you run make test? this will run the tests
    Darío López Padial
    @bukosabino
    thanks. solved :=)
    Albert Carter
    @RogerTangos

    Hi there - I usually post on SO for FT questions, but thought that this discussion might need some more interaction.

    Today, I was looking at advanced custom primitives and came across a stackoverflow question: https://stackoverflow.com/questions/53579465/how-to-use-featuretools-to-create-features-from-multiple-columns-in-single-dataf

    The user is trying to create a primitive which sums columns conditionally, based on whether the row is within a timedelta. So, sum only cells where the timestamp is within the last 3 days.

    I think that this is possible if the user creates a transform primitive, which just outputs the value if the cell is within a time range, and 0 if otherwise. Then, they can use the sum aggregation primitive.

    However, I'm curious to know if this is possible in a single aggregation primitive, or whether there is another mechanism for achieving this. It seems very wasteful to store a column of mostly zeros just to take its sum later on.

    Max Kanter
    @kmax12
    @RogerTangos You can create primitives that take in more than one column. here's an example in the docs: https://docs.featuretools.com/automated_feature_engineering/primitives.html#multiple-input-types
    the case in that specific question is a little tricky, but it should be possible, working on posting an answer with an example primitive soon
    Max Kanter
    @kmax12
    @RogerTangos just put the answer up!
    Albert Carter
    @RogerTangos
    Thanks @kmax12 , that's very interesting to see, and I'm glad that it's possible. I really appreciate you taking the time to answer these. It's very kind of you.
    pabloazurduy
    @pabloazurduy
    Hi, I was reading the documentation but i couldn't find an automatic way to make "row_window features", i understand that is possible to use the training_windowin ft.dfs but that only gives you an lower bound (as i understand). What i mean is for example to create the next features:
    COUNT(orders) in 0to1 day
    COUNT(orders) in 1to2 day
    COUNT(orders) in 2to3 day
    etc...
    its there an easy way to create that kind of features ?
    Max Kanter
    @kmax12
    @pabloazurduy just put up a quick answer on how to approach it. let me know if that helps or if a specific code example is needed
    we would consider support this functionality more natively in the future. would you mind making an issue on our github to document your use case / request?
    thanks for trying out featuretools!
    Gray
    @grayskripko
    hi guys. Is there a simple way to use "mean" as an aggregation primitive, skipping missing values? Or the only way is to write a custom primitive?
    Max Kanter
    @kmax12
    @grayskripko for the time being you'd have to do a custom primitive. in a future update we'll allow you to configure how missing value are handled @grayskripko
    Marco Spoel
    @marcospoel