by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Chris Chow
    @ckchow
    At least in Java there is a float[][] constructor, and I think there's a numpy constructor in python as well. might be out of luck if you're using the command line version.
    lesshaste
    @lesshaste
    hi... does anyone understand why xgboost is so slow if you have lots of classes? This code shows the problem https://bpaste.net/show/f7573b5a2fb9 RandomForestClassifier takes about 15 seconds
    but xgboost never terminates at all for me
    Lyndon White
    @oxinabox
    I am training a binary classifier.
    In the problem I am working on,
    I can generate more training data at will.
    In that by running a simulation I can (determenistically) determine the correct label for any feature set
    Each training case takes a bit to generate (say 0.5 seconds).
    The main motivation for training a classifier is that evaluating via simulation takes too long.
    Is there a specific way to task advantage of my capacity to generate more data, that I can do in xgboosting,
    that I couldn't do with say a SVM?
    Its almost an Active Learning problem
    Lyndon White
    @oxinabox
    I'm not sure if there is anything beyond: "Generate more data, both for training and validation , until the validation error hits 0"
    KOLANICH
    @KOLANICH
    Hi everyone! Could anyone explain what are the arguments of a custom loss function?
    objective function
    Jay Kim (Data Scientist)
    @bravekjh
    Hi everyone. I joined this room first time today, nice to meet you all
    Asbjørn Nilsen Riseth
    @anriseth
    Is there a built-in way to run XGBoost with a weighted mean square loss function?
    Something like i=1Dwi(yiy^i)2 \sum_{i=1}^D w_i(y_i-\hat{y}_i)^2
    binyata
    @binyata
    is there a general reason why xgboost predict returns only nan?
    this is for python
    xiaolangyuxin
    @xiaolangyuxin
    multithread
    xgboost predict for multithread works bad
    on windows xp,i found a lots of issues for xgboost,exspacially,
    uaing
    using lolibray
    Peter M. Landwehr
    @pmlandwehr
    Anybody have a changelog for 0.7.post4?
    AbdealiJK
    @AbdealiJK
    Rydez
    @Rydez
    For XGBoost, when considering time series data, is it worth creating features which represent a change in other features? For example, say I have the feature "total_active_users". Would it make sense to have a feature "change_in_total_active_users"? Or, would that just be redundant?
    Harshal
    @geekyharshal
    Hello people
    Can someone suggest how to begj with xgboost ?
    Harshal
    @geekyharshal
    begin*
    Tommy Yang
    @joyang1
    I use xgboost4j-0.80.jar predictleaf always return 3 leafindex for one label? is this any error?
    @all
    @/all
    Tommy Yang
    @joyang1
    have anyone can answer me?
    Tommy Yang
    @joyang1
    :joy: :joy_cat: :joy:
    Tommy Yang
    @joyang1
    :joy:
    Tommy Yang
    @joyang1
    I used xgboost4j-0.80.jar, xgboost train parameter of round is 800 and train data is 2000000. When I use predictleaf to get leafIndex, the jvm crashed.

    #

    A fatal error has been detected by the Java Runtime Environment:

    #

    SIGSEGV (0xb) at pc=0x00007f42160bf902, pid=880, tid=0x00007f42175f2700

    #

    JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build 1.8.0_171-b11)

    Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode linux-amd64 compressed oops)

    Problematic frame:

    V [libjvm.so+0x6d6902] jni_SetFloatArrayRegion+0xc2

    #

    Core dump written. Default location: /data/suzhe/suzhe-1.0-SNAPSHOT/core or core.880

    #

    If you would like to submit a bug report, please visit:

    http://bugreport.java.com/bugreport/crash.jsp

    #

    Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
    j ml.dmlc.xgboost4j.java.XGBoostJNI.XGBoosterPredict(JJII[[F)I+0
    j ml.dmlc.xgboost4j.java.Booster.predict(Lml/dmlc/xgboost4j/java/DMatrix;ZIZZ)[[F+45
    j ml.dmlc.xgboost4j.java.Booster.predictLeaf(Lml/dmlc/xgboost4j/java/DMatrix;I)[[F+6
    j com.jianshu.suzhe.LRTrainer.train()V+23
    j com.jianshu.suzhe.LRTrainer.main([Ljava/lang/String;)V+30
    v ~StubRoutines::call_stub
    Stack: [0x00007f42174f2000,0x00007f42175f3000], sp=0x00007f42175f1590, free space=1021k
    Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
    V [libjvm.so+0x6d6902] jni_SetFloatArrayRegion+0xc2
    C [libxgboost4j8098523902211486429.so+0x9001c] Java_ml_dmlc_xgboost4j_java_XGBoostJNI_XGBoosterPredict+0x5c
    Can someone suggest how to solve this problem?
    thx~~~
    Adrian Nembach
    @AtR1an
    Hi, I am having a problem with xgboost4j on linux. More specifically, if I specify eta to be smaller than 1 e.g. 0.3 (which is the default), than the model doesn't seem to learn anything. It almost seems like eta is set to zero somewhere along the line. Does anyone else experience this problem or maybe knows a solution? Thanks in advance.
    Lijo Varghese
    @lijoev

    Hi i am stuck in my work in submitting a spark job to hadoop yarn master in cluster mode
    please find my environment setup below

    i have a linux machine having 128 GB of RAM, 2TB Hard disk, 2x16 cores.
    i have set up cloud era hadoop containers in a docker mount point having 50 GB(this mount point is almost full). i have one datanode, namenode and yarnmaster containers runnings.
    i am submitting spark job from my host machine to run Rscript in cluster mode. R server and libraries are set up in datanode.
    When i submit the spark job it remains in the accepted state for long time. please find the spark submit command i am using below
    spark-submit --master yarn --name RechargeModel --deploy-mode cluster --executor-memory 3G --num-executors 4 rechargemodel.R

    camspilly
    @camspilly
    Hi currently seeing
    a segmentation fault when I try to import xgboost
    Keith
    @DrEhrfurchtgebietend
    Are there developers here? Is there interest in adding conformal predictions to the library? The error on the predicted value is an often desired quantity. Here is a link to the paper about how it was done with Random Forrest. https://link.springer.com/article/10.1007/s10994-014-5453-0 This method should work with anything which has our of bag samples.
    Lukas Heumos
    @Zethson

    Hi everyone,

    odd request: I need a non reproducible dataset & xgboost model.
    Does anybody have any pointers or in the best case both?

    Apoorv Shrivastava
    @apoorv22
    has anybody worked on dask-xgboost...need some help
    Harshit Gupta
    @harshit-2115
    @apoorv22 Yeah, What’s the issue ?
    Apoorv Shrivastava
    @apoorv22
    @harshit-2115 not able to convert a model create from dask-xgboost to PMML using sklearn2pmml
    Peng Yu
    @yupbank
    has anyone encountered problems with negative hessian ? for some reason i felt it’s being swallowed