FREE
value mean? And why am I OOM
if I've got 10x as much FREE
as K/V
and POJO
?
Hi
I'm training a GBM multi classifier and I wanted to know, what could cause the following error. Thanks
raw_df = h2o.import_file()
df = h2o.deep_copy(raw_df[raw_df['x'] > 10, : ], 'df')
df['split'] = df['y'] .stratified_split(test_frac=0.2, seed=1)
train_valid = df[df['split'] == 'train', :].drop('split')
test = df[df['split'] == 'test', :].drop('split')
train_valid['col_split'] = train_valid['y'] .stratified_split(test_frac=0.2, seed=1)
train = df[df['split'] == 'train', :].drop('col_split')
valid = df[df['split'] == 'test', :].drop('col_split')
raw_df['y'].unique().nrow => 95
df['y'].unique().nrow => 93
train['y'].unique().nrow => 93
training GBM alog with class_sampling_factors = [w1, ...., W93]
OSError: Job with key $03017f00000132d4ffffffff$_af9c11386cb765249816853dfc3d47fe failed with an exception: java.lang.IllegalArgumentException: class_sampling_factors must have 95 elements
stacktrace:
java.lang.IllegalArgumentException: class_sampling_factors must have 95 elements
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:244)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:238)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1563)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
or the following one, when using "balance_classes": True in GBM model
OSError: Job with key $03017f00000132d4ffffffff$_acb90549c4fb00eefd9be1d55ab5448b failed with an exception: java.lang.IllegalArgumentException: Error during sampling - too few points?
stacktrace:
java.lang.IllegalArgumentException: Error during sampling - too few points?
at water.util.MRUtils.sampleFrameStratified(MRUtils.java:309)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:252)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:238)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1563)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
if(inputDataFrameIsPandas):
dataToML = inputDataFrame.iloc[0:,0:]
DataH2OFrameToML= h2o.H2OFrame(dataToML)
else:
DataH2OFrameToML = inputDataFrame
predictionsDataFrame = MLModel.predict(DataH2OFrameToML)
What I'm saying is that you were testing if this object inputDataFrameIsPandas
is not None
and you selected the data to predict on from another object inputDataFrame
: inputDataFrameIsPandas
is not the same as inputDataFrame
This may be better
if not inputDataFrame.empty:
DataH2OFrameToML= h2o.H2OFrame(inputDataFrame.iloc[0:,0:])
else:
DataH2OFrameToML = inputDataFrame
predictionsDataFrame = MLModel.predict(DataH2OFrameToML)