pass? In any way, it sounds a bit tricky or unusual so let us know if you have further questions.
def infer_relative_search_space(self, study, trial): return def sample_relative(self, study, trial, search_space): return def sample_independent(self, study, trial, param_name, param_distribution): return
paramters = self_defined_sampling() y = model(x, paramters)
if name == "main":
optuna.create_study(objective, Sampler = Sampler() )
optuna.structswas deprecated and new classes (e.g.,
optuna.study.StudyDirection) were introduced by the last release (v1.4.0]. Sorry for the inconvenience, but you need to use the latest Optuna to run the example.
Hi everyone! My name is Zishi and I discovered Optuna this week after watching your talk in Scipy 2019. I was searching for open-source hyper-parameter optimization libraries for a project requiring regression. Thank you making this project open-source!
I noticed all the pruning examples are performing classification, usually on the MNIST dataset.
The only regression example I could find was Keras + MLflow, on the wine quality regression dataset.
I combined code from the keras_mlflow.py example with the keras_integration.py example in the pruning folder to get a Keras + Regression + Pruning script
that evaluates Mean Squared Error on the wine quality regression dataset. I wanted to ask if you would like to my regression script in your examples/pruning folder? I know it's just a copy of two other examples you already have but I think the pruning examples could use at least one regression example. Please let me know what you think.
keras_integration.pyfor those who are looking for a regression example, vice-versa. I’m not sure this would’ve helped you address your problem but please let me know what you think.
We've just updated our Neptune integration with Optuna and now the interactive optuna.visualizations can be logged and rendered in the UI:
You can also log and update those visualizations after every iteration to get a better picture while it is training:
neptune_callback = opt_utils.NeptuneCallback(log_charts=True) study.optimize(objective, n_trials=100, callbacks=[neptune_callback])
It looks like this:
I hope this is useful.
Here is a link to docs.
@hvy I think it would be better to include a link to
keras_mlflow.py, within the the header documentation for
keras_integration.py, and say that link contains a regression example using Keras.
Thank you for the pointers to the different sampler algorithms. I think I will first try the default TPE algorithm with Keras and LightGBM regression models. I will split my data into 80% for training/validation and 20% for testing. Run 10 studies, each containing 100 trials, on 80% of the data and get 10 optimal hyperparameter configs. Then initialize 10 models with those hyperparameter configs, train each on the 80% of training/validation data, and predict the 20% holdout data for testing. Then measure the average and the variance in the mean squared error of those 10 models on the testing set. It will probably take a week for me to get the results for Keras model. I will keep you posted.
keras_mlflow.pyto e.g. Boston.
@hvy I would be honored to make a PR to include the link to
keras_integration.py. As for the Boston dataset, I think it would make sense to use it as the standard regression example. The problem with the wine quality dataset is that the target variable is not continuous and only has 3 different values. In comparison, the Boston dataset uses housing price as the target variable, which is continuous.
I can start with a PR to include a link to
keras_integration.py. Afterwards I can work on recreating the
keras_mlflow.py with the Boston dataset instead of wine dataset. How does that sound?
LightGBMTunerCVclass is amazing. It saved me a lot of time today. Using it with default settings, I consistently got lightgbm models that had 30% less prediction error than my baseline prediction values on my regression dataset of 10k samples. And it only takes a minute to run the study. Whereas each cross validated Keras trial took 1 minute and a study of 100 trials took 1 - 2 hours.
Not AFAIK. If others are fine with it, then there’s no need to change it actually. Open for discussion.
Thanks for sharing your use case, that’s great. It seems like the CV component was one of the most appreciated additions in this release. There was a blog being created a couple of minutes or hours after the release about it.
So this works in my simple example (foo.py is the example from the CLI doc). I launch the studies asynchronously and sync the DB.
study_name=first_study STUDY_NAME=`optuna create-study --storage sqlite:///example.db --study-name $study_name --skip-if-exists` for i in $(seq 1 100); do # Replace by job submission here optuna study optimize foo.py objective --n-trials=1 --storage sqlite:///example.db --study-name $STUDY_NAME & done
Are there reasons not to do that? Conflicts in DB or anything I should be worried about?
PS: I have no knowledge about DBs and SQL, yet