open:hyperparameter-tuning

Hyperparameter tuning

  • Hyperparameter tuning in general
    • General pipeline
    • Manual and automatic tuning
    • What should we understand about hyperparameters?
  • Models, libraries and hyperparameter optimization
    • Tree-based models
    • Neural Networks
    • Linear models

What framework to use?

They implement the same functionality! (except sklearn)

Don't spend too much time tuning hyperparameters

  • Only if you don't have any more ideas or you have spare computational resources

Be patient

Average everything

  • Over random seed
  • Or over small deviations from optimal parameters
    • e.g. average max_depth=4,5,6 for an optimal 5
  1. Select the most influential parameters
    - There are tons of parameters and we can't tune all of them
  2. Understand, how exactly they influence the training
  3. Tune them!
    - Manually (change the examine)
    - Automatically (hyperopt, etc)

A lot of libraries to try:

def xgb_score(param):
  # run XGBoost with arameters 'param'
  
def xgb_hyperopt():
  space = {
    'eta': 0.01,
    'max_depth': hp.quniform('max_depth', 10, 30, 1),
    'min_child_weight': hp.quniform('min_child_weight', 0, 100, 1),
    'subsample': hp.quniform('subsample', 0.1, 1.0, 0.1),
    'gamma': hp.quniform('gamma', 0.0, 30, 0.5),
    'colsample_bytree': hp.quniform('colsample_bytree', 0.1, 1.0, 0.1),
    'objective': 'reg:linear',
    'nthread': 28,
    'silent': 1,
    'num_round': 2500,
    'seed': 2441,
    'early_stopping_rounds': 100
  }
  
  best = fmin(xgb_score, space, algo=tpe.suggest, max_evals=1000)
  1. Underfitting (bad) (red)
  2. Good fit and generalization (good)
  3. Overfitting (bad) (green)

A parameter in red

  1. Increasing it impedes fitting
  2. Increase it to reduce overfitting
  3. Decrease to allow model fit easier

A parameter in green

  1. Increasing it leads to a better fit (overfit) on train set
  2. Increase it, if model underfits
  3. Decrease if overfits

wl60v5z.jpg

Hyperparameter tuning in general

  • General pipeline
  • Manual and automatic tuning
  • What should we understand about hyperparameters?

Models, libraries and hyperparameter optimization

  • Tree-based models
  • Neural networks
  • Linear models
  • open/hyperparameter-tuning.txt
  • 마지막으로 수정됨: 2020/07/15 08:43
  • 저자 127.0.0.1