Neural nets
Number of neurons per layer
Number of layers
Optimizers
SGD
+
momentum
Adam
/
Adadelta
/
Adagrad
/ …
In practice lead to more overfitting
Batch size
Learning rate
Regularization
L2
/
L1
for weights
Dropout
/
Dropconnect
Static dropconnect
관련 문서
Hyperparameter tuning