open:problems-occurring-during-validation

Problems occurring during validation

Causes of different scores and optimal parameters

  1. Too little data
  2. Too diverse and inconsistent data

We should do extensive validation

  1. Average scores from different KFold splits
  2. Tune model on one split, evaluate score on the other

We can observe that:

  • LB score is consistently higher/lower that validation score
  • LB score is not correlate with validation score at all
  • Randomness
  • Little amount of data
  • Different public/private distributions

  • open/problems-occurring-during-validation.txt
  • 마지막으로 수정됨: 2020/06/02 09:25
  • 저자 127.0.0.1