Problems occurring during validation
Validation stage
Causes of different scores and optimal parameters
- Too little data
- Too diverse and inconsistent data
We should do extensive validation
- Average scores from different KFold splits
- Tune model on one split, evaluate score on the other
Submission stage
We can observe that:
- LB score is consistently higher/lower that validation score
- LB score is not correlate with validation score at all
Expect LB shuffle because of
- Randomness
- Little amount of data
- Different public/private distributions
관련 문서
Plugin Backlinks: 아무 것도 없습니다.