Skip to content
/ titan Public

Supervised ensemble classifier for Kaggle's Spaceship Titanic competition

Notifications You must be signed in to change notification settings

polarr/titan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Titan

An ensemble classifier for the Spaceship Titanic Kaggle competition.

Team: cranberry128, bobbbbbbyli, yangtom0516

Model

We do a lot of feature engineering and stack a variety of models. Primary base model is a Random Forest.

The current ensemble uses a HistGradientBoostingClassifier meta model and base models:

  • RandomForestClassifier (anchor, fine-tuned)
  • ExtraTreesClassifier
  • GradientBoostingClassifier
  • XGBClassifier
  • LogisticRegression
  • CatBoostClassifier
  • SGDClassifier
  • LinearDiscriminantAnalysis
  • BaggingClassifier
  • MLPClassifier

Results

Current Best Accuracy (Stacking): 0.81318, rank 73/~2700 teams Current Best Individual Model Accuracy (Random Forest): 0.80009

Usage

base_models.ipynb is a notebook for fine-tuning individual base models. stacking_ensemble.ipynb is self-contained and creates the ensemble model.

The notebooks will automatically create files/, model/ and output/ directories. You must import train.csv, test.csv, sample_submission.csv into files/ from the competition data.

Trained models will be saved to disk in model/ while predictions will be saved to output/. For convenience, the stacking notebook caches models to stackcache/ with a key of modelname_seed. To retrain a model (e.g. with different hyperparameters) simply delete the corresponding cache file.

About

Supervised ensemble classifier for Kaggle's Spaceship Titanic competition

Resources

Stars

Watchers

Forks

Releases

No releases published