Run the following commands to install other packages:
pip3 install -r requirements.txt
pip3 install -U spacy
Download the data from the Toxic Comment Classification Challenge webpage.
Navigate to the src folder.
Machine Learning models: Run the following commands (back to back):
python3 preprocessing.pypython3 models.py
fastText models: Run the fasttext notebook.
Deep Learning models: Run the deeplearning notebook. and deeplearning2 notebooks.
All vectorized n-grams, AUC-ROC summary dataframes, predictions and probabilities will be dumped in the pickle_objects/ folder.
Models and ROC curve plots will be dumped in the folders pickle_objects/models/ and plots/ (or pickle_objects/models_features/ and plots_features/ if you choose to use extra features -- see models.py).