Skip to content

Latest commit

 

History

History

Natural Language Processing with Quora

Tackling Kaggle's Quora question pairs competition

Blog, where I describe my exploration of the dataset : https://medium.com/@gabrieltseng/natural-language-processing-with-quora-9737b40700c8

Exploration

Exploration.ipynb

In this file, I test various ways of manipulating the data before inputting it into my neural network, including cleaning and adding leaky features.

Training

Training_Cleaned_NeuralNetwork.ipynb

I then train to convergence a neural network (without leaky features).

SpaCy

SpaCy_NLP.ipynb

I briefly explore SpaCy's capabilities using this dataset, for potential future use.

Further steps I could take

  1. Ensembling, particularly if I train a neural network with word2vec instead of GloVe embeddings
  2. Identifying 'legal' leaky features (such as question lengths).
  3. Further optimization of hyperparameters.