Name		Name	Last commit message	Last commit date
parent directory ..
diagrams		diagrams
lm		lm
1-WikiPreprocessing.ipynb		1-WikiPreprocessing.ipynb
2a-ConvLM.ipynb		2a-ConvLM.ipynb
2b-RecLM.ipynb		2b-RecLM.ipynb
README.md		README.md

README.md

Language Model

This folder contains code to do the following:

Preprocess the wikitext datasets
Train a universal language model on the datasets. This model can be recurrent or feedforward.
Finetune the language model on a different task. Specifically, the Toxic comment classification challenge

Universal models are motivated by 1. The two different "universal" models are explored: a recurrent model (RecLM), based off 2, and a convolutional model (ConvLM), motivated by 3.

1. Preprocessing

SpaCy is used to tokenize the wikitext dataset. Parallel processing is used for efficiency.

2a. ConvLM

The convolutional language model consists of temporal convolutional blocks, which themselves are composed of variationally weight-dropped convolutional layers. In addition, each block is residual.

The convolutional layers are variationally weight-dropped to mimic the variational weight drop employed in the recurrent language model. The motivation to do this is to allow the same weights to be dropped across multiple timesteps, so that all timesteps in a convolution's output sequence will have been processed in the same way.

Variational dropout is also used for the embedding layer.

Training results:

2b. RecLM

The recurrent language model consists of weight dropped RNNs stacked on top of each other, as happens in 2.

Variational dropout is used for the embedding layer.

Training results:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

language_model

language_model

README.md

Language Model

1. Preprocessing

2a. ConvLM

2b. RecLM

References

Files

language_model

Directory actions

More options

Directory actions

More options

Latest commit

History

language_model

Folders and files

parent directory

README.md

Language Model

1. Preprocessing

2a. ConvLM

2b. RecLM

References