Text classification algorithms such as logistic regression; vector space models for natural language semantics; structured prediction, Hidden Markov models; N-gram language modelling, including statistical estimation;alignment of parallel corpora, Term indexing, term weighting for information retrieval; query expansion and relevance feedback

Go to github to see the code.

Detailed documentation is provided here: