Text Data Documentation Guide

text_data is a tool designed to help you explore and analyze text data.

With a parallelized Rust code base, an efficient data structure, and a suite of visual and statistical tools backing it up, it gives you a strong starting point to perform analyses using text data.

It is not a machine learning framework, nor is does it provide you with all of the preprocessing tools you’ll need to perform an analysis. But it’s designed to work well with ML frameworks or with libraries like spacy or nltk.

If that sounds like the kind of thing that interests you, follow along with the tutorial as I use text_data to perform some analysis around the Kaggle State of the Union Corpus.

Indices and tables