document similarity Archives - Text Analytics Techniques

Document Similarity in Machine Learning Text Analysis with TF-IDF

May 4, 2019May 1, 2019 by owygs156

Despite of the appearance of new word embedding techniques for converting textual data into numbers, TF-IDF still often can be found in many articles or blog posts for information retrieval, user modeling, text classification algorithms, text analytics (extracting top terms for example) and other text mining techniques. In this text we will look what is … Read more

Document Similarity, Tokenization and Word Vectors in Python with spaCY

July 19, 2018April 21, 2018 by owygs156

Calculating document similarity is very frequent task in Information Retrieval or Text Mining. Years ago we would need to build a document-term matrix or term-document matrix that describes the frequency of terms that occur in a collection of documents and then do word vectors math to find similarity. Now by using spaCY it can be … Read more