Document Similarity in Machine Learning Text Analysis with ELMo

In this post we will look at using ELMo for computing similarity between text documents. Elmo is one of the word embeddings techniques that are widely used now. In the previous post we used TF-IDF for calculating text documents similarity. TF-IDF is based on word frequency counting. Both techniques can be used for converting text … Read more

FastText Word Embeddings for Text Classification with MLP and Python

Word embeddings are widely used now in many text applications or natural language processing moddels. In the previous posts I showed examples how to use word embeddings from word2vec Google, glove models for different tasks including machine learning clustering: GloVe – How to Convert Word to Vector with GloVe and Python word2vec – Vector Representation … Read more

Vector Representation of Text – Word Embeddings with word2vec

Computers can not understand the text. We need to convert text into numerical vectors before any kind of text analysis like text clustering or classification. The classical well known model is bag of words (BOW). With this model we have one dimension per each unique word in vocabulary. We represent the document as vector with … Read more