python Archives - Text Analytics Techniques

How to Extract Text from Website

June 9, 2019May 27, 2019 by owygs156

Extracting data from the Web using scripts (web scraping) is widely used today for numerous purposes. One of the parts of this process is downloading actual text from urls. This will be the topic of this post. We will consider how it can be done using the following case examples: Extracting information from visited links … Read more

Document Similarity, Tokenization and Word Vectors in Python with spaCY

July 19, 2018April 21, 2018 by owygs156

Calculating document similarity is very frequent task in Information Retrieval or Text Mining. Years ago we would need to build a document-term matrix or term-document matrix that describes the frequency of terms that occur in a collection of documents and then do word vectors math to find similarity. Now by using spaCY it can be … Read more

FastText Word Embeddings for Text Classification with MLP and Python

November 15, 2018January 30, 2018 by owygs156

Word embeddings are widely used now in many text applications or natural language processing moddels. In the previous posts I showed examples how to use word embeddings from word2vec Google, glove models for different tasks including machine learning clustering: GloVe – How to Convert Word to Vector with GloVe and Python word2vec – Vector Representation … Read more

How to Convert Word to Vector with GloVe and Python

November 15, 2018January 14, 2018 by owygs156

In the previous post we looked at Vector Representation of Text with word embeddings using word2vec. Another approach that can be used to convert word to vector is to use GloVe – Global Vectors for Word Representation. Per documentation from home page of GloVe [1] “GloVe is an unsupervised learning algorithm for obtaining vector representations … Read more