{"id":449,"date":"2018-09-21T00:11:31","date_gmt":"2018-09-21T00:11:31","guid":{"rendered":"http:\/\/ai.intelligentonlinetools.com\/ml\/?p=449"},"modified":"2018-11-17T00:50:38","modified_gmt":"2018-11-17T00:50:38","slug":"text-clustering-word-embedding-machine-learning","status":"publish","type":"post","link":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/","title":{"rendered":"Text Clustering with Word Embedding in Machine Learning"},"content":{"rendered":"<div class=\"mkffk69e49d486376f\" ><script async src=\"\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js\"><\/script>\n<!-- Text analytics techniques 728_90 horizontal top -->\n<ins class=\"adsbygoogle\"\n     style=\"display:inline-block;width:728px;height:90px\"\n     data-ad-client=\"ca-pub-3416618249440971\"\n     data-ad-slot=\"2926649501\"><\/ins>\n<script>\n(adsbygoogle = window.adsbygoogle || []).push({});\n<\/script><\/div><style type=\"text\/css\">\r\n.mkffk69e49d486376f {\r\nmargin: 5px; padding: 0px;\r\n}\r\n@media screen and (min-width: 1201px) {\r\n.mkffk69e49d486376f {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 993px) and (max-width: 1200px) {\r\n.mkffk69e49d486376f {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 769px) and (max-width: 992px) {\r\n.mkffk69e49d486376f {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 768px) and (max-width: 768px) {\r\n.mkffk69e49d486376f {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (max-width: 767px) {\r\n.mkffk69e49d486376f {\r\ndisplay: block;\r\n}\r\n}\r\n<\/style>\r\n<p><img decoding=\"async\" loading=\"lazy\" src=\"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2018\/09\/word2vec-1-e1537725898810.png\" alt=\"\" width=\"600\" height=\"144\" class=\"aligncenter size-full wp-image-477\" \/><br \/>\nText clustering is widely used in many applications such as recommender systems, sentiment analysis, topic selection, user segmentation. Word embeddings (for example word2vec) allow  to exploit ordering<br \/>\nof the words and semantics information from the text corpus. In this blog you can find several posts dedicated different word embedding models:<\/p>\n<p> GloVe &#8211;<br \/>\n    <a href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/convert-word-to-vector-glove-python\/\" target=\"_blank\">How to Convert Word to Vector with GloVe and Python<\/a><br \/>\n  fastText &#8211;<br \/>\n    <a href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/fasttext-word-embeddings-text-classification-python-mlp\/\" target=\"_blank\">FastText Word Embeddings<br \/>\n  word2vec &#8211;<br \/>\n    <a href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-vectors-word-embeddings-word2vec\/\"  target=\"_blank\">Vector Representation of Text \u2013 Word Embeddings with word2vec<\/a><br \/>\n  word2vec application &#8211;<br \/>\n      <a href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\"  target=\"_blank\" >Text Analytics Techniques with Embeddings<\/a><br \/>\n      <a href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/word-embeddinigs-machine-learning\/\"  target=\"_blank\" >Using Pretrained Word Embeddinigs in Machine Learning<\/a><br \/>\n      <a href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/k-means-clustering-example-word2vec\/\"  target=\"_blank\">K Means Clustering Example with Word2Vec in Data Mining or Machine Learning<\/a><\/p>\n<p>In contrast to last post from the above list, in this post we will discover how to do text clustering with word embeddings at <b>sentence (phrase) level<\/b>.  The sentence could be a few words, phrase or paragraph like tweet. For examples we have 1000 of tweets and want to group in several clusters.  So each cluster would contain one or more tweets. <\/p>\n<h2>Data<\/h2>\n<p>Our data will be the set of sentences (phrases) containing 2 topics as below:<br \/>\nNote: I highlighted in bold 3 sentences on weather topic, all other sentences have totally different topic.<br \/>\nsentences = [[&#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;one&#8217;,&#8217;good&#8217;, &#8216;machine&#8217;, &#8216;learning&#8217;, &#8216;book&#8217;],<br \/>\n            [&#8216;this&#8217;, &#8216;is&#8217;,  &#8216;another&#8217;, &#8216;book&#8217;],<br \/>\n            [&#8216;one&#8217;, &#8216;more&#8217;, &#8216;book&#8217;],<br \/>\n            <b>[&#8216;weather&#8217;, &#8216;rain&#8217;, &#8216;snow&#8217;],<br \/>\n            [&#8216;yesterday&#8217;, &#8216;weather&#8217;, &#8216;snow&#8217;],<br \/>\n            [&#8216;forecast&#8217;, &#8216;tomorrow&#8217;, &#8216;rain&#8217;, &#8216;snow&#8217;],<\/b><br \/>\n            [&#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;new&#8217;, &#8216;post&#8217;],<br \/>\n            [&#8216;this&#8217;, &#8216;is&#8217;, &#8216;about&#8217;, &#8216;more&#8217;, &#8216;machine&#8217;, &#8216;learning&#8217;, &#8216;post&#8217;],<br \/>\n            [&#8216;and&#8217;, &#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;one&#8217;, &#8216;last&#8217;, &#8216;post&#8217;, &#8216;book&#8217;]]<\/p>\n<h2>Word Embedding Method <\/h2>\n<p>For embeddings we will use gensim word2vec model.  There is also doc2vec model &#8211; but we will use it at next post.<br \/>\nWith the need to do text clustering at sentence level there will be one extra step for moving from word level to sentence level.  For each sentence from the set of sentences,  word embedding of each word is summed and in the end divided by number of words in the sentence. So we are getting average of all word embeddings for each sentence and use them as we would use embeddings at word level &#8211; feeding to machine learning clustering algorithm such k-means.<\/p>\n<p>Here is the example of the function that doing this:<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndef sent_vectorizer(sent, model):\r\n    sent_vec =[]\r\n    numw = 0\r\n    for w in sent:\r\n        try:\r\n            if numw == 0:\r\n                sent_vec = model[w]\r\n            else:\r\n                sent_vec = np.add(sent_vec, model[w])\r\n            numw+=1\r\n        except:\r\n            pass\r\n    \r\n    return np.asarray(sent_vec) \/ numw\r\n<\/pre>\n<p>Now we will use text clustering Kmeans algorithm with word2vec model for embeddings. For kmeans algorithm we will use 2 separate implementations with different libraries NLTK for KMeansClusterer   and sklearn for cluster. This was described in previous posts (see the list above).  <\/p>\n<p>The code for this article can be found in the end of this post.  We use 2 for number of clusters in both k means text clustering algorithms.<br \/>\nAdditionally we will plot data using tSNE.<\/p>\n<h2>Output<\/h2>\n<p>Below are results<\/p>\n<p>[1, 1, 1, 0, 0, 0, 1, 1, 1]<\/p>\n<p>Cluster id and sentence:<br \/>\n1:[&#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;one&#8217;, &#8216;good&#8217;, &#8216;machine&#8217;, &#8216;learning&#8217;, &#8216;book&#8217;]<br \/>\n1:[&#8216;this&#8217;, &#8216;is&#8217;, &#8216;another&#8217;, &#8216;book&#8217;]<br \/>\n1:[&#8216;one&#8217;, &#8216;more&#8217;, &#8216;book&#8217;]<br \/>\n<b>0:[&#8216;weather&#8217;, &#8216;rain&#8217;, &#8216;snow&#8217;]<br \/>\n0:[&#8216;yesterday&#8217;, &#8216;weather&#8217;, &#8216;snow&#8217;]<br \/>\n0:[&#8216;forecast&#8217;, &#8216;tomorrow&#8217;, &#8216;rain&#8217;, &#8216;snow&#8217;]<\/b><br \/>\n1:[&#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;new&#8217;, &#8216;post&#8217;]<br \/>\n1:[&#8216;this&#8217;, &#8216;is&#8217;, &#8216;about&#8217;, &#8216;more&#8217;, &#8216;machine&#8217;, &#8216;learning&#8217;, &#8216;post&#8217;]<br \/>\n1:[&#8216;and&#8217;, &#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;one&#8217;, &#8216;last&#8217;, &#8216;post&#8217;, &#8216;book&#8217;]<\/p>\n<p>Score (Opposite of the value of X on the K-means objective which is Sum of distances of samples to their closest cluster center):<br \/>\n-0.0008175040203510163<br \/>\nSilhouette_score:<br \/>\n0.3498247<\/p>\n<p>Cluster id and sentence:<br \/>\n1 [&#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;one&#8217;, &#8216;good&#8217;, &#8216;machine&#8217;, &#8216;learning&#8217;, &#8216;book&#8217;]<br \/>\n1 [&#8216;this&#8217;, &#8216;is&#8217;, &#8216;another&#8217;, &#8216;book&#8217;]<br \/>\n1 [&#8216;one&#8217;, &#8216;more&#8217;, &#8216;book&#8217;]<br \/>\n<b>0 [&#8216;weather&#8217;, &#8216;rain&#8217;, &#8216;snow&#8217;]<br \/>\n0 [&#8216;yesterday&#8217;, &#8216;weather&#8217;, &#8216;snow&#8217;]<br \/>\n0 [&#8216;forecast&#8217;, &#8216;tomorrow&#8217;, &#8216;rain&#8217;, &#8216;snow&#8217;]<\/b><br \/>\n1 [&#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;new&#8217;, &#8216;post&#8217;]<br \/>\n1 [&#8216;this&#8217;, &#8216;is&#8217;, &#8216;about&#8217;, &#8216;more&#8217;, &#8216;machine&#8217;, &#8216;learning&#8217;, &#8216;post&#8217;]<br \/>\n1 [&#8216;and&#8217;, &#8216;this&#8217;, &#8216;is&#8217;, &#8216;the&#8217;, &#8216;one&#8217;, &#8216;last&#8217;, &#8216;post&#8217;, &#8216;book&#8217;]<\/p>\n<figure id=\"attachment_455\" aria-describedby=\"caption-attachment-455\" style=\"width: 570px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" loading=\"lazy\" src=\"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2018\/09\/tsne_for_clusters_after_kmeans.png\" alt=\"Results of text clustering\" width=\"580\" height=\"439\" class=\"size-full wp-image-455\" srcset=\"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2018\/09\/tsne_for_clusters_after_kmeans.png 580w, https:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2018\/09\/tsne_for_clusters_after_kmeans-300x227.png 300w\" sizes=\"(max-width: 580px) 100vw, 580px\" \/><figcaption id=\"caption-attachment-455\" class=\"wp-caption-text\">Results of text clustering<\/figcaption><\/figure>\n<p>We see that the data were clustered according to our expectation &#8211;  different sentences by topic appeared to different clusters. Thus we learned how to do clustering algorithms in data mining or machine learning with word embeddings at sentence level. Here we used kmeans clustering and word2vec embedding model.  We created additional function to go from word embeddings to sentence embeddings level. In the next post we will use doc2vec and will not need this function.<\/p>\n<p>Below is full source code python script.<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nfrom gensim.models import Word2Vec\r\n \r\nfrom nltk.cluster import KMeansClusterer\r\nimport nltk\r\nimport numpy as np \r\n \r\nfrom sklearn import cluster\r\nfrom sklearn import metrics\r\n \r\n# training data\r\n \r\nsentences = [['this', 'is', 'the', 'one','good', 'machine', 'learning', 'book'],\r\n            ['this', 'is',  'another', 'book'],\r\n            ['one', 'more', 'book'],\r\n            ['weather', 'rain', 'snow'],\r\n            ['yesterday', 'weather', 'snow'],\r\n            ['forecast', 'tomorrow', 'rain', 'snow'],\r\n            ['this', 'is', 'the', 'new', 'post'],\r\n            ['this', 'is', 'about', 'more', 'machine', 'learning', 'post'],  \r\n            ['and', 'this', 'is', 'the', 'one', 'last', 'post', 'book']]\r\n \r\n \r\n\r\nmodel = Word2Vec(sentences, min_count=1)\r\n\r\n \r\ndef sent_vectorizer(sent, model):\r\n    sent_vec =[]\r\n    numw = 0\r\n    for w in sent:\r\n        try:\r\n            if numw == 0:\r\n                sent_vec = model[w]\r\n            else:\r\n                sent_vec = np.add(sent_vec, model[w])\r\n            numw+=1\r\n        except:\r\n            pass\r\n    \r\n    return np.asarray(sent_vec) \/ numw\r\n \r\n \r\nX=[]\r\nfor sentence in sentences:\r\n    X.append(sent_vectorizer(sentence, model))   \r\n\r\nprint (&quot;========================&quot;)\r\nprint (X)\r\n\r\n\r\n \r\n\r\n# note with some version you would need use this (without wv) \r\n#  model[model.vocab] \r\nprint (model[model.wv.vocab])\r\n\r\n\r\n \r\n\r\nprint (model.similarity('post', 'book'))\r\nprint (model.most_similar(positive=['machine'], negative=[], topn=2))\r\n \r\n \r\n\r\n \r\n \r\nNUM_CLUSTERS=2\r\nkclusterer = KMeansClusterer(NUM_CLUSTERS, distance=nltk.cluster.util.cosine_distance, repeats=25)\r\nassigned_clusters = kclusterer.cluster(X, assign_clusters=True)\r\nprint (assigned_clusters)\r\n \r\n \r\n \r\nfor index, sentence in enumerate(sentences):    \r\n    print (str(assigned_clusters[index]) + &quot;:&quot; + str(sentence))\r\n\r\n    \r\n    \r\n    \r\nkmeans = cluster.KMeans(n_clusters=NUM_CLUSTERS)\r\nkmeans.fit(X)\r\n \r\nlabels = kmeans.labels_\r\ncentroids = kmeans.cluster_centers_\r\n \r\nprint (&quot;Cluster id labels for inputted data&quot;)\r\nprint (labels)\r\nprint (&quot;Centroids data&quot;)\r\nprint (centroids)\r\n \r\nprint (&quot;Score (Opposite of the value of X on the K-means objective which is Sum of distances of samples to their closest cluster center):&quot;)\r\nprint (kmeans.score(X))\r\n \r\nsilhouette_score = metrics.silhouette_score(X, labels, metric='euclidean')\r\n \r\nprint (&quot;Silhouette_score: &quot;)\r\nprint (silhouette_score)\r\n\r\n\r\nimport matplotlib.pyplot as plt\r\n\r\nfrom sklearn.manifold import TSNE\r\n\r\nmodel = TSNE(n_components=2, random_state=0)\r\nnp.set_printoptions(suppress=True)\r\n\r\nY=model.fit_transform(X)\r\n\r\n\r\nplt.scatter(Y[:, 0], Y[:, 1], c=assigned_clusters, s=290,alpha=.5)\r\n\r\n\r\nfor j in range(len(sentences)):    \r\n   plt.annotate(assigned_clusters[j],xy=(Y[j][0], Y[j][1]),xytext=(0,0),textcoords='offset points')\r\n   print (&quot;%s %s&quot; % (assigned_clusters[j],  sentences[j]))\r\n\r\n\r\nplt.show()\r\n<\/pre>\n<div class=\"xnher69e49d48637b3\" ><center>\n<script async src=\"\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js\"><\/script>\n<!-- Text analytics techniques link ads horizontal Medium after content -->\n<ins class=\"adsbygoogle\"\n     style=\"display:inline-block;width:468px;height:15px\"\n     data-ad-client=\"ca-pub-3416618249440971\"\n     data-ad-slot=\"5765984772\"><\/ins>\n<script>\n(adsbygoogle = window.adsbygoogle || []).push({});\n<\/script>\n\n<script async src=\"\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js\"><\/script>\n<ins class=\"adsbygoogle\"\n     style=\"display:block\"\n     data-ad-format=\"autorelaxed\"\n     data-ad-client=\"ca-pub-3416618249440971\"\n     data-ad-slot=\"3903486841\"><\/ins>\n<script>\n     (adsbygoogle = window.adsbygoogle || []).push({});\n<\/script>\n<\/center><\/div><style type=\"text\/css\">\r\n.xnher69e49d48637b3 {\r\nmargin: 5px; padding: 0px;\r\n}\r\n@media screen and (min-width: 1201px) {\r\n.xnher69e49d48637b3 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 993px) and (max-width: 1200px) {\r\n.xnher69e49d48637b3 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 769px) and (max-width: 992px) {\r\n.xnher69e49d48637b3 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 768px) and (max-width: 768px) {\r\n.xnher69e49d48637b3 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (max-width: 767px) {\r\n.xnher69e49d48637b3 {\r\ndisplay: block;\r\n}\r\n}\r\n<\/style>\r\n","protected":false},"excerpt":{"rendered":"<p>Text clustering is widely used in many applications such as recommender systems, sentiment analysis, topic selection, user segmentation. Word embeddings (for example word2vec) allow to exploit ordering of the words and semantics information from the text corpus. In this blog you can find several posts dedicated different word embedding models: GloVe &#8211; How to Convert &#8230; <a title=\"Text Clustering with Word Embedding in Machine Learning\" class=\"read-more\" href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/\" aria-label=\"More on Text Clustering with Word Embedding in Machine Learning\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0},"categories":[46,5],"tags":[7,45,19,11],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Text Clustering with Word Embedding in Machine Learning - Text Analytics Techniques<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Text Clustering with Word Embedding in Machine Learning - Text Analytics Techniques\" \/>\n<meta property=\"og:description\" content=\"Text clustering is widely used in many applications such as recommender systems, sentiment analysis, topic selection, user segmentation. Word embeddings (for example word2vec) allow to exploit ordering of the words and semantics information from the text corpus. In this blog you can find several posts dedicated different word embedding models: GloVe &#8211; How to Convert ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"Text Analytics Techniques\" \/>\n<meta property=\"article:published_time\" content=\"2018-09-21T00:11:31+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-17T00:50:38+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2018\/09\/word2vec-1-e1537725898810.png\" \/>\n<meta name=\"author\" content=\"owygs156\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"owygs156\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/\",\"url\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/\",\"name\":\"Text Clustering with Word Embedding in Machine Learning - Text Analytics Techniques\",\"isPartOf\":{\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/#website\"},\"datePublished\":\"2018-09-21T00:11:31+00:00\",\"dateModified\":\"2018-11-17T00:50:38+00:00\",\"author\":{\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/#\/schema\/person\/832f10562faaa1c7ed668c1ab4388857\"},\"breadcrumb\":{\"@id\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Text Clustering with Word Embedding in Machine Learning\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/#website\",\"url\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/\",\"name\":\"Text Analytics Techniques\",\"description\":\"Text Analytics Techniques\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/#\/schema\/person\/832f10562faaa1c7ed668c1ab4388857\",\"name\":\"owygs156\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"caption\":\"owygs156\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Text Clustering with Word Embedding in Machine Learning - Text Analytics Techniques","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"Text Clustering with Word Embedding in Machine Learning - Text Analytics Techniques","og_description":"Text clustering is widely used in many applications such as recommender systems, sentiment analysis, topic selection, user segmentation. Word embeddings (for example word2vec) allow to exploit ordering of the words and semantics information from the text corpus. In this blog you can find several posts dedicated different word embedding models: GloVe &#8211; How to Convert ... Read more","og_url":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/","og_site_name":"Text Analytics Techniques","article_published_time":"2018-09-21T00:11:31+00:00","article_modified_time":"2018-11-17T00:50:38+00:00","og_image":[{"url":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2018\/09\/word2vec-1-e1537725898810.png"}],"author":"owygs156","twitter_card":"summary_large_image","twitter_misc":{"Written by":"owygs156","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/","url":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/","name":"Text Clustering with Word Embedding in Machine Learning - Text Analytics Techniques","isPartOf":{"@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/#website"},"datePublished":"2018-09-21T00:11:31+00:00","dateModified":"2018-11-17T00:50:38+00:00","author":{"@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/#\/schema\/person\/832f10562faaa1c7ed668c1ab4388857"},"breadcrumb":{"@id":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ai.intelligentonlinetools.com\/ml\/text-clustering-word-embedding-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/ai.intelligentonlinetools.com\/ml\/"},{"@type":"ListItem","position":2,"name":"Text Clustering with Word Embedding in Machine Learning"}]},{"@type":"WebSite","@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/#website","url":"http:\/\/ai.intelligentonlinetools.com\/ml\/","name":"Text Analytics Techniques","description":"Text Analytics Techniques","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/ai.intelligentonlinetools.com\/ml\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/#\/schema\/person\/832f10562faaa1c7ed668c1ab4388857","name":"owygs156","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","caption":"owygs156"}}]}},"_links":{"self":[{"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/posts\/449"}],"collection":[{"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/comments?post=449"}],"version-history":[{"count":19,"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/posts\/449\/revisions"}],"predecessor-version":[{"id":600,"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/posts\/449\/revisions\/600"}],"wp:attachment":[{"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/media?parent=449"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/categories?post=449"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/tags?post=449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}