{"id":24,"date":"2017-12-05T17:05:00","date_gmt":"2017-12-05T17:05:00","guid":{"rendered":"http:\/\/ai.intelligentonlinetools.com\/ml\/?page_id=24"},"modified":"2017-12-07T21:13:42","modified_gmt":"2017-12-07T21:13:42","slug":"text-analytics-techniques-embeddings","status":"publish","type":"page","link":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/","title":{"rendered":"Text Analytics Techniques with Embeddings"},"content":{"rendered":"<div class=\"uczdn69ef5ac6aec33\" ><script async src=\"\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js\"><\/script>\n<!-- Text analytics techniques 728_90 horizontal top -->\n<ins class=\"adsbygoogle\"\n     style=\"display:inline-block;width:728px;height:90px\"\n     data-ad-client=\"ca-pub-3416618249440971\"\n     data-ad-slot=\"2926649501\"><\/ins>\n<script>\n(adsbygoogle = window.adsbygoogle || []).push({});\n<\/script><\/div><style type=\"text\/css\">\r\n.uczdn69ef5ac6aec33 {\r\nmargin: 5px; padding: 0px;\r\n}\r\n@media screen and (min-width: 1201px) {\r\n.uczdn69ef5ac6aec33 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 993px) and (max-width: 1200px) {\r\n.uczdn69ef5ac6aec33 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 769px) and (max-width: 992px) {\r\n.uczdn69ef5ac6aec33 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 768px) and (max-width: 768px) {\r\n.uczdn69ef5ac6aec33 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (max-width: 767px) {\r\n.uczdn69ef5ac6aec33 {\r\ndisplay: block;\r\n}\r\n}\r\n<\/style>\r\n<p><strong>Text analytics techniques<\/strong> involve application of natural language processing (NLP) and text mining machine learning methods such as text classification, clustering,\u00a0summarization , information extraction and sentiment analysis.<\/p>\n<p>We can view text analytics as the process of getting <strong>meaningful information<\/strong> from unstructured text.\u00a0 For example from online discussions we want extract user opinion about product.<\/p>\n<h2>Bag of Words<\/h2>\n<p>Computers do not understand text. So text mining programs map text data into <strong>vectors<\/strong> represented by real numbers. Traditional approach is using <strong>counting of words<\/strong> in documents to convert text into the vectors. The well known and widely used model with this approach is <strong>bag of words<\/strong>. With this model we have one dimension per unique word.<\/p>\n<p>\n<img decoding=\"async\" loading=\"lazy\" src=\"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2017\/12\/text-mining-1476780_640-300x300.png\" alt=\"text analytics techniques\" width=\"300\" height=\"300\" class=\"alignnone size-medium wp-image-47\" style=\"float:left;margin:0 20px 20px 0;\" srcset=\"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2017\/12\/text-mining-1476780_640-300x300.png 300w, http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2017\/12\/text-mining-1476780_640-150x150.png 150w, http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2017\/12\/text-mining-1476780_640.png 640w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<h2>Word Embeddings<\/h2>\n<p>As per many research papers, despite of simplicity bag of words model is very effective. However it is not using position of word in the text relatively to other words.  This information can help extract semantic meaning of word because the words in similar position should have similar meanings. [4]\n<\/p>\n<p>The famous quotation (1957) &#8220;<strong>You shall know a word by the company it keeps<\/strong>&#8221; confirms the importance of word context. This quotation belongs to  an English linguist J. R. Firth &#8211; leading figure in British linguistics during the 1950s.  [1]<\/p>\n<p>To capture context-dependent nature of meaning the <strong>word embedding<\/strong> techniques was created.  Word embedding is the collective name for a set of language modeling and feature learning techniques.   This techniques allow to map words or phrases from the vocabulary to vectors of real numbers. <\/p>\n<p>It involves a mathematical embedding from a space with one dimension per word to a continuous vector space with<strong> much lower dimension<\/strong>. We can use neural networks, dimensionality reduction on the word co-occurrence matrix, probabilistic models to generate this mapping. [2]<\/p>\n<p>Such distributed representations of words in a vector space help learning algorithms to achieve better performance in natural language processing tasks by grouping similar words. [3]<\/p>\n<h2>How Text Analytics Techniques Can Use Word Embeddings<\/h2>\n<p>Once we have word embeddings we feed vector representation of words into algorithms that are used by text analytics techniques.  <\/p>\n<p>For example here <a href=\"http:\/\/ai.intelligentonlinetools.com\/ml\/k-means-clustering-example-word2vec\/\"  target=_blank>K Means Clustering Example with Word2Vec<\/a> is the very basic example where the sequence of words was embedded with <strong>gensim word2vec<\/strong> and then the results where inputted into machine learning clustering algorithm.<\/p>\n<p><strong>Word embeddings<\/strong> can be saved after they are learned. We can use also word embeddings that were obtained on <strong>different vocabulary<\/strong>. Here <a href=\"http:\/\/ai.intelligentonlinetools.com\/ml\/word-embeddinigs-machine-learning\/\" target=_blank>Using Pretrained Word Embeddinigs in Machine Learning<\/a> is the example how to load word embeddings provided by Google.<\/p>\n<p><strong>References<\/strong><br \/>\n1. <a href=\"https:\/\/en.wikipedia.org\/wiki\/John_Rupert_Firth\" target=\"_blank\">John Rupert Firth<\/a><br \/>\n2. <a href=\"https:\/\/en.wikipedia.org\/wiki\/Word_embedding\" target=\"_blank\">Word Embedding<\/a><br \/>\n3. <a href=\"http:\/\/papers.nips.cc\/paper\/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf\" target=\"_blank\">Distributed Representations of Words and Phrases and their Compositionality<\/a><br \/>\n4. <a href=\"https:\/\/cs.stanford.edu\/~quocle\/paragraph_vector.pdf\" target=\"_blank\">Distributed Representations of Sentences and Documents<\/a><\/p>\n<div class=\"jeweb69ef5ac6aec64\" ><center>\n<script async src=\"\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js\"><\/script>\n<!-- Text analytics techniques link ads horizontal Medium after content -->\n<ins class=\"adsbygoogle\"\n     style=\"display:inline-block;width:468px;height:15px\"\n     data-ad-client=\"ca-pub-3416618249440971\"\n     data-ad-slot=\"5765984772\"><\/ins>\n<script>\n(adsbygoogle = window.adsbygoogle || []).push({});\n<\/script>\n\n<script async src=\"\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js\"><\/script>\n<ins class=\"adsbygoogle\"\n     style=\"display:block\"\n     data-ad-format=\"autorelaxed\"\n     data-ad-client=\"ca-pub-3416618249440971\"\n     data-ad-slot=\"3903486841\"><\/ins>\n<script>\n     (adsbygoogle = window.adsbygoogle || []).push({});\n<\/script>\n<\/center><\/div><style type=\"text\/css\">\r\n.jeweb69ef5ac6aec64 {\r\nmargin: 5px; padding: 0px;\r\n}\r\n@media screen and (min-width: 1201px) {\r\n.jeweb69ef5ac6aec64 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 993px) and (max-width: 1200px) {\r\n.jeweb69ef5ac6aec64 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 769px) and (max-width: 992px) {\r\n.jeweb69ef5ac6aec64 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (min-width: 768px) and (max-width: 768px) {\r\n.jeweb69ef5ac6aec64 {\r\ndisplay: block;\r\n}\r\n}\r\n@media screen and (max-width: 767px) {\r\n.jeweb69ef5ac6aec64 {\r\ndisplay: block;\r\n}\r\n}\r\n<\/style>\r\n","protected":false},"excerpt":{"rendered":"<p>Text analytics techniques involve application of natural language processing (NLP) and text mining machine learning methods such as text classification, clustering,\u00a0summarization , information extraction and sentiment analysis. We can view text analytics as the process of getting meaningful information from unstructured text.\u00a0 For example from online discussions we want extract user opinion about product. Bag &#8230; <a title=\"Text Analytics Techniques with Embeddings\" class=\"read-more\" href=\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\" aria-label=\"More on Text Analytics Techniques with Embeddings\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Text Analytics Techniques with Embeddings - Text Analytics Techniques<\/title>\n<meta name=\"description\" content=\"application of natural language processing (NLP) such as word embeddings and text mining methods for text analytics techniques\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Text Analytics Techniques with Embeddings - Text Analytics Techniques\" \/>\n<meta property=\"og:description\" content=\"application of natural language processing (NLP) such as word embeddings and text mining methods for text analytics techniques\" \/>\n<meta property=\"og:url\" content=\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\" \/>\n<meta property=\"og:site_name\" content=\"Text Analytics Techniques\" \/>\n<meta property=\"article:modified_time\" content=\"2017-12-07T21:13:42+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2017\/12\/text-mining-1476780_640-300x300.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\",\"url\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\",\"name\":\"Text Analytics Techniques with Embeddings - Text Analytics Techniques\",\"isPartOf\":{\"@id\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/#website\"},\"datePublished\":\"2017-12-05T17:05:00+00:00\",\"dateModified\":\"2017-12-07T21:13:42+00:00\",\"description\":\"application of natural language processing (NLP) such as word embeddings and text mining methods for text analytics techniques\",\"breadcrumb\":{\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Text Analytics Techniques with Embeddings\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/#website\",\"url\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/\",\"name\":\"Text Analytics Techniques\",\"description\":\"Text Analytics Techniques\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ai.intelligentonlinetools.com\/ml\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Text Analytics Techniques with Embeddings - Text Analytics Techniques","description":"application of natural language processing (NLP) such as word embeddings and text mining methods for text analytics techniques","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/","og_locale":"en_US","og_type":"article","og_title":"Text Analytics Techniques with Embeddings - Text Analytics Techniques","og_description":"application of natural language processing (NLP) such as word embeddings and text mining methods for text analytics techniques","og_url":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/","og_site_name":"Text Analytics Techniques","article_modified_time":"2017-12-07T21:13:42+00:00","og_image":[{"url":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-content\/uploads\/2017\/12\/text-mining-1476780_640-300x300.png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/","url":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/","name":"Text Analytics Techniques with Embeddings - Text Analytics Techniques","isPartOf":{"@id":"https:\/\/ai.intelligentonlinetools.com\/ml\/#website"},"datePublished":"2017-12-05T17:05:00+00:00","dateModified":"2017-12-07T21:13:42+00:00","description":"application of natural language processing (NLP) such as word embeddings and text mining methods for text analytics techniques","breadcrumb":{"@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/"]}]},{"@type":"BreadcrumbList","@id":"http:\/\/ai.intelligentonlinetools.com\/ml\/text-analytics-techniques-embeddings\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ai.intelligentonlinetools.com\/ml\/"},{"@type":"ListItem","position":2,"name":"Text Analytics Techniques with Embeddings"}]},{"@type":"WebSite","@id":"https:\/\/ai.intelligentonlinetools.com\/ml\/#website","url":"https:\/\/ai.intelligentonlinetools.com\/ml\/","name":"Text Analytics Techniques","description":"Text Analytics Techniques","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ai.intelligentonlinetools.com\/ml\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/pages\/24"}],"collection":[{"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/comments?post=24"}],"version-history":[{"count":19,"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/pages\/24\/revisions"}],"predecessor-version":[{"id":82,"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/pages\/24\/revisions\/82"}],"wp:attachment":[{"href":"http:\/\/ai.intelligentonlinetools.com\/ml\/wp-json\/wp\/v2\/media?parent=24"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}