Topics
Understanding What Latent Semantics Is

Have you ever come across the term latent semantics? Although it may sound too technical, understanding this application might help a lot, especially if you are dealing with documentation and information retrieval.

Latent Semantics, or technically called Latent Semantic Analysis (LSA) is actually that technique used in natural language processing, particularly in vectorial semantics. This application analyzes a relationship between sets of documents and the terms they contain through a production of a set of concepts generally related to the terms documents. Patented way back in 1988 by Lynn Streeter, Karen Lochbaum, Thomas Landauer, Richard Harshman, George Furnas, Susan Dumais, and Scott Deerwester, the main context of this application is information retrieval.

Basically, latent semantics can be used in various applications. This new concept space can compare documents like data clustering or document classification. It can also efficiently find similar documents in different languages once it has analyzed a base set of translated documents. This one is called cross language retrieval. This application also finds relations between terms like polysemy and synonymy and once it is given some query of terms, it immediately translates those terms in to concept spaces before finding matching documents. This is called information retrieval. In any natural language processing, polysemy and synonym are important factors to a problem.

In latent semantics, synonymy is defined as the phenomenon wherein various words are described in similar ideas. Meaning, queries in a search engine can possibly fail to give relevant information that does not have the same words appearing in the query. An example is that when you search for "doctors", the document will not give details that contain "medical practitioners" or "physicians", although they are all synonymous. As with polysemy, it is the phenomenon where one word has different meaning. Meaning, when a computer scientist and a botanist seek for information about "tree", they both can obtain different sets of information from such queries.

Latent semantics is also used a term-document matrix. This is described as the occurrence of terms in one document. The sparse matrix in which its rows directly correspond to its terms and its columns to the documents typically is a stemmed word that appears in a document.

Latent semantic analysis can be quite tricky especially if you have not fully grasped the whole idea on the importance of this application. Take time to read more about this concept so you will have a better idea on where you can apply this.

For fresh, unique and quality web content, visit http://www.fresh-web-content.com
This article is free for republishing
Source: http://www.a1articles.com/article_592688_50.html
Related Articles