Latent semantic indexing (LSI) is an algorithm used by search engines to determine what a page is about outside of specifically matching search query. The LSI algorithm doesn't actually understand the meanings of words on the page but it can spot various patterns of related words. It may be possible that LSI return with relevant results that don't contain the keyword at all, but those pages contain related or relevant words.
In response to a query an LSI indexed database will return the pages it thinks will best fit the search terms. The LSI algorithm doesn’t understand anything about what the words mean and does not require an exact match to return useful results.
Latent semantic indexing was initially used in Adsense to enable adverts targeted to the theme of a webpage to appear on the page. The algorithm checks the wording on the page and determines the theme of the page. It was only later that Google applied the algorithms to search engine placement, and it is used by search engines other than Google. It involves analysis of words used in natural language, the synonyms and closely related words used when discussing the general theme of a page. It complements, rather than replaces, keyword analysis.
How LSI Works....
The Search for Content in a document, Natural language is full of redundancies, and not every word that appears in a document carries semantic meaning.The first step in doing LSI is culling all those extraeous words from a document, leaving only content words likely to have semantic meaning. There are many ways to define a content word - here is one recipe for generating a list of content words from a document collection:
- Make a complete list of all the words that appear anywhere in the collection
- Discard articles, prepositions, and conjunctions
- Discard common verbs (know, see, do, be)
- Discard pronouns
- Discard common adjectives (big, late, high)
- Discard frilly words (therefore, thus, however, albeit, etc.)
- Discard any words that appear in every document
- Discard any words that appear in only one document
This process condenses our documents into sets of content words that we can then use to index our location. For more detail visit
1.http://www.seobook.com/archives/000657.shtml
2.http://www.seo-blog.com/latent-semantic-indexing-lsi-explained.php
No comments:
Post a Comment