SEO Marketing Research

SEO Marketing Research header image 2

Natural Language

682 Comments · Marketing Research

Search engines for the free Web use sophisticated algorithms to determine what words in a search statement are important, what possible synonyms exist for those words, and which of the thousands of potential documents on the Web contain information related to the search statement.

The result of a Web search is a list of sites ranked by relevance.  The sites with the best information, as calculated by the search engine, are listed at the top.

If the search engine is operating properly, you should need to review only a page or two of sites, not the thousands that are actually retrieved by a search.

For many search engineers, searchers must use natural language search statements.  In practice, natural language is not exactly the way people speak.

Most search engines ignore prepositions, conjunctions, and other small words such as “how” and “why” that may be key elements in your description of the question.

You may be able to force the search engine to see an important word by using a plus sign (+) before it, but this sometimes makes the search engine place too much importance on these connector words and can skew the results.

As a general rule, natural language search statements should contain all the significant words from your description of the research question.

Word order does have an effect, so place the most important concepts at the beginning of your search statement.

One of the weaknesses of natural language search engines is that they do not accommodate synonyms.

Even if you are not sure whether an industry’s direction would be described as “trends,” “forecasts,” or “future,” you must still select only one.

If you use all three, the search engine will look for documents containing all three words.

The search engine may be programmed to know that “trends” and “forecasts” are closely related terms, but it may give priority to the term you actually use.

To make sure you get the greatest number of possible hits, you must repeat the search using another search term.

Let’s look at an example showing how different terms and word order affect a search on Google, Teoma, and AlltheWeb.

The client wants to know about trends in the wearable microdisplay industry.  At this point, the client simply needs an overview to see what technologies and companies are involved.

Phrasing the request as a question, we ask, “What are the trends in the wearable microdisplay industry?”

The best communication with the Web search engine will be unambiguous keywords.  “Trends,” “wearable,” and “microdisplay” all have synonyms, however, so let’s see how different word choices affect the results.

The example below illustrates synonyms from which to choose in creating a search string.
Screen shots from Google show that each search returned entirely different results within the first page of hits.

From the United Stated Display Consortium (USDC), it addresses “bring to eye designs” and “near eye display.”  These could serve as additional synonyms, if we need them.

The USDC’s mission is to be a “neutral forum,” so we conclude that it is a reliable, unbiased source.

From Search String B, the third hit looks pretty good.  It is a directory page assembled by a professor at the University of Ghent in Belgium.  Search String C produced only three hits.

Two are from commercial Web sites and one is from an academic institution.  One of the commercial sites, an industry overview from a venture capital site, offers some useful information.

The document on the academic site also offers some objective information about the industry, and the academic connection gives it a measure of reliability.

Searching on Teoma with the same search strings yielded different results.  This time the University of Ghent site came up second on the list using Search String A.

On AlltheWeb, Search String B yielded much better results than Search String A, with two hits from a manufacturer’s Web site at the top, and the Ghent professor’s page coming in third.

The first two were useful, keeping in mind the natural bias of a product manufacturer.  With Search String A, the first two hits were for a commercial search service that would like to help you monitor the microdisplay market, and for this search those were not particularly helpful.

When you fail to find useful material through a Web search engine, there are several possible explanations:

1. You did not use the right combination of search terms.
2. You did not use the search terms in the correct order.
3. You did not use the right search engine.
4. None of the search engines crawled the site with the information you need.  (The process the search engines employ to find and index Web sites is called crawling).
5. The information is not available on the free or open Web.


682 Comments so far ↓

Leave a Comment