Looking For Answers

Getting Results From Web Search Engines Takes A Little Know-How

At one time a story about search engines might have begun with a cliché about haystacks and needles. The immensity of the Web today, however - some 800 million pages and counting as we went to print -- races beyond that puny analogy. These days, writers must break out the heavy-duty metaphors: grains of sand perhaps, or the ol' stars in the sky number. However human mind tries to embrace it, the Web is big. Really big. The fact that we find anything of use at all on the Internet is somewhat amazing. Increase your chances of hitting the jackpot by choosing the right search engine for the job and then using it with gusto.

Yahoo!

Probably the best known of all searching tools, Yahoo! is not technically a search engine itself. Yahoo! represents a great example of the directory genre of Web tools, sites that use human editors to attempt to categorize as many Web pages as possible under headings that make sense. The Web is replete with specialized directories covering specific topic areas, but Yahoo! is the best place to go for a general overview of what the Internet has to offer.

Deciding whether Yahoo! or a more comprehensive search engine is the right place for YOU is the first step. Try imagining you're looking for the information in a print library. Would you expect to see your search topic as its own entry in a large encyclopedia, or .would it more likely be found in an index of all the words within encyclopedia articles? Yahoo! is more like the first scenario. For instance, Yahoo! can quickly link you to general-interest sites about Denver, Colorado. A search engine, however, would be the tool to use for finding out about ongoing construction projects in downtown Denver. The more specific you want to be, the better luck you might have with a search engine rather than a directory.

Yahoo!'s mission means it combs through a smaller database than most engines, so narrowing down results is not as much of a problem. However, it always helps to use more than one search word and to put quotation marks around exact phrases. Another useful search term modifier is the minus sign (-). Put it in front of words that you do not want to show up in your results.

If Yahoo! finds no matches in its own directory, it automatically runs the search through the database at Inktomi, a commercial search engine that provides results for a number of customers. The listings provided are fairly basic, so if you don't immediately see a useful result, try running the same search in one of the search engines listed below.

Smaller than Yahoo! but hoping to challenge the leader are two other human-produced, category based directories. Looksmart presents an attractive Web site and offers searching help from real people. Snap is a venture of c|net and NBC.

AltaVista.

The heavy-hitting search engine field is more crowded now than when AltaVista lumbered onto the scene, but many users still turn to this gigantic index when they want to find those sires too specific for Yahoo!

The simple AltaVista search on the site's main page proposes that users ask a natural language question, but we have the most luck with boring old queries using Boolean operators. Click Advanced Text Search to bring up a more sophisticated form.

AltaVista's advanced search uses two main fill-in boxes. The large box on the bottom is actually the most important; this is where to type in the Boolean expression that .defines your query. The smaller box on the top is a place to type in a few especially important words to help the Search engine rank results. You can also set a date range or click the check box if you only need to know how many matches AltaVista finds.

Although some of the up-and-coming engines, such as FAST, have taken the lead in terms of index size, AltaVista still holds its own. On a couple of the more obscure test searches we ran, AltaVista outperformed FAST, Google, and Northern Light. Of course, it is impossible to make a comprehensive comparison, given the infinite number of topics available. If AltaVista isn't working, try something else.

Excite.

An oft-used gateway to the Web's wonders, Excite offers numerous tools to complement its searching abilities. You can run a quick search for a word or phrase from the main page and see results from both Excite's Directory, which is edited by people, and the Web, which is more typical of output by search engines.

For a more advanced search, click the More Search link and then the Advanced Web Search link. Rather than asking people to remember Boolean operators, Excite offers three constraint boxes. Choose the correct commands from the drop-down menus and construct a query from several lines. If you need more constraints, hit the button at the bottom of the Search For area. Under Language and Type of Site, you can narrow down your search by geographic or top level-domain information.

Excite recognizes the names of many companies and institutions and returns a special results page with additional information about the organization For instance, a search for "Ford" brings up stock information, links great ways to a company profile, news articles about Ford, and other sites related to the auto manufacturer. If you didn't mean Ford Motor, click the link near the top of the page for other interpretations of "Ford."

Like a few other search engines, Excite includes a "Search for more documents like this one" link under each starch result listing that runs a new search based on the content of the site you say looks good. This can be a great way to narrow down a search if you happen across one Site that fits your subject very well.

HotBot/Inktomi.

While it is difficult to test search engines in any definitive way, HotBot continually ranks high on pundits' scales. According to its keepers, the HotBot engine crawls through 110 million documents every three or four weeks. HotBot results also include- "recommended" sites arranged in a Yahoo!-like directory format, although these usually are not terribly useful.

Owned by Lycos but operated as a separate service, HotBot combines Inktomi search results with another search product, DirectHit. DirectHit ranks sites according to how often users click them in result lists. The theory is that people actually looking through a big pile of links will more often than not pick out the most useful. Sites that attract clicks rise in the "popularity" scale and are shown closer to the top of subsequent lists.

HotBot offers two ways of searching: the standard screen and an advanced screen. Along with date and language choices, the default standard search lets you specify whether you want to find pages with all of search terms, any of the terms, a phrase, name, or Boolean expression. Check boxes users narrow down results to only those s: with certain multimedia elements such as pictures or sounds.

The advanced search presents a much longer list of options. Many of the choices are expanded versions of the standard search but a few allow some interesting fine tuning. For instance, Page Depth controls whether the search engine looks only at "top" pages (the home page for each site) or any of a site's pages. A Location/Domain section lets you pick certain domains anywhere on the Web or top-level domains associated with a certain country.

Another option to try out is Word Filter, which limits results to pages that either contain or do not contain a certain word phrase. HotBot includes both "must contain" and "should contain" choices, which allows a little flexibility if you're not entirely sure what your target looks like. Click the circle with the plus sign to pull up a screen with more blanks to fill. We've seen word filtering devices along with the rest of these advanced tools at sites like AltaVista, but HotBot's forms make it easier for those uninitiated with search engine language to put together a fairly detailed query.

If you still find yourself staring at hut or thousands of hits, look for the little box in the search bar at the top of the screen that tells HotBot to look within the current results. Type in a few more words and click Search again.

Northern Light.

With more than million pages indexed, Northern Light is second only to FAST in the size category, and it throws in a few extras that you don't o see. For instance, NL searches a "special collection" of periodical material along with Web. Although it usually costs money to research articles from this archive, the additional resources sometimes come in handy.

NL supports all the major types of query-building blocks such as Boolean operators, wildcards, and fielded searches. Although it doesn't sport the type of query-building screen we saw in HotBot, all of the commands are available at any time by simply typing them into the main search box. For easy access to a few search tools, click the Power Search tab at the top of the page. A new search screen appears with spaces for fielded searches and date ranges. You can also select to search only certain sources.

One way of narrowing a search unique to NL is the folder system, After every search NL will attempt to organize your results in hierarchical folders that appear to the left side of the main results list, To focus in on the type of pages you're seeking, look for a folder with the correct subject. The folders approach is not always particularly helpful. NL ranks the main list in an attempt to put relevant sites near the top anyway; focusing in on the most relevant folder might simply show you the beginning of your list.

LYCOS.

A venerable institution to old timers of the Web, Lycos was one of the first large-scale search engines that crawled the Internet on its own quest to index Everything. Today Lycos' database of around 50 million pages falls short compared to giants like AltaVista, Northern Light, and FAST. A smaller index is not necessarily bad though. Depending on the generality of your search, a smaller set of good pages is better than a huge set of good and bad pages. As with anything else, there is a tradeoff.

Like its kid brother HotBot, Lycos uses DirectHit information to refine search results. Lycos has also started its own human-edited directory a la Yahoo!. Search results display many hits sorted into different categories as picked by people. Additional sites not yet categorized appear toward the bottom of the list.

Lycos includes a helpful advanced mode that makes complex queries as easy as filling out a form. The first drop-down menu after "Search for" includes multiple choices such as finding the search terms in order (but not necessarily next to each other) or an exact phrase. You can choose to search titles only, URLs, and other fields. At the bottom of the form is a way to rank results based on several characteristics.

GOTO.

GoTo openly sells the right to be listed near the top of search results to the highest bidder. The site also discloses how much the advertiser paid: right on the results. A search for "Smart Computing" returns NetMagazines.com, an electronic newsstand that paid for the honor, in the #1 spot. GoTo theorizes that this system of selling rank cuts down on the clutter of searches, although we found when conducting searches that this does not always seem to be the case.

Type in a more contested search query such as "buy a car" to see real capitalism in action; the first 16 listings all paid money. Are these the best sites for potential car shoppers? Maybe, maybe not. Interestingly, phrases like "new car," "I want to buy a car," and "car sales" all pull up different lists.

In the real world, shoppers can be drawn to: stores that spend money to be the biggest or brightest. In many cases, such characteristics mean a successful establishment that might in fact be the best place to shop for some purposes Whether the same model provides value on the Web is questionable.

Google.

The beta search site at Google churns through an index nearing 800 million pages. That's not as large as a few we've seen, but Google makes up for the smaller range with an attempt at accuracy.

Google uses an interesting scoring system called PageRank in its quest to find sites that are actually useful. Once the engine finds all of the pages with your search terms, it examines those pages for interconnections. Maybe 50 of the pages all include links to a certain page. Logically, the page receiving all of the attention from other sites might be the most important of the group, and it is ranked near the top of the list.

Go/Infoseek

Once known as Infoseek (and not be be confused with GoTo), Go recognizes the plus and minus signs, certain field searches, and quotation marks.

Go resembles Excite in hat along with a basic search engine, the site offers a directory with special tools under different categories. Click Cars, and a page pops up with special links for research cars, finding used cars, and chats for car enthusiasts. The tabs near the top of these pages will take you to Topic pages with reviewed Web sites and other links.

The Google system works especially well if you are looking for the Web site of a company or institution. The "I'm Lucky" button takes users directly to the correct site instead of a page with search results. Try it with searches for "the University of Minnesota," or "the White House."

Google has some limitations. It does not support the OR operator in any way. Many search engines have an option to search for any of the words you type in; on Google the only way to search for two similar words is to run two searches. Google is also very exact and does not support wildcards and does not automatically stem as some sites do. A search for "road" will not pick up "roads."

On the other hand, Google includes a couple of non-standard extras. Click the red bar next to any site in the results list to pull up a page of sites that link to that result. You can also click the "Cached" link next to most result listings to see what the page looked like when Google crawled through it. Fast-changing sites that no longer include the terms you needed might still be available this way through the search engine.

FAST.

A relative newcomer, FAST Search claims to search the Web's largest index of some 200 million pages. FAST's sponsors have set the lofty goal of indexing the entire Web-estimated at 800 million URLs as of summer 1999 -- by the year 2000 and keeping up with the rapid growth from that point on. FAST also says that, as its name implies, it is quicker than other engines, both in searching the index and crawling through pages, resulting in fresher information.

We can't count every site in the FAST index, but we can report that the search engine was indeed quick. As of yet, however, FAST offers few tools to cut down on all the results zipped to your screen. The standard quotation marks, plus signs, and minus signs are available, but the site provides little beyond these simple tools.

Given the huge database FAST searches, you may need every tool available. A simple test search for "Smart Computing" pulled up several sites that mention the magazine, but the first page actually contained on the Smart Computing site was pegged at no. 12 on the list. FAST in its current incarnation is definitely a site meant for searching out the obscure rather than the well-established.

Meta-engines such as SavySearch combine results from other r8areh engines so you don't have to schlep your query around the Wet on your own.

Find It Yet?

With this many search engines to pick from, most with their own strategies and strengths, even the Web's most deeply hidden information can be found eventually. Pick through the links at enough search engines, and the amalgam of hay, sand, of whatever you want to call it just might yield what you need.

back to General Articles page