Before beginning it's always good to acquaint yourself with the mouse. There are a variety of good skill building sites for this, including: Mousercise, coloring books, or for advanced users, the tricky mouse manipulations at sodaplay.com.
Basic Search Strategy
Search sites are like giant card catalogs in library's of the last century. Because of automation, they build their catalogs of web site addresses in different ways, which leads to differences in the number of sites in the catalog.
Directories use humans to collect and organize their catalogs. They are then arranged in a very user-friendly way, but have smaller catalogs, maybe less than 100 Million entries.
Crawlers use robot programs called spiders that record the addresses of web pages they find, and scan down these pages recording the frequency of words and other items of data. These search sites can be harder to use, but have much bigger catalogs, from 250-500 million entries. Also known as "Comprehensive Search Engines".
Meta-Search engines may have their own catalogs, but are mainly useful because they will scan several search sites at once, saving you time in inputting the same search variables over and over again. Because of this capability, many more web pages can be scanned. Although results may be redundant, they are your best bet to find some obscure item.
Directories ----> Crawlers -----> Meta-Search Engines
Generally the strategy is to go from the smallest, but most user friendly, type of search site, to the larger but more technically complicated sites.
The following pages will review examples of each of these.
Directories Defined
Yahoo is in the Directory category of search engine. Designed by two grad students, mega-millionaires now, they borrowed a bit from Gregor Mendal's classification technique and devised a simple to use, hierarchical branch structure to classify web sites under. You can search by keywords, or simply click onto categories that interest you to see where the branch leads.
What makes it handy is that you could key in, say, an author's name like "Stephen King". You'll not only see dozens of sites that have Stephen King in their titles, but see what branches they are located under. You might see several branches mentioned.
One branch may be for:
Literature/Genre/Horror/Authors/
and another might be:
Business/Collectibles/Books/First_Editions/Horror
Depending on the reason for your search, you may want to "click up on the branch". After gathering your data on King as an author, you might click on the word Authors to see a listing of other Horror authors who have web sites. You may want to look at Bram Stoker or Clive Barker to gather more data on your main subject's peers.
Almost three years ago Yahoo began spinning off mini-yahoo's for countries and major American markets. Lately, they've helped define the "Portal" concept (being an all-in-one internet launch pad) by offering free e-mail through their site, many games and chat areas, a personal portfolio builder to track stocks
Unfortunately, on days when the market is racing, so are Yahoo's engines, and the entire service can slow down. They become a victim of their own popularity. Usually these only last in the mid-afternoon, or at times when there is a significant news story. Yahoo has partnered with Reuters and AP to provide a top notch, state of the art news service through their site.
TIP:
The lesson is, Yahoo as well as all the
other Portal services provide a ton of services, but 100%
dependability? Not yet. Don't depend on them for your only e-mail
account. The day you have to get back to a critical interviewer will be
the day they crawl. The afternoon you might need to see how the
market has treated
your portfolio, well, good luck! Be familiar with
alternative
sites for news and stock prices prepared .
Exercise: Research John Glenn; Review the two categories that come up for him, and the number of individual sites. Practice moving up the branch for individual astronauts.
Next to the Search button in Yahoo is a link labeled "advanced search". This would allow a user to be more specific in telling Yahoo what to search for. You can ask for exact phrases, or that you find web sites relating to two or more separate terms, like "internet" and "business". Using advanced help features usually cut down on the number of web sites you'll be receiving back from Yahoo.
Other Directories -
Librarians Index to the Internet -
http://lii.org/
From their web site description:
The Librarians' Index to the Internet is a searchable, annotated
subject directory of more than 7,900 Internet resources selected and
evaluated by librarians for their usefulness to users of public
libraries. It's meant to be used by both librarians and non-librarians
as a reliable and efficient guide to described and evaluated Internet
resources.
About.com - http://www.about.com/
From their web site description:
Each site in our unique network is run by a professional Guide who
is carefully screened and trained by About. Guides build a
comprehensive environment around each of their specific topics,
including the best new content, relevant links, How-To's, Forums, and
answers to just about any question.
Internet
Subject Directories from the University of Albany Libraries -
http://library.albany.edu/internet/subject.html
A comprehensive listing of Academic, Professional and Commercial
Directories and Portals from one of the nations leading Library Schools.
Alta Vista
http://www.altavista.com
Crawlers use automated software robots called spiders that prowl the web, looking for fresh spider food... new URL's. When they find these they quickly digest as much information as they can, scanning sometimes one page, sometimes dozens. They take that info back to their main nest, and allow it to be deposited into the batch of URL's already there. When a user makes an inquiry of the crawler, it reviews all these URL's and sends you the listing of pages that had the same search string words on it (sometimes several times, increasing it's relevancy and rank in the link listings).
Exercise: Research John Glenn again; Review the top box "Suggested searches"; 44,000+ web sites found! Look at the extra descriptive information accompanying the URL results.
Alta Vista Translation capability:
They are one of the first web sites to offer a language translation service!
You can take a paragraph in French, German, Spanish, or several other languages, put it into their special translation dialogue box, and it will automatically translate it for you into English, or English into another one of the five main Romance languages; this is an extremely useful service as far as being able to read foreign web pages.
Exercise: Practice translating "hello"
Other Crawlers:
Google - http://www.google.com
From their web site description:
Google's index, comprised of more than 1 billion URLs, is the first
of its kind and represents the most comprehensive collection of the
most useful web pages on the Internet. While index size alone is not
the key determinant of quality results, it has an obvious effect on the
likelihood of a relevant result being returned.
Fast -
http://www.alltheweb.com/
From their web site description:
Fast is... the leading information retrieval and delivery solution,
based on advanced indexing, aggregation, matching and presentation
technologies...indexing more than 600 million full-text Web
pages...completely updated every
two weeks.
Internet
Search Engines from the University of Albany Libraries -
http://library.albany.edu/internet/engines.html
A comprehensive listing of general and specialty search engines from
one of the nations leading Library Schools.
Meta-Search engines
These will submit your keyword query across multiple search
engines, casting a wide "net over the Net".
They save you time and keystrokes by making the same request to
several databases. Excellent for researching obscure topics.
Exercise: Research John Glenn - The "fetch"
comes up with a box with specific John Glenn topics, and asks:
"Are you looking for: John Glenn In
Space, John Glenn Sound, Godspeed John Glenn, John Glenn
biography" and several other topics. This will be followed by listing
from Looksmart, Dogpile's own catalog, GoTo.com and then asks you if
you want to search other databases.
Meta-Crawler / Very fast. Conducts parallel searches of 7 major search engines. Removes duplicate and invalid URLs.
On our "Staff Tool Kit" page we have all three areas described above listed out as
One of the common complaints about searching on the internet is that you can get too much irrelevant information returned after keying in a search keyword. As we've seen in our examples, there are thousands of web pages for John Glenn.
EVERY search engine has a web page marked "help", "advanced
search", "tips", or some similar type of name. These pages are designed
to assist people narrow down what they are trying to find. They
are not for people with "advanced" skills...they're for everybody!
Examples of "advanced search"-type pages:
Here are some other ways to narrow down your search. These are tips from the former search engine Infoseek.
Check spelling
Use multiple words
Example: best pizza in San Francisco
Use similar words
Example: restaurant cafe bistro
Use appropriate capitalization
Example: Rock Hudson
Use quotation marks to indicate a phrase
Example: "yellow brick road"
Use plus (+) or minus (-) to give more
priority, or exclude from a search
Example: +"small dogs" -Chihuahua
Use a comma to separate names
Example: White House, Bill Clinton
Another way to phrase search queries is to use boolean logic. This would involve using the words "and", "or" and "not" in conjunction with search keywords.
The pictures below are from the University of Albany Libraries tutorial on Searching on the Internet, and illustrate the various ways you can use boolean operators.
The more terms or concepts we combine in a search with AND logic, the fewer records we will retrieve.
For instance, if we search for web sites using just the two words
poverty and crime, we could get 1000 returns for each!
If we ask using the phrase "poverty and crime", only selections
using both of those two words would appear.
AND
If we were looking for information on colleges we could use college or university and get all the returns that each phrase would bring. OR logic is most commonly used to search for synonymous terms or concepts. This would bring back the most amount of records.
OR
NOT logic excludes records from your search results. Be careful when you use NOT: the term you do want may be present in an important way in documents that also contain the word you wish to avoid.
Here is a handy chart that describe various search engines use of search terms and Boolean operators:
InfoPeople's Search Tools Chart - http://www.infopeople.org/search/chart.html
The site above is kept updated on a regular basis and contains a variety of engines w/ their special features!
Switchboard - Name Search Exercise: In their "People Search" section (Top Left) use the first link and enter the information needed to find a person.
Mapquest - Address Search Exercise: Click directly on the word
"Maps" and enter any address. Observe how the scale under the map works
to zoom in and out of a location.
Medline Plus - Medical Search Exercise: Use their search box to find
information about a medical topic (ex: measles).
1. "Trim the Branch" - In the URL, take out the page name, then the sub-directory preceding it, then the sub-directory preceding THAT until you get to the main domain name. Usually, in going up the branch, you'll find clues to where the former page currently resides.
2. Use the "Wayback Machine" - http://www.archive.org
Also known as the Internet Archive, it's a huge collection of
pages going back to 1996. An interesting article about the technology
it uses and the noble goals of its' creators can be found here: <
http://www.oreillynet.com/pub/a/webservices/2002/01/18/brewster.html
>.
Hybrid Search Engines
1. Teoma - http://www.teoma.com/
Teoma combines both Directory and Crawler techniques to offer a
simple to use choice of offerings.
It has three ways it returns information back to you. The three types
are:
Web Pages: Authoritative sites relevant to search term.
Web Pages by Topic: Top result pages are grouped based on their topics.
Experts' Links: Pages
contain directories of links for related general subjects.
The core Vivísimo technology is called Document clustering: the automatic organization of documents into meaningful groups.
Another unique feature is that it gives you the capability to "preview" a page before visiting it. It creates a small window within it's search list that you can use to view the page.
Searchenginewatch.com is a great site for both the inexperienced web novice and experienced web master. It is a constantly updated industry focused site, that compiles statistics on which search engine database is bigger, which gets more traffic, which is owned by whom, etc. It has a monthly newsletter delivered through email, tutorials for learning more about using search sites (including one on Search Engine Math), history and trivia about this new industry.
It also has tips for those who build web pages on how to help get their sites listed "near the top" of a list of search results. Anyone involved with Internet Marketing should review this site!
We have our web site divided into 3 distinct "zones", for the library, schools and community.
The Library zone has a dialogue box you can click into and ask for things concerning the Library or Lakewood. We have our book and A/V catalogs online, and there are web pages for all the major services the library provides. There's a link to OPLIN databases, where (if you have a library card) you can search through over 1000 magazines for full text, and in some cases full image articles. We also have pages set up for those interested in using the Internet to search for jobs, travel information, scholarships and housing.
The School zone contains information about the school district, web pages for many individual class assignments called "Pathfinders", information about many of the educational programs going on in the city, special pages for teachers, substitute teaching, and hyperlinked summer reading lists for all the area high schools.
The Community zone has web pages for all the major city
departments, many local area businesses, civic and community
organizations, plus historical information which includes "Lakewood
Lore", a database of almost 200
articles by local journalist Dan Chabek!
Use the sites below to sharpen your skills and understanding of searching on the Internet!
Bare
Bones 101: A Basic Tutorial On Searching The Web - Another Library
training tutorial.
Check out Chapter 10, Troubleshooting Tips! http://www.sc.edu/beaufort/library/pages/bones/lesson10.shtml
The Internet Detective - Good for learning how to search for and evaluate web sites.
http://sosig.ac.uk/desire/internet-detective.html
The Pandia Goalgetter - Very good review of query syntax, troubleshooting and other tutorials.
http://www.pandia.com/goalgetter/index.html
The Pandia site also has this great recommendations guide for
searching on the web -
http://www.pandia.com/goalgetter/recommendations.html
AskScott - You pick a topic and Scott, the virtual reference librarian, points you in the right direction.
http://www.askscott.com/
Take the Searching The Web Quiz
Back to the Educational Resources page
last updated 20050201