Cleveland-Marshall College of Law

Image from the Law Library

Click here to find your options for contacting a reference librarian.


Web Searching

When using research methodologies, the WorldWideWeb may be viewed as one electronic information medium to be considered with print, audiovisual, and other electronic information media.  In addition, as is the case with all information media, one must understand how to effectively find valid and reliable information on the Web.  This "Web Searching" guide will help you to be a better Web researcher, as well as help you to better evaluate Web sites you find in your research. 

Web Site Evaluation
Web Searching - General Issues
Web Searching Principles and Guidelines
Web Search Directories
Web Search Engines
Web Metasearch Engines
Invisible Web
Additional Resources


Web Site Evaluation

A WorldWideWeb "site" is composed of many "pages."  Current size estimates of the indexable Web put it at 11.5 billion pages, and the Invisible Web may be as large as 500 billion pages.  [For statistics, news, and research information on the Web, consult ClickZ.]  With all these Web pages, how can you find the information you want?  In addition, even if you find a Web site that appears to have your desired information, how can you know if the Web site is reliable?  You can probably trust sites from established organizations, but what about Web sites of organizations you know nothing about?
 

Web Site Home Page and Site Index

When evaluating a Web site, go to its "home page," or opening page of the site, and look for these key pieces of information:

  • Purpose
  • Scope of Services and Information
  • Navigation Methods
  • Help Information 

If the home page doesn't address this information, look for a link to the "site index" or "site map."  As its name implies, the site index is like the index of a book.  Examining the site index should help you decide if the Web site has the information you want.
 

Web Site Evaluation Criteria

Beyond the home page and site index, consider the following criteria when evaluating a Web site:

  • Credibility / History - who created the site; are they a reliable authority; who are their affiliations, partners, or sponsors?
  • Purpose / Philosophy - why was the site created?
  • Audience / Relevance - who was the site designed for?
  • Content - what are the site's topics or subjects?
  • Scope / Context / Coverage - how much information on a topic does the site cover; does the site report its size and growth rate?
  • Selection Criteria / Critical Thinking / Objectivity / Censorship - how is information selected for the site?
  • Accuracy / Documentation - does the site document its sources?
  • Currency / Updating - how often is the site's information updated and what is the scope of that updating?
  • Writing Quality - can you cognitively understand the site's information?
  • Design / Presentation Format - can you visually or auditorially understand the site's information?
  • Stability / Continuity / Maintenance - can you consistently connect to the site?
  • Accessibility - is the site accessible via different browsers; does the site have "text only" or "no frames" versions?
  • Interface - can you understand the site's graphical and textual controls; how quickly do these controls operate?
  • Navigation / Site Index / Searchability - can you understand how to find information on the site?
  • Connectivity - does the site link to other helpful sites?
  • Help Information / Frequently Asked Questions / Customer Support
  • Usefulness / Value-to-Cost Ratio - how does the site compare to other analogous sites; if it's a fee-based site, is its information worth the cost?


Web Site Reviews

Internet Scout Project
Produces The Scout Report, which announces and reviews Web sites and mailing lists.  The searchable Scout Report Archives contain approximately 23,000 reports, and include the ability to browse for reviews by Library of Congress Subject Headings (eg, "Law - United States - cases").

 

 

Web Searching - General Issues

The Web offers a variety of good search services.  Many provide "value-added" services, such as customized display of search results.  However, it is important to remember that economic forces are quite active on the Web.  In addition, search services do not reach all Web-based information.  [See the Invisible Web section of this guide.]
 

Web Search Service Trends

Over the last decade, several trends have emerged in Web search service operations:

  • Partnerships - Partnering with organizations, such as "amazon.com," to offer easy material purchase and other customer services.
  • Advertisements - Using pop-up ads to cover maintenance expenses or make additional income.
  • Fee Services - Offering fee-based companion services, such as a full-text database.
  • Charges for preferential listings - Selling high positions in search results lists.  [On 27 June 2002, responding to a complaint filed by Commercial Alert, the US Federal Trade Commission sent a letter to the leading search engine companies warning them to adequately disclose and distinguish paid placements from unpaid ones.  See  Commercial Alert Complaint Letter.]
  • Customized search form and results display - Offering capability of personalizing how you search and see search results; "cookies" (ie, information stored on your computer by a Web site) make this possible.


Web Search Service Problems

Despite their wonderful capability of finding information, there can be problems with the information from Web search services.  Web Search Directories cover less of the Web than Engines, but tend to have fewer problems because human beings compile verified information about Web sites.  Web Search Engines cover more of the Web than Directories, but they electronically compile information that may or may not be verified.

All Web search services can be subject to the following problems:

  • Inadequate updating and relocating leads to retrieval of many invalid and duplicate URLs.  [The URL - uniform resource locater - is the "address" of a Web site or page.]
  • Because each search service covers different parts of the Web, as well as indexes overlapping parts in different ways, searchers need to use multiple search services to be really thorough.
  • "Popular," rather than the best, Web sites tend to get indexed more.  [A "popular" site is one that other sites link to; the more sites that link to a site, the more "popular" that site is.]
  • It can take months for a new site to get indexed.
  • Information in frames or image maps is often not indexed.
  • <alt> tags describing graphics are indexed, but not the words in the graphics themselves.
  • Inaccessible information in "Invisible Web" (eg, password sites).  [More on this in  Invisible Web section of this guide.]
  • Partnership constraints may affect information provided.
  • Annoying, distracting advertisements.
  • Slow, congested traffic!  [WorldWideWait]


Web Search Service Evaluation

Search Engine Watch
Search service review Web site created by Danny Sullivan, who continues to edit the site for Incisive Media.  Provides statistics, reviews, and comparative testing.  Includes free daily SearchDay and monthly Search Engine Report newsletters, as well as the Search Engine Blog and podcasts.


In addition to consulting the above review site, consider the following criteria when deciding whether a Web search service meets your needs:

  • Selection criteria or human involvement in indexing, reviewing, and screening information.
  • "Refresh" frequency - how often is the directory information updated; how often does the engine robot "recrawl" to refresh information?
  • Interface - can you understand the search service's graphical and textual controls; how quickly do these controls operate; is there an advanced search capability?
  • Search Tips / Help Information.
  • Boolean Logic Connectors (eg, "and," "or," "not") - can you use these to construct a search statement?
  • Proximity Searching (eg, "near," "adj") - this connector is especially desired for full-text searching.
  • Phrase Searching.
  • Case Sensitivity - does the service understand upper and lower case?
  • Punctuation Sensitivity - does the service understand punctuation?
  • Truncation Capability - can the service retrieve varying forms of word (eg, where "child*" retrieves "child" "children" etc.)?
  • Field Searching - can you limit a search to a particular portion of Web site?
  • "Stop Words" (eg, to, be) - does the service not search certain common words?
  • Results Display - does the service limit search results in any way; how does the service rank search results?
  • Cost / Fee.

 


Web Searching Principles and Guidelines


 

Key Principles

  • If your first thirty hits are not on point, change your search statement and/or use a different search service.
  • Evaluate retrieved sites based on criteria outlined in the Web Site Evaluation section of this guide.
  • Bookmark appropriate and valuable Web sites.


General Guidelines

  • Write down what you are seeking - combine keywords into search statement - before going online.
  • Consider browsing a Search Directory before using a Search Engine.
  • Check search service's tips and help information.
  • Keep search simple - much irrelevant information often retrieved with complex searches, because the searcher could not effectively combine keywords.
  • Use several synonyms of keywords.
  • Start specific; only use general terms if necessary.

    Narrow search - will have fewer items retrieved, but they will likely have high relevance.
    Broad search - will have many items retrieved, but most will likely have low relevance.

  • Enter most important concept first.
  • Use phrases - phrases are usually enclosed in quotation marks.
  • Boolean Connectors -

    Use UPPER CASE letters.

    Check help information for default connector (often defaults to "and").

    Often use + in front of word for "and" (ie, includes term).

    Often use - in front of word for "not" (ie, excludes term).

  • Proximity Connector (eg, "near," "adj") - this connector especially desired for full-text searching.
  • Truncation symbol usually an asterisk ( * ).
  • Use parentheses to combine keywords, similarly to data combinations in algebra or deductive logic.
  • If available, use "field" searching - Web site title and URL particularly helpful.
  • If available, use "limit/refine" capability.
     

 

Web Search Directories

Web Search Directories are created by human beings who identify Web sites and list them according to a subject classification.  You can browse or search a Web Search Directory.  When searching within a Directory, usually the main page of a Web site, rather than multiple pages of that site, is listed in your search results. This feature helps to reduce duplicates in search results.  Particularly helpful, because of their human indexing, Web Search Directory search results often include annotations and evaluations of Web sites.

Open Directory Project
Developed and maintained by 74,000 volunteer editors.
Netscape owns the copyright to ODP's compilations, but freely grants license to them.  Thus, ODP is used by many search engines as their subject directories.

Note the Law sub-category under the Society category.

Yahoo!Directory
Note the Law sub-category under the Government category.

The original Yahoo! was founded by David Filo and Jerry Yang in 1994; now a corporation based in Sunnyvale, CA.

Yahoo! charges "commercial" sites annual fees to be listed.
Yahoo! started Yahoo!Search in 2004 after acquiring the "All the Web" and "AltaVista" search engines.  [See the Web Search Engines section of this guide.]

WWW Virtual Library
The first search directory founded by Tim Berners-Lee, creator of HTML and the Web.
Maintained by volunteers.

Note the Law category.
 
 

Web Search Engines

To establish its information bank, a Search Engine's search software sends out "robots" (or "spiders" or "crawlers") that use HTTP to request data from Gopher, FTP, and HTTP servers.  [HTTP - HyperText Transmission Protocol; FTP - File Transfer Protocol.]  Data is indexed and stored; the main page and additional pages of a site are indexed.

Note:  When you use a Search Engine, you are only searching information indexed and stored by that Engine.

Google and Yahoo!Search are two leading search engines.  Both have the following features:

  • Regular and "advanced" searching
  • Searches Web pages as well as image, PDF, and other file types (eg, Rich Text Format, PowerPoint).
  • When using Boolean operators, defaults to "and"; also supports Boolean operators  "or" and "not"
  • "+" for "must include" and "-" for "exclude"
  • Phrase searching
  • Not case sensitive
  • Field searching
  • File format (eg, pdf) searching
  • Supports 3-month, 6-month, and year date restrictions
  • Can search in at least 35 languages
  • Translates text or Web page from at least English to French, German, Italian, Portuguese, and Spanish; these five languages into English; as well as German to French and French to German
  • "Mature" or "adult" filter option
See additional descriptive information on these two search engines below.  Also included are results for sample searches seeking information on "Ohio legal research."

Google
Also searches Usenet newsgroups.

Supports over 100 interface languages.
Does not support truncation, but automatically searches for variant forms of word.
Can use * (ie, asterisk) as a "wildcard" within a phrase search.
Supports ~ (ie, tilde) in front of word to search for its synonyms.
Can search within search results.
Divides search results into "Web," "Images," "Groups," "News," and "Froogle" (ie, products) categories.

Includes "Google Scholar" feature that searches articles, books, theses, and other scholarly materials.

"ohio legal research"
About 13,200 Web pages will be retrieved.

ohio "legal research"

About 1,070,000 Web pages will be retrieved.  [Google assumes the "and" Boolean connector between words unless otherwise specified.]

"legal research" ohio
About 1,080,000 Web pages will be retrieved.  [Google assumes the "and" Boolean connector between words unless otherwise specified.]


Yahoo!Search
Also searches Yahoo! directory, as well as other Yahoo! portal databases (eg, Yahoo!News).
Supports search nesting within parentheses.
Can use stop word (eg, "a") as a "wildcard" within a phrase search.
Also translates text or Web page from English to Chinese, Dutch, Greek, Japanese, Korean, and Russian; Chinese, Dutch, Greek, Japanese, Korean, and Russian to English; Dutch to French; as well as French to Dutch, Greek, Italian, Portuguese, and Spanish.
Divides search results into "Web," "Images," "Video," "Audio," "Directory," "Local [ie, Businesses]," "News," and "Shopping" categories.

"ohio legal research"
About 22,500 Web pages will be retrieved.
ohio "legal research"
About 991,000 Web pages will be retrieved.
"legal research" ohio
About 998,000 Web pages will be retrieved.

 

 

Web Metasearch Engines

A Web Metasearch Engine sends your search statement to several search services, receives results, deletes duplicates, and displays results in single list.  This type of Web search service can save aggravation and time, because you only need to know one interface to search and don't need to search multiple search services.  In addition, you can often select which services your search will go to.  However, since a Metasearch Engine sends a search statement to several search services, and search services have different search methods, a complex search statement may not run effectively.

Dogpile and Vivisimo are two leading metasearch engines.  Both have the following features:

  • Boolean operators "and," "or," and "not" (Vivisimo uses "-" for "not")
  • Proximity searching - "near"  [Note:  No current search engines support "near" and this type of searching may no longer work in the metasearch engines.]
  • If a search engine doesn't support an operator (eg, "and," "near"), the metasearch engine uses the next general operator
  • Phrase searching
  • Not case sensitive
  • Search nesting within parentheses
See additional descriptive information on these two metasearch engines below.  Also included are results for sample searches seeking information on "Ohio legal research."
 

Dogpile
Allows one to search at least 6 Web search engines/directories (ie, About, AskJeeves, Google, LookSmart, MIVA, MSN, and Yahoo!Search), as well as several audio, video, image, and news services.

Divides search results into "Web," "Images," "Audio," "Video," "News,"Yellow Pages," and "White Pages" categories.

Can view search results highlighted by the search service that retrieved them.

Also provides some links to categories related to your search; uses Vivisimo technology (see Vivisimo section of this guide).
Note that Dogpile's default filter setting is "moderate."

"ohio legal research"
About 90 Web pages will be retrieved.
ohio  NEAR  "legal research"
About 110 Web pages will be retrieved.
ohio  AND  "legal research"
About 60 Web pages will be retrieved.
 

Vivisimo
Allows one to search 8 Web search engines/directories (ie, BBC, GigaBlast, LII, LookSmart, Lycos, MSN, Open Directory, and WiseNut) as well as numerous news, business, and government, and sites.

Presents highly ranked search results first, with an option to continue browsing additional search results.
Vivisimo's hallmark feature is that search results are arranged in ranked "Clusters" (ie, subject areas). This cluster classification is not predetermined (as in Query Server), allowing for maximum classification flexibility.
Note that Vivisimo supports field searching.

"ohio legal research"
About 170 Web pages, in 24 clusters (some with sub-clusters), will be retrieved.  An additional 33,950 Web pages will also be retrieved.
ohio  NEAR  "legal research"
About 130 Web pages, in 31 clusters (some with sub-clusters), will be retrieved.  An additional 28,260 Web pages will also be retrieved.
ohio  AND  "legal research"

About 130 Web pages, in 22 clusters (some with sub-clusters), will be retrieved.  An additional 216,860 Web pages will also be retrieved.


 

Invisible Web

The Web includes a lot of "Dynamic Content."  This "Invisible Web" or "Deep Web" is information transmitted via the Web, rather than stored on the Web in "static" form, and thus is not available for indexing by search engines.  [Specifically, the Robots Exclusion Protocol prevents a search engine's robot crawler from accessing/indexing portions of a Web site.]  The Invisible Web includes information in databases, password restricted information, text within graphics, etc., and is estimated to be 400,000 Web sites.  Web search services only index an estimated 20% - 50% of the Web.

If you cannot find information on a topic via a Web Search Directory or Search Engine, try an Invisible Web search service.  You may be led to a fee-based Web site, but at least you'll find out if any information is available on your topic.

CompletePlanet
Covers over 70,000 searchable databases and search services.
Browse resources via 34 top-level subject "browse tree" (eg, Government).  The "Browse tree" extends to five sub-levels.
Can also search within "browse tree" subject directory, and "Advanced" searching is available.


Librarians' Internet Index
Searchable annotated subject directory of over 14,000 Internet resources; librarians select the resources based on their "usefulness to users of public libraries."
Browse resources within subject directory (eg, Government, with sub-category Law).
Can also search within directory, and "Advanced Search" is available.


 

Additional Resources

 

Web-Based Tutorials

Guide to Effective Searching of the Internet

Available on the BrightPlanet Web site.

The Pandia Goalgetter:  a Short and Easy Internet Search Tutorial

Reviews Web directories, search engines, metasearch engines, as well as search principles and guidelines.


Web-Based Articles

Deep Web Research Research 2006 / Marcus P. Zillman.  Law Library Resource Xchange, 1/15/06.

Evaluating the Quality of Information on the Internet / The Virtual Chase, created 9/14/01, revised 9/16/04.

Deep Web Research / Marcus P. Zillman.  Law Library Resource Xchange, 2/23/03.

 

Books

Ambient Findability / Peter Morville.  O'Reilly, c2005.

[Electronic resource available from Cleveland State University - QA76.9 .D26 M673 2005eb.]

The Extreme Searcher's Guide To Web Search Engines: A Handbook For the Serious Searcher / Randolph Hock, foreword by Reva Basch.  CyberAge Books, c2001.
[Available from Cleveland-Marshall College of Law Library - Reference ZA4226 .H63 2001]

Google and Other Search Engines / Diane Poremsky.  Peachpit Press, c2004.

[Available from Cleveland State University Library - TK5105.884 .P67 2004]

Internet Blue Pages: The Guide To Federal Government Web Sites / Laurie Andriot, comp.  Information Today, Inc., c2000, 2001-2002.
[Available from Cleveland-Marshall College of Law Library - Reference ZA5075 .A53]

Internet Power Searching: the Advanced Manual / Phil Bradley.  Neal-Schuman Publishers, c2002.

[Available from Cleveland State University Library - ZA4201 .B69 2002]

The Invisible Web: Uncovering Information Sources Search Engines Can't See / Chris Sherman and Gary Price.  CyberAge Books, c2001.
[Available from Cleveland-Marshall College of Law Library - ZA4450 .S54 2001]

IssueWeb:  a Guide and Sourcebook for Researching Controversial Issues on the Web / Karen R. Diaz and Nancy O'Hanlon.  Libraries Unlimited, c2004.

[Available from Cleveland State University Library - Curr Mats ZA4228 .D53 2004]

The Librarian's Internet Survival Guide: Strategies For the High-Tech Reference Desk / Irene E. McDermott; edited by Barbara Quint.  Information Today, Inc., c2002.
[Available from Cleveland-Marshall College of Law Library - Reference ZA4201 .M36 2002]

The Professional's Guide To Mining the Internet: Information Gathering and Research on the Net / Brian Clegg.  Kogan Page; Stylus Pub., c2001.

[Available from Cleveland State University Library - ZA4230 .C56 2001]

Search Engine Visibility / Shari Thurow.  New Riders, c2003.

[Electronic resource available from Cleveland State University - ZA4201 .T48 2003eb.]

Search Engines For The World Wide Web / Alfred and Emily Glossbrenner.  Peachpit Press, c2001.

[Electronic resource available from Cleveland State University - ZA4230 .G57 2001eb.]

Sorting Out the Web: Approaches To Subject Access / Candy Schwartz.  Ablex Pub., c2001.

[Available from Cleveland-Marshall College of Law Library - ZA4232 .S39 2001]

Toward a Cyberlegal Culture / Mirela Roznovschi.  Transnational Publishers, c2002.

[Available from Cleveland-Marshall College of Law Library - K87 .R69 2002]

Using the Internet as a Reference Tool: a How To-Do-It Manual for Librarians / Michael P. Sauers, with contributions by Denice Adkins.  Neal-Schuman Publishers, c2001.

[Available from Cleveland State University Library - Z711.45 .S28 2001]

Web of Deception:  Misinformation on the Internet / Anne P. Mintz, ed.  CyberAge Books, c2002.
[Available from Cleveland-Marshall College of Law Library - Reference ZA4201 .W43 2002]

Web Research: Selecting, Evaluating, and Citing / Marie L. Radford, Susan B. Barnes, and Linda R. Barr. Allyn and Bacon, c2002.

[Available from Cleveland State University Library - ZA4201 .R33 2002]

The World At Your Fingertips:  Learning Research and Internet Skills / Heidi Kay and Karen DelVecchio.  UpstartBooks, c2002.

[Electronic resource available from Cleveland State University.]

 

Annuals and Periodicals

The Internet Lawyer

Baltimore, MD: Daily Record Company. Monthly.
[Available from Cleveland-Marshall College of Law Library; current issue on Reserve - KF242 .A1 I57 & Electronic]

Internet Law & Strategy

Philadelphia, PA: Law Journal Newsletters, a division of American Lawyer Media. Monthly.
[Available from Cleveland-Marshall College of Law Library; current issue on Reserve - KF390.5 .C6 I559]

The Legal List: Research On The Internet
St. Paul, MN: West Group, Inc. Annual.
[Available from Cleveland-Marshall College of Law Library - KF242 .A1 L375; current edition in Reference]

The United States Government Internet Manual / Peggy Garvin, ed.  Bernan Press, c2004.

[Available from Cleveland-Marshall College of Law Library - ZA5075 .G68; current edition in Reference]


 

Laura E. Ray, MA, MLS
Educational Programming Librarian
March 2007