Wide Web Is Available Around Term Paper
- Length: 52 pages
- Subject: Education - Computers
- Type: Term Paper
- Paper: #14951486
Excerpt from Term Paper :
The reward for the effort of learning is access to a vocabulary that is shared by a very large population across all industries globally" (p. 214). Moreover, according to Bell, because UML is a language rather than a methodology, practitioners who are familiar with UML can join a project at any point from anywhere in the world and become productive right away. Therefore, Web applications that are built using UML provide a useful approach to helping professionals gain access to the information they need when they need it.
Overview of the Study
This paper used a five-chapter format to achieve the above-stated research purpose. Chapter one of the study was used to introduce the topic under consideration, provide a statement of the problem, the purpose of the study and its importance of the study. Chapter two of the study provides a review of the related peer-reviewed and scholarly literature concerning search optimization on the World Wide Web, and chapter three describes more fully the study's methodology, including a description of the study approach, the data-gathering method and the database of study consulted. Penultimately, chapter four consists of an analysis of the data developed during the research process and chapter five presents the study's conclusions, a summary of the research and recommendations.
Review of the Related Literature
The World Wide Web
The World Wide Web (hereinafter the "WWW" or alternatively, "the Web") is a unique information environment because it is (a) very large and growing larger daily, (b) highly searchable, (c) virtually ubiquitous, and (4) potentially very useful (Ratner, 2003). By any measure, the Web is enormous and continues to grow at exponential rates. For example, in 2003, one new server was introduced to the WWW every 2 seconds, seven-and-a-half Web pages added every second, and there were already 27.5 million Web sites and 413.7 million users (Ratner, 2003). Today, there are more than one-and-a-half billion Web users (Turner, 2009) and the WWW represents a highly accessible medium that features a wide range of search engines that are used to locate relevant and desired information (Ratner, 2003). Google, for example, provides hundreds of millions of searches each day (Ratner, 2003). According to Wade (2009), "Over the past decade, Google has revolutionized the internet. By devising complex search algorithms and amassing vast storehouses of computational power, the Mountain View, California-based company has democratized knowledge distribution to the point where every individual can now the access volumes of information that historically required the backing of an organization" (p. 37).
Moreover, the WWW has become increasingly available in other countries and access has been simplified in a number of ways; in addition, access to the Web can be achieved through the use of various handheld peripherals and television sets (Moyer, 2009). According to Ratner (2003), "Last but not least, the Web contains information that users want. A common phrase among Net-savvy users is 'You can find the answer on the Web'" (p. 267). Indeed, a commonly heard phrase in response to a question today is to "Google it." The WWW has introduced some superior and fundamental changes in the way people go about searching for information compared to years past, but there are still some constraints to its effective use firmly in place. For instance, Ratner advises that, "The Web is larger, more searchable, more ubiquitous, and more useful than previous digital libraries. However, even though the Web has made a wide variety of information available, this increase in the amount of accessible information actually exacerbates the problem of information access, because as humans we have limited human capacity for absorbing information" (2003, 268).
As noted in the introductory chapter, there are also Web sites that are more difficult to find during searches, resulting in the reference to these resources as the "dark" or "invisible" Web (Pedley, 2001). On the one hand, Pedley notes that, "The visible web is the 'publicly indexable' or 'surface web' -- those Web sites that have been picked up and indexed by the search engines" (p. 4). On the other hand, there is the so-called "invisible" or "dark Web." In this regard, Pedley advises, "The phrase 'the invisible Web' refers to information that search engines cannot or do not index. The content that resides in searchable databases, for example, cannot be indexed or queried by traditional search engines because the results are generated dynamically in response to a direct query" (p. 4). The term, "invisible Web," refers to the hidden nature of the Web pages that are not readily accessed using standard search engines. For instance, according to Pedley, "Whilst the search engines might be able to index the home page of a database, they are unable to index each individual record within that database. So, in effect, an enormous amount of valuable content on the web is 'invisible' because it is locked up within databases" (p. 5). There are other constraints to providing efficient search results using many popular search engines. Beyond the "visible" and "invisible" Web exists yet another component that contains an enormous amount of information that is easier to access than the invisible Web but more difficult to access than the visible Web. In this regard, Pedley notes that, "The World Wide Web is so big that to index every single page available would put a great strain on the available computer power, and consequently the search engines may impose a limit on the number of pages that they retrieve from a Web site" (p. 6).
This constraint in particular is the result of a management decision on the part of the search engine industry that places limits on the number of pages that their services will index from a specific site; however, once a Web site is located using a search engine, it may be possible to access these "hidden" pages through the use of the hyperlinks maintained on the site that have been indexed; or through the site map of a given Web site (Pedley, 2001). This part of the WWW has been termed the "barely visible Web" or the "opaque Web" in contrast to the "dark" or "hidden" Web, and there are a number of important reasons why this segment of the WWW exists today, including the following:
1. Depth of crawl -- the search engines may have a fixed limit on how many pages they will index within a site;
2. Frequency of updating -- while some sites are updated many times a day, the search engines might only revisit the site every few weeks or months and so there will always be a time lag between new data being loaded onto a site and the search engines indexing that new information. The search engines are not geared up for sites with real-time or frequently updated content.
3. Robots.txt or the NOINDEX metatag -- search engines use "robots" in order to scan and index a website. It is possible to tell them which pages and directories they can index by using the robots.txt file; however, some ISP's might not let users have access to the robots.txt file, in which case they can use the NOINDEX and the NOFOLLOW metatags. A value of "NOINDEX" allows the subsidiary links to be explored, even though the page is not indexed. A value of "NOFOLLOW" allows the page to be indexed, but no links from the page are explored (Pedley, 2001, p. 7).
Therefore, identifying the most relevant Web sites based on user profiles can help facilitate the search process by eliminating extraneous sites and targeting those that specifically match the search terms as well as the profile for the individual conducting the search. According to Colborn (2006), the search engine industry has become increasingly aggressive in its marketing efforts in an attempt to remain competitive with industry leaders such as Google, Yahoo!, MSN Search and Ask.com. A review of the top-ranked search engines compiled by Wall (2006) describes the attributes and weaknesses of these search engine services which are set forth in Table ____ below.
Comparison of Respective Search Attributes and Weaknesses of Google, Yahoo!, MSN Search and Ask.com
Search Engine Service
1. Has a great deal of experience in the search industry.
2. Is much better than the other engines at determining if a link is a true editorial citation or an artificial link
3. Looks for natural link growth over time.
4. Heavily biases search results toward informational resources.
5. Trusts old sites far too much.
6. A page on a site or subdomain of a site with significant age or link related trust can rank much better than it should, even with no external citations
7. They have aggressive duplicate content filters that filter out many pages with similar content.
8. If a page is obviously focused on a term they may filter the document out for that term. Page variation and link anchor text variation are important. A page with a single reference or a few references of…