Computer Clustering Involves the Use of Multiple Term Paper

Download this Term Paper in word format (.doc)

Note: Sample below may appear distorted but all corresponding word document files contain proper formatting

Excerpt from Term Paper:

Computer clustering involves the use of multiple computers, typically personal computers (PCs) or UNIX workstations, multiple storage devices, and redundant interconnections, to form what appears to users as a single integrated system (Cluster computing). Clustering has been available since the 1980s when it was used in Digital Equipment Corp's VMS systems. Today, virtually all leading hardware and software companies including Microsoft, Sun Microsystems, Hewlett Packard and IBM offer clustering technology. This paper describes why and how clustering is commonly used for parallel processing, batch processing, load balancing and high availability.

Despite some challenges such as achieving transparency, mitigating network latency and the split-brain problem, clustering has proven to be a huge success for bringing scale and availability to computing applications. Hungry for even more efficient resource use, IT departments are now turning their eye on the next evolution of clustering called grid computing.

Parallel processing is the processing of program instructions by dividing them among multiple processors with the objective of running a program in less time. Parallel processing is normally applied for rendering and high computational based applications. Rather than using expensive specialized supercomputers for parallel processing, implementers have begun using a large cluster of small, commodity servers. Each server runs its own operating system, to take a number of jobs, process them, and send the output to the primary system (Shah, 1999). Clusters provide the ability to handle a large task in small bits, or lots and lots of small tasks across an entire cluster, making an entire system more affordable and more scalable.

The first PC cluster to be described in scientific literature was named Beowulf and was developed in 1994 at the NASA Goddard Space Flight Center (Beowulf clusters compared to Base One's batch job servers). Beowulf initially consisted of sixteen PCs, standard Ethernet, and Linux with modifications and achieved seventy million floating point operations per second). For only $40,000 in hardware, Beowulf had produced the processing power of a small supercomputer costing about $400,000 at that time. By 1996, researchers had achieved one billion floating point operations per second at a cost of less than $50,000. Later, in 1999, the University of New Mexico clustered 512 Intel Pentium III processors that was the 80th-fastest supercomputing system in the world, with a performance of 237 gigaflops.

Just as clustering has reduced the important of supercomputers for parallel processing, clusters are making the mainframe less relevant for batch applications. A batch job is a program that is assigned to the computer to run without further user interaction. Common batch-oriented applications include data mining, 3-D rendering and engineering simulations. Before clustering, batch applications were typically the domain of mainframes that involved high cost of ownership. Now with clusters and a scheduler, large batch jobs can easily be crunched on a less expensive cluster.

Load balancing is dividing the amount of work that a computer has to do between two or more computers so that more work gets done in the same amount of time and, in general, all users get served faster. For load balancing purposes, computers are used together in such a way that the traffic and load on the network are distributed over the computers in the cluster (D'Souza, 2001). Load balancing is commonly used in applications where the load on the system cannot be predicted and is variable from time to time.

One example where load balancing is often used is for web servers where two or more servers are configured in such a way that when one server gets overburdened with requests, they are passed on to the other servers in the cluster, thus evening out the work. In a business network for Internet applications, a cluster, often called a Web farm, might perform such services as providing centralized access control, file access, printer sharing, and backup for workstation users (Server farm). The servers may have individual operating systems or a shared operating system and can offer load balancing when there are many server requests. A Web page request is sent to a "manager" server that determines which of several identical or very similar Web servers to forward the request to for handling to allow traffic to be handled more quickly.

High availability refers to a system or component that is continuously operational for a desirably long length of time. To provide fault tolerance for high availability, the cluster is configured in such a way that the entire system responds to an unexpected error or a hardware failure in a graceful manner (D'Souza, 2001). Ideally, the resources of the failed hardware are taken over by the remaining nodes in the system and therefore there is no system downtime. This is implemented by mirroring the critical components in the system such as the storage sub-systems and the power supply. Experts suggest that clustering can help an enterprise achieve 99.999% availability in some cases (Cluster computing).

D'Souza (2001) states that one of the largest challenges in implementing a cluster is the bonding of the nodes together. For the systems to appear as a single entity to the users of the clustered system, the internal architecture and the operation of the system has to be transparent to the users. And, reducing the latencies of the network is required so that the response time of the network is acceptable. According to D'Souza, one of the major reasons why most switched-based networks are bogged down is due to the fact that there is a considerable amount of overhead prior to the actual transfer of data over the network.

Transparency and latency reduction are accomplished using a combination of specialized networking methodologies and the controlling software that is used in the cluster (D'Souza, 2001).

When it comes to networking, one of the most common methods is Channel Bonding. This process bonds together two or more Ethernet channels and, therefore, facilitates greater bandwidth between the nodes in the cluster. Channel Bonding is less costly than using a single Gigabit network or any other high-speed network. It is fairly easy to use channel bonding for higher performance when the number of machines is low and they can all connect through the same switch. However, when the number of processors goes up in the dozens or hundreds, the network topology requires a lot of calculating. In this situation, the Flat Neighborhood method is more appropriate. The Flat Neighborhood Network allows a node to talk to any other node on the cluster via a single switch vs. A single switch fabric. Thus, latencies are greatly minimized and the maximum bandwidth of the switch can be tapped. Therefore, in this system, a pair of PCs communicates through shared network neighborhood.

Yet another cluster challenge is the split-brain problem. This happens when a disruption in communication paths between nodes prevents normal internode communications within the cluster (Leng, Stanton and Saibak, 2001). When this happens, nodes may become isolated from the others, presenting the possibility that each groups of nodes may perform as independent nodes. Under this condition, multiple nodes may access the same volume of shared storage, causing data corruption from simultaneous data access.

According to Leng, Stanton and Zaibak (2001) there are several detection mechanisms and remedies for the split-brain challenge. One way is to configure multiple heartbeats for the cluster to eliminate a single point of failure in the communication path. A heartbeat is an "I am alive" message constantly sent out by a server. Heartbeats should never all be plugged into the same network interface card or other network devices such as a hub or switch. And, separate devices should not share a common power source. Public networks must be designed robustly with highly fault-tolerant hardware. And the use of Internet Protocol (IP) over Fibre Channel can be used to provide a heartbeat channel that allows nodes to exchange health information even when all the network connections fail entirely.

Once a server recognizes that there is a failure, it needs to perform cluster reconfiguration that includes input/output (I/O) control of the disksets (Sun product documentation). This is often referred to as resource-based fencing and it involves hardware mechanism to immediately disable or disallow access to shared resources. Resource-based fencing is necessary because a server that has failed may not be completely down and may still attempt to do I/O, because I/O is typically processed by interrupt routines. Thus, the server that takes over must be able reserve the diskset to prevent the sick server from continuing to read or write the disks. The disks or disk arrays block access to all nodes other than those to which they have issued a reservation. This reservation information is stored locally on the disk and remains there until the reservation is either released by the corresponding server, expires, or is broken by a reservation reset request.

In addition to resource-based fencing, there is also an approach called Shoot the Other Machine in the Head (STOMITH) fencing to cope with the split-brain issue (Burleson). In STOMITH systems, the errant cluster node is simply reset and forced to…[continue]

Cite This Term Paper:

"Computer Clustering Involves The Use Of Multiple" (2004, May 01) Retrieved October 28, 2016, from

"Computer Clustering Involves The Use Of Multiple" 01 May 2004. Web.28 October. 2016. <>

"Computer Clustering Involves The Use Of Multiple", 01 May 2004, Accessed.28 October. 2016,

Other Documents Pertaining To This Topic

  • Clustering in Algorithms That Employ

    At this stage, an abstract format or generic classification for the data can be developed. Thus we can see how data are organized and where improvements are possible. Structural relationships within data can be revealed by such detailed analysis. The final deliverable will be the search time trial results, and the conclusions drawn with respect to the optimum algorithm designs. A definitive direction for the development of future design work

  • Components Associated With HPC

    High Performance Computing (HPC) is a term that has emerged in today's world to replace the yesteryears' custom of supercomputer. In the previous years, supercomputer is a term that generated thoughts of extremely complicated machines that were solving problems that humans could not really understand. Since supercomputers permitted entry of at least seven figures, they were commonly used by engineers and scientists who needed to deal with such figures rapidly.

  • Global Market Research Roles and

    The third position means stepping outside the situation and seeing issues from the point-of-view of a third party. NLP reminds us that people receive information in various sensory channels: the visual, the auditory, the kinaesthetic (perception of movement of effort) and the digital mathematical or reasoned thinking (Taylor, 2000). The idea being that people use all of these modes, but may have a preferred mode. Ethnographic approach: this takes its

  • Statistics Multivariate Analysis Research Data Collected

    Demographic characteristics may be used to generate this profile. Results generated may show that after cluster analysis, respondents who belong to the upper middle to upper class socio-economic group are identified as having a high degree of health consciousness, while respondents aged between 25 and 25 are the ones who most rely on self-medication. Multidimensional scaling, meanwhile, will be useful in this example by mapping out these attitudes towards

  • Data Warehouse a Strategic Weapon of an Organization

    Data Warehousing: A Strategic Weapon of an Organization. Within Chapter One, an introduction to the study will be provided. Initially, the overall aims of the research proposal will be discussed. This will be followed by a presentation of the overall objectives of the study will be delineated. After this, the significance of the research will be discussed, including a justification and rationale for the investigation. The aims of the study are to

  • Reducing Health Disparities Among African American

    Community resources must be identified and brought together to meet needs. Actions can be developed to prevent poor health outcomes by: appropriately identifying, collecting, and reporting racial/ethnic group-specific data; identifying where data are lacking and developing appropriate tools to collect those data; and linking poor health status indicators to social conditions and influences, as well as personal behaviors and genetics. As indicated by other counties, the populations experiencing these disparities

  • Teaching That Play a Role

    Multicultural education researchers and educators agree that preservice teachers' attitudes, beliefs, and understandings are important: foci in multicultural education coursework (Cochran-Smith, 1995; Grant & Secada, 1990; McDiarmid & Price, 1993; Pohan, 1996). Teacher attitudes and beliefs influence teaching behaviors, which affect student learning and behavior (Wiest, 1998)." 1996 study used 492 pre-service teachers to try and gauge the attitudes and beliefs among the group when it came to understanding diversity and

Read Full Term Paper
Copyright 2016 . All Rights Reserved