High Performance Computing (HPC) is a term that has emerged in today's world to replace the yesteryears' custom of supercomputer. In the previous years, supercomputer is a term that generated thoughts of extremely complicated machines that were solving problems that humans could not really understand. Since supercomputers permitted entry of at least seven figures, they were commonly used by engineers and scientists who needed to deal with such figures rapidly. However, this concept has paved way for the emergence of commodity-based supercomputing that is commonly known as High Performance Computing. The main focus of High Performance Computing (HPC) is the ability to process huge amounts of data in short periods of time. High Performance Computing (HPC) is associated with various technologies with varying software and hardware requirements for administrative and operational tasks needed to process data securely.
The Concept of High Performance Computing
Eadline (2009), states that high performance computing is a term that emerged in today's world as a substitute of the custom supercomputer of the yesteryears, which was commonly used by engineers and scientists who needed huge figures as fast as possible (p.3). High performance computing is a commodity-based supercomputing, which is now open to nearly every individual given the all-time low cost of entry. High performance computing is currently regarded as a fundamental part of business to various organizations. Organizations prefer using HPC because it is considered a competitive advantage over business rivals. This concept is preferable by many organizations because it is a secret weapon that enables businesses to become profitable, competitive, and green. It enables users to quickly replicate and then manipulate processes or a product to see the effect of several decisions before they are reached.
Technology Involved in HPC
As previously mentioned, high performance computing is associated with various technologies given its focus on the ability to process huge amounts data within short periods of time. In addition to the various technologies, HPC has several software and hardware requirements for administrative and operational functions that are crucial for secure data processing. Generally, HPC includes computers, algorithms, networks, and platforms that enable them to focus on processing huge amount of data within short durations of time. The technologies involved in this concept range from small group of personal computers to fastest supercomputers. The hardware and software architecture of these computers largely determine the possibilities and impossibilities of speeding up the systems beyond the performance of a single Central Processing Unit (CPU) core (Dongarra & van der Steen, 2012, p.2). One of the important factors in the hardware structure is consideration of the ability of compilers to create efficient code to be utilized on the specific hardware structure or platform.
Software architecture is an important aspect of high performance computing to an extent that it has developed as vital discipline for software engineers (Zhang et. al., 2005, p.409). Software architecture is more important than the selection of data structure and algorithms because of the increase in the size and complexity of software system. However, the process of programming applications for HPC is extremely complex and difficult given the lack of effective software tools. Since HPC involves parallel computers, the software structures of these computers differ widely. Software architectures in High Performance Computing range from distributed and parallel structures to networks of workstations (Appelbe & Bergmark, 1996, p.2).
Since the beginning of the 21st Century, operating systems in high performance computing have experienced considerable changes because of the changes in the architecture of supercomputers. As a result, HPC currently adapts generic software like Linux rather than custom tailored operating systems to each computer. The adaptation has been brought by the shift from in-house operating systems since high performance computing distinguishes various computations from other services through the use of numerous kinds of nodes. The supercomputers utilized in HPC always operate varying operating systems on varying nodes. For instance, small systems utilize lightweight kernel on nodes whereas larger systems employ input/output nodes. Linux is the most preferred operating system for HPC though every manufacturer has his/her unique derivative of the operating system due to lack of industry standard (Padua, 2011, p.429).
Queue Management in this type of computing is relatively different from the conventional multi-user computer system. This process in HPC involves controlling the distribution of communication and computational resources. It also incorporates dealing with the anticipated hardware failures because of the existence of numerous processors. Due to the need to exploit speed, software applications in HPC utilize special programming techniques ranging from distributed processing and open-based solutions. As a result, these software applications require the use of special techniques for testing and debugging. With regards to luster, high performance computing range from loosely connected to tightly connected clusters depending on the type of machine being used.
Based on the trend of marketing enormous parallel computers for high performance computing, it is increasingly important to develop and use a series of new and improved software tools. This implies that the future of software in this kind of computing will incorporate the use of enhanced software tools and applications. These tools and applications will be characterized by changes in the numbers of processors, which will in turn have considerable changes in performance. The development of these new and enhanced software tools and applications will be influenced by new parallel programming model that is gradually developing to help deal with the software issues in the current high performance computing.
Apart from software requirements, there are hardware requirements for administrative and operational tasks necessary to process data securely in high performance computing. The hardware architecture is largely different from earlier systems in HPC and incorporates innovative designs that focus on realization of superior peak performance of the supercomputers. The hardware architecture in HPC has three major subsystems i.e. communication, storage, and system. With regards to storage, modern HPC requires a high bandwidth and flexible data in order to support bulk data movement between storage system and clients. This involves the use of storage area network (SAN) switch and 10 GbE (Ajith, 2012).
Hardware requirements in these systems include compute nodes/servers which work together with workstations to utilize local disk for caching data files. The compute nodes include head nodes, processing nodes, and authentication nodes, which are part of HPC clusters. After identification of the necessary plans and selection of a preferred vendor the head nodes for secure data processing should be at least 2x Dell PowerEdge R610. The head nodes should be set up to create a routing queue that can provide jobs of at least a single execution queue. The user authentication should be synchronized throughout all nodes in order to ensure confidentiality and data security.
Processing nodes in high performance computing should include contemporary multi-core SMP processors that can scale up to more than 512GB. The SMP processor nodes should be adequate enough because of the highly distributed and scalable nature of application in parallel computing in these systems. To ensure secure data processing, blade servers or rack-mounted servers should be utilized as processing nodes. These servers should have disk storage and local memory to enable high-density architectures that promote speedy communications to link the nodes (Bookman, 2003). Robotic (RTL) or automated (ATL) tape libraries are necessary because they support enormous and long-term second tier storage alternative with regards to data archival and backups. RTL and ATL are vital because they assist interim storage, protracted vault of final migrated segments and route backups.
An upcoming trend in hardware that will have considerable impact on high performance computing is the need for data management or external software framework to enable tape libraries to function because automated tape libraries are becoming smarter. Moreover, hardware is continually getting mean, lean, and green because of attempts to increase energy efficiency. This is likely to result in new standards and certifications in hardware for high performance computing as well as leaner, greener, and meaner hardware for improved energy efficiency.
Processing types in high performance computing entails parallel processing and serial processing, which are beneficial in processing data securely with more emphasis on confidentiality, availability, and integrity. Parallel processing is important in parallel computing, which is vital in accelerating high performance computing. The requirements of parallel processing include creating a parallel computing architecture, slicing the process into various, independent executable segments that can function in parallel. In addition, there is need for intensive code structure and batch sequential data processing (Mishol, 2012). The benefits of parallel processing in HPC include synchronization of processes (which is necessary for accuracy), enhanced response time, and acceleration of high performance computing.
Unlike parallel processing in which multiple instructions are performed simultaneously, serial processing in which a single set of instructions are performed can also be used in HPC though minimally. This is primarily because high performance computing has significantly shifted from serial processing to parallel processing (Clark, 2013). However, some applications in these systems are intrinsically serial and not suitable for parallel processing. Some of the requirements of serial processing include batch sequential data…