Expert Systems and Neural Networks
The Development and Limitations of Expert Systems and Neural Networks
The human experience demands a constant series of decisions to survive in a hostile environment. The question of "fight or flight" and similar decisions has been translated into computer-based models by using the now-famous "if-then" programming command that has evolved into the promising field of artificial intelligence. In fact, in their groundbreaking work, Newell and Simon (1972) showed that much human problem solving could be expressed in terms of such "if-then" types of production rules. This discovery helped to launch the field of intelligent computer systems (Coovert & Doorsey 2003). Since that time, a number of expert and other intelligent systems have been used to model, capture, and support human decision making in an increasingly diverse range of disciplines; however, traditional rule-based systems are limited by several fundamental constraints, including the fact that human experts are needed to articulate propositional rules, that the symbolic processing normally used prevents direct application of mathematics, and that traditional rule-based systems require a large number of rules that are not receptive to unique data inputs. This paper provides an examination of the concepts and technologies needed to develop, implement and integrate expert systems and neural networks. The limitations of expert systems and their alternatives are discussed, followed by an analysis of the relevant and scholarly literature covering neural networks. A summary of the research is provided in the conclusion.
Review and Discussion
Background and Overview. Artificial Intelligence (AI) as a formal discipline is certainly not new, having been around for more than 50 years (Gozzi 1997). Nevertheless, AI remains a term that frequently "conjures images of HAL's refusal to open the pod bay doors or Deep Blue winning the world chess championship. But artificial intelligence (or Al) is not a phenomenon restricted to science fiction movies and chess tournaments; it has rapidly, if silently, become a fixture of daily life" (Gibson 2003:83). In fact, Kapoor (2003) emphasizes that there can be no dispute that machines with greater-than-human intelligence will be built in the next 50 years, and the creation of such AI empowered creations will have far-reaching implications for all aspects of society, science, technology, and the environment.
According to Kapoor, "The likelihood of creating AI within the next 50 years, and when it happens, its deep impacts on science and society, are both assertions that will be accepted by most futurists" (788). Bostrom (2003) covers the phenomenal increases in number-crunching capacity of supercomputers that have followed Moore's law, including IBM's biggest and best, Blue Gene that operates at 1 quadrillion operations per second which is scheduled to become operational by the end of 2005. This author notes that he is in agreement with Kapoor concerning "the tragedy of the vast unfair inequalities that exist in today's world, and also in regard to the fact that there would be considerable risks involved in creating machine intelligence"; however, this author suggests that AI assistive technologies might also serve to reduce certain other kinds of risk.
For instance, Bostrom says:
An assessment of whether machine intelligence would produce a net increase or a net decrease in overall risk is beyond the scope of my original paper or this reply. (Even if it were to be found to increase overall risk, which is very far from obvious, we would still have to weigh that fact against its potential benefits. And if we determined that the risks outweighed the benefits, we would then have to question whether attempting to slow the development of machine intelligence would actually decrease its risks, a hypothesis that is also very far from obvious (902).
While the goals of individual practitioners using AI applications have varied and changed over time, a reasonable characterization of the general field of AI is that it is intended to make computers do things that when done by people are described as having indicated intelligence (Steels 1995); this author characterized the primary goals of AI as both the construction of useful intelligent systems and the understanding of human intelligence. According to Gozzi (1997), "In the 1950s, a group of scientists decided to try to provide the computer with intelligence. Their goal seemed attainable due to a common metaphorical identification of the computer with a brain. From their efforts emerged the field of artificial intelligence, or AI" (219).
This author suggests that the basic, or root metaphors of AI, resembled a classical syllogism:
Major Premise: The computer is a brain.
Minor Premise: Thinking is computing.
Conclusion: If we provide the computer with sophisticated programs, it will develop a mind similar to human minds (220).
In recent years, this has, in fact, been the focus of AI programs. According to Komninou (2003), "The more we progress, the more possessed we become with technology, the more obsessed we become with the very idea of 'intelligence', the more we take the images of our desires to be the real thing" (793). According to Boodoo, Bouchard, Boykin et al. (1996):
Individuals differ from one another in their ability to understand complex ideas, to adapt effectively to the environment, to learn from experience, to engage in various forms of reasoning, to overcome obstacles by taking thought. Although these individual differences can be substantial, they are never entirely consistent: A given person's intellectual performance will vary on different occasions, in different domains, as judged by different criteria. Concepts of "intelligence" are attempts to clarify and organize this complex set of phenomena (77).
As a result, a variety of applications of AI have emerged as an increasingly promising technology that can help users from a variety of fields to structure, guide, and improve information processing for decision-making purposes. For example, today, AI programs provide consultative advice to physicians concerning infectious diseases and their etiologies; such programs help physicists investigate unknown molecules and make predictions about their molecular structures with spectroscopic analysis; they also assist mathematicians in solving complex problems, process credit requests for American Express, hunt submarines for the U.S. Navy, help develop timely advertisements for retailers and evaluate a client's ability to repay a loan (Jones, Martin, Mcwilliams et al. 1991).
According to Dillon (1993), artificial intelligence is "the branch of computer science devoted to the study of how computers can be used to simulate or duplicate functions of the human brain... [making] it appear as though a computer is thinking, reasoning, making decisions, storing or retrieving knowledge, solving problems, and learning" (74). There are three fundamental differences between AI and other programming languages though:
AI does not use algorithms, or step-by-step procedures, in order to solve problems; rather, it employs symbolic representation such as letters, words or numbers to represent objects (in the form of statements and procedures), processes and their relationships;
The second major area of difference between AI and other programming languages is the manner in which uncertainty is handled. Dillon uses the sentence, "Erin is taller than Esther" as an example of the uncertainty involved in a definition of "tall." According to the author, "Are you tall at five feet five inches? What about short? Are you short at four feet eleven inches or at five feet? Artificial Intelligence is able to deal with such imprecision through the use of confidence factors and probability" (emphasis added) (Dillon 75).
The final difference between AI and other programming languages concerns the realm of decision-making. According to Dillon, "Conventional software uses precise data and step-by-step instructions for solving a problem, thereby limiting the computer to predetermined solutions. Whereas in AI, the computer is given information (sometimes imprecise) and the ability to make inferences. The computer and the software determine the solution" (76).
A good example, because it is likely known to many people today, of how these imprecise or "fuzzy" conditions play out in an actual setting can be found in the popular computer game, "The Sims" and its many permutations. The characters in these games are governed by a set of "fuzzy" metrics to which they respond (or not, depending on the user preferences). For example, when they become sufficiently hungry, Sims characters will seek out food; when they become sufficiently tired, they will sleep.
In fact, the metrics by which modern people measure intelligence are closer to human experience than might be commonly thought; according to Stevens (1996), "We are already used to dealing with digital, intelligent life in the form of digital representations of other humans" (414). This is echoed in her essay, "Artificial Intelligence and the Real World," where Jenkins (2003) suggests that the scope and significance of artificial intelligence (AI) make it an important concern today and in the future, perhaps more so than other emerging technologies, particularly "because AI is concerned with replicating and enhancing intelligence, and this concept, related as it to consciousness, is at the heart of human identity" (779).
This connection with "human identity" is at the core of AI assisting technologies. In the past, computer scientists working on AI have largely ignored the social roots of human intelligence; however, in more recent years, there has been an increased interest among these researchers concerning the social aspects of intelligence. According to Bainbridge, Brent, Carley et al. (1994), "Areas such as distributed artificial intelligence, coordination theory, and collaboration technology (all with strong roots in engineering or computer science) have begun to look at social issues" (407). Many early AI programs provided the opportunity for humans and computers to interact through natural-language conversations; unfortunately, the programming challenge has always been to simulate the behavior of a single human actor (Bainbridge et al. 1994). While early AI researchers focused on a variety of schools of thought within psychology, they tended to overlook the sociological considerations that were required to make such assistive technologies more robust.
A number of sociologists, though, maintain that true AI cannot truly achieved without the active participation of sociologist; Allen Newell (1990) makes this point in his seminal book, Unified Theories of Cognition (in Bainbridge et al.). In some sense, all types of AI applications are able to "think"; in other words, the programs are able to "solve problems in a way that would be considered intelligent if done by a human"; as a result, AI applications are being increasingly used in a variety of computer-based applications today such as speech recognition, robotics (machine vision systems), natural language processing, expert systems and neural networks; the latter two extensions of AI technology are discussed further below.
Expert Systems. Expert systems are one of the most popular applications of artificial intelligence today; these are computerized decision-making applications that structure expertise in a specific area and emulate human decision-making (Berry, Berry & Foster 1998). According to Grabinger, Jonessen and Wilson (1990), "Expert systems are practical tools that can serve as intelligent job aids to facilitate on-the-job decision making in tasks such as judging student projects, diagnosing learning problems, identifying and classifying performance problems, or helping consumers to decide among a large number of alternatives" (1). Expert systems are intended to improve human performance; however, like any tool, the effective use of an expert system requires conceptual understanding, practice, and specific development skills and processes (Grabinger et al. 1990). Furthermore, like most instructional development tools, the most crucial design phase occurs during the early stages of development when the analysis of a problem and the subsequent structuring of knowledge into a form that is appropriate for entry into an expert system building tool (Grabinger et al. 1990).
It is easy to become overwhelmed with terminology and the vagaries of scientific whim, though. To keep it simple, then, expert systems are a type of computer program that employs artificial intelligence to solve problems within a specialized domain that has traditionally required the use of human expertise alone (Zwass 2004). The first such expert system was developed by Edward Feigenbaum and Joshua Lederberg of Stanford University in California in 1965. This early expert system, which later became known as U.S. Dendral, was designed to analyze chemical compounds; today, expert systems have a wide range of commercial applications in fields as diverse as medical diagnosis, petroleum engineering, and financial investing (Zwass 2004).
While expert systems may be as diverse as the needs of their users, all expert systems depend on two basic components to accomplish their analyses: 1) a knowledge base and 2) an inference engine. According to Zwass, "A knowledge base is an organized collection of facts about the system's domain. An inference engine interprets and evaluates the facts in the knowledge base in order to provide an answer" (5). Such rule-based expert systems tend to be deductive, compared to traditional decision tree algorithms that use inductive learning (Gahegan, Harrower, Rhyne, & Wachowicz (2001). Some typical tasks expert systems are being used for today include classification, diagnosis, monitoring, design, scheduling, and planning for specialized tasks. Because expert systems need to be "expert" in some specific area, the knowledge bases for such systems are drawn from laws, regulations, and -- increasingly -- the unique components of human expertise itself (Berry et al. 1998).
The facts that are required to be incorporated into a knowledge base can be acquired from human experts through interviews and observations which is then represented in the form of "if-then" rules (production rules): "If some condition is true, then the following inference can be made (or some action taken)" (Zwass 2004). Knowledge bases of major expert systems will likely include thousands of such rules. A probability factor is often attached to the conclusion of each production rule, because the conclusion is not a certainty. For example, a system for the diagnosis of eye diseases might indicate, based on information supplied to it, a 90% probability that a person has glaucoma, and it might also list conclusions with lower probabilities. An expert system may be capable of displaying the sequence of rules through which it arrived at its conclusion; tracing this flow helps the user to appraise the credibility of its recommendation and is useful as a learning tool for students (Zwass 2004).
A number of public agencies have used expert systems for well over a decade now, and these systems are becoming increasingly common for a wide variety of other applications; for instance, many social service departments across the country are using expert systems to determine eligibility for food stamps or refugee assistance; law enforcement officials analyze solved and unsolved burglary cases through a computerized network to develop a profile of possible perpetrators of new crimes; and water testing laboratories can apply for state licensure through expert systems (Berry et al. 1998). "Although expert systems began as stand-alone computer programs," Berry et al. report, "large public agencies are integrating their expert systems into their information data bases, and thus users may not even recognize their expert system as such" (1998:294).
The primary benefits of expert systems relate to the size of the organization or the discipline involved. Previous studies have shown how small business owners use computers to delegate many routine decisions to their employees, thereby allowing owners to focus on managerial activities; by contrast, in larger company, a number of tasks such as training, enforcing procedures, or monitoring and controlling business activities are handled by throwing additional staff or by hiring extra staff with specific abilities.
According to Bradley and Hebert (1993), "The impact of expert systems in small business, therefore, may be greater than in large business since small firms may not have the luxury of alternative solutions" (23). While expert systems are an increasingly common feature in many businesses, another extension of artificial intelligence may represent an ever more important innovation for healthcare and statistical analyses applications; these are discussed further below.
Neural Networks. While an expert system is a software program that resembles a database, neural networks are designed to learn less from predigested data and more through experience (Zarowin 1995). As an emerging AI-based technology, compared with traditional statistical approaches, neural network analyses have been shown to be "of great use in diverse real-life applications"; likewise, researchers have noted that neural network analysis improved Chase Manhattan's credit card fraud detection rates over the regression model they had been using (Calori, Lubatkin, Tung et al. 2000:223). In their study, "Artificial Neural Networks as a Method of Spatial Interpolation for Digital Elevation Models," Civco, Cromley and Merwin (2002) report that "Artificial neural networks (ANNs) are highly connected computational models inspired by the neurological structure of the human brain. These networks, which are considered a subset of artificial intelligence, are designed to solve complex computational problems by means of 'self-learning.' Learning does not occur in a manner similar to the learning process of the human brain, but rather through a process of training and recall" (100). A study by Cheng, McClain and Kelly (1997) found that such ANNs are quickly being recognized as a powerful tool for investment forecasting and are attracting much attention from potential users based on the impressive results to date. For instance," Cheng et al. report:
An ANN-based system for investing in the U.S. Treasury Bond market was recently created and used to direct investments in that market for the years 1989-93. Over the five-year period, the ANN system generated a return on investment of 17% versus 14% for the prestigious Lehman Brothers Treasury Bond index over that same period. In another example, an ANN-based system was used to direct the investment of $10,000 in the S& P. 500 index over a 25-month period. Results were spectacular, as the ANN was able to increase the fund to $76,034 over the 25 months (1997:5)
According to Caroli et al., in 1987, Science Applications International Corporation developed a neural network model that remains in use in all major airports around the world that was able to outperform the linear discriminant analysis used previously in predicting the likelihood of a bomb in passenger luggage. According to Feldman (2001), "New systems using artificial intelligence (AI), already in place at a number of money center banks, create significant new marketing opportunities. These systems help reduce the cost of compliance while presenting an opportunity for banks to position themselves as adopting the 'high road'" (56). Other researchers have reported that neural network analysis have been able to predict the ratings of bonds more effectively than multiple regression. Beyond this technology's powerful predictive capability, Calori et al. report a number of other uses for neural networks as well, particularly regarding pattern recognition (such as identification of cancerous cells) as well as speech recognition and generation.
A neural network is modeled on the human brain in which there are extensively interconnected units (neurons) that make up a vast network capable of complex pattern recognition. As such, it is comprised of a number of computational elements that operate in parallel and arranged in patterns reminiscent of biological neural nets. The purpose of such emulation is to provide artificial systems that are capable of sophisticated, perhaps intelligent, computation and pattern recognition similar to those that the human brain routinely performs (Caroli et al. 200). According to Caroli et al. (2000), a typical network is comprised of several layers of interconnected neurons that include input neurons (these receive stimuli in the form of inputs, usually the independent variables), output neurons (dependent variable), as well as a layer of "hidden neurons" that can only interact with input and output neurons but can never actually be observed.
According to "Computers as Assistants: A New Generation of Support Systems" by Hoschka (1996), the first substantive result toward the achieving a seamless integration of inputs and outputs in this framework was the Associative Memory Model (ASM), described as "a flexible experimental system that tackles different problems at different levels" (55). The overall organization of the ASM approach is shown in Figure 1 below. Hoschka writes: "The basic level consists of a neural network package that realizes the minimal functionalities needed to build general neural networks. The focus lies here on the requirement for minimality and on taking care not to be completely unrealistic from a biological point-of-view" (55). The next level of the ASM provides a variety of forms of associative memory structures such as object-attribute- value triples or chains of predicates. Functionally, the user has the potential, for example, of inputting triples and retrieving them by using attributes as context-selecting devices (Hoschka 1996).
The important distinction among these structures is that they are all based on the same primitive structures (nodes) and operated on by the same basic "inference" method; in other words, a relaxing, value-passing, spreading-activation mechanism. The results yielded from the system are modeled as a network of mutually reinforcing nodes. According to Hoschka, "At the third level, these structures can be used for retrieval and associative completion. Previously stored (memorized) examples of objects and situations are retrieved on the basis of partial description" (56). The means for computing best matches (such as the intersection of attribute values) are also realized using the same spreading-activation mechanism.
Figure 1. Structure of ASM [Source: Hoschka 1996].
A number of models of neural networks have subsequently been developed based on such different input-output relationships and different learning models (Ye 1997). For instance, Ye reports that a heteroassociative network provides output that is different from the input, but an autoassociative network yields output that is equal to the input. According to Ye: "There are two general categories of learning: supervised and unsupervised (self-organizing). In supervised learning, desired outputs to given inputs are shown to neural networks. Unsupervised learning occurs without the indication of desired outputs to given inputs" (7). The importance of this relationship was noted by Collins and Clark (1993), who reported: "The task of the network is to learn an optimal pattern of interconnections that best captures all of the input/output relationships" (507). The capability of neural networks in learning from examples is useful in recognizing and generalizing user patterns from instances of repeated user actions, and in adjusting the acquired knowledge to dynamic environments (Ye 1997).
The capabilities of neural networks in implicit knowledge representation and parallel processing also provide the support to processing efficiency required by online dynamic user modeling. Furthermore, the degree of robustness of neural networks to noise represents a valuable addition to data analysis in user modeling; consequently, neural networks provide a better alternative for supporting intelligence required in user modeling and intelligent interface (Ye 1997).
According to Zarowin, "Even in its early development stage, the technology is reasonably successful at making 'decisions' about data that are incomplete, imprecise and only partly correct - jobs particularly unsuited to conventional software" (56). Neural networks also appear to be most useful in economic forecasting, risk management, financial modeling and establishing credit ratings than previous predictive models; for instance, neural networks can determine whether a lease should be classified as either operating or capital (Zarowin 1995).
In his study, "Thinking Computers," Zarowin writes: "One task in which they [neural networks] have shown unusual strength is in determining whether a lease should be classified as operating or capital. To build its experience in this area, an accountant would feed a program a number of sample leases that have been classified by the human instructor. Over time, the program distinguishes the patterns that make it either ail operating or a capital lease" (57). Finally, Sharda (1994) compared the performance of neural nets to classical statistical techniques in 42 reported cases and found that the neural network model performed better in 71% of the cases, the statistical techniques performed better in 17% of the cases, and no winner emerged in the remaining 12%.
A noted authority in the field today, Kohonen (1988) defined such a network as "...a parallel interconnected network of simple (usually adaptive) elements and their hierarchical organizations which are intended to interact with the objects of the real world in the same way as biological nervous systems do" (4). More to the point as to human experiential knowledge, Haykin (1994) described a neural network as being ". . . A parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. Knowledge is acquired by the network through a learning process, and interneuron connection strengths, known as synaptic weights, are used to store the knowledge" (2).
You’re 81% through this paper. Sign up to read the full paper.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.