Paper Example Undergraduate 3,092 words

Extant Literature Has Been Dedicated

Last reviewed: February 21, 2011 ~16 min read

Extant literature has been dedicated to the Markov Blankets Classifier. There is limited research dedicated to the Markov Blankets classifier in literature of machine learning (Koller and Sahami, 1996, Tsamardinos et al., 2003).Very few of these studies do generated Markov Blanket from the data; these are normally for a small number of variables. These studies do make use of the notion that Markov Blanket makes use and is presented as a set of variables. None of these studies have generated the DAG of the Markov Blankets nor have they used the results obtained for the Bayesian inference in the process of classifying the problem sets. Bayesian algorithms that can be regarded as theoretically correct in the limit of large samples (Chickering, 2002) for the process of finding DAGs are explicitly known by now. They have however not been employed in the process of evaluating the Markov Blankets for a given data sets having a large set of variables.Koller and Sahami (1996) initially employed heuristic procedure in the process of finding the Markov Blanket variables for a given target variable in the provided data sets having a large number of variables. The heuristic in this case is used while based on two major assumptions that are usually never true in the data set derived from the real world. The conditions are that; the target does influence the predictors and the variables that are most associated with the given dataset are included in the Markov Blanket. There is no classifier that is studied. The Kohler and Sahami (Koller and Sahami,1996) experiment having a large set of variables, a hundred or more number of variables (predictor) remains. In 1999,Margaritis and Thrun then proposed the GS algorithm (Margaritis and Thrun, 1999). The GS algorithm employs a measure of association against the target variable as well as the conditional independence tests aimed at discovering a condensed set of variables that are estimated to be part of the Markov Blanket. This is coupled with the use of Markov Blanket nodes in the studying of the structure of Bayesian network.

In 2003, IAMBnPC was proposed (Tsamardinos et al., 2003). The IAMBnC system employs a dynamical variant of the selection filter that is then succeeded by the PC algorithm as pointed out by Spirtes et al. (2000). The modified output of the PC algorithm does not bear any form of graphical Markov Blanket DAG. A variant of the variant IAMBnC known as interIAMBnPC

A variant IAMBnC, interIAMBnPC then interleaves PC algorithm. Another procedure of feature se lection that makes use of the Markov Blanket notion HITON effectively supplements the IAMBnPC dynamic variable filter with a special "wrapper" that employs any of the various non-Bayesian classifiers (Aliferis et al., 2003). It then classifies the target using that non-Bayesian. In this process a graphical Markov Blanket. The outcome of the results is then compared on the basis of five (5) empirical sets of data from the domain of biomedicine. Each of the datasets have an extremely large variable rations to the case. The concept of Tabu search for example, has been applied successfully to a large number of continues as well as problems of combinatorial optimization (Johnson and McGeoch, 1997, Toth and Vigo, 2003). The Tabu concept is extremely capable of mitigating the problems of complexity of the actual search process and the acceleration of the convergence rate. It is best and simplest form; the Tabu search commences with a very feasible solution and then proceeds to choose the best possible move in accordance to a special evaluation function. This is performed while ensuring that steps are taken to avoid the method revisiting a solution that was generated previously. This is carried out through the introduction of tabu restrictions on all the possible moves so as to discourage the possible reversal as well as repetition of certain selected moves. The chosen Tabu list is generated that contains the moves that are deemed forbidden. These are referred to as the short-term memory function. It does operate through the modification of the search trajectory I order to exclude the moves that leads to the new solutions that carries the attributes that belong to the various solutions that had already been visited previously within a chosen time zone that is governed by the memory (short-term).The immediate as well as the long-term memory functions may be incorporated in order to intensify while diversifying the search (Rego and Alidaee, 2004). It is therefore worthwhile to note that more investigation is appropriate to be carried out for the metaheuristic procedures. This also applies to the advanced forms of the tabu searches. Other methods such as Metaheuristic (Holland, 1975) search methods like the genetic algorithms, Artificial Neural Networks as postulated by Freeman and Skapura (1991), simulated annealing (Johnson et al., 1989;Metropolis et al., 1953) and Tabu searches (Glover, 1997) have been effectively applied to the process of machine learning as well as data mining using Bayesian Networks and decision trees (Sreerama et al., 1994).

The most recent application of these includes the Genetic Algorithm approach that is used in the building of decision trees to be used in the process of domain marketing (Fu et al., 2004). The analysis of the stock market is also done using Neural Networks that is applied to the Hybrid

Intelligent Systems (Abraham et al., 2001). Heuristic models are also used in the modeling of reinforcement learning algorithm to be used for learning in order to control the partially-observable decision processes based on the Markov Blankets Classifier theory (Moll et al., 2000).

Extant literature has been dedicated to the concept of Maximum Entropy. Maximum Entropy which is a machine learning language method with a basis on empirical data has been shown to be better than and Berger et al. showed that in many cases it outperforms Naive Baye's

Classification (Nigam et al.,1999; Berger et al.,1996). Raychaudhari et al. (2002) also proved that Maximum Entropy performs better than both Naive Baye's and the Nearest Neighbor classification. As opposed to the Naive Baye's machine learning, the Maximum Entropy does not make any assumptions of independence regarding the occurrence of certain words.

Unlike the Naive Baye's machine, the Maximum Entropy technique of modeling does provide a probability distribution that is considered to be very close the uniform under certain constraints. A more elaborate description of the Maximum Entropy technique is available in specific literature (Manning and Schutze, 1999; Ratnaparkhi, 1997). Ratnaparkhi (1997) provides a description of the Maximum Entropy classification.

There has been a considerable amount of work regarding the concept of classification by means of different techniques as well as optimizations. Most of the work on classification have concentrated on the identification concept (Karlgren and Cutting,1994).The otology based on classification is another area (Raychaudhari and Altman, 2002).Yet again subjective genres also are focused on (Kessler et al., 1997). Some of the procedures employed for the determination of various opinions and sentiments from a given text are the same as the one employed by Raychaudhari and Altman (2002) in the classification of abstract that are biomedical in nature. Due to the fact that ontology-based methods of classification are distinct and well-defined, the specific categories that are employed in order to rank the contents are usually ordered but never equidistant (Mehra et al., 2002). The consequence of this is the reduction in the level of accuracy in the classifier that has been reduced. Raychaudhari and Altman (2002) employed a statistical feature selection technique referred to as the chi-square. The Chi-square is test used for evaluating the level of statistical significance that produced words that have the highest level of skew across all of the categories in a given set of training (Manning and Schutze, 1999).The most recent work on sentiment analysis has been based on language as opposed to statistics (Hatzivassiloglou and McKeown, 1997).

Support Vector Machine (SVM) is a technology that is applied to recognize articles that pertains to biomolecular interactions. Besides, they authenticate sentences mentioning given protein-protein communications. The method can be applied to hastily tutor a machine learning algorithm in a bid to distinguish interaction-like articles and also bypasses the arduous procedure of constructing an exact domain semantic grammar needed in Natural Language Processing (NLP). Marcotte et al. (2001) in the recent past applied a connected method (Bayesian) in a bid to categorize articles describing protein-protein relations information.

The names of proteins and the symbols of their genes are got from sequence database that are not redundant (Pruit&Maglot, 2001) and also from the Saccharomyces Genome Database (SGD) (Dwight, ET all, 2002) all these names are only applied in literature searching. This provides one a direct mapping of names to sequences and besides, it helps in the training of BIND interaction records to be submitted. The remaining names are then identified by the module of Taxonomy. This is however used distinctly for the objective of marking-up text for the purpose of being reviewed by other users. The approach of making use of names to make a co-occurrence network of bimolecular that can be identified in the literature resembles the approach that is being used by PubGene (Jenssen et al., 2001)

The use of support vector machine learning is widely supported to be used to notice micro calcification clusters in the digital mammograms. It is indeed a learning tool that originated from modern statistical theory of learning. (Vapnik, 1998). In the recent past years, SVM learning has got a large range of real life applications. This includes handwritten digit detection (Scholkopf et al., 1997), recognition of object, (Pontil&Verri, 1998), identification of speaker (Wan&Campbell, 2000) and detection of face in images,(Osuna et al.,1997) . Categorization of text is done by SVM. (Joachims,1999). SVM learning formulation has its basis on structural risk minimization principle. It does not minimize an object function on the basis of training examples but on the contrary, SVM tries to minimize leap on generalization error. This is usually the error that is done by the learning machine on the test data that is not used while undertaking the training.

Consequently, SVM tries to work perfectly well when it is applied to the data that is outside of the set of training. Surely, it has been stated that approaches that bear based on SVM are able to considerably perform better than the other competing methods in numerous applications (Burges1998; Muller et all,2001; Wernick,1991). SVM attains this advantage through laying focus on the examples of training which are majorly hard to be classified.

The thought of Support Vector Machines is to plan the input data into a great dimensional feature space via non-linear mapping that is chosen a priori (Boser et al., 1992). Handwritten digit recognition has on a number of occasions been applied as benchmarks for assessment of classifiers (LeCun et al., 1995). Because of this reason, SVMs have initially been tried in the database of United States Postal Service (LeCun et al., 1989) and the database of MNIST ([LeCun et al., 1995). The main benefit of the latter is because it has 60000 examples of training and 10000 examples of tests. This yields very accurate assessment among classifiers. On the contrary, the database of USPS is containing 9298 digits that are handwritten. Among these7291 are for training and the remaining 2007 are used for testing. (LeCun et al.,1995).

During the classification of Naive-Bayes, those who do the classification imagine the attributes are provisionally sovereign of one another provided with the class; they thereafter make use of Bayes' theorem in a bid to approximate the likelihood of each distinct class. The class that is having the maximum probability is selected as the class of the case in point. The classifiers of Naive-Bayes are not only simple, robust, effective but they are also efficient and besides, they strongly sustain incremental training. The merits that they have made them to find employment in several tasks of classification.

You’re 80% through this paper. Sign up to read the full paper.

130,000+ paper examples AI writing assistant Citation generator Cancel anytime