The use of support vector machine learning is widely supported to be used to notice micro calcification clusters in the digital mammograms. It is indeed a learning tool that originated from modern statistical theory of learning. (Vapnik, 1998). In the recent past years, SVM learning has got a large range of real life applications. This includes handwritten digit detection (Scholkopf et al., 1997), recognition of object, (Pontil&Verri, 1998), identification of speaker (Wan&Campbell, 2000) and detection of face in images,(Osuna et al.,1997) . Categorization of text is done by SVM. (Joachims,1999). SVM learning formulation has its basis on structural risk minimization principle. It does not minimize an object function on the basis of training examples but on the contrary, SVM tries to minimize leap on generalization error. This is usually the error that is done by the learning machine on the test data that is not used while undertaking the training.
Consequently, SVM tries to work perfectly well when it is applied to the data that is outside of the set of training. Surely, it has been stated that approaches that bear based on SVM are able to considerably perform better than the other competing methods in numerous applications (Burges1998; Muller et all,2001; Wernick,1991). SVM attains this advantage through laying focus on the examples of training which are majorly hard to be classified.
The thought of Support Vector Machines is to plan the input data into a great dimensional feature space via non-linear mapping that is chosen a priori (Boser et al., 1992). Handwritten digit recognition has on a number of occasions been applied as benchmarks for assessment of classifiers (LeCun et al., 1995). Because of this reason, SVMs have initially been tried in the database of United States Postal Service (LeCun et al., 1989) and the database of MNIST ([LeCun et al., 1995). The main benefit of the latter is because it has 60000 examples of training and 10000 examples of tests. This yields very accurate assessment among classifiers. On the contrary, the database of USPS is containing 9298 digits that are handwritten. Among these7291 are for training and the remaining 2007 are used for testing. (LeCun et al.,1995).
During the classification of Naive-Bayes, those who do the classification imagine the attributes are provisionally sovereign of one another provided with the class; they thereafter make use of Bayes' theorem in a bid to approximate the likelihood of each distinct class. The class that is having the maximum probability is selected as the class of the case in point. The classifiers of Naive-Bayes are not only simple, robust, effective but they are also efficient and besides, they strongly sustain incremental training. The merits that they have made them to find employment in several tasks of classification.
Classifiers of Naive-Bayes have for quite a long duration been a critical technique in the retrieval of information (Maron and Kuhns, 1960; Frasconi, Soda, and Vullo, 2001; Maron, 1961;
Lewis, 1992; Kalt, 1996; Larkey and Croft, 1996; McCallum and Nigam, 1998; Pazzani, Murax matsu, and Billsus, 1996; Starr, Ackerman, and Pazzani, 1996; Joachims, 1997; Koller and Sahami, 1997; Li and Yamanishi, 1997; Mitchell, 1997; Pazzani and Billsus, 1997; Lewis and Gale, 1994; Lewis, 1998; McCallum, Rosenfeld, Mitchell, and Ng, 1998; Nigam, McCallum, Thrun, Guthrie and Walker, 1994; and Mitchell,1998;). First they were brought into the learning of machines like straw men, adjacent to which fresh algorithms were evaluated besides being compared (Clark and Niblett, 1989; Cestnik, Kononenko, and Bratko, 1987; Cestnik, 1990). However, it was later found out that the accuracy of their classification was astonishingly high when they were compared with severally more complicated categorization algorithms (Domingos and Pazzani, 1996; Zhang, Ling, and Zhao, 2000; Domingos and Pazzani, 1997; Kononenko, 1990; Langley, Iba, and Thompson, 1992). Therefore, they have always been selected as the foundation algorithm for not only hybrid methodologies, bagging, but also for wrapper, voting or boosting [Kohavi, 1996; Ting and Zheng, 1999; Gama, 2000; Zheng, 1998; Kim, Hahn, and Zhang, 2000; Bauer and Kohavi, 1999; Tsymbal, Puuronen, and Patterson, 2002].
Similarly, classifiers of naive-Bays are widely used in medical diagnosis (Kononenko,1993; Kowhai, Sommerfield, and Dougherty,1997; Kukar, Groselj, Kononenko, and Fettich,1997; McSherry,1997; McSherry,1997; Zelic, Kononenko, Lavrac, and Vuga,1997; Montani, Bellazzi, Portinale, Fiocchi, and Stefanelli,1998; Lavrac,1998; Lavrac, Keravnou, and Zupan,2000; Kononenko,2001; Zupan, Demsar, Kattan, Ohori, Graefen, Bohanec, and Beck,2001], filtering of email (Pantel and Lin,1998; Provost,1999; Androutsopoulos, Koutsias, Chandrinos, and Spyropoulos,2000; Rennie,2000; Crawford, Kay, and Eric,2002], and similarly, they are used in recommender systems (Starr, Ackerman, and Pazzani,1996; Miyahara and Pazzani,2000; Mooney and Roy,2000).
Naive bayes learning algorithm
Naive bayes learning algorithm is the mainly practical approach for the majority of the learning problems. Besides, it has its basis on critically evaluating unequivocal possibilities for the hypotheses. It tremendously competes with the rest of the learning algorithms. On a number of occasions, it outperforms them. Naive beyes learning algorithms are of great importance to machine learning because they give exceptional perspective for comprehending numerous learning algorithms which do not openly direct or interferes with the probabilities (Alpaydin, 2004).The Naive Bayes classifier has its basis on the simplifying hypothesis that the element values are provisionally independent provided the target value (Mitchell, n.d).
Naive Bayes classification is the most advantageous method of supervised learning when the values of the attributes of a sample are autonomous when provided with the example's class. Despite the fact that this hypothesis is violated on a number of occasions in real life practice, previous works have proved that naive Bayesian learning is outstandingly efficient when carried out and very hard to develop upon methodically (Domingos and Pazzani, 1996).On numerous real-life sample datasets, naive Bayesian learning provides improved test set correctness more than the other methods which are known like back propagation . Besides, these classifiers are capable of being learned very proficiently (Minsky & Parpet, 1969).
Abraham, a., Nath, B., and Mahanti, P.K. (2001). Hybrid intelligent systems for stock market analysis. Computational Science, pages 337 -- 345.
Aliferis, C., Tsamardinos, I., and Statnikov, a. (2003). Hiton, a novel markov blanket algorithm for optimal variable selection.
Berger a., a Brief Maximum Entropy Tutorial
Chickering, D.M. (2002). Learning equivalence classes of bayesian-network structures. Journal of Machine Learning Research, 3:507 -- 554.
Schroeder M, Sherlock G, Sethuraman a, Weng S, Botstein D, Cherry JM (2002) Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res.
Freeman, J. And Skapura, D. (1991). Neural Networks. Addison-Wesley.
Fu, Z., Golden, B.L., Lele, S., Raghavan, S., andWasil, E.A. (2004). A genetic algorithm-based approach for building accurate decision trees. INFORMS Journal on Computing, 15:3 -- 22.
Glover, F. (1997). Tabu Search. Kluwer Academic Publishers.
Hatzivassiloglou, H V. And McKeown, K.R. (1997). Predicting the semantic orientation of adjectives. Proceedings of ACL-97, 35th Annual Meeting of the Association for Computational Linguistics, Madrid, ES, Association for Computational Linguistics.
Holland, J.H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan
Jenssen TK, Laegreid a, Komorowski J, Hovig E (2002) a literature network of human genes for high-throughput analysis of gene expression. Nat Genet
Johnson, D.S., Aragon, C.R., McGeoch, L.A., and Schevon, C. (1989). Optimization by simulatedannealing: an experimental evaluation. Part I, graph partitioning, Operations Research, 37:6:865 -- 892.
Johnson, D.S. And McGeoch, L.A. (1997). The traveling salesman problem: A case study in local optimization. In E.H.L., Aarts and J.K., Lenstra, editors, Local Search in Combinatorial
Optimization, pages 215 -- 310. John Wiley and Sons.
Joachims, T (1999)"Transductive inference for text classification using support vector machines," presented at the Int. Conf. Machine Learning Slovenia
Koller, D. And Sahami, M. (1996). Towards optimal feature selection. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 284 -- 292. Morgan Kaufmann.
Marcotte, EM, Xenarios I, Eisenberg, D (2001): Mining literature for protein-protein interactions. Bioinformatics Manning, CD. And Schutze, H (1997). Foundations of Statistical Natural Language Processing.
Margaritis, D. And Thrun, S. (1999). Bayesian network induction via local neighborhoods. In Advances in Neural Information Processing System.
Metropolis, N., Rosenbluth, a., Rosenbluth, M., Teller, a., and Teller, E. (1953). Equation of state calculations by fast computing machines. Journal. Chemical Physics, 21-6:1087 -- 1092.
Moll, R., Perkins, T.J., and Barto, a.G. (2000). Machine learning for subproblem selection. In Proceedings 17th International Conf. On Machine Learning, pages 615 -- 622. Morgan Kaufmann, San Francisco, CA.
Nigam K., Lafferty J., and McCallum a. (1999) using maximum entropy for Text Classification.
In Proc of the IJCAI-99 Workshop on Machine Learning for Information Filtering
Osuna, F, Freund, R and Girosi, F "Training support vector machines: Application to face detection," in Proc. Computer Vision and PatternRecognition, Puerto Rico,
Pruitt KD, Maglott DR (2001): RefSeq and LocusLink: NCBI gene-centered resources.
Nucleic Acids Res.
Pontil, M and Verri, a (1998)"Support vector machines for 3-D object recognition,"