Data Mining Evaluating Data Mining Thesis

Excerpt from Thesis :

The use of databases as the system of record is a common step across all data mining definitions and is critically important in creating a standardized set of query commands and data models for use. To the extent a system of record in a data mining application is stable and scalable is the extent to which a data mining application will be able to deliver the critical relationship data, predictive analytics and accurately reflect the associations most critical to companies (Kuhn, Ducasse, Girba, 2007). The uses of multidimensional database systems are essential for creating the system of record on which data mining applications are based on. Data warehouses are the system of record these data mining applications rely on for completing more extensive analysis of the data sets they have available. The third process is the development of user-based applications that make queries of the data sets possible, including role-based access of the data over time (Cressionnie, 2008). Role-based access of data mining application data is critically important in the development CRM-based strategies where reports are often used for planning marketing campaigns and strategies, predicting customer purchase patterns and response rates to specific promotions (Sun, 2006). Google uses the reporting layer of their data mining applications to provide their managers, directors and senior executives with insights into how their search engine, related products and services, and specific language sites are performing over time. This data is invaluable to Google in creating new online products and services that stand a higher probability of success given their being based on the needs of customers, discovered through data mining. The fourth process is the more advanced applications used for analyzing the data and presenting it in an application that can be used by line-of-business managers, directors and senior management. Advanced applications are critically important for data mining applications to be able to create and continually monitor the four types of relationships in data (da Cunha, Agard, Kusiak, 2010). These four associative models when combined also provide a rich set of insights and intelligence for creating predictive marketing, selling and service strategies (Sun, 2006). Analyzing the data through the use of application software is also going through a revolution of its own today as AJAX (Asynchronous JavaScript) and XML networks are also streamlining the use of Web-based applications that are used for intensive data mining tasks. The streamlined design of AJAX application is leading to Web Services that can scale to support more of the front-end analysis at the client level of networks (Nayak, 2008). The next generation of data mining applications, which will be discussed at the end of this analysis, is already being built on AJAX-based technology that integrates to XML networks that have been optimized for performance gains. The last process area is that of presenting data in useful and readable formats, another area being highly influenced by the adoption of AJAX development languages and tools for Web-based data mining applications (Nayak, 2008).

Assessing Data Mining as a Technology Trend

The catalyst of data mining's growth continues to be the unmet information needs within organizations that are seeking to gain a competitive advantage from the vast data they have accumulated. The convergence of hardware advances in virtualization of server technologies and their use for accelerating complex processing tasks (Luo, Lu, Huang, He, Shi, 2006) in conjunction with the development of text mining, clustering and relational analytics engines (Berry, 2004) is drastically re-ordering the data mining landscape. In addition the acceptance of AJAX as a programming language of choice for data-intensive applications has also served to accelerate the adoption of data mining throughout geographically dispersed organizations (Nayak, 2008). Software-as-a-Service (SaaS) platforms are also being created as a result of these trends including virtualization and AJAX or then client computing (Nayak, 2008). These technologies are making it possible to more quickly and thoroughly define the associations in data and also progress through the five process areas mentioned in the previous section of this analysis.

The more fundamental catalysts of this technological trend of data mining however are found in the unmet needs of organizations, both for-profit and non-profit, to gain greater insights and intelligence into their customers, operating and processes. The role of data mining has been one of creating greater analytical tools through the use of AJAX programming, .NET, Java (J2EE) and the development of Web Services (Nayak, 2008). There is a cycle of continuous innovation occurring today as a result. The technologies are continually fuelling greater flexibility and depth of analysis, while at the same time creating more efficient approaches to creating reports and online scorecards. The net result of these improvements in usability is a continual improvement in how the reports and analysis can be tailored to the needs of information users. For the first time this convergence of technologies and needs is leading to roles-based access of vast amounts of data analyzed through data mining engines and constraint-based modeling techniques (Sun, 2006). This is also fueling the use of data mining for more predictive analytics models in small and medium businesses as the applications are being delivered over the Internet (Nayak, 2008). Organizations are using data mining to also drive their strategies for Business Intelligence (BI) and advanced data warehousing (DW) platforms and programs that are making strategies more accomplishable through greater intelligence and more real-time feedback. In conclusion the needs of users are growing more complex and demanding in terms of analytics while data mining, business intelligence, and data warehouses are also evolving, further expanding the expectations. This cycle of innovation will continue to accelerate as technology gains are made while users of these systems devise creative new ways to use the data and capitalize on the insights they deliver.

Use of Data Mining at Google

Google's uses of data mining are both for the search services it delivers in addition to the extensive CRM platforms and systems used for targeting new corporate accounts, defining customer and audience segments, and devising new approaches to serving advertisers. Of all these customer groups, advertisers are the larger single source of revenue the company has due to their AdWords program. Google uses data mining to determine how effective their advertisers are with specific programs, to track trends of specific queries, determine how to improve the performance of their servers and virtualization routines, and also how to determine which are the best new potential products to launch. The Google latent semantic indexing technology is used for pattern matching (Buddhakulsomsiri, Zakarian, 2009) in addition to linguistics modeling and analysis. Google uses these technologies to create predictive linguistic models that assist the company in managing the search process more effectively. The use of latent semantic indexing actually creates more effective uses of computing time the company has on its servers, in addition to making the search models themselves more effective and streamlined in terms of linguistic associations made (Berry, 2004). Google has the goal of creating a data mining technology that is intelligent and self-learns patterns in data over time so that queries of their search engine and its associated products can be more efficient.

In addition to using data mining for their core search engine performance improvement strategies, and for determining how best to serve their advertising customers, Google also uses data mining to enable greater levels of process improvement (Osei-bryson, Rayward-smith, 2009). Google uses data mining for analyzing the performance of their applications, hosting centers, and business processes to determine how best to improve them over time. In this way its managers can gain more effective insight into how to best streamline, re-vamp and improve processes over time. Business process re-engineering is a main focus of the company using data mining, in addition to creating programs for continuous process improvement globally throughout its divisions and subsidiaries.

Future of Data Mining

The future of data mining is going to be defined by the technologies making the development of streamlined interfaces based on AJAX, J2EE possible. In conjunction with this development will be the continual improvement of XML networking technologies and speeds, making it possible for data mining to eventually become a true Web Service (Nayak, 2008). The continual improvements in usability and user interfaces will also lead to significant advances in how data mining is used for aligning with business roles and responsibilities. No longer constrained by the taxonomies used to create the databases, data mining applications will be able to create personalized, highly flexible taxonomies on the fly given a given user's requirements as well. In short, the data mining applications in the future will be transparent to the business processes and goals being achieved by companies over time. There will be more demarcation line between the data mining application and its supporting systems, databases and systems of record as well. Data mining will be integrated directly into knowledge flow as a result (Lai, Liu, 2009).


Data mining's initial development began…

Sources Used in Document:


Berry (2004) - Survey of Text Mining Clustering, Classification, and Retrieval Berry, Michael W. (Ed.) 2004, XVII, 244 p. 57 illus., Hardcover ISBN: 0-387-95563-1

Buddhakulsomsiri, J., & Zakarian, a.. (2009). Sequential pattern mining algorithm for automotive warranty data. Computers & Industrial Engineering, 57(1), 137.

Cressionnie, L.. (2008). Ready for Takeoff. Quality Progress, 41(7), 59-61.

da Cunha, C., Agard, B., & Kusiak, a.. (2010). Selection of modules for mass customisation. International Journal of Production Research, 48(5), 1439.

Cite This Thesis:

"Data Mining Evaluating Data Mining" (2010, February 08) Retrieved January 17, 2019, from

"Data Mining Evaluating Data Mining" 08 February 2010. Web.17 January. 2019. <>

"Data Mining Evaluating Data Mining", 08 February 2010, Accessed.17 January. 2019,