Data Mining Today, companies of all sizes and types are eager to learn as much about their customers and competitors as possible in order to gain and sustain a competitive advantage in an increasingly globalized marketplace. One strategy that has demonstrated utility for this purpose is data mining. The purpose of this paper is to provide a review of the relevant...
Data Mining
Today, companies of all sizes and types are eager to learn as much about their customers and competitors as possible in order to gain and sustain a competitive advantage in an increasingly globalized marketplace. One strategy that has demonstrated utility for this purpose is data mining. The purpose of this paper is to provide a review of the relevant literature to determine the pros and cons of using data mining for a medium-sized business operating in the U.S., a discussion about the tools that are available for this purpose and their respective costs, and an assessment concerning the amount of time it would it take to be up and running with data mining. Finally, an analysis concerning whether a third-party vendor make it easier to data mine and two examples of businesses that successfully use the data mining process are followed by a summary of the research and key findings concerning these issues in the conclusion.
What are the pros and cons of using data mining?
One of the major “pros” of using data mining is the fact that this research strategy draws on existing data to develop new findings and insights that might not otherwise be possible. For instance, according to the definition provided by Chang (2022), data mining “refers to the nontrivial process of extracting implicit, unknown, and potentially useful information from a database or data warehouse” (p. 1). In layman’s terms, the major benefit refers to the use of a computer-based application to transform raw data into meaningful information (Data mining in business analytics, 2022). By analyzing patterns and anomalies in big data, data mining can help business practitioners make informed decisions and develop timely strategies in response to newly identified opportunities or potential threats in ways that promote organizational efficiency and profitability (Data mining in business analytics, 2022).
Although data mining applications can achieve these benefits, the process is not without challenges and one of the major “cons” in using data mining relates to the paucity of relevant research concerning the manner in which successful firms have leveraged the findings that emerge from data mining analytics into real-world initiatives. In this regard, Zhan et al. (2019) point out that, “While providing high-level evidence of these benefits, studies have failed to systematically investigate the specific mechanics behind how firms can realize these benefits” (p. 6335). In addition, the extent to which organizations realize the benefits of data mining also depend on what tools they use and this issue is discussed further below.
What tools are available to help with data mining?
Given its proven ability to provide business practitioners with the information they need to make informed decisions in a complex environment, it is not surprising that there are numerous commercial data mining applications available, but there are also some powerful open source versions available including the following which are among the most commonly used at present:
· Rapid Miner: This proprietary application provides conventional data mining tools as well as the capability to learn from the findings the process generates over time in an integrated fashion. In addition, Rapid Miner also features an intuitive user interface that facilitates it use from the outset and assists with 1) data modeling and preparation, 2) data cleansing, 3) exploratory data analysis and 4) visualizations.
· Oracle Data Mining: This is a leading proprietary data mining product that include a wide array of functions including multiple data mining algorithms that are used to classify, regress, predict and identify anomalous outliers. An important benefit of this data mining product is the customer support that is included which is provided by expert Oracle technical staff.
· Apache Spark: This open-source data mining application comes complete with many of the same features that are offered in the above-described platforms as well as the ability to build parallel apps to promote greater integration. Some indication of this application’s popularity is the fact that nearly 13.5 thousand companies currently use Apache Spark for all of their data mining needs (Sarangam, 2020).
What tools are high cost, and what tools are low cost?
The price of the proprietary data mining applications described above (as well as dozens of others) can run into the thousands of dollars while the open-source versions are readily available at no coset.
How long would it take to be up and running with data mining?
Like the answer to the question, “How long is a piece of string?,” the amount of time required to configure a commercially available data mining application or to design a novel solution depends on a number of factors. Although these factors vary depending on the size of organizations and the respective purposes for which they intend to mine data, they generally include the level of in-house IT and statistical analytical expertise.
Does a third-party vendor make it easier to data mine?
Even highly trained and skilled software engineers require a significant amount of time and other organizational resources to generate the unique code that is required for even straightforward data mining applications, so it just makes good business sense to select a commercially available model from one of the third-party vendors described above. Likewise, outsourcing this function to a third-party vendor may also make good business sense depending on a cost-benefit analysis.
Two examples of businesses that successfully use the data mining process
Two prominent examples of businesses that currently use data mining to good effect include Amazon which has routinely collected customer data as well as its competitors’ pricing data and McDonald’s which collects sales data from its tens of thousands of restaurants to identify the quality of the customer experience and opportunities for improvement (Peterson, 2016).
The remaining sections cover Conclusions. Subscribe for $1 to unlock the full paper, plus 130,000+ paper examples and the PaperDue AI writing assistant — all included.
Always verify citation format against your institution's current style guide.