Faris (2013) speculates as to whether NSA leaks will compromise big data's future. The article, published on the website Dataversity, notes that there is public concerns about data leaks at NSA. Consumers are becoming more aware about just how much of their information is available to the government. The author calls into question the dichotomy of private data and public data, in particular were corporate entities are gathering data, and then that data finds its way into the hands of government. The author concludes by noting that big data is a growing field, but there are still going to be customers who are wary.
The point Faris makes about customers being willing participants is valid in that most people are perfectly willing to divulge information. Only when they overtly know that this information is being used to market to them may they have a problem with it, but only maybe. The Nordstrom anecdote provides valuable evidence that consumers tend to separate online from real life, and invasions that seem reasonable online are acceptable in real life, probably because their physical body is there. This is an interesting concept that can be explored in greater detail in research studies about big data.
Harris (2012), writing on the website Gigaom, discusses the use of big data in the political domain. He describes the 2008 Obama campaign in particular as a landmark for the use of big data, which was used to raise money and build a strong grassroots element to the campaign. Politicians are major ad buyers, so it is only natural, Harris argues, that big data becomes a political tool. Ads can be created specifically for the recipient, even online, highlighting one of the uses of big data. The author notes that while voters say they are turned off to targeted ads, marketers can easily mask the fact that a person is being targeted.
Given the importance of advertisements to politics, it is not surprising that big data has become involved in a big way. Harris implies that big data will only become more important. We know that presidential elections are often seen as coming down to a handful of swing voters in key swing states; big data can help parties to refine the messages that they present to these voters, which has significant implications for democracy that Harris could discuss in his next examination of the subject.
Herodotou et al. (2011) introduce Starfish, which is a self-tuning system for big data analytics. The authors note that big data analytics is becoming increasingly complex with time. The idea is that such analytics systems are difficult to tune for the best performance, so there is new software called Starfish that allows the system to "self-tune" to deliver the best results. The authors discuss some of the challenges and benefits to this.
This article is important for two reasons. The first is that it highlights where the technology is with respect to big data. The second is that it identifies some of the key issues that constrain the performance of big data analytics systems. Computer scientists are working to resolve these problems and hopefully unlock the power of big data.
Kantorwitz (2012), writing on a Forbes blog, discusses the role of big data in Presidential elections. Noting that the Obama re-election team advertised publicly for big data analysts, he attempts to sort out how big data is used in election campaigns. Kantorwitz identifies something akin to a credit score for politics, which aggregates a variety of demographic information that is correlated with voting patterns. This helps parties to form a profile of an individual voter, and in that way ensure that the right voters are being targeted. The campaigns are specifically looking for turnout targets and persuasion targets. The former are not habitual voters, but already support the party; the key is to ensure that these people vote. The latter are people who are undecided, but likely to vote. This microtargeting uses big data to sort people into these different categories, so that campaign efforts can be as focused and efficient as possible.
The key point in the article is that when journalists look at poll numbers, those poll numbers are not as sophisticated as the numbers that the campaigns have; therefore the campaigns are much more prepared with their strategies, and have a better sense of how campaigns are going to play out, than
White (2014), writing on a blog, explains the basics of big data, using the analogy of a breadcrumb, a little tidbit of data that you leave whenever you make a transaction. The challenge is to create a loaf of bread from millions of breadcrumbs, and that is the challenge that big data practitioners are most interested in. Big data has been enabled by technological innovation, and this is also outlined by White. White's article is a good introduction to the basic concept of big data and some of its uses in the real world.
Chen (2012) discusses the impact of big data. Business intelligence and analytics have become a growing field, and Chen outlines the different levels of big data. 1.0 is basically databases and other fairly simple old school techniques. 2.0 came about when online data collection began, which allowed for far more data collection. New uses for data also came about during the 2.0 stage. Chen argues that the 3.0 stage is emerging now as well, with so-called "Internet of things," where many devices become connected and contribute to the data-gathering process. Chen's work is important because it outlines the evolution of big data. With White's article, a comprehensive understanding of the concept and its history can be understood. Furthermore, Chen sheds some light in the 3.0 discussion about the future direction for big data.
Cohen (2009) offers an outline of some of the new analytical techniques -- at least as of five years ago. While dated today, this article covers MAD analytics, which is Magnetic, Agile and Deep. The article provides an in-depth look at database practices, including agile design, parallel algorithms and density methods. This article is useful in understanding where big data was five years ago. While there has been some evolution, some of these concepts are still valid and valuable today. The article is also a valuable source of technical information, as that is the primary thrust of the article.
Zaslavsky (no date) discusses the Internet of things concept. The author notes that the IoT "will comprise billions of devices that can sense, communicate, compute and potentially actuate." The architecture of the IoT is discussed in the article, along with some technical details and issues with respect to cloud-based management. The article is valuable because it identifies the role that the cloud plays in big data, and how big data is going to be transformed by the Internet of Things. The IoT is a major concept in big data 3.0, so it is important to have this technical understanding of how it works. There are several articles that provide valuable insight into how big data works, which is just as important as knowing how it is used.
Tankard (2012) writes about security, one of the emerging issues with big data and especially with the IoT. If data is so important and value, then it needs to be secured. Big data includes not only personally sensitive information but also proprietary data -- IP, trade secrets or even national security knowledge. The author looks at different approaches to securing big data. This is an important article because of how important big data is going to be going forward -- in order for it to reach its promise, this data must be secured.
Boyd and Crawford (2012) ask critical questions about big data. They look at both pragmatic issues -- will big data…