This paper examines the tension between the benefits of large-scale data mining and the privacy rights of individual users. It traces the shift from traditional data warehousing to relational databases and OLAP technology, explaining how this evolution has made it easier to track and profile user behavior. The paper considers the implications of legislation such as the 2002 Homeland Security Act, discusses how law enforcement agencies, advertisers, and hackers can exploit data trails left by users, and weighs whether legal rather than technical solutions offer the most viable path toward preserving privacy in an era of expansive data collection and analysis.
One of the key requirements of data mining is timely access to relevant data. However, privacy concerns arise in the collection, storage, mining, and analysis of large-scale databases containing personal or sensitive information. Particularly following the passage and subsequent renewal of the 2002 Homeland Security Act, individual Americans have grown increasingly concerned that their retrieval and use of information is being tracked β not simply through physical libraries, but also via the World Wide Web.
Although there is a legitimate concern that users can leave an information imprint and compromise their privacy when accessing relevant data, one cannot ensure that no data fingerprints are left on personal or public hard drives or search engines when users access information through large-scale data warehouses. For instance, a student researching hate groups could be unfairly targeted as a potential sympathizer with such organizations if he or she leaves a record on a university's hard drive.
Until recently, data warehousing traditionally focused on relational technology β namely, the storing of large amounts of data β which made privacy less of a concern than the current shift toward relational databases. Relational databases are typically better equipped to handle ad hoc, speed-of-thought analytical querying for large user communities. This transition has significant implications for how personal information is handled and protected.
With the use of online analytical processing β otherwise known as OLAP technology β the privacy of individual users may be more easily compromised. Patterns of use are more readily formatted, tracked, and profiled. This is a favorable development for law enforcement agencies, but not necessarily for innocent computer users, although users do benefit from OLAP technology in the short run, as they are able to more easily focus and refine their queries.
However, the more widely such databases have come into use, the more effectively trackers have been able to create patterns and directed queries through masses of data. As a result, individual hackers, law enforcement agencies, and others β such as advertisers β are increasingly able to trace a user's virtual footprints across the web (Hyperion, 2000).
"Law versus technology as privacy safeguards"
Users may be able to block advertising and spam mail directed toward their email addresses as a result of their search patterns, but more serious legal ramifications may result, and these would require additional technical precautions that do not yet exist, or are not yet feasible on a wide-scale basis. Truly creating a safe and private search environment for law-abiding Internet users remains an unresolved challenge at the intersection of technology and policy.
You’re 76% through this paper. Sign up to read the remaining 1 section.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.