A detailed description of all the tables that a database consists of is referred to as a data dictionary. A data dictionary will contain all the attribute characteristics and names for every table found within a database system. A data dictionary can also be referred to as a metadata repository. This is because it contains information like relationships with other data, meaning, usage, format, and origin. According to Coronel, Morris, and Rob (2011)
the data dictionary is a vital component for any database management systems as it is used for determining the database structure.
Data fragmentation is a characteristic of database management systems which allows for the breaking of a single object into two or more fragments or segments. The object being broken down might be a system or user database. Each of the broken fragments can then be stored at any location on a computer network. This would allow for easy access by the users. The breaking down also allows for easy access to relevant data instead of having to filter thorough different tables.
In a data warehouse environment, the access layer is referred to as a data mart. It is used for delivering data to the users. It is a subset of data warehouse and is oriented towards a specific group of people. It also provides decision support to the users. There are some deployments that each department within a business owns its own data mart, which includes software, hardware, and data. This allows each department to develop, and manipulate their data as they deem fit. The modifications they make would not alter the information found in the data warehouse or data marts.
The process of discovering various patterns in large data sets is referred to as data mining. Its main goal is transforming data into understandable structure that can be used for decision making. It employs automated tools in the analysis of raw data in a data warehouse. Using data mining tools one is able to identify possible anomalies and relationships in data.
Hebda and Patricia Czar (2009)
posits that data redundancy would occur when the same data is stored in different locations within the same data warehouse. This replication of data would be unnecessary. Data redundancy occurs mostly in database systems that have repeated field in two or more tables. This would lead data corruption and anomalies, which should be avoided during the database design. The use of foreign keys would also eliminate data redundancy.
Similarities of the data models
These models act on data held within a data warehouse. The similarities between the models are they all ensure that there is no inconsistency within the data stored in the data warehouse. Eliminating inconsistencies ensures that the data is not corrupted and has no anomalies, which would allow for a clean database. The data dictionary and the data mart are similar in that the dictionary will contain information regarding all the tables and structures found within a database. The data mart will be geared towards a specific group, or department and without the data dictionary one would not know which tables are geared towards which department.
The data dictionary eliminates anomalies by providing a structure and description of all the tables and fields within the database. Data fragmentation allows for different users to view the data relevant to them, which eliminates anomalies and corruption. Reducing the number of fields a person or group views ensures that they will not mistakenly edit fields or tables without them realizing. With a reduced view of the database, there is little chance that a user will corrupt any data. The data mart and data fragmentation are similar in function as they reduce the number of fields or tables that a user can access. The data mart will allow a specific group to access only the data relevant to them, and they can make alterations without them affecting the overall data in the warehouse. The group will manipulate the data in order to fit their specific needs.
Data redundancy shares similarities with data dictionary. These two models ensure that there is no unnecessary replication of data. To ensure that there will not be any redundancy, the designer will need to ensure that they make good use of foreign keys. Foreign keys will allow for data to picked from other tables without replicating the same data. This way data will be consistent, and there will not be any data corruption.
Data mining would be used to discover and make sense of the data found within the data warehouse. Without the data mining tools, users would not understand the meaning of the raw data. In order to use the data mart and data fragmentation the users will need to use the data mining tools. This way they can make sense of the data stored. The tools will also allow the users to understand the data, and in case they need to make any changes they will know which tables and fields they need to change. Data mart and data mining are both used for decision support.
Differences of the data models
The data models are all used in a data warehouse, but they serve totally different purposes. Data mart and data fragmentation serve different purposes. While the data mart will provide an access layer for different groups, data fragmentation breaks a single object into two fragments in order to provide ease of access to the different groups. A data mart is used to deliver specific data to users. Data fragmentation breaks objects and stores them in different locations within the network. This allows for improved performance of the database. Breaking objects provides the various users with access to only the data they request and this improves the speed of data fetching, mainly because only the relevant tables are fetched and not the whole database. Data mart on the other hand delivers specific tables to the users and allows them to modify the data without affecting the whole database Abdelhak, Grostick, & Hanken, 2012()
In order to get relevant information from the database one would need to use data mining tools. Data fragmentation and data mart deliver data to the user, but in order to understand the raw data one will need to use data mining tools. Filtering out unnecessary data will allow a user to make sense of the data delivered to them by the other models. Data mining is different from data redundancy as it is only concerned with data processing and does not check for data replication.
Data redundancy should be reduced to ensure that no data is replicated, which can cause errors and corruption. Data dictionary defines the structure of the database, and determines the relationships between the various tables. Without a data dictionary, the database design would be faulty and would likely lead to data redundancy. These two models are different in their functions though they both are used for eliminating anomalies. The data dictionary provides a schema for the whole database and data redundancy ensures there is no data replication within the database.
Functions of each model
In order to have a good database management system one will need to ensure that they employ these models. The data dictionary would allow the designer to establish if there will be any redundancies within the database, which would mean the designer will need to employ foreign keys within the database tables. The database structure is developed using the data dictionary. Having descriptions of what the various tables and fields within the database are will allow the designer to establish early enough the relationships between the various tables. This would allow for the elimination of data redundancy. Data dictionary also allows other people to understand what data each table holds, which would allow the developers in the database design. In case, the designer is not available other people can reference to the dictionary in order to establish the various tables and their uses.
Data fragmentation is used to break an object into two or more objects. This allows for data to be shared between different groups and ensuring that each group only has access to it relevant data. This fragmentation enables data to be stored in different locations within the network, and this improves the database performance. Breaking objects into smaller objects allows users to only fetch the required tables and not the whole database, which protects the data integrity as no error modifications are likely.
The data mart delivers data to users. The data delivered can be manipulated and modified and it would not affect the data warehouse Hinchcliff et al., 2012.
This way different group can access the same data and no corruption would occur since changes made by one group would not be effected to the others. Data mart is also used for decision support. Users can use the data they have access to in order to understand and deliver decisions.