Paper Example Undergraduate 8,823 words

Integrating Heterogeneous Data Using Web Services

Last reviewed: May 9, 2011 ~45 min read

¶ … solution of the heterogeneous data integration problem is presented with the explanation if the criteria to be employed in the approval of the validity. The tools to be used are also indicated.

The proposed solution is to use semantic web technologies (Semantic Data Integration Middleware (SIM) Architecture) for the initial integration process (Cardoso,2007) and then couple it with broker architecture to improve integration and interoperability while solving the problem of multi-level impedance (Kashyap and Sheth,2002).

For an elaborate diagram see figure the figure below.

Integration via the semantic web technologies According to Barnett and Standing (2001) the rapid developments in the business environments due to the adoption of internet-based technologies have resulted in the need to implement improved business models, development of improved network systems as well as alliances and the implementation of creative marketing strategies. The strategy to be developed for integrating heterogeneous data must take into account the organization-specific data and the general information based on the internet. The whole idea is to come up with a semantic web that is beneficial to individuals and organizations alike. In efforts geared towards the gaining of competitive advantage, organizations employ business-mediated channels in an effort to create internal and external. This is through the formulation of technology convergent strategies (through heterogeneous data integrations) and the organizing of resources based on knowledge and the existing relationships between the knowledge based as pointed out by Rayport and Jaworski (2001). The internal and external value is created on the basis of the information available and the organization of the resources related to knowledge and the corresponding relationships. This requires organizations to identification of the various data assets. The data assets could be in the form of relational databases, plain text files, web pages, XML files, and Electronic Data Interchange (EDI) document and web services. The proposed solution for this project should be able to integrate information from autonomous, heterogeneous and distributed (HAD) data schema. As pointed out by Ouskel and Sheth (1999) three forms of heterogeneity can be achieved. These are syntactic heterogeneity in which the technology used in the support of data sources is different (such as databases and webpages). In order to provide transactional data, it is important to make use of The Extensible Markup Language since it effectively provide consistent and reliable ML streams and web services (XML,2005). The second type of heterogeneity that is to be achieved is schematic heterogeneity which involves data source schemas that possess different structures. Semantic heterogeneity is the last form of data stream that is to be achieved by the proposed solution. XML is to be used in order to provide syntactic interoperability (Busler,2003). Its downside is that it lacks the required semantics for the current web environment (Shabo et al., 2006). The proposed solution should be capable of solving the semantic heterogeneity problem by enabling the autonomous, heterogeneous and distributed systems to share as well as exchange information in a manner that is semantically viable as pointed out by Sheth (1998). The solution is to employ the capabilities of semantic web via the concept of shared ontology. One of the main impacts of employing semantic web services is their ability to impact the organizational need for data integration from semantically dissimilar sources. The fact that semantic web services have successfully been deployed in Bioinformatics, Digital Libraries and the rest is a great motivator for the success of this project. The solution to data integration in this project entails the use of Semantic data Integration Middleware (SIM) and its consequent integration with the broker architecture to improve integration and interoperability. This is as a means of solving multi-level impedance for top notch unified data integration. Semantic data Integration Middleware (SIM)

This is a special data integration technique with a basis on single query. The technique effectively integrates the information that resides in different data sources having dissimilar structures, formats, schemas as well as semantics. The data wrapper or rather extractor knowledge is used in the transformation of data to semantic knowledge. The middleware extractor is ontology-based and multi-sourced as pointed out by Silva and Cardoso (2006). The SIM is made up of two main modules; 1) Semantic Transformation module and 2) the Syntactic-to-Semantic Transformation module (Cardoso,2007).

3.2 Semantic Transformation module

The Semantic Transformation module is responsible for the integration of the data that resides in various different data sources that possess dissimilar formats, schema and structure.

Syntactic-to-Semantic Transformation module

This module is used to map the maps XML Schema documents to the already available OWL ontology. It is also responsible for the automatic transformation of the XML instance documents onto the separate instances of the mapped ontology as pointed out by Rodrigues et al.,(2006). This module is critical for the operation of transforming XML-based syntactic data to a semantic one by means of OWL.

3.2.1 The Semantic data Integration Middleware (SIM) architecture

The Semantic data Integration Middleware (SIM) architecture is important for the process of integrating heterogeneous information since it is used in solving the problem of semantics that is inherent in the XML data schema and representation. Our choice of semantic data representation emanates from the fact that it marks the most current and most efficient state of data representation (Cardoso, 2007,p.2).The SIM architecture is illustrated in the figure below;

Figure 1: The SIM architecture (Source- Cardoso,2007).

The Semantic data Integration Middleware (SIM) architecture has four main layers. These are; the source of data, the Schematic transformation layer, the Syntactic-to-Semantic transformation layer and finally the ontology layer. The correlation between these layers is indicated in the diagram above.

Sources of data (D)

The data sources are the ones that dictate the scope of the information integration system. The diversity of the data source provides an enhance level of data visibility. The Semantic data Integration Middleware (SIM) architecture connects the formats of the database like the unstructured (such as plain text and web pages) semi-structured (XML) and structured databases (such as relational databases). The data sources can include other unmentioned formats.

The schematic transformation

The schematic transformation of data source (D) to XML is executed a module that integrates the data from different sources having different structures, formats, database schema as well as semantics. The module employs a data extractor that is multi-sourced in the transformation of the available data to XML.

The transformation from Syntactic to Semantic

This process is carried out by a module that employs the JXML2OWL framework so as to map the XML Scheme to the already available OWL ontologies. The module transforms the instance of XML into separate independent documents that are appropriately mapped into the ontology.

The Ontologies (OWL)

The Semantic data Integration Middleware (SIM) architecture brings about the capability of extracting data from various sources having different data types (structured, semi-structured or semi-structured) and then wrap the outcome in a Web Ontology Language (OWL) format (OWL, 2004). The importance of this is that it provides a homogenous data access to otherwise heterogeneous data sources. The adoption of OW ontology is based on its preference by the World Wide Web Consortium (W3C).

The semantic model

NIST (1993) described a semantic data model as a conceptual data model within which semantic data is included. The implication of this is that the model is a description of the meaning of the various instances. The semantic model is therefore an abstraction that is utilized in the definition of the instance data (stored symbols) correlate to the real world situations. In order to effectively conceptualize a given areas in a format that is machine readable, an ontology such as OWL is employed. The function of the ontology is the promotion as well as the facilitation of system interoperability to enhance intelligent processing and reuse the available knowledge. The ontology therefore provides a shared understanding of a given domain.

The schema of ontology defines both the data structure and the semantics. The extraction process can proceed without a schema. The ontology is important for the creation of the mapping between the schema and the data sources. The ontology also provides the specification of the query. As Rodrigues et al. (2006) pointed out the framework employed is JXML2OWL which has two subsystems; JXML2OWL Mapper and the JXML2OWL API. The JXML2OWL API is a reusable library that is also both generic and open source that is used to map the XML schemes to the OWL ontologies.The Mapper on the other hand is special application that is Java based and has a graphical user interface (GUI).

The documents that can effectively be mapped by the JXML2OWL to the OWL ontology are DTD, XMK and XSD. The process of mapping takes some time in a series of steps. The initial step is the creation of a new mapping project as well as the loading of XML schema and the OWL ontology. Should the XML schema be missing, then the JXML2OWL would come up with an appropriate schema. This step is followed by the creation of class mapping by the user. The mapping takes place between the loaded XML's schema and the ontology classes. When the mapping has been created, it becomes possible to come up with a relationship between them so as to come up with the mapping of object properties. The property mapping can be achieved by relating the XML schema elements of the objects. In the final stage there are possibilities of exporting the rules of transformation that are generated in accordance to the mapping that is performed. The semantic integration of information is a very costly and difficult task. The use of Semantic web can assist in the process of integrating multiple heterogeneous data schema via the mapping of the schema to a single or multiple ontologies. The main aim of the SIM architecture is to present a common understanding of a given subject and then acquire the heterogeneous systems by means of a Semantic web technology. The whole process is supported by a special ontology schema that offers a semantic representation of data that provides the user to access shared data that is process by tools that are highly automated.

3.2.1.1 The components of the SIM architecture

The Schematic transformation

The middleware is responsible for the integration of data. It should present the users with the capability of concentrating on what form of information is needed while leaving the details of how the information is to obtained and integrated while concealed from the users as pointed out by Silva and Cardoso (2006). In a nutshell, the system for integrating data should avail a mechanism to be used for seamless communication with autonomous data sources, perform queries on heterogeneous data sources and aggregate the results into a data format that is interoperable. The main challenge therefore lies in the mechanism of bridging the schematic, syntactic and semantic deficiencies that exists between the sources of data. In other words, the main problem lies in the tackling of the data source heterogeneity problem. As mentioned earlier, three forms of data heterogeneity are possible when integrating data from autonomous, heterogeneous and distributed sources of data. These are; Schematic heterogeneity (schema of the data sources are different), syntactic heterogeneity (the technology used in the support of the data sources are different ) and semantic heterogeneity (in which the sources of data have different nomenclatures, concepts, meanings and vocabulary). The Schematic Transformation module's function is to integrate the data emanating from different sources of data and the resolution of schematic heterogeneity and schematic heterogeneity. The solving of the semantic heterogeneity is carried out by Syntactic-to-Semantic Transformation module.

The architecture

Cardoso (2007) presented the architecture of the Schematic Transformation module (see figure 1). In comprises of two main components; Extractor manager (employed in the connection of different sources of data that are recognized by the system and subsequently performs data extraction operations on them).The fragments of data that have been extracted are then appropriately compiled so as to generate instances of ontology. The second function handled by the is the mapping of the obtained results between the sources of data and the ontology schema. This role is carried out by the mapping module. The intersection of the ontology classes and attributes with the various data sources is what produces this information. The information is used in the formation of data extraction schema to be used by the extractor module in the retrieval of data from the various sources.

The architecture also has a module called Query Handler that receives and subsequently handle the multiple queries from the data sources. The other module is the Instance Generator that provides information regarding any form of error that might occur in the process of data query and extraction. The other very important module is the Ontology Schema that maps the data appropriately.

The module for mapping data

The mapping module makes it possible to map the data source that is located remotely with the ontology that exists on the local machine. The process of mapping the information takes place by the crossing of information between the data sources and the XML schema. The process may give rise to two extraction situations. This is dependent on the characteristics of the data source. The data source may be sing instance (such as a document describing a vehicle model ) or be multiple data records (like a document describing several vehicle models). The situation or rather scenario is what defines the nature of mapping and data extraction.

The Extractor manager module

This module's main function is the handling of data sources used in the retrieval of the raw data as outlined in the query parameters. The techniques of extraction vary according to the data source. This means that the extractor must be able to support different methods of extraction. The mapping and extractor architecture are open so as to allow for the seamless extension of the supported data types, extraction methods as well as languages.

The Schematic of Transformation module accomplishes its role by obtaining the schema of the data to be extracted, then obtaining the definition of the data source. The final step is the data extraction process. After the system processes the query, the system performs data extraction so as to fulfill the query. The process of extraction is carried out on the basis of the attributes. The extractor then retrieves the data using schemas of the desired attributes and hence showing the extractor the mechanism to be used in the execution of the data extraction process. The attributes have an association with the sources of data and they have characteristics of their connections. The extractor must therefore determine how to effectively connect to each and every data source. After the extraction schema is retrieved, the extractor then determines the definition of the associated source of data so as to access it. The process of extraction can then go on. The extraction process is mediated and involves the use of wrappers and extractors.

The generation of instances

The instances are created by a module called instance generator. The module's work is to serialize the format of the output data as well as error handling. The Schematic Transformation module effectively converts the unstructured, semi-structured and structured formats to eXtensible Marker Language (XML). The process of generating XML instances is automatic. This is due to the fact that the information that is extracted conforms to the XML schema.

The handling of queries

The handling of queries is carried out by a special module known as the Query handle module. A query is defined by Suciu (2003) as a process of generically transforming databases. In other words it refers to a function for mapping a relation to another relation. They are the events that put the Schematic Transformation module in on its course. The data input is based on a semantic query language of a higher level. The query is then converted to represent the various requests on the basis of XML elements. The extraction module and the query handler communicate via Syntactic-to- Semantic Query Language (S2SQL) as pointed out by Cardoso (2007). The Syntactic-to- Semantic Query Language (S2SQL) is a much simplified SQL in which the location of data is much more transparent when considered from the query point-of-view.

The transformation of syntactic data to a semantic one

The weaknesses of XML make it necessary to devise better techniques of data integration. This problem can be handled effectively by the adoption of Semantic Web technologies like RDF, OWL and RDFS ontologies. The functions of the ontologies it to come up with as semantic definition to be used in the integration of data. A module is employed to enable the transformation of syntactic information infrastructure that has its definition in the XML file to become a semantic data infrastructure by means of OWL ontology. The module for transforming syntactic data to a semantic one has mapping support and is automated fully.

3.3 The broker architecture

What is context broker?

A broker refers to a central medium that necessitates the transfer of information. A broker is a common address or gateway that is used by various clients in the process of accessing various services. The role of the broker is to interact with the a server application or multiple server applications. The main roles of a context broker are; to receipt of SOAP request form the client applications in XML format. The broker also initiates calls to various server applications. The composed calls include list of arguments for data input as well as the sequencing instructions for calling the server applications.

The aim of this section is to maintain the processes of decision-making in which multimodal information is got from a group of assorted, independent agencies. The field of social care and health has been selected because it is offering practical samples of every problem for which this research is seeking technical solutions. The approach of the IT fraternity is majorly obligatory, top down, big systems of IT which are surrounding every organization that is contributing. The organizations include all-purpose medical practitioners and acute hospitals. Besides, it also entails specialist services like pathology. In addition, in the United Kingdom, community services are also incorporated. This is due to the aim that is desired to create "flawless" services to the patient of the hospital who is later discharged and then being taken care of by restricted social services. Data warehousing is one of the approaches that is applied (Widom, 1995). In this approach, the sources of operational data are gathered into a data store that is very large and centralized. The approach of 'filing cabinet' is having the drawbacks of the duplication of data. Additional complications come up when the data that underlies is being 'owned' by organizations that are different. IT systems that are in large-scale and are integrated fully have been unsuccessful on numerous instances to come up with solutions because the construction, management, and evolution of the systems were difficult. For that reason, in a an environment that is speedily evolving, different answers have been recommended to facilitate adaptation to uninterrupted change and continuation of the responsible requirements that are associated with the possession of data which are confidential, at the same time permitting the universal sight of the data sources that are distributed. Hospital, Basic care practices, and services for community health are sovereign organizations. All of them are having their distinct information systems. They are also having harsh rules about the people who may have access to the information. However, the treatment of the sick will demand for entry to every record that is applicable to the situation. The main aim of this research is to discover the incorporation of data that comes from numerous sources, because of its always varied character, for instance, in conditions of features like design, semantics, implication, significance, excellence, possession, cost and also moral control. If data is created, customized, and stocked separately, its incorporation needs binding when demanded. In the management of data that is centered by humans, this task is frequently executed by a broker that is a human being (like travel agent), who is capable of integrating data when demanded. This research is utilizing a model that is similar to this one. The function of IBHIS is to come up with a service for dealing with information. The service should be able to sustain the steady incorporation of the data that is held and administered by various independent agencies. The capability of this method may be got in numerous fields in addition to healthcare (tour, entertainment among others.). Evaluation of the project is via the expansion of numerous prototype brokers.

IBHIS architecture (Bennett et al.,1993).

Brokerage, the various data sources that are distributed are acting as a universal resource to be used for inquiring the associations that are in the middle of the sources or just within the sources. This will permit the data sources that are in existence to carry on with their activity that is operational on their information and also to maintain the ownership of the information. Brokerage is building on interoperability and integration of information so as to offer arbitration to determine impedance at numerous levels (Kashayap and Seth, 2000). The advantages that may result from the broker approach include the following. First, it supports numerous, sovereign data sources. Secondly, it deals with semantic, syntactic, and heterogeneity of the systems. Thirdly, it is dealing with information that is distributed globally. Finally, it is providing a path that is leading towards the detection and right of entry to the resources that are having information.

The broker attempts to provide those who are using this approach with a modified and necessary picture of the highly important information, the time and place where the information is demanded, using with authorization the information that is received from the sovereign systems. There are several approaches to the difficulty of the incorporation of diverse sources of information. This depends on several constraints like sovereignty, safety, performance among other things. There are diverse implementations of the arbitrated approach. This includes the systems based on agents; knowledge-based data brokers, web data recovery, the systems of brokering and the system of merged. Publication of a comprehensive survey of systems like those was made by Paton et al. (2000). In the health domain, numerous information brokers and mediated systems have been used in the area of health. Grimson et al., (2001) are presented incorporated views of the data from the patients from the distributed information and heterogeneous systems by the use of the federated approach. The service of health is a very compound domain. Therefore, this research focuses on three main areas:

The first one is the degree that broker architectures are capable of supporting the incorporation of mixed data to one view of information for the sick patient.

The second one is the manner in which privacy and security ought to be tackled in a robust ethical situation. Properties like that must be given with a very robust audit function.

Thirdly, to what degree the development of the information technology of the health service may be maintained.

The approach of this research is using an experiment that is having three stages. The research is reporting in detail on the first stage that has been experimented in reality. It is also focusing on a merged scheme approach that uses an inactive dealer with services having access to data. Secondly, a service based approach using a passive broker. There is also the use of an approach that is service based. It uses a broker that is active. A "passive" broker requests for information when demanded and is also offering data's view that is customized. A broker who is "active" will notify the person using it of the main changes. The concept that is essential is that the person using it will ask questions to IBHIS. Consequently, IBHIS will interrogate the 'local 'information access services. It will then and combines the results.

3.3.1 Analysis of the Context Broker Architecture

The context architecture is the core of CoBrA. The roles of the context broker are to provide a model that is centralized and shareable to all devices and services. The other role of the content broker architecture is to acquire all of the contextual data from unreachable sources via the devices having limited resources. The content broker architecture also reasons on the contextual data that is not easily acquired by the sensors. The content broker architecture also detects as well as resolves the inconsistencies in the knowledge that exists in the context of shared model. The broker architecture also protects the privacy of the system user through the enforcement of policies that are user-defined for the purpose of controlling the sharing as well as the usage of contextual data.

Context Broker

The design of the Context Broker (Source; Chen, Finin & Joshi,2004)

The context broker is to run on stationary and resource rich computers in an embedded environment. In order to enhance system flexibility we employ of UPnP as the service discovery architecture. The centralized context broker design is inspired by the need to provide a support to all the smaller system devices that posses limited resources that are to be used for acquiring context as well as reasoning. The employment of a context broker allows smaller devices like mobile phones to relinquish their burden of context knowledge management onto larger resource rich computers (servers) that possess the ability to reason with the data context as well as detect and resolve the possible inconsistencies that exist in the context knowledge. This is the same kind of technology used in some application running on Android-based phones that support cloud computing. In environments that are both dynamic and open, the centralized context knowledge management can be advantageous since it allows for the ease in the implementation of privacy as well as ensuring security of information flowing through the system. It is worth pointing out the fact that a centralized broker can be a great problem in a distributed system because it can create a point failure that is very difficult to diagnose and correct. In order to solve this problem, it is important that the proposed solution make us of the Persistent Broker Team approach as pointed out by (Kumar, Cohen & Lavesque,2000). The idea is to use a broker that ensures that not all brokers experience down time in the process of service delivery. Should any broker become unreachable as a result of network issues, the remaining ones that are active will initiate recruitment or install new brokers. In the broker network/team a special protocol called Joint Intention protocol is employed so as to make available the mutual benefits of the team work.

3.3.2 Elements present in the context broker

The Context Broker Base

This is an ever-present source of context knowledge which has the role of providing API to the other parts of the context broker so as to allow it to effectively access the knowledge that is stored. The Context Broker Base contains various ontologies of a given smart space (The ontology could be for an intelligent dashboard).It also contains heuristic knowledge which has association to the given space.

The engine for context reasoning

This engine is based on a reactive inference principle and reasons against the stored context knowledge. The form of inferences that occur in the engine are the ones that employ ontologies for deducing contextual knowledge and the inferences that employ heuristic knowledge in the detection as well as the resolution of inconsistent knowledge.

The module for acquiring data

This contains a library of relevant procedures that make parts of the middle-ware abstraction of context acquisition. This is the role that is played by the SIM architecture that was described in the first section. The role of the data acquisition module is much like the one played by the Context Widgets that are available in the Context Toolkit (Dey,2000); the shielding of the low-level sensing mechanisms from applications that are considered high-level.

The module for managing policy

This module contains a set of rules used in the inference process. The rules are the ones that deduce the knowledge to be used in policy enforcement.

The design requirements for the brokering architecture

The following are the design requirements for the brokering architecture

A centralized database

This is used in the storage of data. It stores metadata from various application in an integrated and yet unified format.

A flexible model for data storage

The data repository must be capable of storing the available content while being able to make the data extensible for future applications.

Scalability

The repository should support larger chunks of information

Performance of query

The data repository must be able to support query functions

Distributable

The context broker should have a distribution agreement with the various requesters. The agreement should be uniform. The context broker should conform to the current industry standards.

Comprehensibility

The broker architecture should be fitted with an efficient reasoned to be used in the location of the appropriate learning objects. An OWL/RFD model is employed at this stage due to the fact that it is flexible and contains standard vocabularies e.g CanCore. These are included in an effort to fulfill the requirements of a distributable system. Centralization is achieved via the used of appropriate modeling that is based on a relevant. The key goals in the selection of the ontology are scalability, excellent query performance and integration.

The proposed broker architecture is made up three major components.

Broker

Requestor and Provider

The requestor that is chosen should be able to meet the specific requirements of a given application.

3.3.3 Broker Architecture for Integrating Data Using a Web Services Environment

When the occurrence of change might be important; IBHIS applies conventional federated scheme resolution to integration. In the version that follows, it will be substituted by a broker that is a service which is capable of dynamically locating and binding the sources of data that are not time fixed. The structural design of the IBHIS broker that comes first is a combination of FDBMs, the use of ontology, and software that are service-based. Models that are service-based are realized in the architecture of the first stage through the use of web services and also open protocols like SOAP, Java, UDDI and WSDL. Simultaneously, the universal view of data that is distributed is attained through the establishment of a number of amalgamated schemes in line with the requirements of the user. Registration of data source and scheme integration is fundamentally believed to take place only once during the time of set-up. In the domain of Social care and health, this is sufficient as a sample; the likelihood of the sources of data not being available all through the periods is taken into account. IBHIS is presently operating both between and within comparatively little sources of data. They are entirely holding 'real' data. The concept of 'meta-IBHIS' is lending itself to an architecture that is scalable. However, the effectiveness of this model across numerous sources of data is a field of investigation.

The people who are using the IBHIS broker are supplied with apparent admission to the assorted, disseminated services of data access when the stage of set-up is finished. As the period is going on, the registration of the users and the underlying data services takes place. The system administrator creates amalgamated schemes and also determines every semantic disparity.

Operational System

The acquisition of an inquiry from those who are using the models; identification of access rights, location of appropriate sources of information and the submission of results to users are the major objectives of operational system. For this functionality to be provided, the operational system should be consisting of five web services that are communicating and also an interface for the users. The services of the web are interacting in line with the models that follow; (Vinoski, 2002).

Firstly, a service for data access promotes the definition of its WSDL into a registry called UDDI. In our situation, UDDI is incorporated in the service of registry. Secondly, the client will check definition of the service at the registry. Thirdly, the client will use the data that comes from WSDL definition for messages to be sent or for the requests that are to be made directly via SOAP to the service.

Access Rule Service (ARS).

The initial authentication of the user and also the authorizations that follow are all done by ARS. In the architecture of ARS, it is mainly dealing with the authorization of entry to the data services that are available. The services themselves are responsible for the approval of additional system resources. The first approval has its basis on encrypted passwords and usernames. The users will sign into front-end. The front end will in turn pass the credentials of the user to ARS. The present solution is using Role Based Access Control. Here, every role is having its distinct access rights.

Several researches involving Role Based Access Control have been conducted. There are specific researches that show how RBAC can be used in the domain of Health.

Access Rules

Broker architecture that Integrates Data by the means of web Services settings is developed. All these, together with the rights of access for the roles that corresponds, are applied in the formation of a user profile. Coupled with Federated Schema Service, the profile user will be capable of identifying the features of the amalgamated schema that the users are permitted to view.

Federated Schema Service (FSS) and Query Service (FQS).

Federated Schema Service (FSS) maintains federated schema and every mapping that occurs between federated schema and export schema. FSS is consulted by the Federated Query Service consults FSS as the process of integration and query decomposition takes place. Federated Schema and also the mappings that correspond to Export Schemas are formed as the IBHIS broker is set-up.

The FQS is composed of sub-modules that are two in numbers:

The first one is query Decomposer. This is responsible for the decomposition of the amalgamated query into a number of local queries. This work is conducted in collaboration with FSS.

The second one is the Query Integrator. This is responsible for the reception of local results that comes from the services that access data. After the reception, it is responsible for the integrating them to form a record that is federated.

FQS transmits federated query and also the amalgamated record to audit service. This process comprises of sub-modules which are two in numbers. These two sub-modules are used in the tracking of all the actions of IBHIS that ought to be audited sometime in the future.

The User Audit is holding information like: log-in time and date for the users among several other things.The System Audit is holding information that concerns the sources of the data such questions include the date and time when registration was done among several other things. It also includes the setup for the user.

Data Access Service (DAS).

DAS is made by web services. However, unlike the usual web services, they are demanding a lot of data and besides; they are responsible for the provision of data from the relevant sources. The administrator of the broker puts into practice and also illustrates the service through the use of WSDL. Web Services Policy Framework is also used. On top of this it presents the consumers of the goods and services with the information listed below:

1. The amount of data that is provided by the DAS and also the format that it will take.

2. The functionality and the domain that is associated with the data.

3. The security procedures that are needed before a person uses the service

4. Other characteristics that is not functional. This may include the cost and quality of the service

Thereafter, the administrator will publish the file to offer explanation to the registry service. For instance, DASs themselves are capable of programming by the use of several languages. It is also capable of accessing data sources that are provided by several vendors. Sometimes, they might run on several operating systems. However, the DASs are providing united ways of accessing data. This is very vital in the health services of the United Kingdom. In this place, sources of data are draw from numerous independent organizations. They apply a wide range of diverse technologies.

A highly exhaustive research of appropriate environments and tools for implementation is carried out.

The investigational system comprises of three databases that are having data that concerns the imaginary inmates:

1. Essential Information of the inmates

2. The history of incarceration

3. Additional crimes

There are users having diverse levels of authorization in line with their roles. The users are three in number. This first trial product has attained a considerable application by the use of web services. Again, knowledge has been got with the environment and toolsets of the web services. This has provided a guarantee on technology. Widespread understanding of very compound domain has been attained. Again, the implementation of fundamental design of brokers has been attained. This trial product has proved to be very constructive in the re-confirmation of the classification of the main problems.

3.4 Discussions

There are certain issues that surround the implementation of IBHIS framework. Most of the issues affects software as a service-based application and have been extensively studies in previous work (Kagal, Finin and Koshi,2003). These are; the formulation and management of supply chain

The issues of supply chain in regard to the component-based software engineering (CBSE) environments have been discussed in several literatures. The solutions that are suggested are mainly concentrated in the use of repositories as well as documents and are indicative of component attributes (Lassila and Adler,2003).The suggested solutions are favorable for scenarios that entail stable user requirements that are not the same as the rapidly changing and dynamic environment of the IBHIS Architecture. The available web service is only capable of addressing the technological bottle-necks that are created as a result of the heterogeneity of the various applications, components and middleware. The problems caused by the rapidly changing environment are never addressed by the web services. It is essential to optimize the supply chain because in the service environment, the required software services are procured and then subsequently taken through assembly from the components and subsystems in the supply chain. The solution to this dilemma is the employment of a negotiation description language (NDL).

The broker environment for the integration of data in a web service scenario

It is important that the negotiations along the supply chain be automatic. This is necessary for the automatic adjusting of participant activities in an effort of optimizing the supply chain performance. The negotiation description language (NDL) provides negotiations that are many-to-many. In this environment, the various participants can obtain information simultaneously regarding suppliers and therefore react.

3.4.1 Advantages, benefit and strength of the proposed solution

The advantage, benefits as well as strength of the proposed hybrid architecture is that it combines the beneficial features of both the SIM and Broker architecture to form a dependable system foe data integration with an unparalleled level of interoperability, while improving the level of resource utilization of the data admission mechanism as well as balancing the resulting combined processing load. The outcome is a more sensitive and faster heterogeneous data integration platform. The heterogeneous infrastructure has improved stability as well as increased fault-tolerance characteristics. Improved resource utilization and better load balancing is attributed to the broker-based part of the architecture.

Improved security

Improved security has been noted to result as a result of employing a heterogeneous architecture that employs a SIM and Broker architecture. Brokers acts as mediators between the various software components that provide data processing services and the software components that makes use of the services. Independence is realized between the processing task as well as the logical data organization/algorithmic aspects of the data processing. The flow of information is secured as first level security a result of the flow being entirely self-identifiable. The system is able to detect non-compliant flows within the heterogeneous architecture and appropriate raise an alarm. This is courtesy of the SIM architecture.

You’re 80% through this paper. Sign up to read the full paper.

130,000+ paper examples AI writing assistant Citation generator Cancel anytime

Cite This Paper

PaperDue. (2011). Integrating Heterogeneous Data Using Web Services. PaperDue. https://www.paperdue.com/essay/integrating-heterogeneous-data-using-web-44446

Always verify citation format against your institution’s current style guide requirements.