Designing XML Databases
What exactly is a 'web-enabled database? The World Wide Web, as everyone knows, provides the user with a host of tools with which he cane gain access to information and knowledge on the Internet and browse for information using web browsing technologies. Numerous people also use web browsing in order to deliver marketing messages, advertising information, promotional material for any products, and so on. However, what is most surprising is the fact that the World Wide Web is rarely, if ever, used for what it really must be used for: the provision to interactive business information on the Internet, through which an organization may make any number of queries for information and then gain access to the information that the queries generate. An organization can also place orders on the World Wide Web, and get all its numerous statements and records updated constantly through the browser services provided on the Internet. In a nutshell, this is what a web-enabled database must be able to do, and this is what it does if it is implemented well.
The fact is that it is not a very difficult task to create and design and develop a web-enabled database, even though different circumstances may make it more or less efficient. There are many people who believe quite strongly that the web-enabled database technology is here to stay, and this is the way that the World wide web will be used in the future, when web-based business technologies catch up with he times and make a greater impact on the day-to-day activities of any organization that has some connection with the Internet and the Intranet and other web-based technologies. It is very obvious that the situation has been gradually showing a change over the years, when a web page was nothing but a mere reproduction of a corporate brochure wherein there would be nothing different from the material that would be printed within the pages of the brochure, now there are a lot of evidences of innovations. Generally, it is the United States of America that leads the way to these innovations, but there are some positive signs in the United Kingdom too. (Web-enabled Database Development)
There are some better websites that have been developed recently, and some examples are the sites that are run by the 'Railtrack' (http://www.networkrail.co.uk/) where one can look for the railway timetable with ease, and the 'National Express', (http://www.nationalexpress.com/home/hp.cfm) where one can book railway coach tickets online. These two sites definitely provide increased customer value by linking the existing web technologies with the available business systems. When the technical system within the business provides detailed technologies so that it becomes a matter of ease to implement them into the business systems, then it is easy to inculcate a web-based front-end system into a traditionally functioning web-based system. Generally, most industrial strength database systems offer so-called 'client-server' modules, and these can be used so that they may be able to build those applications that can be delivered through the World Wide Web. However, most companies face a problem when it comes to the issue of creating and developing and implementing an XML Database system that would be used on the Web. (Web-enabled Database Development)
The problem is not usually that of a technical nature, because such problems can be overcome with relative ease; but that of costs and buy-ins. The costs vary according to the country, and it has been found that in the U.S.A., the costs are infinitely less than the costs in the UK, and also, in the U.S.A. The management is not overly awes by technology as such, and this makes it easier to implement the latest technologies into the working systems of any organization that desires to do so. The costs, as stated earlier, are lower in the U.S.A. primarily because of the fact that it is in the U.S.A. that the Internet connection charges are about one third to a quarter less than the costs for the same in the UK. Most organizations in the United States of America will generally have a connection that they have been using for quite some time, and this makes it easier to implement newer technologies into their sites, whereas in the United Kingdom, even those companies that do need speedier and faster connections still do tend to use dial up connections instead of better and more permanent ones. In addition, people in general in the UK are unwilling to take on any new risks for their companies, probably because of a fear of threatened security or something else, while people in the U.S.A. And companies in the U.S.A. are more than willing to take up risks if they feel that there would be increased profits for their companies.
It is even more common in the U.S.A. For a company to take a large risk by innovating something, and then use the situation to its maximum for its Public Relations value, and then sit back and enjoy the profits that have been generated as a result, while their competitors try their best to catch up with them with their own latest innovations and other ideas. This phenomenon would not take place anywhere lese in the world because not everyone has the courage to leap into a situation like the people in the U.S.A. do; most people do not like to take risks, as they prefer not to take any sort of risks and generate less profits for their companies rather than take an inordinate risk and end up in severe losses for their company. Therefore, for those organizations that do not want to take risks and innovate, there will be in existence a database that will generally be connected to the company's internal network. This internal network may be using the TCP/IP protocol already, though in some cases it may not exist. If it does, however, there will be an inner security layer as well as an outer security protection layer, like for example, a firewall. (Web-enabled Database Development)
In a typical configuration within an organization that uses the application, there will be a 'server' who would run the Internet Web Server, and who would deliver web pages through the security system of an outer firewall to the publicly accessible Internet. However, if it an Intranet, then the delivery system would be mush more simplified than this system, and there would be no need for the protection of a firewall. Web pages can also be delivered in the age-old traditional manner, which some people consider to be quite a bore, because the web pages are static and unchanging as they are all pre-written pages. Such pages are to be found on most web sites anywhere. Most of the time, this type of page is sufficient enough for persons, who want textual information and knowledge, and who do no want or desire anything else from the web pages that they access. However, it is indeed possible for a server to be able to generate pages that are either based on the content of the data found on the pages or in the data that has been stored within, or it is even possible to generate data that has been extracted from the large database. This is a very interesting concept that must be explored further by the servers.
The basic parts and components of a web-enabled database are as follows: the organization that uses it must have a permanent link to the Internet, it must have a web- server, it must have a firewall for protection and security, and it must have software that would be able to deliver the active application through web pages. By Internet what is actually meant is that though the company may already have a good Internet connection in place that it has been using for some time for sending out e-mails and for the purpose of browsing over the World Wide Web, it must be ready to use it further for a web enabled database system. If in case the company does not have an Internet connection, then it must be ready and willing to install one, whatever are the costs. A 'web server' must also be present within the infrastructure of the organization that would wish to install and implement a web enabled database system. Three are in fact two kinds of web servers: one that would support secure transactions, and one that does not support them. (Web-enabled Database Development)
However, most organizations do not need 'secure' transactions as a rule, unless they plan to ask for credit card details or any other information that can be classified under 'secure'. A 'firewall' however is a must for an organization that wishes to install a web enabled database system because a firewall provides adequate security against the outside world and it's prying and inquisitive eyes. As far as web pages and other software are concerned, there are plenty of issues involved in it. The first step to take would be to decide what the company wants to actually achieve; as many people as there are in the world, there will be as many ideas to consider, and the management must be clear as to what sort of idea it wants to use in its web pages. What is being created is in fact a standard software application, the only difference being that this particular software application may be delivered with certain interesting additions such as graphics and animation, and these elements would make the web page a more interesting one, if only because of the reality that web users are now coming to expect better and more sophisticated and also interactive web pages today.
The basic idea must be to stay as close to the reality as possible, while at the same time making sure with utmost care that the delivery of the web page is attractive and also fast, because when a user becomes impatient, then the charm of a beautifully designed web page would be lost on them. After all, unless and until the end user is satisfied, there is absolutely no point in trying to make the page better and spending a lot of money in the bargain. Building a web enabled database system usually involves a 'Web Form' that would capture information from the users and submit requests wherever necessary. It is the truth that even sophisticated applications like the Java and JavaScript would be able to make the entire process easier, but unfortunately, it is not so. JavaScript is often mistaken for being flexible and user friendly, but it is a fact that it suffers from serious incompatibilities between different browsers, and this means that it would only play a cosmetic role in this type of application. If the browsers were more tightly controlled, like for example, when there is a need for using the application in an Intranet setting, then JavaScript would be more useful. (Web-enabled Database Development)
Java is seen as being a very god toll that can be used for the purpose of building database applications such as the web enabled database system, but it would be completely useless in a public Internet; it would be useful in the Intranet rather then in the Internet. Therefore, both Java as well as JavaScript has been proven to be less than useful for the purpose of building a web enabled database system. Therefore, plain forms for input and output are being used more often than not for the purpose of creating the web enabled database system, though it cannot be said that making interactive and more interesting web sites is impossibility. It is because specialist skill needs to be involved that most organizations hesitate at this juncture. This is because of the fact that tracking needs to be done once the web enabled database system has been created and implemented and there have been visitors to the site.
These visitors in fact have to be tracked, especially if this is not their first visit, and this would mean that a sophisticated tracking device would have to be implemented by a specialist. Even small interactive and interesting details like a shopping cart, or any simple personalization detail added to the site would make the site a better one, because of the fact that the customer would be able to view information in the way in which he wants it, and not in the way in which the company wants him to see it. Though all this would definitely take up a lot of time and resources, it would all be worth it in the end when profits are generated when more visitors come to the site, especially after the first time. (Web-enabled Database Development)
What exactly is XML? How can XML be used for the development of a web-enabled database? XML is nothing but an 'extensible Markup Language'. It is a subset of SGML that makes up a particular text markup language that can be used in web applications in the process of the inter-changing of certain structured data. The XML is also an extremely flexible way in which to create certain standard information formats, like for example, the web enabled database system, and shares all the information through the Internet with the help of a Web Page on the World Wide Web. The XML is a specification that has been created and developed by the W3C, and it is a pared down version of the earlier developed 'Standard Generalized Markup Language', which had been primarily designed and utilized for the purpose of creating Web documents. This language offers the designers of the web pages the advantage and the facility of creating their own customized tags, especially designed by them, that would facilitate and enable the transmission and the transfer and the validation and finally the interpretation of the data between applications, and also between organizations. (Definition of XML)
Another definition of the Extensible Markup Language is that it is a cross-platform and an extensible and also a text-based standard language that is the very best language that is available for the designer and the developer of such web sites, today. (The Source for Developers) Mark Graves, a software developer in the field of database development in a Bioinformatics and a Biotech and a pharmaceutical research laboratory, called Berlex Biosciences, has written the book entitled 'Designing XML Databases'. This is what he has to say about the Extensible Markup Language. He says that for a long time now, it has been widely acknowledged and recognized that Extensible Markup Language would be the language that would revolutionalize the entire World Wide Web, and would not only enliven and invigorate the existing economy, but would also be responsible for changing the way in which the world looks at data. (XML Databases for Bioinformatics)
Today, XML has spread like wildfire throughout the Information Technology Industry, and this has ended up in creating more and more opportunities for developers who must take advantage of this technology to improve their developmental skills and capabilities, since the XML is such a versatile and adaptable language. XML can, not only be used to develop databases, but can also be used for any number of other applications that the imagination can conceive of. XML certification is today one of the most sought after qualifications and all large companies do insist that the developer that they may hire be qualified with a certificate in the language. A developer therefore must have a strong knowledge of the fundamentals of the XML, and he also must be able to understand and assimilate the technology that the XML is based on so that he may be able to use it in his understanding of the concepts of the language. The developer must also have a thorough knowledge of the W3C recommendations in relation to XML, and be familiar with the various uses and practices of XML. Once he is clear about all this, then he can go about in his development of databases and all other related applications that use XML as a strong base. (A Certification Primer for XML and related technologies)
As websites are today becoming larger and more and more informative and interactive and dynamic, the need of the day is the difficulty of not only developing such sophisticated web pages but also maintaining them so that they are able to function at their best levels. The new XML or the Extensible Markup Language is one of the latest and the better methods for this very purpose. An XML application is not only used for the purpose of better maintaining a web site, but it is also used for the purpose of creating certain specialized languages within its capabilities as an application, such as the 'Web Site Description Language', or in other words the WSDL, and the 'Page Layout Language' or the PLL, as it is better known. These languages herald the changes that are coming about in the designs of web pages as compared to the designs of a few years ago. When the Internet initially came into use, the contents of the World Wide Web were quite small and insignificant, so small that it was possible for a small group of people to create it and build it and also maintain it with no problem at all. However, since the Internet and the World Wide Web has been steadily growing at an amazing rate and the older web sites are being constantly revised and changed as and where necessary, it has become glaringly obvious that change is a must, and the older and more traditional designs are no longer acceptable. (WSDL, a new XML-based site description language)
This may be because of the fact that there had been an incorrect focus on the issue, and the most commonly made mistake if the fact that the World Wide Web simply assumes that web sites are merely a collection of web pages, and this is where there is a drastic misplacement of the primary focus. When taken in this particular context, a web page is one single and individual HTML or the 'Hyper Text Markup Language' as it is popularly known. This HTML document had as a part of its content a few images and a lot of textual content in the form of style sheet information. However, this is not the same as a web site, since a web site is more than a mere collection of pages, it is a site that has plenty of inter-connections and inter-relations with other sites, and connects them through the network of the world wide web. The navigation capabilities of the user generally determine the extent of the success of the web pages and their design.
The Web Site Description Language uses the XML to address the all-important issues of page presentation and the basic structure of the web site. This language is also capable of describing external sources of content into the web pages that can be made into dynamic and robust sources of information, even if it only a database that is being constructed. There are in fact plenty of other systems that have been designed for the purpose of creating dynamic web page designs that the customers of today seem sot prefer more and more over the older traditional models. Some of these solutions are based on the CGI script and this offers a fully dynamic page design. However, WDSL is the only language that exists today that is innately capable of laying emphasis on and focusing on the entire web site as one whole. (WSDL, a new XML-based site description language)
There are numerous design techniques available today that would make the most out of the XML enabled database system designs. When XML initially began to be used, it was primarily considered to be a data interchange standard and nothing more or nothing less. However, as designers and developers began to understand the intricacies and the innate capabilities of the language and the variety of uses that it could be put to, XML began to be used more frequently in a variety of uses and for a large variety of purposes, including the function of serving as the core for certain developmental and deployment platforms like for example, Microsoft's '. NET' XML is now used for the purpose of serving as the means for the modeling of the various components of information systems, and these components have the ability to automatically constructing themselves around the facts that have been expressed with the means of XML. (Information Modeling with XML)
This is actually the real and genuine potential that is exhibited by XML, that is, the ability to be able to model the entire application with XML, just once, instead of in several different ways, for each and every component, separately, one from the other, repeatedly. All the tediousness of repeating and endless recaps can be avoided when this versatile language is used. It is a fact that when XML was used as a mere container for the data that was managed by legacy systems, there was virtually no need to consider anything other than the matter of syntax when in the process of building a document. However, today, since XML is being used to represent more than mere facts and data, it has become an important issue that grammar and style must also be considered while building and developing a database system using XML. It is imperative that the grammar as well as the syntax within the document must be at par with the expected standards, that is, the standards that are generally expected by the parsers.
When good and perfect grammar and syntax are used, it makes sure that the XML information that has been assimilated and expected to be contained within the database being created, there would be no need for certain specific and most often redundant domain knowledge to be able to interpret the application programs using XML. In fact, it can be said that a good style of presentation and good grammar and syntax would ensure that the performance of the application is extremely good and well appreciated, and this inevitable means that the basic functions of the storage and the retrieval and the management of all information becomes excellent. Therefore, when good grammar and style are used in the modeling and designing of information in XML, it means that flexible applications can be built with a minimum of effort on the part of the developer. (Information Modeling with XML)
XML allows the developer to model information systems and designing web-enabled databases in such a way that he can be intuitive and completely natural in his creation, and the reason for this is because XML allows the designer to express the information that he is in charge of in such a manner that it would closely resemble the manner in which an individual generally conducts business transactions. Now it is easy for the individual to know what he has to do rather than wonder about it and then begin the process of trying to figure out what to do, and then doing it, and this is definitely better than the system telling him how he has to do it but not stating exactly what it is that has to be done. Therefore, it can be said that the XML definitely does a better job of showing and demonstrating to the person using the system the way in which the real world works, in fact, in a much better manner than those several types of data modeling mechanisms that were created in the preceding years.
These are the capabilities that XML can be capable of bringing into the creation and the development of the designs for information modeling systems such as the web-enabled database systems based on XML today. Web enabled XML systems are also capable of numerous other things, and these are: heterogeneity, or the ability of the system to contain several different data fields, especially since in the real world nothing is ever expressed in neat tables and rows and tables of data that one can just glance at and assimilate. Therefore when one can express data and information in a realistic manner just in the same way it exists without too many restrictions. (Information Modeling with XML)
Another advantage and an important component of the XML databases is the fact that it is innately extensible. This means that new data can be added on whenever there is a need to do so. This also means that there would be no need to fear change, as the XML is completely adaptable to change. Therefore, whenever there is new information, it can be added on without prior planning, and without too many changes. The XML is also completely flexible, and this means that wherever the data fields vary in both their size as well as in their configuration from one application to the next, XML does not impose any sort of restrictions on the data, and this means that the data can be as short or long as desired and necessary. Another important advantage of XML is that it is a self-descriptive and an informational complete language, and when the developer chooses this language, he can be sure that the application would build itself up automatically, without much interference or intervention from his side. What this means is that there would be only a minimum of programming required. Some examples of such applications are those XML applications that are used by companies such as BEA, TIBCO and Microsoft.
In such cases, XML becomes a virtually universal 'information-structuring tool' wherein the components of the system need not be configured in an individual and separate manner as discreet silos of functionality. Therefore, when whatever information and data needs to be stored and retrieved and managed and maintained is expressed in XML, then this can be advantageous for the purpose of answering queries that have been expressed in XPath or XQuery patterns. The application development would to only be more rapid, but any types of changes that would have to be made can also be accommodated without having to undertake the modification of the underlying XML application base. The information systems would be able to accommodate and adjust themselves without the need for constant and excessive reprogramming and this could be of great advantage for designers and developers of web-based XML databases. (Information Modeling with XML)
How exactly dose XML express information? There are four basic components of the system: tags, attributes, data elements, and hierarchy. Each and every component has a simple and essential purpose and use, and each is representative of a different dimension of information. The data can be expressed within tags, and attributes can be expressed within the document so that the developer would know what to do with the data. One example would illustrate this: ,
GB resolution= 8>, and so on. The data is what has been expressed within the tags, and the attributes are the points that describe the data that is within the tags. Now the problem here is how to go about stringing the data together, and this is where hierarchy becomes important. Hierarchy is spatially expressed, and this means that whatever information is provided within the document has to be arranged in a manner that would provide meaning to the entire document, and this would mean that the hierarchy would be able to show the relationship of the various different elements within the documents to each other, thereby giving meaning to it.
Therefore, it can be said that when a designer wants to express information using the XML, he must primarily, at the outset, learn to identify the natural patterns that are inherent to it. This means that each and every element, that is, the tag, and the attributes, and the data elements, and the hierarchy, must be properly aligned. The previous example would demonstrate this point, where each data would have to be examined and analyzed so that answers to all the following questions could be found within the document: is all the information contained in the document actual data related to this document, or is it data related to another different data element, in other words, metadata? After examining each attribute, the following questions would have to be asked and answered: does the attribute tell the designer anything about the various means in which to describe or analyze or to present data elements? Is the attribute a metadata element, or is it a simple data element? Does it in fact apply to the entire data elements that are within its scope or not? The tags would also have to be examined, and the following questions must be asked: does this particular tag help to describe all the various data elements that are within its scope? The groupings or inter-relationships that have been created must be analyzed next, and the answers to these questions must be sought: are all the different elements within the group the way in which the parent nodes have described them, and is the basic relationship between the elements unambiguous or not? (Information Modeling with XML)
All the above questions must come up with a positive answer, or in other words, a 'yes', and if not, then the different elements or components would have to cast in a different manner. Some of the most common mistakes that are made when creating and developing a web enabled XML document are: an inadequate and insufficient us of the tags, which means that the data element would not have been described well or within it scope, and this would only result in confusion. In the same manner, when the attributes are not used well, it would be difficult to interpret the various elements within the document, and the result would again be that of confusion. An incomplete and improper use of attributes would also be a serious mistake that would result in the inadequate description of the element within the document. When data elements are used as metadata instead of tags, then there would be misdirection because of the fact that the use of names and pairings would be wrong or misleading, and there would be no way to ascertain the truth from the mistakes that have been made.
When the same tags are not used in a proper manner, and some of them are redundant or unrelated, or even totally unnecessary, then this would result in poor and wrong hierarchy construction, and the mistakes would be difficult to rectify. The ultimate result would be that the XML document that has been created would be robbed and deprived of both its power as well as of its practicality and usefulness, and therefore, it goes without saying that when some quality time is spent on good or better information modeling, it would be a great boon for the document users because of the fact that there would be fewer mistakes and the XML document would be one of great benefit to the user. The XML is so very adaptable and flexible that it is extremely easy to abuse it and when this is done, the very advantage of the XML is lost. (Information Modeling with XML)
A very simple and easy way in which to design an XML document would be as follows: XML has the great advantage over other traditional information modeling languages in that it is a definitely much better analog of reality. Therefore, even a simple method of designing an XML document would be bale to produce excellent results, like for example, this document that has been created for a telephone directory: Title: Telephone Directory Listing:
Name: XX, Address: XXX, City: XXY, State: XYZ, Zip Code: ZZZ, Telephone: ZXY. When all the above listed information was to be converted into an XML document, this would be the result: ,
ame> XZX ,
XXX,
XXY ,
XYZ ,
ZZZ,
ZXY
.
Now, supposing the client was to ask for some small changes, these could be included into the document without any major difficulty. For example, when there was a need to separate the name of the person into his first and his middle and his last names, it could be done like this:
X
Z
X
.
The above example shows how easy it is to change information within an XML document, and it also shows how adaptable and extensible XML is. Any additional information that needs to be added as and when necessary can be added, and changes can also be made as and when they are needed. Since it is a fact that most businesses today are global, this means that with the help of XML such documents can be created that are driven primarily with the goal of conducting business and generating better profits for everyone involved, in mind. The very bets advantage is that this can be done through intuition, rather than through the process of abstract computing paradigms. One example related to the pharmaceutical industry is that clinical trails in the industry start off with the forms that have to be approved of even before the computing systems that would be managing the information generated as a result of the trails can be designed or developed. (A very simple way to design XML)
This means that because XML is so easily adaptable and the forms can be converted into XML documents with such ease, XML can be used for the purpose of building information systems that can prove to be extremely useful and practical for all purposes. The best way, according to some developers and designers, to create and develop an XML document would be to start off with the assumption that this is a simple manually created document and not something that can be used on a computer at all. This would enable the designer to get rid of all complexities and keep the document an innately simple but at the same time an extremely useful one that would serve its purpose very well indeed. (A very simple way to design XML)
How can XML be integrated into broader enterprise systems, especially in the field of scientific research? Is XML useful in this particular field? If so how? The report brought out by Eric Miller on Semantic Web Activity, entitled 'Revolutionizing Science and Engineering through Cyber infrastructure' in January 2003 starts with the statement that there are several different trends that have been 'crossing and converging' in multiple ways, and this phenomenon has the promise of bringing about and maintaining extraordinary developments in the field of creating, disseminating, and dissecting scientific as well as engineering knowledge. Therefore, the report recommended, the National Science Foundation must establish an excellent intra-agency 'Advanced Cyber Infrastructure Program' that is internationally controlled and maintained on a large scale.
This program must be able to empower scientific and engineering research and other allied subjects of education, and this would lead to the creation of a more ubiquitous and all-pervasive 'digital environment' for the purpose, and this would in turn lead to the creation and the development of a completely interactive and functional research community where not only people but data, information, and tools and the various instruments needed would function at high levels of computational and storage as well as data transfer capabilities. The World Wide Web is one of the best-known instruments known to man today that would help him in his quest to enhance and increase human communication between lots of people. (Enabling the Semantic Web for scientific research and Collaboration)
Today, the designs that are being created adapt to the rapidly expanding Internet very well, and this inevitably means that the Web has been able to promote these improved designs that apply to the various modes of communication that are possible through the Internet, the more important among them being scientific communication, business communication and commercial communication. However, it can be said that the World Wide Web has not yet reached its full potential, though it is a fact that it has been one of the most successful phenomenon in the history of mankind and communication methods between human beings all over the world. Today, however, the Web is stupendously successful on virtue of the simple fact that human beings have been using it to put across documents that are intended for other humans to read and comprehend. Therefore, it can be said that the World Wide Web is capable of providing an excellent infrastructure for the purpose of social communications between human beings all over the whole world. The 'Semantic Web' as it is known as, is the enhancement of the World Wide Web that allows machine processable data to include application boundaries in the same manner that human readable documents do today.
You’re 81% through this paper. Sign up to read the full paper.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.