Log in ....Tribune

Dot.ComLatest in ITFree DownloadsOn hardware

Monday, April 16, 2001

XML — speak your own language

They took the best of SGML, guided by the experience with HTML, and produced something that was no less powerful, but much simpler to use, 
says Amit Puri

XML—Extensible Markup Language—is an exciting development in Web technology. It is the youngest and most comprehensive of the markup languages. (Markup refers to anything on a document that adds special meaning to a particular text; for example, bold text is a form of markup.)

This language got the name Extensible Markup Language from the characteristic that it is not restricted to a fixed set of tags, as is HTML (Hypertext Markup Language). An XML user can create his own tags according to need. A tag is a sequence of characters in a markup language used to provide information, such as formatting specifications, about a document.

Markup languages are roughly classified into three types:
1) Stylistic—defines character presentation; for example bold, italics, underline, font, etc. 2) Structural—defines the structure of the document as for heading and paragraph.

3) Semantic—informs us about the content of the data, like giving a Title.

SGML (Standardised Generalised Markup Language) is the mother of all markup languages and has been in existence since the late 1960s. In 1986 it became an international standard for defining markup languages. It is used to create other languages, including HTML, which is very popular for its use on the Web. HTML was made by Tim Berners Lee in 1991.


While on one hand SGML is very effective but complex, on the other, HTML is very easy but limited to a fixed set of tags. This situation raised the need for a language that was as effective as SGML and at the same time as simple as HTML. This gap has now been filled by XML.

The development of XML started in 1996, when a team led by Jon Bosak of Sun Microsystems began work on a project for remoulding and cutting the inessential parts of SGML. They took the best of SGML, guided by the experience with HTML, and produced something that was no less powerful, but much simpler to use. The World Wide Web Consortium also contributed to the creation and development of the standard for XML. The specifications for XML were laid down in just 26 pages, compared to the 500+ page specifications that define SGML.

Although XML looks like HTML, there is a world of difference. While HTML specifies what each tag and attribute means and how the text defined by it will look in a browser, XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. For example, if we see "<b>" in an XML file, it may or may not mean bold (as in HTML)—it may mean ‘book,’ ‘bank’ or anything else specified by the programmer. HTML is only a presentation technology—it carries no description of the content held within its tags—whereas in XML a programmer can describe the text in its own tag. Moreover, we can specify the importance of a tag in XML so that a hierarchy of data can be represented, which is not possible in HTML.

Like HTML, both Netscape and Microsoft browsers support XML. As XML files are text files, it becomes easier for a programmer to debug applications. But at the same time, being in text format, XML files are always larger than comparable binary formats.

XML is a family of technologies. XLINK is one of them, which describes a standard way to add hyperlinks to an XML file. XPOINTER and XFRAGMENTS are syntaxes for pointing to parts of an XML document. XSL is an advanced language for expressing style sheets. We can also use cascading style sheets (CSS) as we do in HTML. XML NAMSPACES is a specification that describes how to associate a URL with every single tag. XML SCHEMAS helps to define the XML-based formats.

DOM is a standard set of functions called for manipulating XML files from a programming language. Math ML is a specification for describing mathematics as a basis for machine communication. With adequate style-sheet support it would ultimately be possible for browsers to natively render mathematical expressions, which is not possible in HTML.

XML encryption is a process name for encrypting and decrypting digital content. XML signatures provide integrity, message authentication for data of any type. These things are extensively used for providing security in applications.

XML protocol is used to develop technologies that allow two or more peers to communicate in a distributed environment, using XML as its encapsulation language.

Nowadays we can find a number of quality sites made by using XML like www.brainbench.com. Use of XML can also be seen in B2B portals and it is also used in WAP development. WML (wireless markup language) is derived from XML, which plays the primary role in the development of WAP applications.

The most novel feature of XML is that it is able to express complex data structures, and even distributed actions, in terms of a simple, punctuated stream of text. Any network component newer than the Abacus can send and receive XML; almost any processor has sufficient power to parse it.

XML is license free, platform independent and well supported. Visual Studio.NET, Hailstorm, .NET platform, latest products from Microsoft, are fully compatible with XML.

Many companies are using this language according to their needs. As this language can be used for various objectives, it can be seen on various platforms, in combination with different languages. XML is a key to the next-generation Internet, offering a way to unlock information so that it can be organised, programmed and edited—a way to distribute data in more useful ways to a variety of digital devices and allowing Web sites to collaborate. Today XML is a young child developing various aspects of its personality; which of these would be the dominating trait to give it its final adult character could be anybody’s guess.