XHTML is a stricter, more XML-based version of HTML. HTML and XHTML are both languages in which web pages are written. HTML is SGML-based while XHTML is XML-based. They are like two sides of the same coin. XHTML was derived from HTML to conform to XML standards.
What is XHTML?
- XHTML stands for EXtensible HyperText Markup Language
- XHTML is a stricter, more XML-based version of HTML
- XHTML is HTML defined as an XML application
- XHTML is supported by all major browsers
XML is a markup language where all documents must be marked up correctly (be “well-formed”) XHTML was developed to make HTML more extensible and flexible to work with other data formats (such as XML). In addition, browsers ignore errors in HTML pages, and try to display the website even if it has some errors in the markup. So XHTML comes with a much stricter error handling.
Overview of HTML and XHTML
HTML is the predominant mark up language for web pages. HTML creates structured documents by denoting structural semantics for text like headings, lists, links, quotes etc. It allows images and objects to be embedded to create interactive forms. It is written as tags surrounded by angle brackets – for example,
<br> is valid in HTML, it would be required to write
<br /> in XHTML.
Features of HTML vs XHTML documents
HTML documents are composed of elements that have three components- a pair of element tags – start tag, end tag; element attributes given within tags and actual, textual and graphic content. HTML element is everything that lies between and including tags. (Tag is a keyword which is enclosed within angle brackets). XHTML documents has only one root element. All elements including variables must be in lower case, and values assigned must be surrounded by quotation marks, closed and nested for being recognized. This is a mandatory requirement in XHTML unlike HTML where it is optional. The declaration of DOCTYPE would determine rules for documents to follow.
- Aside from the different opening declarations for a document, the differences between an HTML 4.01 and XHTML 1.0 document—in each of the corresponding DTDs—are largely syntactic.
- The underlying syntax of HTML allows many shortcuts that XHTML does not, such as elements with optional opening or closing tags, and even EMPTY elements which must not have an end tag.
- By contrast, XHTML requires all elements to have an opening tag or a closing tag. XHTML, however, also introduces a new shortcut: an XHTML tag may be opened and closed within the same tag, by including a slash before the end of the tag like this:
- The introduction of this shorthand, which is not used in the SGML declaration for HTML 4.01, may confuse earlier software unfamiliar with this new convention. A fix for this is to include a space before closing the tag, as such:
XHTML vs HTML Specification
HTML and XHTML are closely related and therefore can be documented together. Both HTML 4.01 and XHTML 1.0 have three sub specifications – strict, loose and frameset. The difference opening declarations for a document distinguishes HTML and XHTML. Other differences are syntactic. HTML allows shortcuts like elements with optional tags, empty elements without end tags. XHTML is very strict about opening and closing tags. XHTML uses built in language defining functionality attribute. All syntax requirements of XML are included in a well formed XHTML document.
Note, though, that these differences apply only when an XHTML document is served as an application of XML; that is, with a MIME type of application/xhtml+xml, application/xml, or text/xml. An XHTML document served with a MIME type of text/html must be parsed and interpreted as HTML, so the HTML rules apply in this case. A style sheet written for an XHTML document being served with a MIME type of text/html may not work as intended if the document is then served with a MIME type of application/xhtml+xml. For more information about MIME types, make sure to read MIME Types.
This can be especially important when you’re serving XHTML documents as text/html. Unless you’re aware of the differences, you may create style sheets that won’t work as intended if the document’s served as real XHTML.
Where the terms “XHTML” and “XHTML document” appear in the remainder of this section, they refer to XHTML markup served with an XML MIME type. XHTML markup served as text/html is an HTML document as far as browsers are concerned.
How to migrate from HTML to XHTML
As recommended by W3C following steps can be followed for migration of HTML to XHTML (XHTML 1.0 documents):
- Include xml:lang and lang attributes on elements assigning language.
- Use empty-element syntax on elements specified as empty in HTML.
- Include an extra space in empty-element tags: <html />
- Include close tags for elements that can have content but are empty: <html></html>
- Do not include XML declaration.
Carefully following W3C’s guidelines on compatibility, a user agent (web browser) should be able to interpret documents with equal ease as HTML or XHTML.
How to migrate from XHTML to HTML
To understand the subtle differences between HTML and XHTML, consider the transformation of a valid and well-formed XHTML 1.0 document into a valid HTML 4.01 document. To make this translation requires the following steps:
- The language for an element should be specified with a
langattribute rather than the XHTML
xml:langattribute. XHTML uses XML’s built in language-defining functionality attribute.
- Remove the XML namespace (
xmlns=URI). HTML has no facilities for namespaces.
- Change the document type declaration from XHTML 1.0 to HTML 4.01.
- If present, remove the XML declaration. (Typically this is:
<?xml version="1.0" encoding="utf-8"?>).
- Ensure that the document’s MIME type is set to
text/html. For both HTML and XHTML, this comes from the HTTP
Content-Typeheader sent by the server.
- Change the XML empty-element syntax to an HTML style empty element (
The Most Important Differences from HTML
- <!DOCTYPE> is mandatory
- The xmlns attribute in <html> is mandatory
- <html>, <head>, <title>, and <body> are mandatory
- Elements must always be properly nested
- Elements must always be closed
- Elements must always be in lowercase
- Attribute names must always be in lowercase
- Attribute values must always be quoted
- Attribute minimization is forbidden
XHTML – <!DOCTYPE ….> Is Mandatory
An XHTML document must have an XHTML <!DOCTYPE> declaration. The <html>, <head>, <title>, and <body> elements must also be present, and the xmlns attribute in <html> must specify the xml namespace for the document.
Here is an XHTML document with a minimum of required tags:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.futurefundamentals.comxhtml"> <head> <title>Title of document</title> </head> <body> some content here... </body> </html>
XHTML Elements Must be Properly Nested
In XHTML, elements must always be properly nested within each other, like this:
XHTML Elements Must Always be Closed
In XHTML, elements must always be closed, like this:
<p>This is a paragraph</p> <p>This is another paragraph</p>
<p>This is a paragraph <p>This is another paragraph
XHTML Empty Elements Must Always be Closed
In XHTML, empty elements must always be closed, like this:
A break: <br /> A horizontal rule: <hr /> An image: <img src="happy.gif" alt="Happy face" />
A break: <br> A horizontal rule: <hr> An image: <img src="happy.gif" alt="Happy face">
XHTML Elements Must be in Lowercase
In XHTML, element names must always be in lowercase, like this:
<body> <p>This is a paragraph</p> </body>
<BODY> <P>This is a paragraph</P> </BODY>
XHTML Attribute Names Must be in Lowercase
In XHTML, attribute names must always be in lowercase, like this:
<a href="https://www.futurefundamentals.com/html/">Visit our HTML tutorial</a>
<a HREF="https://www.futurefundamentals.com/html/">Visit our HTML tutorial</a>
XHTML Attribute Minimization is Forbidden
In XHTML, attribute minimization is forbidden:
XHTML Attribute Minimization is Forbidden In XHTML, attribute minimization is forbidden:
<input type="checkbox" name="vehicle" value="car" checked /> <input type="text" name="lastname" disabled />