html2xhtml

A free-software converter from HTML to XHTML

Convert your document online

Output document type:

Indentation length:

Line length:

Input HTML document:

(go to advanced options)

Latest news RSS news feed

(go to previous announcements)

[Jul 26, 2011] Html2xhtml uploaded to github Html2xhtml has been uploaded to github. That means that from now on there is public access to the git repository of the project. Please, feel free to collaborate forking and sending pull requests. Bug reports and feature requests may be filed at the issue tracker of html2xhtml in github.

[Apr 7, 2010] Released html2xhtml 1.1.2-2 A new version of html2xhtml (1.1.2-2) has been released. It has some minor changes and a bug fix. The bug only affected the program when compiled with glibc up to version 2.0.6, which was the case of the Windows binary version distributed in this site.

[Mar 25, 2010] Library for calling html2xhtml in the .NET 4.0 framework A developer in Germany that is using html2xhtml for a project he is currently involved in has just released a .NET 4.0 library that binds to html2xhtml. It makes it easier to integrate html2xhtml in a C# program. The library is called Html2Xhtml, is also free software and is available at its website at sourceforge. I'd like to express my gratitude to him for sharing the library with us.

Introduction

html2xhtml converts HTML files into XHTML. It can fix many common errors in HTML files (e.g. missing end tags, elements with incorrect content model, non-standard elements or attributes, etc.) It can also handle invalid or non well-formed XHTML input, and clean it to produce a well-formed and valid XHTML output. The output document type can be selected among several XHTML DTDs (1.0, 1.1, Basic, etc.)

You can convert HTML files from a Web browser using the online conversion form or download the program and run it on your own computers as a command-line tool, which is quite more convenient for batch conversion and off-line use. The program is free software, licensed under the terms of the GNU General Public License (GPL) version 2.

If you need to call html2xhtml from a program in the .NET framework, there is a separate project, authored by another developer, that provides a .NET 4.0 library (also called Html2Xhtml). The library uses html2xhtml internally and has also been released with GPL version 2 or higher license.

The program has been developed in C and does not depend on other libraries, apart from the GNU libc and GNU libiconv. It has been tested both in GNU/Linux and Windows platforms, but I hope it can also be compiled for other environments. Please, let me know if you succeed to build and run it on other platforms.

A Web API has been recently released (still beta, though) for developers. It allows other programs to remotely invoke html2xhtml through HTTP.

Contribute!

Want to contribute? You can contribute or contact me through the github page of html2xhtml.

Bug reports may be filed at the issue tracker of html2xhtml in github.

Other resources

The xhtmlpedia is a browsable list of XHTML elements and attributes. It lists the elements available for each XHTML DTD, their content rules, attributes, etc. I find it much easier to read and browse than the actual DTDs. The xhtmlpedia has been automatically created from the DTDs with the help of the module of html2xhtml that encodes the definitions of the XHTML DTDs. It is updated frequently to keep it in sync with the official DTDs.

References