html2xhtml

A free-software converter from HTML to XHTML

Latest news RSS news feed

(go to previous announcements)

[Sep 23, 2015] Released html2xhtml 1.3 A new version of html2xhtml (1.3) has been released. It contains some small changes and bug fixes.

[Jul 26, 2011] Html2xhtml uploaded to github Html2xhtml has been uploaded to github. That means that from now on there is public access to the git repository of the project. Please, feel free to collaborate forking and sending pull requests. Bug reports and feature requests may be filed at the issue tracker of html2xhtml in github.

[Apr 7, 2010] Released html2xhtml 1.1.2-2 A new version of html2xhtml (1.1.2-2) has been released. It has some minor changes and a bug fix. The bug only affected the program when compiled with glibc up to version 2.0.6, which was the case of the Windows binary version distributed in this site.

Introduction

html2xhtml converts HTML files into XHTML. It can fix many common errors in HTML files (e.g. missing end tags, elements with incorrect content model, non-standard elements or attributes, etc.) It can also handle invalid or non well-formed XHTML input, and clean it to produce a well-formed and valid XHTML output. The output document type can be selected among several XHTML DTDs (1.0, 1.1, Basic, etc.)

You can convert HTML files from a Web browser using the online conversion form or download the program and run it on your own computers as a command-line tool, which is quite more convenient for batch conversion and off-line use. The program is free software, licensed under the terms of the GNU General Public License (GPL) version 2.

If you need to call html2xhtml from a program in the .NET framework, there is a separate project, authored by another developer, that provides a .NET 4.0 library (also called Html2Xhtml). The library uses html2xhtml internally and has also been released with GPL version 2 or higher license.

The program has been developed in C and does not depend on other libraries, apart from the GNU libc and GNU libiconv. It has been tested both in GNU/Linux and Windows platforms, but I hope it can also be compiled for other environments. Please, let me know if you succeed to build and run it on other platforms.

A Web API has been recently released (still beta, though) for developers. It allows other programs to remotely invoke html2xhtml through HTTP.

Contribute!

Want to contribute? You can contribute or contact me through the github page of html2xhtml.

Bug reports may be filed at the issue tracker of html2xhtml in github.

Other resources

The xhtmlpedia is a browsable list of XHTML elements and attributes. It lists the elements available for each XHTML DTD, their content rules, attributes, etc. I find it much easier to read and browse than the actual DTDs. The xhtmlpedia has been automatically created from the DTDs with the help of the module of html2xhtml that encodes the definitions of the XHTML DTDs. It is updated frequently to keep it in sync with the official DTDs.

References