We’ve large variety of JS and CSS minification tools, but almost no HTML ones; Kangax has now developed a new HTML minification tool – a very small “wrapper” on top of parser, only about 250 LOC. It takes input string and configuration object; passes this input string to parser, and builds final output according to specified options. “minifier relies on HTML parser by John Resig. John’s parser was capable of handling quite complex documents, but would sometimes trip on some more obscure structures. For e.g., doctype declarations weren’t understood at all. Whenever attribute name contained characters like “-” (e.g. as in “http-equiv”), parser would fail. There were also some defficiencies in regular expressions for matching comments and CDATA sections: newlines inside them weren’t accounted for, so multiline comments simply weren’t matched. CDATA sections and comments inside elements with CDATA content model (e.g. SCRIPT and STYLE) were getting stripped for no apparent reason,” noted Kangax.
More info: HTML Minifier