Jason Harwig has posted about AntiSamy, the Java 1.5 compatible library that sanitizes away:
Philosophically, AntiSamy is a departure from all contemporary security mechanisms. Generally, the security mechanism and user have a communication that is virtually one way, for good reason. Letting the potential attacker know details about the validation is considered unwise as it allows the attacker to "learn" and "recon" the mechanism for weaknesses. These types of information leaks can also hurt in ways you don't expect. A login mechanism that tells the user, "Username invalid" leaks the fact that a user by that name does not exist. A user could use a dictionary or phone book or both to remotely come up with a list of valid usernames. Using this information, an attacker could launch a brute force attack or massive account lock denial-of-service. So, we get that.[Full wiki]
HTML filtering was never recommended because it was so difficult to get right, and with no proven libraries, trying to build a solution would almost certainly contain security holes. Thanks to Arshan Dabirsiaghi we finally have something to use. He has created the OWASP AntiSamy project to easily sanitize HTML input. AntiSamy is currently implemented as a Java 1.5 compatible library, but there are plans to support other platforms.
Here's a sample usage: