Add CanBeEmpty instance for ScriptContent? replace Only with reference to Control.Monad.Identity Add block tables to TableExtras module Consider removing the wrapping monad in _allowed_in classes. This would require the use of list syntax for lists, but make the error messages for certain mistakes far clearer. It'd also remove the need for 'sole', I think. Consider making string produce values acceptable to the following elements: option, textarea, title, style, script. This would be more regular, but would require either that the combinators for those elements take Only String, or that String be a 0-order class, which would involve awkwardnesses because String is type not data. Add XML Comments Necessary for the sake of horrid IE conditional comments Since stuff is meant to be html here, perhaps better to implement a conditional comment feature. Pitfall: List1 things -- although, since it's a comment, there's an argument that it always counts for nothing, and perhaps the content doesn't need to be valid either... Add instances of CanBeEmpty for the empty_element_classes? XHtml parser (or read instances!). Probably the best way of doing that would be to provide a pair of functions from the Html type to the XML type of HXT or HaXML (which would be total) and - what we'd want to get a parser - one back (which would be partial, since DTD valid XML isn't necessarily valid XHTML). But which one to choose? read instance for Charset Decide what to do about REQUIRED attributes. We certainly don't want to keep the current arrangement (where elements just pick an arbitrary value for such attributes!). One method would be to change the attribute setters from a list to a function, (and build the function by composition from individual attribute setters (class "foo" . name "bar)), then for attribute lists with REQUIRED elements, use a parameterised type with a hole in it, so that eg map:: (Map_attrs Id_required -> Map_attrs ID) -> contents -> result Difficulties include having to have multiple arguments for multiple required attributes, and what to do about attributes that appear both REQUIRED and IMPLIED Another option would be to make elements with required attributes take them as parameters, but that's rather ugly since they are passed in a different way and you'd have to remember the order when there was more than one. Consider using composition rather than append for +++ so that can apply to an invalid initial object and get out a valid final one -- useful for getting rid of (empty, tr...) etc, since you can start with (empty, []) and still get a (Maybe t, List1 t') out at the end. It's not clear how to do this without MPTCs (and using MPTCs might make another method available). Perhaps I should reconsider the initial decision not to use MPTCs in these libraries? Some attribute types still need attention: width in col and colgroup should be %MultiLength, not %Length (but width for object, img, table) HTML and xhtml seem to disagree on whether the relative widths in MultiLengths are integers or fractions. Some name attributes are CDATA, some NMTOKEN. This is a pain in the neck, but name and ID should probably be interdependent to cope with the difference between HTML and xhtml. See if there's anything to be done about DOCTYPE wrt IE; anything before DOCTYPE throws IE into quirks mode, I read, so the xml version processing instruction will cause problems. Furthermore, IE and other old browsers misbehave if served xhtml. Since the underlying semantics of xhtml 1.0 and html 4.01 are the same, I think the proper solution is to provide renderHtml that outputs Html 4.01 from the same datastructure, taking note of the issues mentioned at (drop id from elements that don't have either it or name, put it in name if it has that instead, drop all xml: attributes). xhtml elements having name: [meta, a, object, param, map, label, input, select, textarea, button]; all 77 xhtml elements have an id. In html, three more elements [img, form, head] have name, while [base, head, html, meta, script, style, title] do not have id attributes. Note also that in xhtml "name" is deprecated for [a, applet, form, frame, iframe, img, map] (not that applet, frame or iframe appear in the strict dtd). Unfortunately the abstract syntax trees for the two languages are different: for example, xhtml body takes (...)* while html4.01 body takes (...)+ This seems to be a pointless difference, but there it is. List of * v + differences: element html xhtml body + * blockquote + * form + * noscript + * The above could be handled by munging in gratuitous empty div elements, but what about this: element html xhtml map (block|area)+ (block+|area+) The xhtml content for map is a subset of the html, so making a valid html tree out of xhtml is trivial but good grief! In what sense is this a translation? Pitfall: CDATA v PCDATA -- see ; which says Note that attribute value literals are always parsed as replaceable character data, regardless of the attribute's declared value. This means that references (&xxx;, &#yyy;) are recognized and replaced in attribute specifications, even for CDATA attributes. so there's no problem with attribute values, but for