7.5. HTMLTags - generate HTML in Python

7.5.1 Overview

The HTMLTags module defines a class for all the valid HTML tags, written in uppercase letters. To create a piece of HTML, the general syntax is :

t = TAG(content, key1=val1,key2=val2,...)

so that print t results in :

<TAG key1="val1" key2="val2" ...>content</TAG>

For instance :

print A('bar', href="foo")   ==> <A href="foo">bar</A>

Attributes with the same name as Python keywords (class, type) must be capitalized :

print DIV('bar', Class="title")   ==> <DIV Class="title">bar</A>

To generate HTML attributes without value, give them the value True :

print OPTION('foo',SELECTED=True,value=5)   ==> <OPTION value="5" SELECTED>

For non-closing tags such as <IMG> or <BR>, the print statement does not generate the closing tag

7.5.2 Tags concatenation

To add a "brother" to a tag (an element at the same level in the tree) use the addition operator :

print B('bar')+INPUT(name="bar")   ==> <B>bar</B><INPUT name="bar">

You can also repeat an instance using the multiplication operator :

print TH('&nbsp')*3   ==> <TD>&nbsp;</TD><TD>&nbsp;</TD><TD>&nbsp;</TD>

If you have a list of instances, you can concatenate the items with the function Sum() :

Sum([ (I(i)+':'+B(i*i)+BR()) for i in range(100) ])

generates the rows of a table showing the squares of integers from 0 to 99

7.5.3 Building an HTML document

An HTML document is a tree of elements ; HTMLTags provides a simple way of building this tree

The content argument can be an instance of an HTMLTags class, so that you can nest tags, like this :

print B(I('foo'))   ==> <B><I>foo</I></B>

If you think of the document as a tree, this means that the instance I('foo') is a child of the instance of class B

If you have to build a more complex tree, using this approach means that you will have to be careful about the opening and closing brackets, the code will rapidly become difficult to read and maintain. It also means that you build the tree "bottom-up"

An alternative is to build the tree "top-down" : build the nesting element first, then add the children. HTMLTags uses the operator <= as a synonym of "add child"

You can compare the 2 approaches with this example :

  • "bottom-up"

    # build lines first
    lines = INPUT(name="zone1",value=kw.get("zone1",""))
    lines += BR()+INPUT(name="zone2",value=kw.get("zone2",""))
    lines += BR()+INPUT(Type="submit",value="Ok")
    # build and print form
    print FORM(lines,action="validate",method="post")

  • "top-down"

    # build form first
    form = FORM(action="validate",method="post")
    # add child elements
    form <= INPUT(name="zone1",value=kw.get("zone1",""))
    form <= BR()+INPUT(name="zone2",value=kw.get("zone2",""))
    form <= BR()+INPUT(Type="submit",value="Ok")
    print form

To build a complex document, the top-down approach is probably more readable

head = HEAD()
head <= LINK(rel="Stylesheet",href="../doc.css")
head <= TITLE('Record collection')+stylesheet
 
body = BODY()
body <= H1('My record collection')
 
table = TABLE(Class="content")
table <= TR(TH('Title')+TH('Artist'))
for rec in records:
    table <= TR(TD(rec.title,Class="title")+TD(rec.artist,Class="Artist")
 
body <= table
 
print HTML(head+body)

7.5.4 Inspecting the document tree

Tags have 2 methods to find the elements that match certain conditions :

  • get_by_tag(tag_name) : returns the list of the elements with the specified tag name
  • get_by_attr(arg1=val1,arg2=val2...) : returns the list of the elements whose attributes match the specified condition

For instance, if you have built a table and want to present odd and even rows in different styles, you can use get_by_tag() and change the attribute "Class" of the TD tags this way :

classes = ['row_even','row_odd']
lines = table.get_by_tag('TR')
for i,line in enumerate(lines):
    cells = line.get_by_tag('TD')
    for cell in cells:
        cell.attrs['Class'] = classes[i%2]

7.5.5 SELECT tags, checkboxes and radiobuttons

When building an HTML document, there is often a set of data (the result of a request to a database for instance) that should be presented to the end-user as a list of options in a SELECT tag, or as a list of radiobuttons or checkboxes. Generally, one or several of the options is selected or checked because it matches a certain condition

HTMLTags provides special methods for the SELECT tag to initialize it from the set of data, and to mark one or several options are selected :

  • from_list(data) : returns the SELECT tag with OPTION tags taken from the list data. Each OPTION tag has the item value as content and the item rank in the list as value :

    s = SELECT().from_list(["foo","bar"])   ==>

    <SELECT>
    <OPTION value="0">foo
    <OPTION value="1">bar
    </SELECT>
  • select(content=item) or select(value=item) : mark the options with the specified content or value as selected, and the other options as not selected. item can be a list of contents or values, for SELECT tags with the MULTIPLE option set

    s.select(content="bar")   ==>

    <SELECT>
    <OPTION value="0">foo
    <OPTION value="1" SELECTED>bar
    </SELECT>

For checkboxes and radiobuttons, HTMLTags provides 2 classes, CHECKBOX and RADIO. Instances of both classes are initialized with a list as the first argument, and attributes of the INPUT tags as other keyword arguments :

radio = RADIO(["foo","bar"],Class="menu")

Iterating on the RADIO instance yields tuples (content,tag)  where content is the item in the original list :

for (content,tag) in radio:
    print content,tag
  ==>

foo<INPUT Type="radio" Class="menu" value="0">
bar<INPUT Type="radio" Class="menu" value="1">

When the instance is created, all the INPUT tags are unchecked. The method check(content=item) or check(value=item) is used to check the INPUT tags with the specified content or value

radio.check(content="foo")
table = TABLE()
for (content,tag) in radio:
    table <= TR(TD(content)+TD(tag))
print table
  ==>

<TABLE>
<TR>
<TD>foo</TD>
<TD><INPUT Type="radio" Class="menu" value="0"></TD>
</TR>
<TR>
<TD>bar</TD>
<TD><INPUT Type="radio" Class="menu" value="1"></TD>
</TR>
</TABLE>

As for SELECT, item can be a list of contents or values, in case several checkboxes must be checked

7.5.6 Unicode

Tags content and attribute values can be bytestrings or Unicode strings. When a tag is printed, Unicode strings are encoded to bytestrings. The encoding used can be defined by the function set_encoding(encoding)

If you don't specify an encoding, the system default encoding (sys.getdefaultencoding()) is used

Inside a Karrigell script, the encoding defined by SET_UNICODE_OUT() is also used by HTMLTags - you don't have to use set_encoding()