7.5. HTMLTags - generate HTML in Python
7.5.1 Overview
The HTMLTags module defines a class for all the valid HTML tags, written in uppercase letters. To create a piece of HTML, the general syntax is :
t = TAG(content, key1=val1,key2=val2,...)
so that print t
results in :
<TAG key1="val1" key2="val2" ...>content</TAG>
For instance :
print A('bar', href="foo") ==> <A href="foo">bar</A>
Attributes with the same name as Python keywords (class
,
type
) must be capitalized :
print DIV('bar', Class="title") ==> <DIV Class="title">bar</A>
To generate HTML attributes without value, give them the value
True
:
print OPTION('foo',SELECTED=True,value=5) ==> <OPTION value="5" SELECTED>
For non-closing tags such as <IMG> or <BR>, the print
statement does not generate the closing tag
7.5.2 Tags concatenation
To add a "brother" to a tag (an element at the same level in the tree) use the addition operator :
print B('bar')+INPUT(name="bar") ==> <B>bar</B><INPUT name="bar">
You can also repeat an instance using the multiplication operator :
print TH(' ')*3 ==> <TD> </TD><TD> </TD><TD> </TD>
If you have a list of instances, you can concatenate the items with the function Sum()
:
Sum([ (I(i)+':'+B(i*i)+BR()) for i in range(100) ])
generates the rows of a table showing the squares of integers from 0 to 99
7.5.3 Building an HTML document
An HTML document is a tree of elements ; HTMLTags provides a simple way of building this tree
The content argument can be an instance of an HTMLTags class, so that you can nest tags, like this :
print B(I('foo')) ==> <B><I>foo</I></B>
If you think of the document as a tree, this means that the instance I('foo') is a child of the instance of class B
If you have to build a more complex tree, using this approach means that you will have to be careful about the opening and closing brackets, the code will rapidly become difficult to read and maintain. It also means that you build the tree "bottom-up"
An alternative is to build the tree "top-down" : build the nesting element first, then add the children. HTMLTags uses the operator <=
as a synonym of "add child"
You can compare the 2 approaches with this example :
- "bottom-up"
# build lines first
lines = INPUT(name="zone1",value=kw.get("zone1",""))
lines += BR()+INPUT(name="zone2",value=kw.get("zone2",""))
lines += BR()+INPUT(Type="submit",value="Ok")
# build and print form
print FORM(lines,action="validate",method="post") - "top-down"
# build form first
form = FORM(action="validate",method="post")
# add child elements
form <= INPUT(name="zone1",value=kw.get("zone1",""))
form <= BR()+INPUT(name="zone2",value=kw.get("zone2",""))
form <= BR()+INPUT(Type="submit",value="Ok")
print form
To build a complex document, the top-down approach is probably more readable
head = HEAD() head <= LINK(rel="Stylesheet",href="../doc.css") head <= TITLE('Record collection')+stylesheet body = BODY() body <= H1('My record collection') table = TABLE(Class="content") table <= TR(TH('Title')+TH('Artist')) for rec in records: table <= TR(TD(rec.title,Class="title")+TD(rec.artist,Class="Artist") body <= table print HTML(head+body)
7.5.4 Inspecting the document tree
Tags have 2 methods to find the elements that match certain conditions :
get_by_tag(tag_name)
: returns the list of the elements with the specified tag nameget_by_attr(arg1=val1,arg2=val2...)
: returns the list of the elements whose attributes match the specified condition
For instance, if you have built a table and want to present odd and even rows in different styles, you can use get_by_tag()
and change the attribute "Class" of the TD
tags this way :
classes = ['row_even','row_odd'] lines = table.get_by_tag('TR') for i,line in enumerate(lines): cells = line.get_by_tag('TD') for cell in cells: cell.attrs['Class'] = classes[i%2]
7.5.5 SELECT tags, checkboxes and radiobuttons
When building an HTML document, there is often a set of data (the result of a request to a database for instance) that should be presented to the end-user as a list of options in a SELECT tag, or as a list of radiobuttons or checkboxes. Generally, one or several of the options is selected or checked because it matches a certain condition
HTMLTags provides special methods for the SELECT tag to initialize it from the set of data, and to mark one or several options are selected :
from_list(data)
: returns the SELECT tag with OPTION tags taken from the list data. Each OPTION tag has the item value as content and the item rank in the list as value :s = SELECT().from_list(["foo","bar"]) ==>
<SELECT>
<OPTION value="0">foo
<OPTION value="1">bar
</SELECT>select(content=item)
orselect(value=item)
: mark the options with the specified content or value as selected, and the other options as not selected. item can be a list of contents or values, for SELECT tags with the MULTIPLE option sets.select(content="bar") ==>
<SELECT>
<OPTION value="0">foo
<OPTION value="1" SELECTED>bar
</SELECT>
For checkboxes and radiobuttons, HTMLTags provides 2 classes, CHECKBOX
and RADIO
. Instances of both classes are initialized with a list as the first argument, and attributes of the INPUT tags as other keyword arguments :
radio = RADIO(["foo","bar"],Class="menu")
Iterating on the RADIO instance yields tuples (content,tag)
where content is the item in the original list :
for (content,tag) in radio: print content,tag==>
bar<INPUT Type="radio" Class="menu" value="1">
When the instance is created, all the INPUT tags are unchecked. The method check(content=item)
or check(value=item)
is used to check the INPUT tags with the specified content or value
radio.check(content="foo") table = TABLE() for (content,tag) in radio: table <= TR(TD(content)+TD(tag)) print table==>
<TABLE> <TR> <TD>foo</TD> <TD><INPUT Type="radio" Class="menu" value="0"></TD> </TR> <TR> <TD>bar</TD> <TD><INPUT Type="radio" Class="menu" value="1"></TD> </TR> </TABLE>
As for SELECT
, item can be a list of contents or values, in case several checkboxes must be checked
7.5.6 Unicode
Tags content and attribute values can be bytestrings or Unicode strings. When a tag is printed, Unicode strings are encoded to bytestrings. The encoding used can be defined by the function set_encoding(encoding)
If you don't specify an encoding, the system default encoding (sys.getdefaultencoding()
) is used
Inside a Karrigell script, the encoding defined by SET_UNICODE_OUT()
is also used by HTMLTags - you don't have to use set_encoding()