3.7 White Space Removal in Albatross

If you were paying close attention to the results of expanding the macros we created in section 3.5 you would have noticed that nearly all evidence of the Albatross tags has disappeared. It is quite obvious that the Albatross tags are no longer present. A little less obvious is removal of whitespace following the Albatross tags.

Let's have a look at the "doc" macro again.

<al-macro name="doc">
<html>
 <head>
  <title>Simple Content Management - <al-usearg name="title"></title>
 </head>
 <body>
  <h1>Simple Content Management - <al-usearg name="title"></h1>
  <hr noshade>
  <al-usearg>
 </body>
</html>
</al-macro>

We can get a capture the result of expanding the macro by firing up the Python interpreter to manually exercise the macro.

>>> import albatross
>>> text = '''<al-macro name="doc">
... <html>
...  <head>
...   <title>Simple Content Management - <al-usearg name="title"></title>
...  </head>
...  <body>
...   <h1>Simple Content Management - <al-usearg name="title"></h1>
...   <hr noshade>
...   <al-usearg>
...  </body>
... </html>
... </al-macro>
... '''
>>> ctx = albatross.SimpleContext('.')
>>> templ = albatross.Template(ctx, '<magic>', text)
>>> templ.to_html(ctx)
>>> text = '''<al-expand name="doc">
... <al-setarg name="title">hello</al-setarg>
... </al-expand>
... '''
>>> expand = albatross.Template(ctx, '<magic>', text)
>>> ctx.push_content_trap()
>>> expand.to_html(ctx)
>>> result = ctx.pop_content_trap()
>>> print result
<html>
 <head>
  <title>Simple Content Management - hello</title>
 </head>
 <body>
  <h1>Simple Content Management - hello</h1>
  <hr noshade>
  </body>
</html>

Not only have the <al-macro> and <al-expand> tags been removed, the whitespace that follows those tags has also been removed. By default Albatross removes all whitespace following an Albatross tag that begins with a newline. This behaviour should be familiar to anyone who has used PHP.

Looking further into the result you will note that the </body> tag is aligned with the <hr noshade> tag above it. This is the result of performing the <al-usearg> substitution (which had no content) and removing all whitespace following the <al-usearg> tag.

This whitespace removal nearly always produces the desired result, though it can be a real problem at times.

>>> import albatross
>>> ctx = albatross.SimpleContext('.')
>>> ctx.locals.title = 'Mr.'
>>> ctx.locals.fname = 'Harry'
>>> ctx.locals.lname = 'Tuttle'
>>> templ = albatross.Template(ctx, '<magic>', '''<al-value expr="title">
...  <al-value expr="fname">
...  <al-value expr="lname">
... ''')
>>> ctx.push_content_trap()
>>> templ.to_html(ctx)
>>> ctx.pop_content_trap()
'Mr.HarryTuttle'

The whitespace removal has definitely produced an undesirable result.

You can always get around the problem by joining all of the <al-value> tags together on a single line. Remember that the whitespace removal only kicks in if the whitespace begins with a newline character. For our example this would be a reasonable solution.

>>> import albatross
>>> ctx = albatross.SimpleContext('.')
>>> ctx.locals.title = 'Mr.'
>>> ctx.locals.fname = 'Harry'
>>> ctx.locals.lname = 'Tuttle'
>>> templ = albatross.Template(ctx, '<magic>', '''<al-value expr="title"> <al-value expr="fname"> <al-value expr="lname">''')
>>> ctx.push_content_trap()
>>> templ.to_html(ctx)
>>> ctx.pop_content_trap()
'Mr. Harry Tuttle'

The other way to defeat the whitespace removal while keeping each <al-value> tag on a separate line would be to place a single trailing space at the end of each line. This would be a very bad idea because the next person to modify the file might remove the space without realising how important it was.

Note that there are trailing spaces at the end of each line in the text assignment. This should give you a clue about how bad this technique is.

>>> import albatross
>>> ctx = albatross.SimpleContext('.')
>>> ctx.locals.title = 'Mr.'
>>> ctx.locals.fname = 'Harry'
>>> ctx.locals.lname = 'Tuttle'
>>> templ = albatross.Template(ctx, '<magic>', '''<al-value expr="title"> 
... <al-value expr="fname"> 
... <al-value expr="lname">
... ''')
>>> ctx.push_content_trap()
>>> templ.to_html(ctx)
>>> ctx.pop_content_trap()
'Mr. \nHarry \nTuttle'

A much better way to solve the problem is to explicitly tell the Albatross parser that you want it to do something different with the whitespace that follows the first two <al-value> tags.

>>> import albatross
>>> ctx = albatross.SimpleContext('.')
>>> ctx.locals.title = 'Mr.'
>>> ctx.locals.fname = 'Harry'
>>> ctx.locals.lname = 'Tuttle'
>>> templ = albatross.Template(ctx, '<magic>', '''<al-value expr="title" whitespace="indent">
...  <al-value expr="fname" whitespace="indent">
...  <al-value expr="lname">
... ''')
>>> ctx.push_content_trap()
>>> templ.to_html(ctx)
>>> ctx.pop_content_trap()
'Mr. Harry Tuttle'

The above variation has told the Albatross interpreter to only strip the trailing newline, leaving intact the indent on the following line. The following table describes all of possible values for the whitespace attribute.

Value Meaning
'all' Keep all following whitespace.
'strip' Remove all whitespace - this is the default.
'indent' Keep indent on following line.
'newline' Remove all whitespace and substitute a newline.

Note that when the trailing whitespace does not begin with a newline the 'strip' and 'indent' whitespace directives are treated exactly like 'all'.