Formatting a Document
(What follows is a rather long and complicated discussion of how to apply PIKT to the task of formatting and publishing web documents. You may wish to visit the doc_macros.cfg Samples page beforehand for a brief peak at the art of PIKT document formatting before plunging into the extended discussion here.)
In addition to the usual monitoring and system configuration functions, you can employ PIKT to format and install documents--web page publishing, for example. The principal PIKT Maintainer (Robert Osterlund) uses PIKT to manage hundreds and hundreds of Web pages (over 500, at last count) across several different Web sites, including of course pikt.org.
For this example, we will focus on another of his websites, Early MusiChicago. Early MusiChicago--hereinafter referred to as "EMC"--is a web portal to the Early Music scene in the Chicago area and beyond. Like pikt.org, earlymusichicago.org, too, is PIKT-managed.
Nearly all EMC web pages follow a common format, a nested table structure like the following:
<table>
<tr>
<td>
[EMC logo]
</td>
<td>
[page title]
</td>
<td>
[EMC mascot]
</td>
</tr>
<tr>
<td>
[left sidebar]
</td>
<td>
<table>
<tr>
<td>
[main page content]
</td>
</tr>
<tr>
<td>
[footer]
</td>
</tr>
</table>
</td>
<td>
[right sidebar]
</td>
</tr>
</table>
The topmost row of the outer table--the site logo, title, and site mascot--are more or less the same across all pages. Likewise, the left and right sidebars, also the footer, are more or less identical. It is the part marked "main page content", the core of each page, where most of the page differences lie.
Since to PIKT a web page is categorized as a "file", all web pages source from the piktmaster files.cfg. Here is the relevant line from files.cfg:
#include <files/emc/doc_emc_files.cfg>
For this example, we will further focus on a single web page, the EMC Wind Instruments page. Here is the relevant line from doc_emc_files.cfg file, still another config file #include:
#include <files/emc/doc_instruments_files.cfg>
For the Wind Instruments page, here are the relevant portions of the doc_instruments_files.cfg file:
///////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////
=emcwebdir/instruments_winds.htm mode 644 uid 0 gid 0
///////////////////////////////////////////////////////////////////////////
#set cat = "Wind Instruments"
#set title = "Early MusiChicago Wind Instruments"
#set descr = "Early music wind instruments listed at Early MusiChicago."
#set keywords = "music\, early\, instruments\, businesses\, wind\,
recorder\, bagpipe\, cornamuse\, flute\, pipe\, tabor\,
tin\, whistle\, crumhorn\, shawm\, cornett\, serpent\,
chalumeau\, bassanello\, schreierpfeife\, panpipe\, penny"
///////////////////////////////////////////////////////////////////////////
#def longpage
#indent
#include <files/emc/doc_page_top_files.cfg>
<td width="100%">
#include <files/emc/doc_instruments_winds_files.cfg>
</td>
#set apollosaxes_aff
#set inst = "winds"
#set amazon_winds
#set keyword = "wind|bagpipe|clarinet|crumhorn|krumhorn|bassoon|oboe|pipe|
flute|recorder"
#include <files/emc/doc_page_bottom_files.cfg>
#unset apollosaxes_aff
#unset inst
#unset amazon_winds
#unset keyword
#unindent
#undef longpage
///////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////
The first line of the file stanza specifies the file name, permissions, and ownerships:
=emcwebdir/instruments_winds.htm mode 644 uid 0 gid 0
On the piktmaster system, we could "publish" (install) this file to our primary and test/backup web servers using the command:
# piktc -iv +E emc pages=instruments +D emc +F instruments_winds.htm
+H emcwebsys emctestsys
The '+E emc pages=instruments' instructs the piktc preprocessor to look only at the emc instruments #include files of files.cfg and ignore all other #include files (for example, pikt #include files, or system configuration #include files)--this to save preprocessing time.
The '+D emc' tells the piktc preprocessor that this is an emc page (as opposed to a pikt page or some other page type), therefore to customize certain conditionally defined macros (using #ifdef emc ... #endifdef) for the EMC setting.
If we have edited the core page content or any of the standard page components, before publishing the page we could diff the current page on the primary webserver against the changed version on the piktmaster:
# piktc -fv +E emc pages=instruments +D emc +F instruments_winds.htm
+H emcwebsys
Here is a command, for all EMC Instruments pages, to compare the MD5 checksum of existing pages on the webserver with the checksum of their counterpart versions in the piktmaster's central configuration:
# piktc -m5v +E emc pages=instruments +D emc +F =emcdocs_instruments
+H emcwebsys
Returning to the configuration files, we #set some environment variables for later access:
#set cat = "Wind Instruments"
#set title = "Early MusiChicago Wind Instruments"
...
The first #set directive specifies the actual displayed webpage title (the one you see). The second #set directive is for the HTML coding (more on this '#set title' in a moment).
We commence actual web page content with the #include line
#include <files/emc/doc_page_top_files.cfg>
Here are the contents of the doc_page_top_files.cfg file:
///////////////////////////////////////////////////////////////////////////
//
// doc_page_top_files.cfg
//
///////////////////////////////////////////////////////////////////////////
=doctype
<html>
<head>
#echo "<title>$title</title>"
=meta_content_language
=meta_content_type
#echo "=meta_description($descr)"
#echo "=meta_keywords($keywords)"
=meta_author(=emc_author)
=meta_copyright(=emc_copyright)
</head>
=body_emc
=pagetop
=pagetable
<tr>
#include <files/emc/doc_header_files.cfg>
</tr>
<tr>
<td width="0%" valign="top">
#include <files/emc/doc_sidebar_left_files.cfg>
</td>
<td width="100%" valign="top">
=coretable
<tr>
///////////////////////////////////////////////////////////////////////////
Virtually all EMC web page specifications begin with a doc_page_top_files.cfg #include. If we need to change something in the top row of our pages--for example, add a new meta tag--we don't have to manually edit hundreds of different web pages on each of our webservers. Instead, we only have to edit this one config file on the piktmaster, doc_page_top_files.cfg, then reinstall all pages to both servers with:
# piktc -iv +E emc pages=all +D emc +F =emcdocs +H emcwebsys emctestsys
=body_emc is a macro defined in doc_emc_macros.cfg as:
body_emc <body background="http:\/\/earlymusichicago.org/images/BC4.jpg">
If we want to change the background image on every EMC web page, we would only have to change the =body_emc macro specification in doc_emc_macros.cfg, then reinstall all EMC pages everywhere.
What about this line in the header section?
#echo "<title>$title</title>"
Remember that earlier we had set the $title environment variable:
#set title = "Early MusiChicago Wind Instruments"
The #echo directive has PIKT evaluate $title, then echo back the string
<title>Early MusiChicago Wind Instruments</title>
as if we had hard-coded that line into the page specification. Using the #set-#echo tandem, we have effectively passed an argument, $title, from the parent doc_instruments_files.cfg file to one of its children #include files, doc_page_top_files.cfg, as if that #include file were a kind of subroutine. We sidestep having to hard-code the title and write the <head></head> block directly into each of the EMC web pages.
Each EMC web page is thus an assemblage of HTML code modules (#include files or macros). We only have to write them in one place, then #include them into each web page specification, using the #set-#echo trick and other tricks (see below) to tweak them as necessary.
Since $descr was earlier #set to "Early music wind instruments listed at Early MusiChicago.", and given the definition of the =meta_description() macro in doc_macros.cfg
meta_description(T) <meta name="description" content="(T)">
the line
#echo "=meta_description($descr)"
effectively becomes
<meta name="description" content="Early music wind instruments listed at
Early MusiChicago.">
A bit farther down in doc_page_top_files.cfg, we see the line
#include <files/emc/doc_header_files.cfg>
Here is that #include file:
<td width="0%">
=nbsp
<br>
=emclogo
=nbsp
<br>
</td>
<td width="100%" align="center">
=nbsp
<br>
=emctitle
<p>
#echo " =topcat($cat)"
=nbsp
<p>
</td>
<td width="0%">
=nbsp
<br>
=emcmascot
=nbsp
<br>
</td>
The =emcmascot macro is defined as:
emcmascot <img border="0" src="images/Musician246.jpg" width="120"
height="135" alt="Renaissance Lady Musician">
=topcat() is defined as:
topcat(T) <font face="Chaucer" size="6" color="#800000"
style="font-weight: 700"><b>(T)</b></font>
Moving into the next outer table row, we #include the file doc_sidebar_left_files.cfg:
///////////////////////////////////////////////////////////////////////////
//
// doc_sidebar_left_files.cfg
//
///////////////////////////////////////////////////////////////////////////
=nbsp
<br>
<b><font face="Chaucer" color="#800000">
=lnk(Home, index.htm)<br>
=lnk(Calendar, calendar.htm)<br>
....
=lnk(AboutUs, about_us.htm)<br>
=lnk(ContactUs, faq.htm#contact)<br>
</font></b>
<br>
<hr>
///////////////////////////////////////////////////////////////////////////
#ifndef veryshortpage
<br>
<center>
=lnk(=emctipjarmedium, support.htm" target="_blank)
</center>
<br>
<hr>
#endifdef // veryshortpage
///////////////////////////////////////////////////////////////////////////
#ifndef veryshortpage
# ifndef shortpage
<br>
<center>
#verbatim <files/emc/doc_featured_artist.cfg>
[/pikt/lib/configs/files/emc/doc_featured_artist.pl]
</center>
<br>
<hr>
# endifdef
#endifdef
///////////////////////////////////////////////////////////////////////////
=lnk() is defined in doc_macros.cfg as:
lnk(T, L) <a href="(L)">(T)</a>
So the "Home" line above becomes, after macro expansion:
<a href="index.htm">Home</a><br>
We use the =lnk() macro numerous times throughout the EMC web page specifications. As with many of the document formatting macros, the intent is to hide the "ugliness" of HTML code as much as possible, not to mention reduce the number of keystrokes we have to type.
We display the EMC "Tip Jar," and possibly also show the photo of a Featured Artist(s) depending on how we have #define'd veryshortpage and shortpage. So, on a "very short page," we would show the site navigation links only, and not the Tip Jar and Featured Artist(s) photo.
Moving into the inner, nested table, we encounter the main page content with the line
#include <files/emc/doc_instruments_winds_files.cfg>
Here is a portion of that file:
<li>
=emnm(Crumhorn)
<img border="0" src="images/Crumhorn.jpg" align="right" width="60"
height="89" alt="Crumhorn">
<br>
=quarterrest =lnk(Crumhorn Home Page,
http:\/\/members.iinet.net.au/~nickl/crumhorn.html)
<br>
=quarterrest =lnk(The Crumhorn,
http:\/\/www.s-hamilton.k12.ia.us/antiqua/crumhorn.htm)
...
#ifdef amazon
<br>
=compactdisk =amazon_search_link_emc(Crumhorn CDs, crumhorn|krumhorn,
classical-music, =amazon_rank_emc)
<br>
=book =amazon_search_link_emc(Crumhorn Books, crumhorn|krumhorn,
books, =amazon_rank_emc)
#endifdef
<br>
<br>
</li>
The =quarterrest macro is defined in doc_emc_macros.cfg as
quarterrest <img border="0" src="images/QuarterRest1.jpg"
width="8" height="14">
Here, too, if for some reason we need to change the hundreds of instances of this image in the EMC web pages, we only need to change this single line in the doc_emc_macros.cfg file, then reinstall all pages everywhere.
We have wrapped the Amazon.com links with '#ifdef amazon ... #endif' directives. If for some reason we decide to deactivate or retire those links, we only need to set the amazon define to FALSE, which would effectively exclude all amazon-related lines.
Consider the macro invocation
... =amazon_search_link_emc(Crumhorn CDs, crumhorn|krumhorn,
classical-music, =amazon_rank_emc)
Here is the definition of =amazon_search_link_emc() from the doc_emc_macros.cfg file:
amazon_search_link_emc(T, K, M, R)
<a href="http:\/\/www.amazon.com/exec/obidos/external-search?
tag\==emcid&keyword\=(K)&mode\=(M)&rank\=(R)"
target="_blank">=fn((T))</a>
In its final form on the webserver, here is the amazon Crumhorn CDs line in all of its HTML ugliness:
<img border="0" src="images/CompactDisk1.jpg" width="9" height="14">
<a href="http://www.amazon.com/exec/obidos/external-search?
tag=emc-10&keyword=crumhorn|krumhorn&mode=classical-music&
rank=reviewrank" target="_blank"><font color="#000080">
Crumhorn CDs</font></a>
Contrast that with the compactness and readability of the source version above.
Moving along in the webpage definition, we #include the doc_page_bottom_files.cfg file:
///////////////////////////////////////////////////////////////////////////
//
// doc_page_bottom_files.cfg
//
///////////////////////////////////////////////////////////////////////////
</tr>
<tr>
<td width="100%">
#include <files/emc/doc_footer_files.cfg>
</td>
</tr>
</table>
</td>
<td width="0%" valign="top">
#include <files/emc/doc_sidebar_right_files.cfg>
</td>
</tr>
</table>
</body>
</html>
///////////////////////////////////////////////////////////////////////////
We won't delve deeply into the doc_footer_files.cfg and doc_sidebar_right_files.cfg files in the interest of brevity. We do want to point out a few more interesting tricks, however.
Recall that in the doc_instruments_files.cfg file, we #set (#unset) a bunch of environment variables before entering (after leaving) the doc_page_bottom_files.cfg file:
#def longpage
#indent
#include <files/emc/doc_page_top_files.cfg>
<td width="100%">
#include <files/emc/doc_instruments_winds_files.cfg>
</td>
#set apollosaxes_aff
#set inst = "winds"
#set amazon_winds
#set keyword = "wind|bagpipe|clarinet|crumhorn|krumhorn|bassoon|oboe|pipe|
flute|recorder"
#include <files/emc/doc_page_bottom_files.cfg>
#unset apollosaxes_aff
#unset inst
#unset amazon_winds
#unset keyword
#unindent
#undef longpage
In the right sidebar of every EMC page you will find Google ads; this never varies. After that, it depends on the environment variables we have #set and the #define's we have specified.
On some pages, you might find ads for CDs and/or books, with their subject matter matching a page's content. For example, on the Baroque Era page, you might find CDs of music written by Baroque composers and books about Baroque music.
On other pages, you might find ads for instruments, indeed particular types of instruments. For example, on the Recorder Businesses page, you might find recorder ads. (Note: The central content of the Recorder Businesses page--the list of recorder makers and sellers from around the world--is itself auto-generated by a combination of PIKT preprocessing tricks and Perl programs too complicated to describe here. By these same techniques, we auto-generate every one of the other instrument Businesses pages at EMC as well.)
On the Internet Radio Resources page you will probably find an advertisement for on-line radio services.
And so on. Depending on whether longpage, shortpage, or veryshortpage have been #def(ined), you might find two ads (in addition to the Google ads, which are always there), one ad only, or no other ads.
How do we achieve all of this context-sensitive ad customization? Without going into too much detail (because the ad setup is constantly evolving and we are tweaking ad placements constantly), here is a suggestive portion of the doc_sidebar_right_files.cfg file (subject to change at any time):
#ifdef amazon_two
# ifndef veryshortpage
# ifndef shortpage
<br>
<center>
#verbatim <files/emc/doc_amazon_asinlink_emc.cfg>
[/pikt/lib/configs/files/emc/doc_amazon_asinlink_emc.pl -books]
</center>
<br>
<hr>
# endifdef // shortpage
# endifdef // veryshortpage
#endifdef // amazon_two
In the doc_amazon_asinlink_emc.pl Perl script, we check the status of various programs arguments and/or environment variables, for example
if ($ENV{amazon_books} || ($ARGV[0] eq "-books")) {
and generate the appropriate HTML code for the desired advertisement or effect, for example (shown with line wrapping not present in the actual Perl code):
sub amazon_recorder {
print <<EOFRECORDER;
<table border="1" bordercolor="#000000" bgcolor="#ffffff"
style="border-collapse: collapse" cellpadding="5" width="120">
<tr><td><center>
<a href="http://www.amazon.com/exec/obidos/external-search?
search-type=ss&tag=earlymusichicago-50&keyword=recorder&
amp;mode=mi-index&platform=gurupa" target="_blank">
<img src="/images/amazon/recorder.jpg" border="0"
alt="Recorders & accessories at Amazon.com"></a>
<br>
<a href="http://www.amazon.com/exec/obidos/external-search?
search-type=ss&tag=earlymusichicago-50&keyword=recorder&
amp;mode=mi-index&platform=gurupa" target="_blank">
<font color="#000080" size="-1" face="Arial, sans-serif">
Recorders & accessories at Amazon.com</font></a>
</center></td></tr></table>
EOFRECORDER
}
Again, we won't go into greater detail about how all of this works, because it is all subject to frequent change. You might visit the Samples page, where we hope eventually to describe this all in full detail.
At the end of this lengthy narrative of how we handle our web page publishing, we do have to admit two things: this is advanced, complicated PIKT stuff; and other website content management systems exist as alternatives. The other content management systems have their own special capabilities and merits, but remember this: the system described here is the same system you might use to monitor and configure your machines, organize your system security, schedule processes, enhance your Unix/Linux command line, and so on and so forth.
Repeating a point made in the Managing System Security section of the Introduction, "[PIKT does have the] great virtue of involving just one system and command language to learn and use." If you appreciate that virtue and come to master PIKT in all of its power and complexity, you might be surprised at what you can accomplish.
|