I have been looking for a good open source HTML preprocessor for a long time. I found lots of different solutions, but nothing has suited me. I wanted something which is simple, easy to learn, but powerful and flexible as well, a tool which I could fully control. To be short - something conforming to the Unix spirit. When I found WPP I was convinced that it suits my needs perfectly. It is well designed and extensible. It has some built-in features which are important in web development, but nearly everything one could imagine can be achieved with the powerful Perl evals (Perl code embedded into web page). Moreover WPP has nice home page (
http://wpp.sf.net) and very informative manual.
This article is not a WPP tutorial. If you want to learn WPP basics, read the fine manual on the home page. This text is just a bunch of advices on how to integrate
make
and WPP to make your life easier. It describes the process of creating makefile from scratch, but if
make
syntax is already familiar to you, you could still find some interesting tips in it.
Introduction
Parsing the web site with WPP is very similar to source code compilation. We have raw files ('source code') and html files ('object files') created by parsing raws with
wpp
. Following this metaphor it is natural to use the famous GNU make tool together with WPP, like it is used to make the compilation of nearly every program easier. I will point just a few advantages:
-
Dependency resolution: only modified files and files depending on them are rebuilt, it is extremely useful, especially when the site consists of many pages.
-
Makefiles are very flexible and have a powerful syntax. One has to spend some time to create it, but afterwards things are getting extremely simple, one has to just type
make
.
-
Make is a standard tool widely know and used. Why reinvent the wheel? Everything you will learn might profit in the future. Makefiles can be very complicated, we will just talk about some basic features, but if you want to learn more, you should read make info page, which is also available online: http://www.gnu.org/manual/make-3.80/
Basic rules
WPP is a great tool, but when site you parse with it is getting bigger it is also getting harder to maintain. You have to remember to rebuild every file by hand with
wpp
after every change and when you forgot about it your site could be out of date or even broken. Maybe it is not a big problem when it comes to home page, but with important (eg. company) sites people can't afford such mistakes. Moreover I am lazy, and I want site maintenance to be as easy as it could be. Fortunately, as you will learn later in this article, with help from
make
this simple task can be fully automated.
First of all you have to create a file called
Makefile
(or
makefile
) in your project main directory. After calling
make
information in this file is used to recreate all destination files (htmls) which adequate source (raw) has been modified after last parsing (to be precise: files which modification date is latter than modification date of the adequate source file).
Every makefile consists of a set of rules. Every rule looks like this:
target : dependencies ...
commands
...
Remember that every line of 'commands' is preceded by tabulation, NOT eight spaces! Inserting space is a common mistake made by novices, causing make to throw "missing separator" error message.
Target is a destination file name or just some alias. Dependencies are the source files for current target. When one of such files is modified, the whole target needs to be rebuilt using 'commands'. Moreover if one of the dependencies has it's own rule in the makefile, it is entered first. Remember, when calling
make bogus
target
bogus
is called first, and
make
without arguments calls the first (default) target.
Makefiles also support variables, they are defined using VARIABLE="value" syntax. Starting from this line every occurrence of $(VARIABLE) in makefile is substituted with "value".
A simple makefile for WPP (in its standard configuration) located in a directory where raw files are located might look like:
# variable definition
WPP=/usr/bin/wpp
# first (default) rule without commands
all : ../about.html ../index.html
# rule to create about.html
../about.html : about.raw config
$(WPP) about.raw
# rule to create index.html
../index.html : index.raw config templates/logo.tmpl
$(WPP) index.raw
I bet you've already guessed that everything following '#' is a comment, and is silently ignored by make. We could try to translate this makefile into human language:
-
Start with target 'all' (the first target).
-
[Target 'all'] Target '../about.html' exists, so enter it:
-
[Target '../about.html'] If file about.raw or config was modified we have to update file ../about.html using command '/usr/bin/wpp about.raw'.
-
[Target 'all'] Target '../index.html' exists, so enter it:
-
[Target '../index.html'] If file index.raw, config or templates/logo.tmpl was modified we have to update file ../index.html using command '/usr/bin/wpp index.raw'.
-
[Target 'all'] Do nothing if ../about.html or ../index.html was updated.
Pattern rules
Now in order to update the site after modifying one or more raw file we just have to type
make
. But there are some drawbacks, our makefile requires manual modification, because every time when you add new source file you have to also add corresponding rule. It is getting more and more inconvenient as your site grows bigger. Fortunately you can overcome this using 'match-anything' pattern rules of
make
:
all: ../index.html ../about.html
# pattern rule to create EVERY html from raw
../%.html : %.raw
@wpp -x $<
Default target 'all' just states that our site consists of (depends upon) two files. The second rule is used to recreate every file which name begins with '../' and ends with '.html' (so both ../index.html and ../about.html matches). It also states, that ../file.html depends upon file.raw source file, and when file.raw is modified, destination file will be rebuild with 'wpp -x file.raw' command ('$<' is a build-in variable which substitutes the name of the first dependency). '@' prefix before command just tells
make
not to print this command on stdout before execution.
Functions
Now after adding new file you just have to update dependencies of the 'all' rule, you don't have to create any new rules. But we are lazy, and we want something more - we don't want to remember to update anything. Of course it is possible, but it requires some additional modifications in our makefile:
DSTDIR = ../html
WPP = wpp -D DEFAULT_OUTPUTDIR=$(DSTDIR)
# list of all source raws in current directory
SRC = $(subst ./,, $(shell find -name "*.raw"))
# list of all htmls we want to create in DSTDIR
DST = $(addprefix $(DSTDIR)/, $(SRC:.raw=.html))
all : $(DST)
$(DSTDIR)/%.html : %.raw
$(WPP) $<
As you could notice to make the makefile more flexible we defined additional variable called DSTDIR, which of course contains name of the destination directory. WPP has to know about this, so we pass it to every 'wpp' instance using DEFAULT_OUTPUTDIR (
see section 4 of the WPP Manual). Then we use a
make
feature which wasn't mentioned before: functions. Every function call looks like this:
First we use 'shell' function to pass 'find -name "*.raw"' command to the shell. As you probably know it prints list of all files matching "*.raw" pattern in directory hierarchy starting from the current directory. It's output might look like:
./index.raw
./about.raw
./contact/email.raw
Make takes care of the newline-to-space conversion, but we have to strip the leading './'. It can be done using the 'subst' function:
Which convert each occurrence of "from" in "text" into "to". In this way we created SRC variable which is a list of all raw files in current directory and below. We can't use it in our default target yet, because it has to depend on all htmls (not raws) which compose the site. Note that we can't use 'find' to search for htmls, because that might not exist yet, we are still trying to create or update it!
So using
$(SRC:.raw=.html)
substitution reference (similar to subst function, but shorter) for each word we substitute '.raw' at the end with '.html', and return the resulting list. But the destination files has to be in DSTDIR, so we use yet another make function: "addprefix" to add '$(DSTDIR)/' prefix before every word on out list. That's how DST list is created.
Now we can just depend on the whole $DST list of space separated destination files which compose our site in the default target:
# default rule
all : $(DST)
Automatic dependencies
We could just finish here. But wait a minute, what will happen if you modify some template you use in a file? Or you re-scale an image which is parsed by WPP using @HTML_IMAGE@ to automatically determine its width and height?
When you type
make
in such case you will see:
make: `all' is up to date.
It is because make only know that you site depends on every html file ('all' rule), and every html file depends on it's own raw (pattern rule). It knows nothing about templates, images, WPP configuration or any other file that could change WPP's output. We have to tell make about such files, but as you already know we are lazy, and we don't want to do this by hand. That's where WPP's "-d" option comes handy:
-d, --depend Generate dependencies for make.
Looks like the author has foreseen what we will try to achieve :) It forces WPP to look into file and note every image used and (optionally) RURL link. It also generates config file and template dependencies. It generates output in
make
format similar to:
../html/l5k/index.html: \
Config \
TEMPLATES/head.tmpl \
../html/portal.gif \
../html/index.raw
As you may guess "\+newline" allows you to continue the current line in the next one. We could append this output to our makefile at each
make
run, but it would be a great waste of time and resources to rebuild dependencies every time. It is wise to store this information in some file, and rebuild it only when it is necessary:
# rule to build 'Makefile.dep', alias 'dep'
Makefile.dep dep :
@$(WPP) -d $(SRC) > Makefile.dep
Moreover we have to append
include Makefile.dep
at the of out makefile to utilize the generated dependency information. Our target is called 'Makefile.dep' because it generates this file, and it will be invoked by the include mentioned before if Makefile.dep doesn't exists. The 'dep' alias is just for your comfort, so you could just type:
make dep && make
if you suspect that your modifications have changed the dependencies, and just
make
otherwise.
Cleaning up
Now you have a fully working system based on
make
. At the end you could just add one final rule, which is used to clean things up and remove every html file in the destination directory (
warning: double check if it is correct or you could lose some important files!):
clean :
find $(DSTDIR) -name "*.html" -exec rm -f {} \;
Now you could type 'make clean' to remove unnecessary files and free some disk space, or to rebuild the entire site from scratch.
That's all, I hope that most of the simple tips included in this text will be helpful to you and will simplify your site management with WPP a lot. May the Source be with You!