This subject requires students to write and document shell and bash scripts, then show the output of those scripts in a web-pased report. This web page is about a tool that simplifies all of that. The tool implements some simple extensions to standard PerlPod.
In fact, this page is an example of that tool. This is not an html file. It is actually a shell script made readable using some html/perlpod tricks. It also auto-includes some output files generated by this tool.
So, before reading on, have a look the raw source for this file. Notice that it's a shell script with lots of commented lines.
It's useful to extend standard pod with some other facilities.
. sitemap.sh
Here, we turn off the next and section commands within sitemap.sh
next() { true; } # will auto-build site map, one day
section() { true; }
sitemap() {
#FILENAME #TITLE
#-------------- #------------------------------------
section Mining
pod percentile percentile:: percentile chops
pod bars bars:: simple histograms
pod nbins nbins:: discretize numeric data into N bins
pod ranges ranges: extract ranges from data in ARFF or C4.5 format
pod stable stable:: find stable treatments
pod dos2unix dos2unix:: covert DOS to UNIX format
pod zapblanks zapblanks:: remove empty columns and rows
pod inspectlog.pod Log of doing some data mining
pod nbc nbc:: A naive bayes classifier
next; section Weka
pod commandlineweka.pod CS510-DM: Using WEKA from the command line
pod kinds.pod Different kinds of attributes and learners
pod wekatools.pod Command-line calls to WEKA learners
pod traintest traintest:: Generate train and test sets
pod randomarff randomarff:: generate randomly sorted arff file
next; section Modeling
pod assume assume:: generate and cache assumptions
pod cocomo cocomo:: cost-estimation with COCOMO
pod cocomoexpert.pod cocomo expert
pod dsl.pod Notes on domain-specific languages
pod kfarm1.pod Learning software processes
next; section Reference
pod functionpoints.pod Notes on function points
pod rx.pod Introduction to treatment learning
next; section Gawk
pod gawk4dm.pod Why all the scripting?
pod gawk101.pod Introduction to Gawk
pod gawk4ai.pod Use of Gawk for AI
pod gawk4teaching.pod Use of Gawk for teaching
pod lib.awk Useful Gawk library functions
pod readTableEg readTable.awk:: show reading tables into Gawk
next; section Pods
pod perlpod.pod How to build web pages quickly
pod gnuplot.pod Scientific plots and pods
pod site Advanced Podding
pod files files:: extracts files referenced in a pod
pod zips zips:: create zip files for site files
pod makemap makemap:: create a site map for site files
pod stuff stuff:: batch up all the site maintenance stuff
next; section Bash
pod freqx Simple frequency counts
pod template Template for bash applications
next; section Subject
pod index.pod CS510-DM: Special Topics: Data Mining
pod project.pod CS510-DM: Project
pod review.pod CS510-DM: Review questions
next; section Site
pod news.pod "What's new"
pod search.pod Search page
pod sitemap.pod "Site map file (auto-generated)"
}
The entire site can be recreated or just one particular file (the latter is useful for debugging purposes and you are continually remaking just one file).
podchecker approves of them.
=include statements for including
output and coding examples or even nested pod files. Included files
can be include other files recursively.
This is a useful project management tool. Group members can work
on separate files in separate directories and one
manager can build the entire project by writing a pod file
with lots of =include statements.
First, if you have installed anything from this site before, save your config file to somewhere safe.
Copy the following files into your own directory from
~/timm/public_html/dm or http://www.cs.pdx.edu/~timm/dm/site.zip:
Make site executable:
chmod +x site
Edit sitemap.sh to include your files.
Edit big.css to change the appearance of the generated
html files.
Copy footer.timm to footer.yourname
and add in your own details. Now, when you write your
own pod files, add as a last line
#=include footer.yourname
Copy cs510.pod to yourheader.pod
and change the context bar (top right) stuff to reflect
what you want to see on all your pages. Change the line
#=include cs510.pod
to
#=include yourheader.pod
Get defaults settings:
Compare your safe version of config
with the new version you just copied and fix up any paths.
. config
Change the following variables, if appropriate:
CSS="big.css"
TOP="[TOP]"
To rebuild the entire site:
site
To rebuild just one file
site file
where file is a file mentioned in the sitemap.sh.
Remaking whole site:
bash-2.05$ site perlpod.pod ==> perlpod.html; tmp.pod pod syntax OK. project.pod ==> project.html; tmp.pod pod syntax OK. commandlineweka.pod ==> commandlineweka.html; tmp.pod pod syntax OK. index.pod ==> index.html; tmp.pod pod syntax OK. site ==> site.html; tmp.pod pod syntax OK.
Remaking just one file:
$bash-2.05$ site perlpod.pod perlpod.pod ==> perlpod.html; tmp.pod pod syntax OK.
(Don't worry if you can't understand the following code. You will soon, just not maybe today.)
pod() {
file=$1; # file name = argument 1
shift;
title=$* # title= everything else
[ -n "$goal" ] && [ "$goal" != "$file" ] && return 0
stem=${file%.*}
echo -n "$file ==> $stem.html; "
$gawk -f rinclude.awk $file > tmp.pod
if podchecker tmp.pod
then pod2html -back "$TOP" -css "$CSS" --outfile=$stem.html \
--infile=tmp.pod -title "$title"
chmod a+r $stem.html
fi
}
{rinclude($0)}
function rinclude (line, x,a) {
sub(/^#/,"",line); # strip leading comments
split(line,a,/ /);
if ( a[1] ~ /^\=include/ ) { #looking for =include at start of line
while ( ( getline x < a[2] ) > 0) rinclude(x);
close(a[2])}
else {print line}
}
[ -n "$1" ] && goal="$1" sitemap
#include FILE F<FILE>
The files utility reports all such references all all
references in included files. See http://www.cs.pdx.edu/~timm/dm/files.html.
zips utility bundles all references files (found by
files) into a zip file. This allows for easy
generation of one file for simple downloading.
See http://www.cs.pdx.edu/~timm/dm/zips.html.
makemap utility builds a site map from sitemap.sh.
See http://www.cs.pdx.edu/~timm/dm/makemap.html.
stuff utility calls all the above tools to
completely rebuild the site.
See http://www.cs.pdx.edu/~timm/dm/stuff.html.
# is useful for including commands without
executing them; e.g.
site
is rendered by
# site
sitemap.sh title contains brackets or quotes, strange things
happen. Best to enclose all such titles in quotes.
included files would save some time.
Site remakes files, even if they are older than the generated
html file. But this is hard to fix since if you block remakes
on non-updated files, you have to recursively check that all the
included files are not updated either.
included into tmp.pod before podchecker
tests them. This means that error messages are reported
using line numbers from tmp.pod and NOT the original source
code files.
Workaround: when tracking down errors from podchecker,
read tmp.pod.
Tim Menzies ,
tim@menzies.us,
http://menzies.us
This page generated by Site:
see http://www.cs.pdx.edu/~timm/dm/site.html
This site is built using PerlPod.Style sheet switching method taken from Eddie Traversa's excellent and simple-to-apply tutorial: http://dhtmlnirvana.com/content/styleswitch/styleswitch1.html.
Search engine powered by ATOMZ http://www.atomz.com/search/. Note, the indexes to this site are only updated weekly (heh, its a free service- what more ja want?).
Icons on this site come from http://www.sql-news.de/rubriken/olap.asp and http://www.ifnet.it/webif/centrodi/eng/toolbar.htm.
The JAVA machine learners used at this site come from the extensive data mining libraries found in the University of Waikato's Environment for Knowledge Analysis (the WEKA) http://www.cs.waikato.ac.nz/ml/weka/
Copyright (C) Tim Menzies 2004
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; see http://www.gnu.org/copyleft/gpl.html. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
The content from or through this web page are provided 'as is' and the author makes no warranties or representations regarding the accuracy or completeness of the information. Your use of this web page and information is at your own risk. You assume full responsibility and risk of loss resulting from the use of this web page or information. If your use of materials from this page results in the need for servicing, repair or correction of equipment, you assume any costs thereof. Follow all external links at your own risk and liability.