Advanced Pod-ding

Data Mining
CS 510 (DM)
Winter,2004
home | news | site map
review | project | subject | group
weka | mining | gawk | bash
modeling | reference | pods
Display: big | small

Why all the scripting?

This subject requires students to write and document shell and bash scripts, then show the output of those scripts in a web-pased report. This web page is about a tool that simplifies all of that. The tool implements some simple extensions to standard PerlPod.

In fact, this page is an example of that tool. This is not an html file. It is actually a shell script made readable using some html/perlpod tricks. It also auto-includes some output files generated by this tool.

So, before reading on, have a look the raw source for this file. Notice that it's a shell script with lots of commented lines.

Motivation

It's useful to extend standard pod with some other facilities.

[TOP]


Usage

Installation

First, if you have installed anything from this site before, save your config file to somewhere safe.

Copy the following files into your own directory from ~/timm/public_html/dm or http://www.cs.pdx.edu/~timm/dm/site.zip:

Standard files
site (main driver) and config (path names)

Support code
Recursive includes: rinclude.awk and the site map sitemap.sh.

Sample output files
siteeg.out

Word processing support files
Style sheets: tiny.css; small.css; big.css. Standard includes: the cs510.pod header and the footer.timm footer

Miscellaneous graphics
mining.jpg, psu_logo.jpg, timbeach.jpg, timbeach.png

Make site executable:

 chmod +x site

Configuration

Edit sitemap.sh to include your files.

Edit big.css to change the appearance of the generated html files.

Copy footer.timm to footer.yourname and add in your own details. Now, when you write your own pod files, add as a last line

 #=include footer.yourname

Copy cs510.pod to yourheader.pod and change the context bar (top right) stuff to reflect what you want to see on all your pages. Change the line

 #=include cs510.pod

to

 #=include yourheader.pod

Get defaults settings:

Compare your safe version of config with the new version you just copied and fix up any paths.

 . config

Change the following variables, if appropriate:

Location of style file
 CSS="big.css"
String for ``top of page'' link
 TOP="[TOP]"

Command line

To rebuild the entire site:

 site

To rebuild just one file

 site file

where file is a file mentioned in the sitemap.sh.

Example output

Remaking whole site:

 bash-2.05$ site 
 perlpod.pod ==> perlpod.html; tmp.pod pod syntax OK.
 project.pod ==> project.html; tmp.pod pod syntax OK.
 commandlineweka.pod ==> commandlineweka.html; tmp.pod pod syntax OK.
 index.pod ==> index.html; tmp.pod pod syntax OK.
 site ==> site.html; tmp.pod pod syntax OK.

Remaking just one file:

 $bash-2.05$ site perlpod.pod
 perlpod.pod ==> perlpod.html; tmp.pod pod syntax OK.

[TOP]


Source code

(Don't worry if you can't understand the following code. You will soon, just not maybe today.)

Pod: the engine room

 pod() {  
     file=$1;   # file name = argument 1
     shift;     
     title=$*   # title= everything else
     [ -n "$goal" ] && [ "$goal" != "$file" ] && return 0 
     stem=${file%.*}
     echo -n "$file ==> $stem.html; " 
     $gawk -f rinclude.awk  $file > tmp.pod
     if    podchecker tmp.pod
     then  pod2html -back "$TOP" -css "$CSS" --outfile=$stem.html \
                    --infile=tmp.pod  -title "$title"
           chmod a+r $stem.html 
     fi
 }

Rinclude.awk: recursive include

 {rinclude($0)}
 function rinclude (line,    x,a) {                               
   sub(/^#/,"",line);      # strip leading comments
   split(line,a,/ /);                                                  
   if ( a[1] ~ /^\=include/ ) { #looking for =include at start of line   
     while ( ( getline x < a[2] ) > 0) rinclude(x); 
     close(a[2])}                                                      
   else {print line}                                                   
 }

Main driver

 [ -n "$1" ] && goal="$1"
 sitemap

[TOP]


Tool support

Finding files references in this pod
A file reference in this system is one of:
 #include FILE
 F<FILE>

The files utility reports all such references all all references in included files. See http://www.cs.pdx.edu/~timm/dm/files.html.

Zipping up all relevant files
The zips utility bundles all references files (found by files) into a zip file. This allows for easy generation of one file for simple downloading. See http://www.cs.pdx.edu/~timm/dm/zips.html.

Making a site map
The makemap utility builds a site map from sitemap.sh. See http://www.cs.pdx.edu/~timm/dm/makemap.html.

Rebuilding site
The stuff utility calls all the above tools to completely rebuild the site. See http://www.cs.pdx.edu/~timm/dm/stuff.html.

[TOP]


Quirks

Tricks

Annoyances

Bugs

[TOP]


Credits

Author

Tim Menzies , tim@menzies.us, http://menzies.us

Software

This page generated by Site: see http://www.cs.pdx.edu/~timm/dm/site.html

Acknowledgements

This site is built using PerlPod.

Style sheet switching method taken from Eddie Traversa's excellent and simple-to-apply tutorial: http://dhtmlnirvana.com/content/styleswitch/styleswitch1.html.

Search engine powered by ATOMZ http://www.atomz.com/search/. Note, the indexes to this site are only updated weekly (heh, its a free service- what more ja want?).

Icons on this site come from http://www.sql-news.de/rubriken/olap.asp and http://www.ifnet.it/webif/centrodi/eng/toolbar.htm.

The JAVA machine learners used at this site come from the extensive data mining libraries found in the University of Waikato's Environment for Knowledge Analysis (the WEKA) http://www.cs.waikato.ac.nz/ml/weka/

[TOP]


Legal

Copyright

Copyright (C) Tim Menzies 2004

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; see http://www.gnu.org/copyleft/gpl.html. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

Disclaimer

The content from or through this web page are provided 'as is' and the author makes no warranties or representations regarding the accuracy or completeness of the information. Your use of this web page and information is at your own risk. You assume full responsibility and risk of loss resulting from the use of this web page or information. If your use of materials from this page results in the need for servicing, repair or correction of equipment, you assume any costs thereof. Follow all external links at your own risk and liability.

[TOP]