Copyright (C) 1992, 1993 Portland State University and Trent A. Fisher
Printed 18 July 1994
This is Edition 1.0 of PSU-CS Sysgroup Handbook, for the Computer Science Systems Staff. last updated 29 September 1993 . Printed 18 July 1994.
Published by Portland State University P.O. Box 751, CMPS Portland, OR 97207
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
The Computer Science Systems Group (Sysgroup) is dedicated to providing and maintaining computing resources for the CS faculty, staff and students. Sysgroup is responsible for:
Sysgroup's goal is to provide and maintain a wide range of hardware and software with the greatest quality possible. Unlike a Computer Center, it is our responsibility to be flexible to unusual requests, and fulfill them to the best of our ability.
This document explains the procedures which have been unwritten for many years, and others which are newly created for the changing needs of the department. It is not intended to be an all-inclusive set of rules and regulations, but instead, a set of guidelines (and some rules) to help shape the culture within the Systems Staff.
The fewer rules the better: Rules reduce freedom and responsibility. Enforcement of rules is coercive and manipulative, which diminishes spontaneity and absorbs group energy.The more coercive you are, the more resistant the group will become. Your manipulations will only breed evasions. Every law creates an outlaw. This is no way to run a group. (1)
Much of the work on the Systems is performed by students. These students generally start out as volunteers who work for their own edification and for the opportunity to become a paid member of the Sysgroup.
In some cases, a volunteer may be an underclassman or may not be attending PSU and, as such, not be entitled to an account on the CS machines. However, for this purpose, a volunteer may be granted a guest account. See section Guest Accounts, for more info.
In order to become a volunteer, a person must attend the Systems Workshop, pass a pre-test, and be interviewed by the Systems Manager. A file will be kept on each volunteer, which should, at minimum, contain a copy of their resume.
After a volunteer is accepted as a new member of Sysgroup, they will be put into the "sysgroup" mailing list.
In order to maintain a continuous supply of persons for doing systems work, a training program will be run. It will have approximately 1-2 hours of lecture per week, and will cover much of the necessary background material for Systems Administration (i.e. shell programming, filesystem structure, networking, troff, etc...) along with actual Systems Administration material.
Some early lectures have been video-taped and are available in the EE office. The materials for this workshop may be found in `/home/rigel/sys/group/workshop'.
Since the Systems Workshop is intended to train volunteers at a lowest common denominator, there are some people who may want to pursue specialized topics. To encourage this, special interest groups will be formed to discuss, and work on, their chosen subject area in greater detail that would be possible in the Systems Workshops.
Each SIG will have a coordinator, who will be charged with arranging meeting times, and leads the group. People interested in doing this should post a message to `psu.systems.group'.
These SIG's may also produce lectures for the general Sysgroup population, and put them on videotape.
The most crucial SIG is CORE (Computer Operations Research E?) which concentrates on hardcore C programming, i.e. kernel hacking, device drivers, and similar sorts of systems programming.
This SIG is not intended to teach the basics of C programming (although, the participants may work on such a workshop). Some of the work will be done as part of job duties with the Systems Staff, other projects may be strictly voluntary.
This chapter contains the job descriptions for each member of The Systems Staff. Some general information follows, that applies to every member of The Systems Staff.
The Systems Staff consists of the following:
Except for the Systems Manager and the Systems Programmer, all job descriptions list all possible duties applicable under that title. When a person is hired, certain duties will be assigned out of these lists. This list may be renegotiated by both parties as necessary. It is quite possible for a person to have more than one job title which applies to them if they have assigned duties from several job titles.
The reason for this approach is that student employees have a wide range of skills and available working hours, and also because of natural turnover. The approach used here avoids having to rewrite job descriptions continually.
This section details the responsibilities and other vital information applicable to all members of the Systems Staff.
All members of the Systems Staff are responsible for following the policies within this document. In particular, responding to e-mail and emergencies are of paramount importance. See section Problem Response, for more info.
All members of the Systems Staff are expected to attend the weekly Sysgroup meetings and to provide a summary of work done over the week.
Also, all Systems Staff members will use the task system for tracking work done and in progress (See section Task System).
Any member of the Systems Staff may be exposed to the following working conditions. Reasonable accommodations can be arranged to avoid certain working conditions, if necessary, and the specific job description does not specify that this is a necessary condition.
Any member of the Systems Staff may have any number of volunteers entrusted to their supervision.
The Systems Manager is a full-time research assistant who manages all hardware and software in the department (see section Introduction, for more details.), and the Systems Staff. The Systems Manager is responsible for ensuring that the policies within this document are followed by the Systems Staff, including the Systems Manager.
The Systems Manager works under the Faculty Systems Committee and will meet with them periodically (every 3-4 weeks).
The Systems Manager's duties are:
The Systems Manager will work with the Electrical Engineering Systems Manager and Hardware Technician to maintain shared equipment. The Systems Manager will also work cooperatively with the Systems Staffs of other departments in the University and, contact shall be maintained with Systems Staffs of other Institutions in the area (i.e. OSU, OGI).
The Systems Programmer is a full-time research assistant who develops, installs and maintains a wide range of system and application software. The Systems Programmer also administrates and co-administrates many systems.
The Systems Programmer must be an experienced Systems Administrator, and must be a proficient programmer, particularly in the languages suited to systems work (C, Perl, Bourne Shell, awk, sed, etc...), not to mention being skilled in installing and maintaining software.
The Systems Programmer works directly for the Systems Manager.
The specific duties are as follows:
The Systems Programmer will commonly have contact with the EE Hardware Technician, the EE Systems Manager, Telecommunications service, and System Administrators of other organizations (for UUCP).
The Hardware Technician maintains and repairs the various hardware which the CS Department has. This position is part-time (approx 10-20 hours per week).
The Hardware Technician should be able to diagnose common problems and replace components (i.e. flyback transformers).
The Hardware Technician will have to be able to work in all the working conditions specified above (See section General Info, for more info.)
The Hardware Technician has the following duties:
Some of the common contacts outside of the Systems Staff for this position are: The EE Hardware Technician, Physical Plant Electrician(s), Property Control Specialist, CS and EE Office Coordinator and Secretaries.
The X Administrator is a part-time (5-10 hours/week) position responsible for the operation and administration of the X terminals and the server machine(s) for them.
The X Administrator should be a proficient user of X windows, and be familiar with the overall structure of the system. Also, this person should have some system administration skills.
The X Administrator works directly under the Systems Manager, although it is possible that there may be more than one X Administrator, in which case they will also coordinate work with each other.
The X Administrator performs the following duties:
xdm
is running on them.
This person will work closely with the X programmer(s), see section X Programmer, for more information.
The X Programmer is a part-time position who maintains the X windows software on all the machines that run it.
The X Programmer should be an experienced C programmer, an experienced X windows user and, preferably, have some knowledge of X programming.
The X Programmer answers directly to the Systems Manager.
The duties for the position are:
The X Programmer will work closely with the X Administrator, see section X Administrator, for more info.
The Assistant Systems Administrators are part-time student employees who administrate small systems or co-administrate the larger CS machines. They answer directly to the Systems Manager. As the amount of time varies due to the different duties which fall under this category, the amount of time each duty takes is noted in brackets after the duty.
People in these positions must be experienced UNIX users and either some experience or training in System Administration (see section Systems Workshop, for more info) Shell programming skills are also important to these positions.
The Assistant Systems Programmer is a part-time employee who works on specific programming projects. This may involve developing new software, or maintaining existing software. This person answers directly to the Systems Manager.
The Assistant Systems Programmer will be a proficient programmer, particularly in languages relevant to Systems Work (e.g. C, C++, Perl, Bourne Shell, awk, sed, etc.)
Any member of the Systems Staff who has special access to any of the department's resources will be expected to follow certain guidelines.
After-hours access is only available on a special-case basis. There is no after-hours access to the Mill Street Lab, only to the Systems Office.
Anybody with after-hour access is expected to act responsibly with this privilege, those who cannot will be relieved of the burden.
The only people who will be given keys are those on University payroll (Student employees, Teaching Assistants, etc...) The rooms for which keys will be given are (in order of preference): Terminal Room, Systems Office(s), Machine Room, and Computer Science Office. Other keys (such as master keys) will be given out if they are necessary for the performance of the job.
Anybody who has access to the Systems Office is assumed to be a trustworthy and responsible individual. Anybody who has a desk in the Systems Office is expected to follow the following guidelines:
top
, etc.).
Most of the manuals in the Systems Office may be borrowed, provided that they are checked out by someone known to the Systems Staff, and that the materials are listed on the check-out sheet.
Anybody working on important projects or any paid Systems person should keep a schedule in their `.plan' file, and posted on the door to the Systems Office. This schedule should specify when they are in PCAT, in person. Also, a schedule should be given to the Systems Manager.
During a person's specified office hours (if they have any) the in-out board outside of the systems office should be used.
The top priority of the Systems Staff is the timely response to problems. Anything else we do is utterly meaningless (to the user populace) unless we ensure that all problems get responded to, and that our demeanour always suggests that we are genuinely interested in solving their problems.
The following sections detail how and when problems should be responded to.
As a member of the Systems Staff, it is vitally important that E-mail is used effectively, not only from a mechanical perspective (i.e. how do I read mail?) but also from a stylistic perspective, which is what the remainder of this document covers.
Due to the varied nature of sysgroup mailings, generalities cannot be avoided in describing them. Hopefully, after being in sysgroup for a while, everything described herein will become clear.
To maintain optimum relations with faculty, students and any others, the following conventions must be followed. If you are unable to follow these conventions, you will be put into a position where you will not need to.
When a message comes to sysgroup, there should be an initial response (ACK) which will either indicate that you will be working on the problem, or that you have found a solution. This initial response should also contain a time estimate as to when the work will be compleated. This initial message must be sent within the following time frames. (See section Systems Manager, for more info.)
When you send an initial response you are implicitly taking the responsibility for seeing that task through to it's completion. This does not mean you have to resolve the problem yourself, but if you are unable to, you are responsible for finding someone who is able to resolve it, and that the problem is, ultimately, resolved.
Unless you don't know the answer to a person's problem, or you know that someone else will respond, you should never ignore a sysgroup message. It is our collective responsibility to maintain an optimum response time.
If the project takes a long time, status reports should be sent periodically. These status reports should be sent to the involved individuals and to sysgroup.
When the problem is corrected, a message should be sent informing the involved individuals that you think it is fixed, and asking them to make sure it is to their satisfaction.
Graphically, the process is like this:
Mail -.----------------,-----------------,-> Final \ / / \ / / `----> ACK --------> Status -< ( ) `----------'
Most of these items in this section are taken directly from various USENET etiquette guidelines(2).
These are some E-mail guidelines specific to anybody responding to messages on sysgroup:
msgs
is OK. Either way the 4
line rule still applies (q.v.).
We are using a mail monitoring tool called GIPR (3) to handle task management.It intercepts any mail going to Sysgroup and stores and indexes each message. This way when someone sends a problem to Sysgroup, GIPR initiates a 'task'. Each time someone sends a related message it is stored with the original 'task'. When a task is finished a message is sent, replying to the person who sent the original problem, marked as a 'resolution', and then GIPR removes it from the list of tasks to be done.
The first person to respond to a message will be assigned responsibility for the task (although this can be reassigned later).
Since most operations can be done from a mail-reader and messages are coordinated by their `In-Reply-To:' fields, this means that you should always keep a message relevant to a task you are working on.
Most operations on the task system should be doable from any mail-reader. Control information for GIPR is encoded into the subject line as keywords between square brackets.
The specifications for the priority field were assigned by the Faculty Systems Committee, and, thus, it is very important that they be followed.
The priority field consists of a letter `A-F' and a digit. The letter signifies the broad priority category. The number specifies a relative priority within the priority category.
These priorities are set by putting a line like `Priority: pri' anywhere in your message.
Many operations can be done on the task list by using GIPR at the command line. Ideally, few of these should ever be used.
Here is a summary of the more important command line options (in approximate order of importance):
{User Option} -l [task numbers]
{User Option} -s [task numbers]
{User Option} -p [task numbers]
Purge a message which should never have been made a task. This should not be done lightly.
{User Option} -r [task numbers]
Resolve the given task(s). This should not be used, in general.
{User Option} -a [base task number] [task numbers]
Attach the given task(s) to the given base task. This is mainly used to aggregate related tasks together, and to make up for brain-damaged mailers.
{User Option} -M [task number]
Return an `In-Reply-To:' line for a given task. This is used when the original message has been lost/erased/&c.
{User Option} -E [task number] [estimate date]
Change a task's estimate date. This should not be used, in general.
{User Option} -P [task number] [priority]
Change a task's priority. This should not be used, in general.
When there is an emergency to deal with, and the Systems Manager is not present to handle it, those present must determine who is in charge. The order of succession in such a situation is:
In any case, a person with keys has preference over persons without.
If the person in charge cannot handle the situation (particularly
if the person is in one of the latter two categories) their first
responsibility is to find someone who can (via phone, pager,
talk
, etc.)
When none of the previous methods for handling a problem work, it may be necessary to page someone. Currently we have only one pager, its number is 299-9490. Below are some notes for both the person with the pager and the one doing the paging.
Eventually, it would be nice to set up a priority/event code into the area code, so that the on-call person will have an idea of the importance and meaning of the page. If you have any ideas for this please let me know.
This is an outline of what a generic workstation in the CS department will be like. This configuration will ease administration of the increasing numbers of workstations, and make the computing environment more consistent between workstations.
Deviations from this are possible, but will have their costs (either in decreased service, or in local SW maintenance).
Some sort of Sun Sparc (IP[XC], ELC, 1+, 2, 10, etc...) With at enough disk space to hold the OS (~400 megs).
The filesystems `/', `/usr', `/var' and the like will be local.
One or more local filesystems for a faculty member's use may exist with the naming convention of `/home/hostname'/food name All other filesystems will be taken care of via the automounter.
`/usr/local' (including `X11') will be mounted from rigel, sirius or xavier.
Accounts will be distributed via NIS (YP), and all home directories will be mounted from rigel.
Depending on the application, some or all users may be excluded from the workstation (although NIS will still run)
Mail will not be handled locally, i.e. mail sent to a workstation will be forwarded to rigel.
Rigel's mail spool will be mounted such that mail can be read locally
routed will run, to lessen dependence on gateways
This section details various folklore and guidelines for people who have root access on the CS machines.
In an environment where many people have root access, it is very easy to step on other people's toes. There are two paradigms for avoiding this:
vipw
is an example of the latter.
The approach taken in the Sysgroup is a combination of the two preceding paradigms. Some guidelines will be detailed hereafter, which should give an idea as to what some of the common problem areas are.
Beyond these guidelines, anybody with root access is assumed to be conscientious and sensitive enough to avoid most problem areas.
sudo
whenever possible, and use it for single commands when
possible.
There is a variety of sources for information and assistance when encountered with system problems.
There is a great deal of information in `/home/rigel/sys/group', the notable files are:
Also there are a number of local tools for diagnosing and fixing systems problems. They may be located in any of the following directories, but the first two are definitely preferred:
This section details various common problems and fixes for them. This is not (and can never be) all-inclusive. Some related information can be found in section Modems and section Security.
There are a number of things that can go wrong with a user's account.
One of the most common is a forgotten password. There are two ways to handle this:
Another common problem is a person's startup files getting messed up. The files which can cause the greatest problems are `.cshrc', `.login', and `.xsession'.
Sometimes a user's home directory will have incorrect permissions (via `chmod' experimentation) or incorrect ownership (via SysAdmin oversight).
Another occasional problem is the password file on a YP server and its clients getting out of sync. There should be a program called `makeyp' which will push changes out, if not, run `make' in `/var/yp'
- printer queues
A common problem (especially after a reboot) is the printer daemon silently dying. This can happen on any machine running `lpd', although it is most common on Suns. `lpq' will indicate this with `no daemon present'. If there is no `lpd' running on the machine, start another up.
Another strategy for dealing with misbehaving printer daemons is to kill all the `lpd''s and restart one `/usr/lib/lpd'.
Sometimes the print queue jams, i.e. the active job has printed but it never gets removed from the queue. In this case use `lprm' to remove the active job.
You can use chkps -e root to find jobs belonging to people no longer logged in. If they are not `nice''d you may kill them, and notify the people that you did so. There is a form letter for this.
Sometimes a runnaway process will cause a pty to misbeahve. Use `chkps' as above to find and kill any offending programs. If that doesn't work try to determine the next free pty and use `lsof' to find the misbehaving processes.
The program `chkaddr' can help find mail loops (via `.forwards').
This chapter details the guidelines for altering hardware and software which is in general use in the department.
All significant hardware changes should be preceded by a full backup to prevent any data loss due to the change. See section Backups. The following sections also apply to hardware modifications, as it typically requires downtime. See section Planned Downtime and section Routine Preventive Maintenance, for more information.
Test the software in question before installation or install it on an isolated machine for testing. This is particularly important if the software will affect large numbers of people (i.e. changing a login shell).
Changes to heavily used pieces of software require that the entire user community be notified. For widely used software, post notification to msgs. Otherwise, send mail to the user(s) of the software.
After the software is installed it should always be tested again. If testing is difficult due to a lack of understanding of the software (e.g. specialized computer languages), the person who requested the software installation should be solicited for assistance
After the software is installed, another message should be posted specifying the changes made, and the extent of the testing.
Either the person who made the changes, or someone familiar with them, should be on hand during the next day to assist with any problems that may arise.
The old versions of the software should be kept in case problems with the new version arise.
Any time any modification is made to a machine, or any significant event occurs (i.e. a system crash) a record should be kept of it.
After much consideration, it seems as though the most difficult part of getting people to keep records is making the system convenient. The following system is the simplest I have been able to work out.
A set of mailboxes, located inside the Systems Office, is dedicated to specific system information. Any information (logs, notifications, packing slips, etc.) about a machine should be put in the appropriate box. Make sure that the information is dated and specifies who wrote it.
The following sections specify the policies for the various types of system downtime: Planned, Unplanned, and Emergency.
Any system down time should be announced well in advance of the planned date. The amount of lead time should be proportional to the amount of (projected) downtime and to the importance of the machine. Any general use machine should get at least 1 weeks' notice.
Downtime announcements should specify the following:
If you receive mail from a faculty member who needs the machine during this time, make the utmost effort to reschedule or arrange alternate resources for them.
Unused or single-user machines (diskless workstations, workstations
which provide service to one person only) may be shutdown
without
notification provided the machine is currently unused (and probably
won't be soon), or the user of the machine agrees to the downtime.
Two time slots are reserved for times when any system may be shutdown with very short notice. These time slots are
The Tutors (x4023) should always be informed of any crashes, and given a time estimate of recovery. Since they are the first contact for the user population, this is important.
Also, there will be white boards in the terminal room and on the door to the Systems Office for systems messages (i.e. `Eecs down, hardware problems, running diagnostics, back up by 12:30 (hopefully)'). If necessary, another one may be located outside the CS office.
There are situations when a machine may need to be rebooted either due to software failures or the entire OS locking up.
When a machine is in a state where it is no longer functioning properly
for a majority of its users, and rebooting it is the only option, it
should be rebooted. For example, if all the nfs
daemons are
dead on a diskless client server, which means none of its diskless nodes
or YP
clients will work, it is time to reboot.
If at all possible, attempt to perform a shutdown
in an emergency
situation; 5-10 minutes should be enough lead time. Make sure to
mention (via shutdown
) that this is an emergency reboot.
If the entire OS is locked up (and you have double-checked) do not hesitate to crash the system and reboot.
Certain routine maintenance must be performed on the various computer hardware in the department.
As of yet, we have not determined all the hardware needing service and the frequency of such service.
Since we are on the Internet, it is vitally important to maintain security on all of our systems. There are crackers constantly wandering around looking for new playgrounds or bases of operations for themselves. This means that if we are broken into, we could be a launching point for attacks on other sites.
This section details the security plan for the machines managed by Sysgroup. For any system to be considered for the addition to the trusted host cluster (i.e. be in `/etc/hosts.equiv'), it must pass all of the following points.
This section details what should be done when an account is left logged in(4). It is vitally important to follow these procedures because we all make mistakes and we hope that the person who comes upon our account is as nice as we were when we found someone else's account.
When confronted with an account that has been left logged in, one usually feels an incredible rush of power. You can now do anything you want to that account. You can tinker with the account's files. You can send all sorts of mail to other people from it. You can probably even crack into other systems from it. And, best of all, you can probably get away with it, too, if you're careful.
Or you may feel like teaching this person a lesson that will not be forgotten in a hurry. You can tinker with the settings, change their window setups, come up with what appears (at that time) to be cute changes to their tty driver, aliases, shell scripts, etc. In effect, humiliate the user so that they don't ever, ever leave themselves logged in again.
Before doing anything, consider the following:
We are in the business of running a computer center. We are trying to create a documented list of rules and regulations so that everyone knows what is expected of them and what happens if they break those rules. Any punitive action will be undertaken by the Systems Manager. Individual Sysgroup members may prevent someone from breaking a rule but may not render justice on the spot.
When faced with this situation, the following is the appropriate action to take:
If a user keeps doing this, the Systems Manager should be made aware of this so that they may be dealt with personally. This is a chance to see if this is genuine ignorance on the user's part (which can be rectified) or sheer obstinance. In the latter case, after repeated chats, their accounts should be de-activated and they will be referred to the Systems Committee for further action.
If the account in question belongs to a Faculty member, they should be notified, as usual, after which the Systems Manager should have a chat with the person. Any problems after this will be sent directly to the Systems Committee with no other action.
If the account in question belongs to a member of the Systems Staff, they will be dealt with much like Faculty, except that all action will take place internally. It is possible for a person to be dismissed from the Systems Staff for persisting.
Note that policies for Systems Offices is laid down elsewhere, see section Office Policies, for more info.
People's offices and desks are their own domain and they can do whatever they feel fit with them. Tampering with a person's login session in this situation will be considered equivalent to rifling through their desk. If you have a key to someone's office, it is assumed that you can treat the privilege responsibly and ethically.
The paragraph above is very important. There are many systems people who have keys to these offices and they are all considered trustworthy. If they prove themselves not to be, they will lose both their keys and their job.
Some people may give others permission to use their workstations. In this case, the rules for logging out is governed by that personal agreement. In the absence of such an agreement, no one touches a private workstation, whatever the reason. Allowing people to use a window to check on the load average or status of a system is a convenience that has to be personally authorized by the owner of a workstation. It is not an implied right.
The Backup Schedule is as follows:
All backups are recorded and logged for future reference. Vitally important backups should be stored off-site.
File restorations are done in the following order:
The following are some guidelines for using any of the exabyte tape drives:
This chapter details the policies for the installation, storage, and organization of software within the CS department. Some other relevant information can be found in section Software Change Policy
This section details the guidelines for installing software on CS systems. Before installing any software be familiar with, and be prepared to follow, our software change policy, see section Software Change Policy for more info.
There are two types of software on a computer system: the vendor-supplied OS and systems software, and locally installed software.
The former should be located in the directories
/bin
, /usr/bin
, /usr/ucb
, and possibly others.
All of these directories must be in every users path.
Efforts will be made to make this set of software consistent between
various machines, by augmenting or replacing programs
The latter is what this section concerns itself with.
In order to maintain software in a controlled way, software will be categorized according to the amount of effort Sysgroup will put towards its installation and maintenance. The three categories are described below, both in terms of what kind of software is covered, and in terms of what support can be expected.
Software which in this category must be general purpose, receive widespread use, it should be stable and can be made available on all supported architectures in a uniform manner. This software should also not be a redundant service. The terms used above are defined, for our purposes, as follows:
Software in this category will be upgraded periodically, and when necessary for normal operation (see section Software Change Policy). When software in this category fails, every effort will be made to diagnose the problem, and, either work with the maintainers of the software to fix the problem or try to fix it ourselves and supply the fix to the maintainers.
This is software which is supplied by some commercial entity. It will, in general be treated similarly to to supported software however, we do not have source and are, thus, at the mercy of the supplier's support organization. The reason for this distinction is so that it is clear to the users that if the software breaks we may be unable to fix it and will have to wait for the supplier to fix it.
This is any software which does not meet the preceeding criteria. This may also be software which is being evaluated for possible future supported classification.
Little effort will be given to such software, unless a Sysgroup member has some spare time to spend maintaining it. If the software fails, sysgroup will not be obligated to do anything. Volunteers who wish to work on the software will be given necessary assistance (i.e. manuals, assistance in installing it, &c).
This section describes how and where local should be installed.
Any local software should be installed under the `/usr/local' hierarchy. Small software packages (as measured by the number of binaries) which are supported will be installed in `/usr/local/bin', with their libraries in `/usr/local/lib' and man pages in `/usr/local/man'.
Unsupported software will be installed in a separate hierarchy (i.e. with it's own `bin', `lib' and `man') so that its support status is obvious. This hierarchy will be located in `/usr/local/uns'.
In the case of X software which uses imake
, rather than using
xmkmf
(which will try and put everything with the standard X
distribution) you should use the following command:
imake -DUseInstalled -I/usr/local/X11R5/lib/X11/config-uns
This is, essentially what xmkmf
does with `-uns' attached.
Large software packages will be put in a separate hierarchy of their own. There are several benefits to this scheme.
These hierarchies will contain a `bin', `lib' and `man' directories (with the latter being an entire man hierarchy).
Note that in the case of large packages, there is no distinction between support levels made via location of said package. This distinction will have to be made elsewhere.
As a result of these packages, users will have to have an easy means for customizing their $PATH variable. All of the supported packages will be in everbody's default path. Unsupported programs can be easily added by setting a variable before the system default `.cshrc' is sourced.
Any Software installed on the CS Systems in one of the standard
$PATH directories must have a man
page installed in the
appropriate `man' hierarchy.
Any other documentation (i.e. papers, tutorials, etc.) should be stored in the Systems Office. If appropriate, the tutors should also be given a copy of it.
See section Software Support Policy, section Software Change Policy, section Source Archive and section Software Organization for more info.
Unsupported software may be installed by anyone in the `uns' group. In order to become a member of this group you must sign an agreement form. Check with the Systems Manager for details.
The sources for programs should be kept in `/usr/local/uns/src'.
Everything should be installed under `/usr/local/uns/bin', i.e. no packages. See section Software Organization for more info.
Any software installed on CS Systems should have it's source code available to all users (unless restricted by copyrights) under a uniform source tree. All sources should appear under the /src directory, whether physically in that filesystem or linked there via NFS.
Any programs in the source archive should be in a directory with a name which consists of the package name, a hyphen, and the version number.
If any changes are made within the source archive,
src-godz@cs.pdx.edu
should be notified.
Every machine in the CS department should have certain software installed to prevent confusion to the user population and us. Some software may not run on certain machines; this is acceptable, and a normal part of managing a heterogeneous network.
Some such software is:
The account system currenly in use was written for the Engineering Computer Network at Purdue University by the following people:
David A. Curry - SRI International (while at Purdue University) Samual D. Kimery - Purdue University Kent C. De La Croix - Purdue University Jeffery R Schwab - Purdue University
Changes were made localy by:
John Jendro Dennis Gilbert Gary Moyer The current version of acmaint is 2.0.
acmaint 3.0 is being written by the folks at Purdue at the moment.
It will include: The use of tcl The restructoring of the database records The option to use a real database program The use of TCP instead of UDP in some parts of the account system
It is hoped to put printer quotas in the database
There are two types of records in the account database. The first is the user record which contains general information about the user. It contains the following fields:
uid
login
sid
fullname
office
offphone
homephone
mailbox
grouplist
affiliationlist
uninterp
There is 1 account record for every machine that the user has an account on.
machine
login
gid
passwd
classif
courselist
authdept
authorizer
expdate
shell
homedir
uninterp
For each group there is a group record.
gid
gname
passwd
authorizer
uninterp
Each user also has a host record which is a list of all hosts a user has an account on, and a student id record which allows a users login name to be determined by student id number.
The account system is composed of the following daemons.
acmaint_dbd
acmaint_burstd
acmaint_wired
acmaint_transd
acmaint_dbd
The following is a list of the valid commands to acmaint_dbd
The following programs customize the Purdue's original account system to work according to our local procedures. These were written at PSU.
addme
aack
Menu item 1 looks at all of the entries in the queue and allows the administrator to change the queue record into a database record, and make the account.
Menu item 12 allows the administrator to disable an account with a seeme shell.
Menu item 13 allows the administrator to re-enable an account.
account_maint
Finger daemon
Login Name: johnj Name in real life: John Jendro Groups: wheel operator sources sworkers tip cmc games Mail address: This Person has Accounts on: jove.cs.pdx.edu cs.pdx.edu walt-suncs.cs.pdx.edu Hostname: jove.cs.pdx.edu Group id: 5 Classif: Auth Dept: Authorizer: Expiration: never Shell: /bin/csh Home Dir: /home/rigel/sys/johnj Created By: loadpwfile Extra Info: Hostname: cs.pdx.edu Group id: 5 Classif: Auth Dept: Authorizer: Expiration: never Shell: /bin/csh Home Dir: /home/rigel/sys/johnj Created By: loadpwfile Extra Info: Hostname: walt-suncs.cs.pdx.edu Group id: 100 Classif: Auth Dept: Authorizer: Expiration: never Shell: /bin/csh Home Dir: /u/johnj Created By: loadpwfile Extra Info:
Currently, due to limited resources in the department, we are not offering any guest access. The only exceptions are:
In the future, it would be nice to have a guest machine which would have it's own phone lines, &c. With user fees, this system could be self-supporting. OSU has a system like this.
However, with the advent of metro area networks, there are an increasing number of public access internet sites available in the area. A handout listing these will be kept on hand in the CS office.
A modem is considered up when it is able to answer the phone,
connect at the proper baud rate and allow the user to get the prompt
from the terminal server (currently malach
).
It is the intention of Sysgroup that modems be "up", as defined above, at all times. If more than one is not up it is considered a serious malfunction of the system.
The program modemchk
will be used to determine whether a modem is
answering properly. If either it or the System Administrator(s) cannot
log in, the modem will be considered down.
Factors beyond our control (i.e. line noise, telco problems) may cause the modems to be inaccessible. While we cannot be held responsible for these problems, we will do our best to narrow down the cause and correct it (if possible). Also, we cannot be responsible for problems at the user's end, i.e. improperly configured terminal emulators, etc.
The remainder of this chapter details methods for working with the modem pool.
First there are some facts about the modem that need to be gathered.
NOTE: This will not help with Telebits, if you are having a problem with a Telebit then contact the Modem Manager
jove% cisco malach 'show line 77' Tty Typ Tx/Rx A Modem Roty AccO AccI Uses Noise * 77 TTY 2400/2400 F callin - 2 3 1342 134446 Location: "Dialup (Ext. 3146 )", Type: "dialup" Length: 24 lines, Width: 80 columns Baud rate (TX/RX) is 2400/2400, no parity, 1 stopbits, 8 databits The escape character is "^^", followed by "x" The local hold character is disabled No flowcontrol in effect. Status: PSI Enabled, Ready, Active, Rcvd CR, No Exit Banner Capabilities: Notification Set, Autobaud Full Range, Modem Callin Idle EXEC timeout is 2 minutes. Idle session timeout is 120 minutes. Modem answer timeout is 90 seconds Dispatch timeout is 50 milliseconds Disconnect character is not set Activation character is ^M (13) No output characters are padded No special data dispatching characters
Look at the second line and under the `TX' should be the speed of the modem.
cmc
Cmc is used to change the configuration of a modem (or to reset the modems parameters). The format of the command is:
cmc sysname port answer|noanswer modem_type or cmc -m phone answer|noanswer
sysname is the name of the cisco terminal server (`malach')
port is the port number of the modem, in decimal or octal (with a leading 0).
phone is the telephone number that the modem is on.
`answer' or `noanswer' indicates if the modem should answer incoming calls.
modem_type the type of modem, currently `hayes', `pp', `vbis' and `tbit' are defined.
For example:
jove% cmc -m 54206 answer
modemchk
Modemchk
dials a list of phone numbers and verifies that a modem
answers and a malach login prompt is present.
To run modemchk type: modemchk -a
Following is a sample of modemchk's output:
Modem Check-up run at: Tue Dec 3 12:50:19 1991 53145: LOGIN,IN USE 53146: LOGIN 54054: BUSY 54111: LOGIN 54112: NO LOGIN 2nd Try: NO LOGIN ... 55407: NO CARRIER 55408: OUTGOING MODEM FAILED TO RESPOND A total of 22 numbers were called. 18: calls connected to a login prompt. 1: calls connected but could not login. 1: calls failed with no carrier. 1: calls failed with a busy signal. 1: calls failed because the outgoing modem did not wake. Completed at: Tue Dec 3 13:03:54 1991
Following is a more detailed explanation of the meaning of the messages above:
modemchk
was able to get the `malach>' prompt.
cmc
to reset the modem.
modemchk
tried to call was busy.
Sometimes the outgoing modem gets hosed, to fix this problem run the following command cmc -m 54210 answer If the modem is still hosed, then recycle the power on the modem. If the modem continues to be hosed contact the modem manager.
This section applies primarily to printers which have quotas installed
(at the moment, only lw3
). On any printer with quotas, copies must be
either paid for or pre-authorized.
The policy is as follows:
The procedure for getting/buying copies on a quota'ed printer is as follows:
There are two programs for manipulating printer quotas: lwquot
,
and lwaddquot
. These programs are rudimentary, and
will be replaced with a more elaborate systems.
The lwquot
command will display a user's current allowance on a
printer. Notice that it prints the totals; subtract the two numbers to
find out how many more copies the user has. If no username is given,
`$USER' is used.
% lwquot glatz 75 pages allowed 15 pages used
To update a quota use the command lwaddquot
.
However the only way to give unlimited access is direct editing of the
`quota' file.
Note: this command must be run on the machine which has the printer
physically attached.
% lwaddquot glatz 100
This section describes several of the important files in the quota system, and for the printers in general. Note that these files are only relevant on the machine which actually has the printer connected to it.
1.00 jove:glennc
tail -f
on this file.
Here is an example of a successful job:
psbanner: jove:glennc Job: x.c Date: Mon May 20 22:31:17 1991 psif: jove:glennc lw3 start - Mon May 20 22:31:25 1991 psif: end - Mon May 20 22:32:06 1991
And, here is an example of a failed job:
psbanner: jove:glennc Job: stdin Date: Sun May 19 16:00:39 1991 psif: jove:glennc lw3 start - Sun May 19 16:00:44 1991 %%[ Error: undefined; OffendingCommand: we ]%% %%[ Flushing: rest of job (to end-of-file) will be ignored ]%% psif: end - Sun May 19 17:37:59 1991
In order to keep the mass of mail aliases understandable, the following standards should be followed.
All the aliases above should have forwarding aliases on all machines, i.e. `sysgroup: sysgroup@cs.pdx.edu'.
Some aliases should be host specific, i.e. `root'.
Most of the departmental mailing lists are under the control of `majordomo'.
Where possible, all new mailing lists should be put under the control of `majordomo'.
Sending the string help to `majordomo@cs.pdx.edu' will give you information on using it. Some other docs are in `/usr/local/majordomo/Description'
This chapter details some general policies regarding the various Internet-accessible information services which Sysgroup maintains. All of these information services should have their own alias for the machine to be used when publicizing the service (i.e. `gopher.cs.pdx.edu').
The Computer Science Department has one anonymous FTP site on `ftp.cs.pdx.edu'.
The incoming directory should not be readable to the world to prevent the directory from being used for illicit purposes (i.e. crackers).
The directories in `pub' should be have contain an archive with a single theme (i.e. `rfc', `gnu', &c.) Directories for individuals should be put in a single directory (i.e. `faculty' or `people').
The departmental gopher server is located on `gopher.cs.pdx.edu'.
While the use of explicit and detailed standards serve only to hinder creativity and spontaneity, some general guidelines should be laid out to make cooperation among Sysgroup members easier.
In general, all coding styles serve to delineate methods for ensuring that source code is:
Anyone doing programming as part of Sysgroup, should be able to ensure that their code meets the aforementioned criteria, and determine methods for doing so. If you are doubtful about a style, ask.
With the use of tools such as indent
, specific indentation styles
are easily dealt with if they are difficult for the reader to
comprehend.
Dictating what text formatter or word processor an individual will use is an overbearing policy. As such, this section will specify some general requirements of any formatting system.
Any system used must be available to other members of Sysgroup, and preferably on a UNIX platform.
Any system must be able to produce a reasonable looking laser printed output and a reasonably accurate ascii approximation. This latter item should require a minimum of by-hand changes. Related to the former, any system should have the ability to produce PostScript files, so that the files can be printed by others at a later date.
The documentation systems prefered by the Systems Manager are
troff
and texinfo
.
Straight ascii files are quite appropriate when they will be changed
frequently or be eliminated soon. In these cases the time investment in
a getting something formatted nicely may not be worth the effort.
This section lists the various books and papers which are relevant to systems work. All items listed here have been read by someone in Sysgroup.
The Cuckoo's Egg
, and his preceding papers
What Do You Feed a Trojan Horse?, and Stalking the Wily
Hacker.
Internetworking with TCP/IP
UNIX Network Programming
Managing NFS and NIS
This is an alphabetical list of all the relevant files and programs discussed in this document.