Auspex FAQ

Auspex FAQ Administrative Information

About the Editor

This FAQ was edited by Darrell Root, based heavily on posts to the Princeton Auspex majordomo list.

Darrell Root (rootd@ee.pdx.edu) currently works at the NASA Ames Research Center as a combination Fileserver Engineer (Sun and Network appliance boxes), and a Workstation Engineer (SGI, Sun). Darrell previously worked at Intel Corporation, where he co-owned 13 Auspex fileservers. He specializes in performance analsysis and metrics (alas, the code I wrote at Intel is Intel property, and I can't distribute it to this group).

Disclaimer

The stuff in this FAQ does not represent the opinions my employer, Auspex, or any other organization in the history of the planet. Although I quote emails sent to the list by other admins, the age of the emails means that the information may be dated, obsolete, incorrect, irrelevant, or that the original author may have changed their opinion. Don't blame them. This list is just so low on activity that I've had to go back years to get some information.

In short, it's all my fault (but don't sue me...use at your own risk...this FAQ has no warranty of any kind...etc...).

In particular, code snippets are vulnerable to the multiple cut-paste operations, texinfo formatting, and html formatting that they went through to get to the printed version of this faq. Do _NOT_ simply type in code you don't understand and expect it to work----there's too many cut-and-pastes between the original post and the FAQ you now see.

Who Contributed to the FAQ?

Ruth Milner and Snoopy are major contributors to this FAQ. They are both heavy contributors to the Auspex list, and they've both coordinated Auspex BOF's at LISA.

People who also had posts to the Auspex list integrated into the FAQ: Paul Graham, Dale Carstensen, Lawrence Rogers, Frank Lemmon, Wojtek Sylwestrzak, Ted Schroeder, Tom Perrine, Mark Dadgar, John Sasso, John Kupec, Jennifer Joy, Frank Lemmon, Peter Van Epp, Adam Fox, Wayne Folta, Fritz Raab, Tom Wike, Matt Black, Ian Battan, Paul Joslin, Richard Wong, and Elmar Kurgpold.

My apologies to people I've missed (my record keeping has not always been perfect). Feel free to email me at rootd@ee.pdx.edu and I'll correct the FAQ.

You all have my thanks. Without this list I'm sure I would have suffered an extra crash or two.

Auspex Information Resources

How do I subscribe to this auspex email list?

Send email to "auspex-request@princeton.edu" with the word "help" in the body of the message. This is standard listproc software. Old messages are archived.

Does auspex have an email support address?

Yes. support@auspex.com. Be warned that some of the emails I sent to support@auspex.com seemed to disappear (I asked to be subscribed to a mailing list, they subscribed me but never let me know that they completed my request...so I spent time requesting to be added a second and third time before finding out).

Does Auspex let customers access the bug database?

Auspex used to support "bugs access online" at "boa@auspex.com. Unfortunately, that bug interface no longer seems to function (the help information works, but any attempts to get data results in a "sh: /usr/local/bin/bugcsat: not found" error. This is obviously no longer supported (the example in the help information shows you how to obtain a buglist for Auspex OS 2.7 :-)

How do I find out about new patch and OS releases?

Send email to support@auspex.com and ask to be put on the "early warning email list". That list is used for new patch and OS releases. In theory, that list would also announce critical bugs, but I remember a bug in a 1.9 patch that caused the server to come up without any ax_nfsd's (and the patch was unfixed on the server for two days after I reported it, and no email to "Early Warning" was ever sent--even though the bug was in the prerelease version of the 1.9Z1 patch which was available on Auspex's ftp server).

Does Auspex have an email address for unsupported software?

Auspex used to have a "bootleg@auspex.com" email address, but that was years ago and the address no longer works.

What is the Auspex ftp site?

ftp.auspex.com. Anonymous ftp is not supported, but there is a "patch" id with a fairly public password. Customers with support can also get accounts for core uploading.

What is the Auspex web site?

http://www.auspex.com of course!!

Auspex Hardware

ARRGH! I accidentally hit the break key!

If you're fast enough, you can use the "co" command to continue the system if it gets to the "HP>" prompt. If you wait too long (~3 minutes) a M16 timeout will occur and you'll be forced to reboot the system.

What is a M16 timeout?

Each board in the Auspex has one or two processors, each running it's own operating system. They constantly send "M16 pings" to each other to let each other know they still function. If a board doesn't respond to a M16 ping within several minutes, the server will attempt to fix the problem by performing a reboot.

How do I disable the break key on a link terminal?

Ted Schroeder wrote:

The terminal setup on the Link terminal actually allows you to disable the break key. Fortunately, when you need the BREAK key, you can enable it again and send the BREAK. All you need to do is leave some instructions around the console for the lucky devil who covers for you when you're out 8^)

Here's the procedure for disabling the BREAK key on a LINK terminal:

Press Shift-Setup to get to the Setup screen
Press the F4 key to get to the Keyboard screen
Press the Down Arrow key seven times to move to the 'Break' item. This should be set to "170ms" by default.
Press the Left Arrow key once to change this selection to "off".
Press the F9 key to quit out of Setup.

Here's the procedure for enabling the BREAK key on a LINK terminal:

Press Shift-Setup to get to the Setup screen
Press the F4 key to get to the Keyboard screen
Press the Down Arrow key seven times to move to the 'Break' item. This should be set to "off" if BREAK is disabled.
Press the Right Arrow key once to change this selection to "170ms".
Press the F9 key to quit out of Setup.

Can we buy disks from vendors other than Auspex?

For full-height drives, if you have an old sled (perhaps from a 1-2GB drives), you can probably successfully qualify some 5.24" drives. The old 9GB elite-9 seagate drive worked well (but they are now out of production). MTI used to sell part number NPA310800S (seagate elite-9) and MP3AUS (auspex cable kits). Getting the power cable right is tough (we burned one drive out while working with MTI to specify the MP3AUS package).

10% of the Elite-9 disks were DOA, and one more died during use. MTI will replace them for five years. We also had two fileserver crashes while using ax_add_device to add the disks, although most of those were due to scsi reset errors that the new patchlevels of the OS can handle.

Note that the 1.0GB full-height sled and the 1.3GB full-height sled have reversed power-cable configurations. We typically use our 1.3GB sleds, but not our 1.0GB sleds, with the elite-9 disks.

Make sure to set the scsi ID to 0 or 1, and you'll need to set the "spinup scsi drive on command" jumper (by default that's off in most drives). Setting those jumpers will probably let any drive work in the old full-height cabinets.

With the half-height drives, Auspex lists the price at $400, but is very reluctant to sell them for obvious reasons. If you use RAID, I'd suggest using Auspex drives only (I have nightmares of losing a whole raidset).

Note that external-vendor disks may cause Auspex support to refuse to service your machine until you take them all out. Large Auspex customers will usually have enough clout to get around this, but if you're a 1-Auspex site then this is a serious consideration.

The third party vendors (below) may be able to sell you some empty full-height disk sleds. I wonder if they have any half-heights?

Do any third-party vendors resell Auspex equipment?

Union Computer Exchange 6233 Idylwood Lane Edina, Minnesota 55436 612-935-7282 612-935-5056 (FAX)

David Bransky Service Resourcing International, Inc. 4958 Corliss Road Lyndhurst, Ohio 44124 USA Tel: (216)382-1400 Fax: (216)382-2129 E-mail:dbransky@mail.multiverse.com

Where can I purchase a UPS for my Auspex

General UPS advice

Get at least 2 bids for your UPS system. Compare how much power each vendor thinks you'll use. The less expensive vendor may be underestimating your power consumption.

Make sure you include maintenance prices in your quotes (the maintenance prices can be pretty awful).

Homegrown tty-based monitoring scripts are usually pretty flakey. Make sure the vendor supplies one, or that their UPS can be monitored with SNMP (and that they give you a SNMP client to check their UPS SNMP agent).

Make sure there's some way for you to test/estimate how long your UPS will last given a power outage. Battery capacity and power load would be real nice numbers to have.

If you plan on doing periodic UPS tests, try to do one test before your production equipment goes on the UPS. You want to have the procedure down before you crash your servers figuring out the procedure.

Best Power Technology

Several people are happy with Best Power Technology.

Best Power Technology, Inc. P.O Box 280 Necedah, Wisconsin 54646 USA Phone: 608-565-7200 Sales Toll Free 1-800 356-5794 (Outside WI)

Some models have a serial port and a "shell" that has been successfully hooked up to a tty port and polled every minute with a perl script.

Another sysadmin was happy with the "power filtering" characteristics of a Best FERRUPS QFD 18kVA/X. They felt that their network equipment didn't "burp" as often with it (they were in an unreliable power area).

Lieberts

Manuel Morales and Tom Limoncelli were both happy with their Liebert systems. You can monitor it with a tty or SNMP.

Deltec

Ruth Milner was happy with their Deltec system. She liked the 10-year pro-rated warranty on the batteries. She also liked the "estimated battery time remaining" figure that she was able to get through the RS-232/tty interface during an outage. This was in an unreliable power area, so this Deltec was very well tested in outages.

Backups

What software is available to backup your Auspex?

Home Grown Scripts

As of the Nov-93 BOF, homegrown scripts were the most common backup method. This is easiest and cheapest to setup, but reinventing the wheel is always prone to bugs. One common problem is the script not reporting stderr messages to the sysadmin. As with any backup solution, random restore checks are essential here.

Budtool

PDC produces Budtool, a high-end backup solution. Depending on what components you buy, you get a file-history-database, jukebox support, barcode-inventory support, dd-based-fast-backups (budturbo), and live-stable backups (budtool-live). PDC is also developing a native AFS backup solution.

PDC's support is moderate (it's hard to staff helpdesks with good people). Budtool has a long history of bugs. The PDC budtool classes don't have instructors with much barcode-inventory-system experience (our operators essentially ended up teaching that part of the class).

The file-history database can get very large. If it fills up the disk it will become corrupt. Rotate your database often, and set aside a 9GB disk for the database on a 500GB server.

Epoch

Two people commented on Epoch. One was happy. The other was ready to throw it in the trash.

Legato Networker

Two people commented on networker. One complained that the client-server nature of the product taxed the Auspexes small host-processor too much. The other complained of poor support (particularly related to bug disclosure).

Amanda

Snoopy (an Auspex guru), was very happy with the public-domain amanda product.

Auspex's backup solution

Auspex sells a backup solution, including direct access from the SP to your tape drives (it'll cost you some disk slots, but give you great dd-based backup performance). I've never investigated, and would be happy to hear comments.

How come my backups are so slow?

Host Processor problems

Let's face it. Even the HP8 is not the fastest CPU around. dump is very cpu and memory intensive. Use vmstat to see if you're thrashing. Use a dd-based backup to reduce the cpu-load if possible. Some people backup their systems from the Auspex's NFS clients.

HP SCSI Bus

The scsi interface on the HP7 (and probably the HP8) is about the same as an old sparc-II interface: narrow, not-fast, 5MB/second. Using a dd-based backup, you can pump 3MB/second into a DLT4000. Hang four DLT4000's off of your host-processor scsi interface, and you'll have 12MB/second of tape drive capacity on a 5MB/second bus. It doesn't take a rocket scientist to figure out where the bottleneck is.

Put a scsi card into one of the sbus slots on your HP. The new card will have 10MB/second of capacity. That gives you 15MB of capacity total. That's usually sufficient. You can always put a second scsi card into another sbus slot.

We use Budtool's Budturbo product, and after we restore

we get a "bad superblock" error when trying to fsck.

Sounds like APR450. This is a bug where a clean filesystem gets a corrupt superblock under normal operations. There was a confirmed case of a filesystem fsck'ing clean, being up for a week with no unusual incidents (no disk errors, no full filesystems, no inode shortages) and then having a dirty superblock.

Normally, a dd on this filesystem would result in a new filesystem which also has a dirty superblock, but can be easily fsck'd.

Unfortunately, with budturbo, the backup|restore results in a filesystem which cannot be fsck'd at all.

We implemented a weekly fsck rotation on all our Auspex filesystems (we'd put an entry in /etc/cgfscklog, and then run ax_isolated -f). That minimized the problem enough so we were no longer getting failed restores from this cause.

I theorize that the data on the restored partition could possibly be recovered by writing a program that edits the raw disk device. This would require studying a disk with the dirty superblock and the restored "bad" endproduct of that filesystem put through budturbo. If someone is ready to lose their job over a failed restore, and wants to pay me a bunch of money to give it a shot, I'll be happy to try.

Backup Hardware

Exabyte 8mm

The exabyte 10i carousel robots were pretty good, but the exabyte 8mm tape drive suffer from chronic media errors.

DLT4000

These are excellent in terms of media errors. The compac-III tapes can handle 20GB uncompressed. With compression they get in the 30GB range (depends on data type). Using dd, I was able to pump 3MB/sec into these drives (on a 10MB/sec SCSI bus, without contention).

The compac-IV tapes handle 30GB/tape uncompressed, but are very expensive. The price has begun to drop now that we're finally getting more than one vendor.

Getting the st_conf.c configuration parameters right is always a pain. No matter how many times I ask the vendor what the best answer is, it's always wrong (we finally worked it out, but I left the parameters at my last job, sigh).

ATL 4/52 jukeboxes

These are nice reliable jukeboxes. Make sure you get one with the glass doors so you can see what's going on inside. Check out the SCSI configuration (if you get a differential ATL4/52, you'll either need a differential sbus card for your Auspex, or you'll need a differential<->single-ended converter).

Differential<->single-ended converters are bad news because if you accidentally bump the power cord, the next time you do anything on that scsi bus your host processor will crash.

We also suffered a crash while a maintainance guy was upgrading the firmware on the 4/52 while one of our operators queried the tape drives.

Dataguard is a real good product to have when you're configuring devices on your host processor scsi interface.

Other Products

Andataco and IBM both sell "striped parallel" tape units, which get great transfer rates, but make me nervous because of the tape logistical problems.

Don't forget Auspex's solution. If you've worked with it let me know.

How fast should your backups be?

dd on dlt's

The host-processor SCSI port is a 5MB/sec narrow non-fast (I can never remember whether that's synchronous or asynchronous:-) SCSI bus.

A DLT4000 using dd can take about 3MB/sec.

Dirk Duerinck did some tests:

We just installed a DLT-drive in our AUSPEX. It's hooked up to the HP.

I did some tests with a 4GB file-system, and got these results:

- dd from disk to tape: between 37 and 62 minutes (unloaded versus loaded system) - dd from tape to disk: 61 minutes

- dd from tape to /dev/null: 32 minutes - ax_sputil copy from disk to disk: 29 minutes - dd from disk to disk: 49 minutes

- dump from disk to tape: 58 minutes - restore from tape to disk: 117 minutes

To get the best restore time, dd seems to be the best tool.

Try ddd

Peter Van Epp wrote:

While it has been some years (and a DLT2000 not a 4000 which is what I expect this is) since I played with dd on the Auspex you probably want to give ddd a try. It spawns two tasks one to read the disk/tape and another (connected via a pipe to the first) to write the tape/disk. This overlaps the the I/O and may increase performance (assuming I/O bandwith isn't what is currently limiting you). It is freeware, I expect an archie or web search will find it on an archive site somewhere.

Human issues in backup engineering

You can't be a fileserver engineer without backups. You must conduct random backup validity checks. You must test your ability to restore an entire filesystem.

Treating backups like a "quality assurance" exercise is a good idea. Keep metrics on the number of random restore tests performed each week, and the number of failures.

Finally, make sure your customers know what your backup commitments are. Make a list of everything you backup. Advertise that list to your customers (even if that means other support people). Make it clear that if their personal workstation is not on the list, it's not being backed up. Then use the list as input to your random restore rotation (I remember where a novell server was being backed up, but was not in the random restore rotation...guess what, all the backups were invalid).

Performance

How does the new 1.9 OS do for performance?

Awesome. The 1.9 OS is essentially a complete rewrite of much of the NFS code. The more experienced programmers improved many of the pathways. The number of ax_nfsd's was also increased. I'd estimate a 25% improvement on identical hardware. Denials of service due to extremely hot disks causing nfsd unavailability were greatly reduced as well--even under extreme nfs activity.

Note that this experience is mostly with AIX clients running NFSv2.

How does the new RAID feature peformance-wise?

Very good. A RAID-stripe is superior to a 2-way stripe for writes. I don't know how it compares to a 3-way stripe. It does, however, use a large amount of storage processor CPU. I haven't seen a server with 100% RAID being pounded on heavily, and I'm not sure whether the SP5 can survive that.

All my nfsd's are busy. Help!

If one filesystem is pounded on from hundreds of clients, it can cause that disk to go to 100% utilization. All the ax_nfsd's will submit IO requests to that disk, and then wait for it to be serviced. This can cause a denial-of-service problem to your entire server. You can see this if you have 40+ nfs jobs queued on your server.

ax_fsutil now has a "throttle" subcommand that can dynamically limit the number of nfs daemons which can service a particular filesystem. Use ax_perfmon to identify the busy filesystem, and then use "ax_fsutil throttle" to limit the number of daemons to the 2-range. If your jobs-queued drops to less than 10, then life is better (at least for most of your server, that specific filesystem is still hosed). Then slowly increase the throttle until the NFS jobs queued start to increase (usually in the 8-range).

Now you have time to change the usage pattern, or migrate the filesystem to a striped configuration.

How can I run ax_perfmon with a filter file without having it suspend on tty input?

Redirect /dev/null to standard output and standard input.

What are the consequences of setting "minfree" less than 10%?

The BSD filesystem tries to lay out files in a particular cylinder. This way you can read a file without having to perform disk seeks. If a filesystem is extremly full, then it will no longer be possible to put a file on adjacent cylinder groups, and the file will be fragmented across the disk. The 10% minfree is designed to prevent this.

The 10% figure, however, was designed in the <100MB filesystem days. In these days of >1GB filesystems, 10% is way too much. I'd suggest using something in the 4% range for any filesystem larger than 2GB.

Can you access ax_perfmon data through snmpd?

According to Paul Joslin, Auspex sells an enhanced snmpd daemon which includes ax_perfmon information.

What benchmark tools are available?

LADDIS

LADDIS is an old NFSv2 benchmark. Results for many system configurations can be found at http://www.spec.org in the "sfs" area. It's designed to measure total NFS capacity of a machine. If your company is a member of spec, you're entitled to a free copy. Otherwise LADDIS costs about $1200.

LADDIS includes multi-filesystem multi-client support and job control.

LADDIS is not the easiest software to use. It essentially requires that every machine trust every other machine in /.rhosts and /etc/hosts.equiv. Trying to use it on production workstations is a big hassle.

The other problem with LADDIS is it's NFSv2 specific. It directly generates NFS packets, rather than going through each workstation's operating system, so it's hard to compile on some platforms.

Note that most vendors (including Auspex) don't report LADDIS results at www.spec.org with RAID enabled. Yes, Sun can achieve ungodly LADDIS numbers with 361 non-RAID disks on a 24-cpu enterprise 6000. No, I wouldn't configure my system that way.

nhfsstone

This was the predeccesor to LADDIS. It's ok for running one client against one server on one filesystem. One advantage is that it's free. But if you want to use nhfsstone on multiple clients you'll need to hack up your own job-control scripts (and I'll bet that will take more than $1200 worth of effort).

nhfsstone is SunOS and NFSv2 specfic.

Bonnie

Bonnie is a benchmark designed for looking at disk throughput. I've never actually looked at it.

PostMark

This is a brand new filesystem benchmark program written by Network Appliance. It's designed to show performance for smaller files (typical of a usenet server or mail server). It is available at http://www.netapp.com/technology/level3/3022.html.

Third Party Tools

General

Third party tools are often complex to setup, difficult to use, but provide very useful data when you have problems. Make certain that you're proficient in their use _before_ you run into trouble.

Many of the manuals are of the "for idiots" variety. They tell you where to click the mouse for 500+ pages, but don't have a summary "for people with brains".

Some of the tools monitor the net, but that is problematic in a switched 100bt network (it's tough to monitor the whole net). Options include having your switch spit out as much data as possible out one port, or putting your monitoring machine on the same port as your fileserver.

AIM Sharpshooter

AIM Sharpshooter is a non-SNMP system management package that seems pretty good at providing useful information. It has an Ingres database and can print reports on NFS ops and other system information (such as disk utilization/aging). It also functions as an alarm console to alert you to client NFS response time problems. The only drawbacks are that Auspex doesn't have a NIT interface, so some Sun NFS server features are not available. Sharpshooter also requires 1 proxy Sun machine per network.

Concord Trakker

Concord Trakker focuses on NFS monitoring/management, but does not show information above the transmission layer and does not model the NFS client/server relationship as Sharpshooter does.

Concord pods must be on each subnet to collect data.

HP Openview

Generally pretty useful, but some items could be better. To run on the Auspex you had to replace the snmpd daemon.

HP Netmetrix

Useful for monitoring load on subnets and general packet catching. Not useful on the Auspex itself. This is one of the worst "point-and-click-here" manual offenders.

Specturm

One user hated Spectrum ("ready to throw it out"). Another thought it was pretty good ("Superior to sunnet mangler and HP openview").

SunNet Manager

Two users hated it. They thought ping/netstat/vmstat were more useful and SunNet required a huge investment to get only marginal returns.

One person liked it for sun-intensive networks. They integrated it with Cabletron Lanview and was happy with the reports and graphics.

Networking

I keep on seeing "Transmit retried more than 16 times - net jammed?" errors. What's wrong?

Possible causes:

You just have a very busy network.
A transciever's SQE switch could be "on".
The Lance Chipset reports this all the time on all suns, but most suns don't tell you that it's happening. This is simply a net with tons of collisions.

I'm getting horrid performance through my full-duplex 100bt interfaces?

Running 1.9 I had to turn off full-duplex 100bt because of performance problems. I'm not sure if it was due to auto-negotiation, or a real problem at full-duplex. Perhaps the situation is better at the new patch levels.

Doug Austin at Auspex reports that the problem has since been fixed. In your ifconfig command, you need to set the speed before the duplex mode. For example, use this:

ifconfig ahme0 Speed100 fullduplex ...

but not this:

ifconfig ahme0 fullduplex Speed100 ...

Also remember that the autonegotiation protocol for 10/100bt interfaces was not designed to deal with duplex mode. If one side is manually set to 100bt full-duplex, and the other side autonegotiates, then the autonegotiating side will incorrectly set 100bt half-duplex. This will result in horrid performance. Keep in mind that this problem is with the design of the 10/100bt autonegotiation protocol, not a problem with any vendor implementation. You'll be happy to know that this has been fixed in gigabit interfaces (of course, most gigabit interfaces can't be run at slower speeds due to the laser type, so duplex mode is just about the only thing they have to autonegotiate).

My rule of thumb is, either set both sides to autodetect, or manually set both sides speed and duplex mode. Both sides of the ethernet link have to be handled the same way.

NFS

df doesn't show > 2GB filesystems correctly on SunOS

One person took the Auspex df command and rdist'd it to all Suns. It worked fine.

How do we keep the Auspex from "notifying" every NFS client

in the building when it reboots?

Start the rpc.mountd daemon with the Auspex -r option, which tells it not to make entries in rmtab. This actually improves performance.

Non NFS Filesystems

Dos/Windows/NT -> PCNFSD

Every PC user has to have a valid password entry on your Auspex for pcnfsd to work.

You can run pcnfsd on another machine to do authentication for the Auspex. It works fine.

One person reported that FTP software - OnNet 2.0 is a good product.

NT->Intergraph NFS

Intergraph NFS has a limit on the number of concurrent open files, which causes problems in some environments. Intergraph, however, is a leading vendor of NT/Win95/Unix interoperability solutions, so make sure to give them a look.

NT->Samba

Snoopy runs Samba on his Auspex, and is happy with it despite it's slowness. He just has his customers access home directories on the Auspex, and get most of their binaries on the local NT box.

Snoopy give some suggestions on optomization:

Try out different read and write parameters in the Samba config file. Think about the ratio of the Samba block sizes and the PC ethernet cards buffers. Experiment.
Make sure that you install Samba in case-sensitive mode. Case-insensitive uses a ton of CPU on your Samba server.
Make sure you have a high-end HP if you run Samba on your Auspex.

Some people have had success statically NFS mounting all NFS filesystems onto a dual-proc ultra-II, and then exporting that as one Samba share.

Novell

I've heard a report that Novell has certified Novell Gateway on the Auspex (this required fixing of some of Novell's file-locking bugs). Even so, this is very difficult because Novell tends to change the specs in poorly documented ways.

Appleshare

Intercon now lists Auspex as a supported platform.

Xinet (sales@xinet.com) selss KTalk and KSpool for print spooling to Apple Laserwriters.

Columbia Appletalk Protocol is freely available on the internet, and one person had good results with that.

Public Domain Software

Do any of you have the Korn shell for the Auspex?

Auspex uses ksh-88 which they licensed from AT&T. It runs fine.

The public domain ksh is available from ftp.cs.mun.ca:pub/pdksh

Public Domain SNMP Daemon

Tom Wike reported that he'd successfully used the ucd-snmpd version 3.1.1 on two 1.7M1Z3 Auspii. He had to increase the MAXDISKS figure in config.h. He likes this daemon because you can have it monitor processes and disk space and configure alarm condiditions.

ftp.ece.ucdavis.edu:/pub/snmp/ucd-snmp.tar.gz

Do TCP Wrappers work on an Auspex?

Several people reported an unqualified yes.

Does Bind 4.9.5 work on an Auspex?

Richard Wong reports that 4.9.5-D1 compiled clean and works on his 1.8.1Z2 Auspex.

Using an Auspex as a Mailserver

Does Sendmail V8 Work?

Yes. Eric Allman did some development of Sendmail V8 on the Auspex, and answered this one personally. Enough said.

Auspex also now distributes sendmail-V8 on their CD's, although it's probably not this weeks version :-)

For the latest public domain version of sendmail, you may need the 4.9 bind libraries as well.

General Comments on an Auspex as a Mailserver

Auspex for NFS mail spool and Mail Delivery

The Auspex is a great NFS fileserver, /var/spool/mail included. Unfortunately, a mailserver uses up CPU (for mail delivery, for procmail processing, and for POP/IMAP stuff). With only one CPU, the Auspex doesn't scale for actual mail delivery.

Auspex as Mail Spool, but Not Delivery Agent

One person had two sparc-20's perform mail delivery, but NFS mount their mail spool off of Auspii.

This worked pretty well, except for AIX user boxes that screwed up their lock tables so you had to break the inode (copy the mail file to a different inode).

Make sure you take down the mailservers before the Auspex, and boot the Auspex before booting the mailserver.

Locking Comparison With Solaris

Solaris 2.4 and later provide superior NFS locking functionality than a 1.8 or earlier Auspex. That's important for a mailserver. I don't have enough experience with 1.9 file locking to make a judgement, but my intuition says that it's much improved.

When will the Auspex be Solaris Based?

It was originally scheduled for 1.9, then 1.9.1, then 1.9.2. Sigh.

Auspex Software Configuration

Can you jumpstart solaris2.5 clients off the Auspex?

Ruth Milner has done this, although there are limitations.

If you want to put the CD-Rom contents on disk (a good thing) you need to NFS mount the Auspex onto a Solaris-2.5 box and then do the copy on the Solaris box (using setup-install-server).

The Auspex doesn't understand the Solaris "package" format, and you can't patch the client OS areas once they're on the Auspex, so you need a Standalone Sun system and then use ufsdump/ufsrestore to copy the /opt and /usr areas to the Auspex. Then use those over NFS to the clients. Make sure that only this one client "knows" about the patches (the data is in it's local /var/sadm area) because you'll need to use that client to apply patches.

If you support multiple processor architechtures you'll need to manually fix some stuff in /usr/platform.

For customization of jumpstart, check out ftp.fwi.uva.nl:/pub/solaris/auto-install

For automatic workstation installations, check out the auto-net mailing list (send email to auto-net-request@math.gatech.edu).

Selectively limiting logins with NIS

Matt Black wrote:

If you want every user to have an account on the auspex (perhaps for PCNFSD purposes), but only want authorized people to login to the server, you can do the following:

$ more /etc/passwd root:byPPPPPPPPU2E:0:1:Operator:/:/usr/local/bin/bash nobody:*:65534:65534::/: daemon:*:1:1::/: sys:*:2:2::/:/bin/csh bin:*:3:3::/bin: audit:*:9:9::/etc/security/audit:/bin/csh sysdiag:*:0:1:Old System Diagnostic:/usr/diag/sysdiag:/usr/diag/sysdiag/sysdiag sundiag:*:0:1:System Diagnostic:/usr/diag/sundiag:/usr/diag/sundiag/sundiag black:ZHQQQQQQQQAFk:104:10:Matthew Black:/u2/black:/bin/csh belanger:MIRRRRRRRRGx6:1265:10:Alan Belanger:/u6/f/belanger:/usr/local/bin/bash +pei +::0:0:::/bin/true

Only black, belanger, and pei can login. Users black and belanger get their password from the local passwd file; pei gets his from NIS. Everyone cannot login; however they still retains NFS access if they possess an NIS account.

Ian Batten suggesting replacing /bin/true with a small script that says "this host is restriced". The script could even log the login attempt to a file.

War Stories and Disaster Recovery

How do I List the Root Directory Without Booting the Auspex?

b * or bad(,<DISK>,)*

(with appropriate <DISK>)

ax_expand and 100% full file systems

Using ax_expand to migrate a 100% full file system is risky. The ax_loadvpar process can hang in the kernel, leaving your vpartab in an intermediate state after the required reboot.

You can use the manual ax_mdetach commands to recover, and will need to manually edit your ax_vpartab.

Lawrence Rogers sent in this neat alias to get a stack trace of a process (the whole thing should be on 1 line, but it's been reformatted for the web):

alias ptrace set slot='`'echo proc '|' crash '|' awk "'"'$3=='\!^'{print $12'"'"'`"; (echo trace $slot; echo quit)|crash; echo ""'

Auspex Alternatives

Sun

Sun Micrsystems SparcStation line includes some very large servers with some very large LADDIS numbers (an ultra 6000 was clocked at 27,000+ LADDIS ops/sec, as opposed to the fastest Auspex which was just above 10,000 ops/sec).

Beware those laddis numbers though. 27,000 LADDIS ops was achieved on a server with 361 disks, 24 CPU's, and 8GB of RAM. The disks were _not_ using RAID. Nobody would configure a production server like that (especially since you don't have full disk hotswappability, you'd be replacing multiple dead disks every downtime).

Solaris is a very stable OS (we're not talking 2.0-2.3 anymore:-). Crashes due to software problems are minimal. Solaris supports NFSv3 (including TCP!). Solaris has excellent NFS file locking reliability.

The problem on the Sun side is hardware. Their current line of sparc storage arrays claims to be hotswappable, but you need to take an entire tray of disks offline in order to replace one disk. Since (for performance reasons) it's optimal to stripe raidsets across disk trays (translation: across disk controllers), that usually means lots of filesystem downtime every time you replace a disk. The new 2000-series SSA systems may solve this problem (not sure).

Another problem with Sun is the Veritas software. Yes, Veritas gives you RAID, but it's software RAID. This uses CPU on your server. The LADDIS numbers reported by Sun at www.spec.org do not include RAID (neither does Auspex, but at least the Auspex is hardware raid--if you consider the SP-based RAID implementation hardware like I do). I have personal experience with Sun Veritas RAID, Online Disk Suite RAID, and Auspex RAID. Without hard data, I "feel" that the Auspex RAID is superior.

One very nice thing about the Suns is the ability to buy the Veritas File System. In addition to providing useful features (dynamic inode creation as needed, ability to shrink live filesystems) this filesystem shows excellent performance. I'd love to have a vxfs-based filesystem on the Auspex.

Network Appliance

Network Appliance has a different philosophy than most other NFS server vendors: you don't want a complicated NFS server, you want a NFS toaster that spits out packets when you hit the "make toast" button.

As a result, Network Appliance's OS is very small and efficient (they mirror the OS on _every_ disk). A 512MB Network Appliance box can outperform (LADDIS-wise) a 1GB Sun box. When you consider that the Netapp box is using RAID, and the Sun is not, that's pretty impressive.

The minimal OS also helps keep Netapp's track record of coming out with new technology relatively quick. Netapp came out with RAID, NFSv3, and dual-protocol support long before Auspex. This may be an issue for IPv6 support next year.

Netapp also has an excellent reliability record, although their service isn't as good as Auspex's.

Of course, the minimal Netapp OS (<40 commands) can make the Netapp unable to adapt to different situations. On our Auspii, we had a "mkroute" daemon that would advertise RIP packets for all the Auspii's network interfaces out each interface. We could configure interfaces up and down, add networks, deal with network failures, all without manually reconfiguring a route. Because the Auspex had a unix-based OS, we were able to write a program to automatically do all that work for us. There's no way you could do that on an Netapp.

I think a high-end Auspex is faster than a high-end netapp. The netapp is still very fast, but not too adaptable. The prices are suprisingly comperable (especially at the low end, where Auspex recently cut their prices). I'm happy with either one in my environment.