Go to the previous, next section.
@everyfooting Author: rootd // Texinfo: rootd @| @| 3 December 1994
This chapter explains how to test the searcher module. Unlike the other modules in the free archie system, the searcher is the time-limiting-factor in the speed of archie searches. As a result, in addition to accuracy, pay attention to speed as you test these modules. Small differences in speed between tests are unimportant, but each time you modify the searcher it's easy to accidentally increase the number of comparisons performed at each character.
Don't be blinded by performance. It is important to remember the following priorities while coding the archie system:
Make something that works, regardless of how slow. Then improve it.
The unix time and prof programs will help test the performance of the system. Time measures how long it takes for your program to run. Prof can be used to see how often particular subroutines execute (strncmp would be a good candidate for this type of analysis). Check the man pages for details on how to use these utilities.
Be advised that there are several implementations of the time command on unix systems (including one built into the c-shell). If the time command that you are using does not appear to correspond to the man page, that is probably the problem.
Since the searcher program reads from standard input, we can redirect input from a file, or even run the searcher interactively from the command line. Here is a file we used to test our prototype:
3 20 README 123.3 3 12 hamlet 124.5
If you recall from the searcher module implementation manual chapter, the information on the lines is:
Soe the above file specified two searches. The first one was search type number 3 (case sensitive substring, see the include file in the searcher implementation chapter), requesting a maximum of 20 hits, looking for the string "README", with a unique search identifier of "123.3" (indicating that this is the third search generated by the interface with a process-id of 123).
Note that regardless of how we redirect the input to our searcher module, our output always goes in the soutput directory (in a file named after the unique id of the search).
One problem with running the searcher module from the command line is that when a search is complete, the searcher sends a SIGUSR1 to its parent (which is either your shell or the time command). This will cause your shell (or the time command) to exit.
To solve this problem, the searcher will not send a SIGUSR1 to its parent if it is run with the -d command line option.
If our file above was called "testscript", we could test our searcher with the following command:
> searcher -d < testscript
Alternatively, we can type in the search parameters interactively with the following command
> searcher -d
In addition to the file redirection, these commands have one other difference: when an EOF is generated. The file redirection generates an EOF after the last search, which causes blocked I/O calls in our searcher to return. An early version of our searcher passed our module tests only because it received this EOF, and then failed integration tests with the queuer module (since the pipe is never closed, an EOF is never generated, and our searcher blocked waiting for input at an incorrect time).
Redirecting input from a file makes module testing easier, and is our standard method. Since the interactive invocation of the searcher module does not send an EOF to the searcher, it more closely resembles the final integrated enviornment. We have added a fault-based test to detect this problem in module testing.
Consider the following search file:
2 50 README 123.3
This does an exact search for the string README, with a maximum of 50 hits, and the output will go into a newly created 123.3 file in the soutput directory.
Searching for README has the advantage that you KNOW that there are plenty of README's in anonymous ftp sites, so your search should return quickly (as soon as it's found 50 hits).
The following search-strings are good for testing the four search types (we handle WHATIS search testing separately):
README Readme EADME eadme
Vary the search types between the four different search types (exact, subcase, sub, and regex) and search for the above README's. The number and locations of the files you will find will be different in the different type of searches (for example, the Readme exact will give much different output from the Readme substring).
WHATIS searching is not currently implemented in our archie server, but our modularity will allow it to be implemented fairly easily. A functional test for WHATIS searching involves the following steps:
We've tested the normal functionality of the archie searcher. Time to see if we can make it fail.
If coded incorrectly, the searcher can read in all four lines of the current search request and attempt to read in another line. This will cause a deadlock because the searcher will block waiting for input--and the queuer will not send additional input until the searcher sends it a signal indicating that the current search is complete.
Repeat one of the above searcher functionality tests without redirecting the input from a file--type the commands interactively. Stop after typing one search request (four lines) and wait for the results to appear in the soutput directory.
To see if all of the files in the index directory are searched, run a ps -auwwwx command and grep for the user archie while a search is being run. The command-line arguments of the searcher subprogram (executed by the csystem part of the code) should include all of the files in the index directory.
Goto the index directory and grep for a word you don't expect to be in any archie site (like "dskjfls"). Make sure the archie output correctly indicates that no hits were found.
Goto the index directory and pick the first archie-site (ordered the way csh does, vi * will read that file first--then you can quit). Pick the first and last filenames in that file. Conduct an archie search of each type for that filename. This will verify that your searcher programs can find the first and last files fine.
You'll need to use the last filename again. Assume that the last filename was "potrzebie". Do a search of each type for "potrzebieaaaaa". This will cause strncmp to look at the very last character in the index file (which should be a newline). If the strncmp ever looks beyond that character then it will be looking at memory outside of the region mmapped in by the mmap() system call. See if it crashes.
Go to the previous, next section.