PSU CS199 — Introduction to Computer Science

Homework 2: Simple Statistics

Due Thursday 30 July 2009

Language Level: Beginning Student
Teachpacks: baseball.ss

This homework will let you gain experience writing functions that consume lists of numbers. You will write functions to compute some common statistical measures of a dataset, such as the mean, variance and standard deviation. I'm providing you with a teachpack called baseball.ss that defines two (relatively) large datasets which contain historical data about Major League Baseball players.

Preliminaries

Download the following file onto your computer.

This teachpack defines two constants:

Load this teachpack into DrScheme (using the Language > Add Teachpack … menu).

Your Assignment

Part I

Write Scheme definitions for count (which should answer the number of elements in a list), sum, mean, variance and stdev (standard deviation). Provide contracts, purposes and test cases for each. These test cases should be small enough that you can check the result by hand.

Statisticians have various ways of defining variance and standard deviation. For this homework, please adhere to the following definitions:

Part II

Once all of your functions are working properly, use them to analyze the baseball data. Your results should correspond to the numbers in the Benchmarks section below.

Extra Credit

Write median, and test it.

Benchmarks

To make sure your numbers are not way off base, here are a few ballpark figures to check against your results:

Hand in your work.

Put your names as a comment at the top of the definitions window. Save your file from DrScheme, attach it to an email message. In addition, put the names of both partners in the body of the email and in the name of the file. Submit your email to CS199Homework

Acknowledgements

This homework is based on a lab used in the University of Chicago's CS105 Class, originally designed by Adam Shaw. We owe the baseball data and the ballpark humor to him.