Haskell track: assignment 4: stand-alone programs

Goals

In this lab you'll learn how to write stand-alone programs in Haskell.

Programs to write

This week's lab is divided into two parts:

Some very simple stand-alone programs, for practice.
Some slightly more complicated stand-alone programs, which actually do something interesting.

Let's look at these separately.

Simple stand-alone programs

Write a program called hello1 that prints hello, world!.
Write a program called hello2 that prints a prompt asking you to enter your name, grabs your name from stdin, and prints
```
  hello, <your name>!
```
(with <your name> appropriately substituted, of course).
Write a program called hello3 that prints out its command-line arguments, one per line.

None of these programs requires more than 7 lines of code.

More complex stand-alone programs

Now that you have that under your belt, it's time to write some more interesting programs. You're going to re-implement the Unix utility programs cat, sort and uniq in Haskell -- call them hcat, hsort and huniq. They will not implement all the options of the Unix programs, of course (unless you want serious extra credit ;-)). However, each of them has to be able to read input either from a named file (or files, in the case of hcat) or from stdin, and they should all dump their output to stdout. Use a quicksort algorithm for the hsort program. None of the three programs should add any extra lines (not even newlines) that weren't in the original files.

Read (or at least skim) the man pages for cat, sort, and (especially) uniq to make sure you know what these programs do. In brief, cat copies (concatenates) files to the standard output stream, sort sorts the lines in a file, and uniq removes consecutive duplicated lines in a file (not all duplicated lines, just the consecutive ones).

For the hsort and huniq programs, do the sorting or uniq'ing in a separate function that does not have an IO type. Because of laziness, it's just as efficient to use this function on a list of all the strings in all the files as it would be to do the processing inside the input/output handling code, and it's much more modular. This is one of the hidden benefits of laziness; it helps you decompose problems better.

Once you're done, it should be possible to do this:

% hcat *.hs | hsort | huniq | more

and see all the unique lines in your programs. (If you don't understand the above line, you need to read about pipes in shell commands. Do "man bash" to learn more.) For fun, try to figure out how much slower the Haskell versions are relative to the normal Unix utilities written in C.

Useful functions

Here are some functions you may find useful:

putStr and putStrLn
getChar
getLine
hsetBuffering
openFile
getContents
hGetContents
hClose
System.getArgs
mapM_
interact (especially the idiom interact id -- what does that do?)
lines
the (!!) operator for accessing lists by index
the IO and System libraries (use the import statement to import them)

To hand in

The files hello1.hs, hello2.hs, hello3.hs, hcat.hs, hsort.hs and huniq.hs.

Supporting files

A Makefile for compiling your code. Type make to make all the programs and make clean to get rid of the compiled programs and intermediate files.

References

As always, the tutorial A Gentle Introduction to Haskell, by Hudak, Peterson and Fasel.
The GHC user's guide, specifically the part relating to compiling stand-alone executables.