CS11 Erlang - Lab 2

Sieve of Eratosthenes, Process-Style

The Sieve of Eratosthenes is a simple way of generating prime numbers. The basic idea is this:

  1. Generate a list of numbers from 2 to some maximum value N.
  2. Starting with the first number in the list (that is, 2):
    1. The current number is prime.
    2. Remove from the list all multiples of the current number.
    3. Move forward to the next number that still remains in the list.
    4. Go back to step (a).

So for example, if you had the list:

  [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]

This is the general idea.

Specification

We are going to implement the Sieve of Eratosthenes, but with a twist! Instead of using a list of numbers, we will use a series of Erlang processes that function as sieves, and the messages will be the numbers. As numbers flow through the sequence of processes, you will end up with only prime numbers at the end.

  1. Create a new Erlang module called "proc_sieve". This module should export a single function called "generate", which takes a single argument "MaxN", and returns a list of all prime numbers between 2 and MaxN, inclusive.
  2. The generate function will first spawn a "sieve" process (described below), and then send this sieve process each number from 2 to MaxN in sequence. Each integer is sent as a separate message to the sieve process. Finally, when all integers have been sent, the generator will send a final message to its sieve process, but we will get to that in a moment.
  3. The "sieve" process should be implemented with a function that takes no arguments, because all of its information will be passed via messages. The sieve process follows a relatively simple set of rules:

    Subsequent incoming messages are handled like this:

    You can see from all of these details, that each sieve process will have two pieces of state: the sieve's value of N, which doesn't change throughout the lifetime of the process; and also the PID of the next sieve in the chain, if a "next sieve" has actually been created. You could, for example, use the atom undefined to represent the situation where the "next sieve" has not yet been created, and once a "next sieve" has been created, you can store the sieve's PID for that state value instead.

  4. Now that you see how the sieve processes work, you can see that the last thing that the generate function must do is to send a {done, self()} message to the first sieve process, and then receive the list of prime numbers from the first sieve process.

Sketch of Operation

This is a pretty complicated procedure to describe in words, so here are some simple sketches to illustrate what is going on. When the generator starts up, it spawns a single sieve process for filtering out all numbers that are multiples of 2:

Once the sieve process receives N = 2, its job is to filter out all multiples of 2. However, the first non-multiple of 2 that it sees is a new prime number, which we call P. Therefore, the sieve must start its own child-sieve and then send P on to the child:

The next value received from the generator is 4, and the N=2 sieve will filter out that value:

Of course, when the generator sends a value of 5, this passes through both the N=2 sieve and the N=3 sieve, because 5 is also prime. Thus, the N=3 sieve must now start a child sieve:

And so forth, until after running for quite a while, we end up with a situation like this:

At the end, of course, the generator needs to send its {done, self()} message to the first sieve in the chain, and then this "done" message propagates all the way to the end of the chain of processes. Finally, the last sieve in the chain receives the "done" message, and the actual prime numbers are collected from the end back to the start.

Testing and Printing Primes

Once you have completed your prime number generator, you need to test it to make sure it works properly. You probably want to start with a simple test, like MaxN = 100, just to make sure that it works correctly.

Make sure that your generator doesn't leak any processes! All sieve processes should terminate successfully at the end of a run, and it should be possible to run "generate" multiple times without any adverse behaviors. You can find out how your solution behaves by running the processes/0 BIF before running your code, and then again after running your code, to see if there are suddenly lots of extra processes. Note that there are quite a few processes running in a normal Erlang emulator, so make sure that you check what processes are running before you try out your code.

Once you have your code working pretty well, try to generate 10K or even 100K prime numbers. (My code takes approximately a minute to generate 100K primes; maybe you can do better!) You don't need to worry about getting a huge output from your program, because the Erlang shell will only print out a small number of list-elements before stopping.

But, it would be cool to actually see 100 or 200 or 500 prime numbers, so add one more function to your proc_sieve module, called gen_print(MaxN). This function's operation will be very simple:

Try out your new gen_print function with 200 or 500 primes, and make sure your results are correct!

All Done!

Once you have completed all of the above work, go ahead and submit your lab on csman.


Last updated October 18, 2012.
Copyright (C) 2009-2012, California Institute of Technology. All rights reserved.