## Tuesday, October 15, 2013

### Primality testing- I (brute force, Sieve of Eratosthenes)

If you are into algorithms, you would realize that Primality testing is a very important topic. In simple terms, Primality testing means finding if a number is prime or not. Lets discuss a few tricks for the same. There is actually a lot of number theory based on primality testing and it is impossible to cover all of them here. Nevertheless lets discuss some of them. This post starts with a simple brute-force algorithm and introduces Sieve of Eratosthenes. With these simple things covered, we will try and discuss more of primality testing in the following posts. Lets get started with the brute force approach!

NAIVE ALGORITHM:

Lets say the given number is n. The simplest(brute-force) way to find if a number is prime or not is to check if the number is divisible by any number in the range 2 to n-1. This follows from the definition of a prime number and should be easy to understand. At this moment, lets revisit a basic school grade mathematics property which says:
If a number, say n, is divisible by an integer k, then either k or n/k has to be <= sqrt(n)

This means you dont need to check if n is divisible by any of the numbers in [0, n-1], rather you can just check for [0, sqrt(n)]. Well, Give yourself a pat on the back because that seems like a good enough optimization. If you were making 10000 checks in the former case, you have reduced it to just 100 iterations with the addition of this property.  Lets have a look at a small python code to illustrate this (source: github):

[sourcecode language="python"]

def isPrime(n):
maxPossibleFactor = int( floor( sqrt(n) ) )
possibleFactors = range( 2, maxPossibleFactor+1 )
divisibleFactors = [ n % i == 0 for i in possibleFactors ]
return not any( divisibleFactors )

[/sourcecode]

That was easy. Right?

SIEVE OF ERAOSTHENES:

Lets move ahead and look at a slightly different question: Identify all the prime numbers below 1000. A naive way to do this is to call the above python function for [0-1000].  Sounds like bad... Can we do better? Yes, we can. Turns out there is a well known algorithm for doing this- Sieve of Eratosthenes. Lets see how this algorithm works (The explanation below is  copied from here):
Make a list of all the integers less than or equal to n (and greater than one). Strike out the multiples of all primes less than or equal to the square root of n, then the numbers that are left are the primes.

For example, to find all the primes less than or equal to 30, first list the numbers from 2 to 30.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The first number 2 is prime, so keep it (we will color it green) and cross out its multiples (we will color them red), so the red numbers are not prime.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The first number left (still black) is 3, so it is the first odd prime. Keep it and cross out all of its multiples. We know that all multiples less than 9 (i.e. 6) will already have been crossed out, so we can start crossing out at 32=9.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Now the first number left (still black) is 5, the second odd prime. So keep it also and cross out all of its multiples (all multiples less than 52=25 have already been crossed out, and in fact 25 is the only multiple not yet crossed out).
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The next number left, 7, is larger than the square root of 30, so there are no multiples of 7 to cross off that haven't already been crossed off (14 and 28 by 2, and 21 by 3), and therefore the sieve is complete. Therefore all of the numbers left are primes: {2, 3, 5, 7, 11, 13, 17, 19, 23, 29}. Notice we just found these primes without dividing.

Now, lets look at a fun animation to illustrate the algorithm (source: wikipedia):

With the algorithm clear now, lets look at a python code (source: stackoverflow).

[sourcecode language="python"]
def primes_sieve2(limit):
a = [True] * limit # Initialize the primality list
a[0] = a[1] = False

for (i, isprime) in enumerate(a):
if isprime:
for n in xrange(i*i, limit, i): # Mark factors non-prime
a[n] = False
[/sourcecode]

If you have a trouble understanding some part of  this post or any other related question, please drop a comment. And by the time, we have the next post ready, you may try some questions related to prime numbers at http://projecteuler.net/problems. Happy coding :)