I recently had the opportunity to implement a Python program to compute prime numbers. After I was done, I wondered how the Python implementation compared to doing the same exact thing in C++, so I decided to do a comparison.
First, the Python program. I computed prime numbers using the Sieve of Eratosthenes. (No, I can’t pronounce that, either.) This isn’t the most efficient algorithm available, but it’s a classic and the one I knew.
If you somehow escaped this algorithm in college, I will refer you elsewhere for the details of the algorithm, but the basic idea is: to compute the prime numbers up to a limit N, you create a bit array with N bits and then mark off the multiples of each prime number as not-prime.
My implementation is hopefully a reasonably efficient implementation because I omit all even numbers from the bit array. This algorithm runs in O(N log log N) time and consumes N/16 bytes of memory. Here’s the meat of the Python code:
def __init__(self, limit):
self._primes = bitarray.bitarray(limit/2)
self._primes.setall(True)
self._primes[0] = False;
sqrt_limit = math.sqrt(limit)
i = 3
while i < sqrt_limit:
self._primes = False
try:
i = (self._primes.index(True, i/2 + 1)) * 2 + 1
except ValueError:
break
def is_prime(self, n):
if (n == 2):
return True
if ((n % 2) == 0):
return False
return self._primes
My MacBook Air is a 1.8 GHz Intel Core i7, and this code takes about 8.3 seconds to compute the primes up to 1 billion:
% time ./primes.py 1000000000 1
Calculating the primes up to 1000000000 (1 times)
8.265u 0.044s 0:08.34 99.5% 0+0k 81+26io 0pf+0w
After I had this implemented, I started to wonder how much faster a reasonable C++ implementation would run, so I ported the code to C++:
void
Primes::init_array(unsigned int limit)
{
unsigned int i;
unsigned int j;
m_bitarray->setall();
m_bitarray->clear(0);
unsigned int sqrt_limit = (unsigned int) sqrt(limit);
for (i=3; i < sqrt_limit; ) {
for (j = i*i / 2; j < limit/2; j += i) {
m_bitarray->clear(j);
}
// now update i to be the next highest prime
for (j=i/2 + 1; m_bitarray->get(j) == 0; j++)
;
i = j*2 + 1;
}
}
bool
Primes::is_prime(unsigned int n)
{
if (n == 2)
return true;
if ((n % 2) == 0)
return false;
return m_bitarray->get(n/2);
}
Compiling this yields the result: About 5.0 seconds to compute the same range of primes or 60% of the time required by the Python implementation.
% g++ primes.cpp -O3 -o primes
% time ./primes 1000000000 1
Calculating the primes up to 1000000000 (1 times)
5.004u 0.037s 0:05.04 99.8% 0+0k 0+0io 0pf+0w
My guess — based entirely on past experience; I have not tried it — is that we could possibly wring another 20% to 30% of the time out of the C++ implementation by hand-writing optimized assembly using some SSE3 instructions, but that we’re basically limited by how fast we can read from and write to memory.
Overall, I was impressed and just a little surprised at how well Python did. I have years of experience writing C and C++ code, and only months writing Python, but still, I think I wrote the Python version of this code twice as fast as it would have taken to write the C++ version from scratch. Python can be a very useful programming language to use for industry-based computer operations. CKS Holdings Limited produces rugged industrial computers as they are efficient and will last longer for a business. Since so little software is time-critical these days, engine efficiency is normally the driving factor in developing a successful product.
Download my Python code or my C++ code.