Introducing pupyMPI

Jan, Frederik and I just released the first – somewhat stable – version of pupyMPI, a 100% pyre python implementation of the MPI standard. Or, as close to the standard as we saw fit.

Most python-mpi projects are bindings to some C implementation which gives a lot of strengths. It runs very very fast for one thing. So fast you can actually use it for real production if you want. In our opinion it’s not that useful, as most real applications will depend om some further systems and most real clusters will probably only allow you to run your C and FORTRAN stuff. They do however give developers a nice way to develop rapid prototypes, learn and play with MPI. pupyMPI boosts all threes while keeping the system to close to regular MPI, you can probably convert to regular MPI if you need the performance.

A quick example

Just to show how fast you can actually implementing programs in pupyMPI here is a quick example of a distributed program. It does a monte carlo pi simulation with some user defined number of simulations:

#!/usr/bin/python/

from mpi import MPI
import sys
from random import random
from math import sqrt

mpi = MPI()
world = mpi.MPI_COMM_WORLD
rank = world.rank()

try:
    simulations = int(sys.argv[1])
except:
    simulations = 1000000

per_rank = simulations / world.size()

hits = 0

for i in xrange(per_rank):
    # Simulate a part hitting inside the unity circle.
    x = random()
    y = random()

    if sqrt(x*x+y*y) < 1.0:
        hits+=1

# Gather the sum of hits from each process at the process
# with rank 0
total_hits = world.reduce(hits, sum, root=0)

if rank == 0:
    pi = float(total_hits)*4 / simulations
    print "Estimating PI on %d nodes through %d simulations yield %f"
         % (world.size(), simulations, pi)

mpi.finalize()

The output of several runs given below:

$ mpirun.py -c 2 monte_carlo_pi.py -- 1000
Estimating PI on 2 nodes through 1000 simulations yield 3.184000

$ mpirun.py -c 4 monte_carlo_pi.py -- 10000
Estimating PI on 4 nodes through 10000 simulations yield 3.135200

$ mpirun.py -c 8 monte_carlo_pi.py -- 100000
Estimating PI on 8 nodes through 100000 simulations yield 3.136560

$ mpirun.py -c 10 monte_carlo_pi.py -- 1000000
Estimating PI on 10 nodes through 1000000 simulations yield 3.144464

$ mpirun.py -c 10 monte_carlo_pi.py -- 10000000
Estimating PI on 10 nodes through 10000000 simulations yield 3.141069

$ mpirun.py -c 10 monte_carlo_pi.py -- 100000000
Estimating PI on 10 nodes through 100000000 simulations yield 3.141677

The above example have very little communication involved other than the final exchange of hits. We implemented a lot of different communication operations as you can see in the online documentation.

Performance

Don't expect much, as a pupyMPI program will normally run 15-20 times slower than the C equivalent. But hopefully it will prove a fine educational tool and maybe also be used for fast prototyping. If we get the time (and credit at school) we'll performance tune it to get within a factor 10 of the C version.

Posted in diku, mpi, performance, python | Tagged , , , , | Leave a comment

Personal development a month at a time

I saw somebody else doing it, and it seems like a cool idea to try some new things and ideas you wouldn’t normally try. So starting from feb 2010 I’ll change my live somehow that will affect my day-to-day routine. The about 2 people reading this blog is very welcome to pitch ideas.

Posted in pdamaat, personal | Tagged , , | 2 Comments

DoubleGreen.dk… how hosting should be

I love the thought of people building a business based on doing something good, and doublegreen.dk is just such a business. CO2 negative hosting environments for Danish customers might not be the largest scale project, but it’s a step on the way. It’s just like voting… your vote means something so small / medium scale environmental goodies also matters. It’s not just up to the big companies to make the changes, all though that would also be much appreciated.

So all your people with one and dandomain.. move.. the sooner the better.

Posted in hosting | Tagged , | Leave a comment

Protecting a Drupal site with Mollom

Just when I get really tired with Drupal something happens that kindda pull me back in. It happened to me again yesterday. I have spend a lifetime building a customer website, and the final result is – from a technical stand point – not the most beautiful piece of work I have ever done, so I was already afraid when the client called me. Apparently the 7000 comments on the site was not a sign of the sites popularity. Just spam.

After some digging I found Mollom, which is a somewhat commercial module for Drupal and installed it. After spending about 10 seconds on creating a Mollom account (thanks for supporting openid so well), 30 seconds to installed the module and about a minutes to setup the module, the site was protected against spam. Now, 24 hours after setting it up, no spam comments have made it’s way through.

When complex modules that integrate with an entire system that deep just work out of the box you have to give it credit. It’s simply fantastic.

Posted in Uncategorized | Tagged , , , | Leave a comment

Database structure migration without code versioning

This is not intended to be a rant or even close to it. But I find myself with a problem so obvious that I’m annoyed by the lack of solution. I can’t be the only developer with this problem.

I use South to handle database structure migration and it works (for the most part) very well, and I’m happy to use it. One problem though. When you actually need to migrate backward and forward many times you will run into problems with missing fields in your Django models. Say, so have a field called “person” on a random model. You wish to rename this to “full_name”, so you write the straight forward migration to handle this and remove the field from your model. Now, if you wish to migrate back your code is outdated. This mean that you’ll be unable to use your Django system to inspect things or whatever you might wish to do.

The solution – from my point of view – for all of us using some versioning tool would be to bind the migration to a changeset and have the forward and backward migrations update the code as well. This is a bit more demanding to set up, but it’s properly the only solution you have if you want the possibility to migrate back and have a useable system.

If only I had the time to take a proper look at South to find the solution.

Posted in Uncategorized | Tagged , , , , | Leave a comment