Distributed Promises for AppEngine

PromiseThis post is a plea for help. I’ve made something pretty cool, in a commercial context, that’s small and contained and useful and which I want to open source. I’ve got no experience in successfully creating open sourced code that other people might want to interact with, so I’m looking for guidance from someone who does, particularly in the Python and/or AppEngine communities.


 

AppEngine is a great platform, one I’ve been working in now, using Python, for a couple of years.

One of the ongoing annoyances for me has been the low-level nature of the abstractions available for distributed programming, particularly using task queues to kick off background tasks (which might want to kick off more background tasks and etc).

At the most basic, you can create tasks on a queue. The queue is processed, the task runs eventually. The task is a web handler (a “get” handler iirc). Simple, but it’s messy to set up if you want to do complex things, lots of boilerplate, lots of messing around with routes and so on.

Then there’s the excellent deferred library. It allows you to kick off a function with some arguments in a task, but hiding all of the task queue messiness. It makes tasks highly useful and usable. But there are still niggles.

Firstly, deferred functions are passed by name. Actually, it’s more complex than this; deferred takes a callable, which might be an object (must be picklable, is pickled), or a function (passed by name for the most part I think, maybe something else going on with built in functions). But in any case we have the following restrictions in the doc:

The following callables can NOT be used as tasks:
1) Nested functions or closures
2) Nested classes or objects of them
3) Lambda functions
4) Static methods

ie: you can’t use really interesting stuff.

The second problem is that you launch a process, but then what? Does it ever execute? How do you know when it’s complete? How can you compose these calls?

As a higher level alternative to using deferred, I’ve made a library that provides distributed promises.

It lets you do things like this:

The interesting thing here is that both the functions are run in (separate) background tasks. If the first one times out, the second one will receive a timeout error as the result.

You’ll recognise promises more or less from the javascript world. These are similar, but structured a little differently.

Javascript’s promises are used for handling an asynchronous environment inside a single process. So you can kick off something asynchronous, and have the handler for that be a closure over the resolve method, meaning you can signal the job is done in a finished handler, without needing to “return” from the originating function.

For python on appengine, if you want to manage asynchronous behaviour inside a single process, try ndb tasklets.

My distributed promises for appengine are instead for managing distributed processing; that is, algorithms coordination multiple tasks to do something.

The function you pass to your promise (via when, then, etc) will run in a separate task, and can signal it is done by calling resolve() which will trigger further tasks (eg: as above) if necessary.

Also, to make promises really work and be interesting, I’ve included the ability to pass inner functions and closures and so on through to the promises. So while defer takes only callables that are picklable or referenceable by name, promises take functions (must be functions, not other callables) which are completely serialised with their entire closure context as necessary, and reconstructed in the task in which they will run. So you can do things like this:

Note the functions are referring to names from outside their own scope. Full closures are allowed. Also you can pass lambda functions.

The when() function returns a promise. The then() function returns a promise. There’s an all() function which also returns a promise (as well as variations such as allwhen and allthen). These functions are completely chainable.

The all() functions allow you to do something when a bunch of other things finish, ie:

promiseSpace.all([promises...]).then(somethingfinal)

Python has great exception handling, so I’ve dispensed with Javascript’s separate resolve and reject methods; to reject, just pass an exception to resolve. Also dispensed with are separate success / fail handlers; instead you get a result object with a value property, which throws the exception if the result was actually an exception.

Here’s a more red-blooded example of a function for mapping over the datastore, and example of using that to make a count function, and an example of calling that function:

Start at the bottom and work upwards to see how a simple counting method is implemented.

There might be much better ways of writing this; I’m not especially good at using these promises yet!

So, in closing, if you can see some value in this and you could help me with how to best publish this code, please contact me.

Advertisements
Distributed Promises for AppEngine

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s