AppEngine Tuning #1

September 4, 2011September 4, 2011 Emlyn

Following my article The Amazing Story Of AppEngine And The Two Orders Of Magnitude, I’ve made some initial adjustments to Syyncc ‘s performance settings, and a minor code tweak.

Instances

I made the following change to Syyncc’s performance settings today:

AppEngine Performance Settings for Syyncc, 4 September 2011

I’ve set the Max Idle Instances to 1, and Min Pending Latency to 15 seconds. ie: hang it all, don’t start more instances unless the sky is falling in.

Syyncc’s site still seems responsive enough, and the app itself is working as before, I can’t detect any functional difference. But what’s important is that the average instance count has dropped significantly:

It’s dropped to 4 or 5 instances average, rather than 10 to 15 that it’s normally at. Not bad for shoving a couple of sliders around! And that’s without what will be necessary code changes, to stop the behaviour where the 50 tasks are scheduled concurrently once every 2 mins. That leads to a spike in activity, then nothing for most of the time, and is very likely the cause of excess idle instances. That’s all detailed in the previous post, linked at the top.

Given that the impact from the performance tuning is obvious, I’ll go ahead with refactoring the bursty scheduling code in the next few days, and post the results of that.

A bit more detail:

Instances Detail for Syyncc, 4 September 2011

Not quite good enough (I want to get down to under 2 average), but much better.

Datastore Reads

You’ll recall from the previous article I had the horrible code full of fetches with offsets. I’ve replaced it with this:

    def SetAllNextChecks(cls):
        monitor_query = SSMonitorBase.all().filter("enabled", True)
        lopcount = 0
        for monitor in monitor_query:
            if not monitor.nextcheck:
                monitor.nextcheck = datetime.today() - timedelta(days=5)
                monitor.put()
            lopcount += 1
        logging.debug("SetAllNextChecks processed %s monitors" % lopcount)
    SetAllNextChecks = classmethod(SetAllNextChecks)

So that’s a simple iteration through the query that should behave much better; the debug line says

D2011-09-04 23:19:24.189 SetAllNextChecks processed 1403 monitors

and I’m fairly sure that this is actually accurate (ie: it’s not actually touching over 100,000 objects!)

There’s no way to tell if it’s helping at the moment, Datastore Reads are only surfaced in the billing, and the billing lags a few days behind. So, I’ll report back midweek or so with results of this change.

It’s a start

That’s it for now. Some preliminary success, but I’ve got a way to go. Stay tuned for the next update in a few days.

15 thoughts on “AppEngine Tuning #1”

Bendanpa says:

Wait to see your billing results….

September 5, 2011 at 3:55 am Reply
The Amazing Story Of AppEngine And The Two Orders Of Magnitude « point7 says:

[…] Emlyn's Blog – Downshifting, Creating, Ranting « Syyncc, AppEngine and Me AppEngine Tuning #1 […]

September 5, 2011 at 10:34 am Reply
Sverre Rabbelier (@srabbelier) says:

You can (and should!) use [0] AppStats to look at your RPCs to find out if you did indeed solve the problem. Doing so should show the previous behavior pretty clearly.

[0] http://code.google.com/appengine/docs/python/tools/appstats.html

September 6, 2011 at 5:57 am Reply
1. Emlyn says:
  
  I didn’t know about AppStats, that’s gold, thanks!
  
  But for this issue, it’s not going to show me what I want to see, is it? If I do a fetch like
  
  foo.fetch(10, 20000)
  
  I’ll see one rpc call, but in the datastore there’ll be 20010 reads which are invisible to me (and which are the problem). Is that not the case?
  
  September 6, 2011 at 11:22 am Reply
  1. Sverre Rabbelier (@srabbelier) says:
    
    Yes and no. You can’t actually see how many reads are done, but you _can_ see the query being executed, and how long it took to run. Example output for a query that fetches 150+1 entities:
    
    @26ms datastore_v3.RunQuery real=7393ms api=1396ms
    Request: Query<app_='socghop', compile_=True, composite_index_=[], count_=151, filter_=[Query_Filter, Query_Filter], …>
    Response: QueryResult<compiled_cursor_=CompiledCursor<position_=[CompiledCursor_Position]>, …>
    
    The fact that the api time is above 10 seconds should tell you that there’s a lot of entities being fetched and that you should make sure that this query is doing what you’re expecting it to be doing. Your example would show up as a query like this happening _multiple times_, something that should definitely set of alarm bells :).
    
    September 7, 2011 at 7:26 am Reply
    1. Emlyn says:
      
      That might be true, the particular request overall took a long time to run, datastore response time in that bad section could have been part of it.
      
      September 8, 2011 at 12:35 am Reply
Nick Johnson says:

You can further optimize things in that loop by accumulating the results and storing them in a single batch put operation. And instead of using iterators, if you fetch in a batch, you’ll further reduce RPCs. Sverre’s suggestion of using appstats is a good one.

September 6, 2011 at 10:23 am Reply
1. Emlyn says:
  
  Absolutely, a fetch using cursors would be better. However, as I’ve said in the articles, the proper fix is to remove this code altogether.
  
  What I’ve done here is the simplest, cleanest little bit of code I can which should show the difference clearly between the old situation and the new one. Think of it more as testing an hypothesis, rather than as a final fix of the code.
  
  As to a batch put, that’s probably worth looking at. The new billing says that I’ll be paying for datastore writes too; not much, but it’s there. So they probably warrant optimisation. However, they’re a rounding error at the moment, bigger fish to fry!
  
  September 6, 2011 at 11:26 am Reply
Ryan Brown says:

Can’t you query for only entities that are enabled and have nextcheck = None?

September 7, 2011 at 12:19 am Reply
1. Emlyn says:
  
  No, new ones will have no nextcheck, rather than a nextcheck set to None (I think).
  
  September 8, 2011 at 12:38 am Reply
  1. Ryan Brown says:
    
    I haven’t read the code for a while, but I think db.Model will store a None for all properties unless you’ve removed the property from your model or are using an Expando. You get charged a read for every entity that matches the query, so optimizing the query to return fewer results should save you money.
    
    September 20, 2011 at 5:21 am Reply
AppEngine Tuning – An instance of success? « point7 says:

[…] point7 Emlyn's Blog – Downshifting, Creating, Ranting « AppEngine Tuning #1 […]

September 7, 2011 at 12:21 am Reply
juancarlospaco says:

i use SQLite inside GAE, dont use datastore

September 7, 2011 at 1:29 am Reply
Emlyn says:

How can you do that in production?

September 8, 2011 at 12:31 am Reply
What Google App Engine Price Changes Say About The Future Of Web Architecture | TouchHappy says:

[…] Emlyn O’Regan has written a really fine article explaining in great detail his experiences tuning his application for the new pricing model. He has also followed this article up with AppEngine Tuning #1. […]

September 11, 2011 at 11:45 pm Reply

point7 – Emlyn's Blog

AppEngine Tuning #1

Instances

Datastore Reads

D2011-09-04 23:19:24.189 SetAllNextChecks processed 1403 monitors

It’s a start

15 thoughts on “AppEngine Tuning #1”

Leave a reply to Emlyn Cancel reply

Instances

Datastore Reads

D2011-09-04 23:19:24.189 SetAllNextChecks processed 1403 monitors

It’s a start

Share this:

Related

15 thoughts on “AppEngine Tuning #1”

Leave a reply to Emlyn Cancel reply