AppEngine Tuning #1
Following my article The Amazing Story Of AppEngine And The Two Orders Of Magnitude, I’ve made some initial adjustments to Syyncc ‘s performance settings, and a minor code tweak.
I made the following change to Syyncc’s performance settings today:
I’ve set the Max Idle Instances to 1, and Min Pending Latency to 15 seconds. ie: hang it all, don’t start more instances unless the sky is falling in.
Syyncc’s site still seems responsive enough, and the app itself is working as before, I can’t detect any functional difference. But what’s important is that the average instance count has dropped significantly:
It’s dropped to 4 or 5 instances average, rather than 10 to 15 that it’s normally at. Not bad for shoving a couple of sliders around! And that’s without what will be necessary code changes, to stop the behaviour where the 50 tasks are scheduled concurrently once every 2 mins. That leads to a spike in activity, then nothing for most of the time, and is very likely the cause of excess idle instances. That’s all detailed in the previous post, linked at the top.
Given that the impact from the performance tuning is obvious, I’ll go ahead with refactoring the bursty scheduling code in the next few days, and post the results of that.
A bit more detail:
Not quite good enough (I want to get down to under 2 average), but much better.
You’ll recall from the previous article I had the horrible code full of fetches with offsets. I’ve replaced it with this:
def SetAllNextChecks(cls): monitor_query = SSMonitorBase.all().filter("enabled", True) lopcount = 0 for monitor in monitor_query: if not monitor.nextcheck: monitor.nextcheck = datetime.today() - timedelta(days=5) monitor.put() lopcount += 1 logging.debug("SetAllNextChecks processed %s monitors" % lopcount) SetAllNextChecks = classmethod(SetAllNextChecks)
So that’s a simple iteration through the query that should behave much better; the debug line says
D2011-09-04 23:19:24.189 SetAllNextChecks processed 1403 monitors
and I’m fairly sure that this is actually accurate (ie: it’s not actually touching over 100,000 objects!)
There’s no way to tell if it’s helping at the moment, Datastore Reads are only surfaced in the billing, and the billing lags a few days behind. So, I’ll report back midweek or so with results of this change.
It’s a start
That’s it for now. Some preliminary success, but I’ve got a way to go. Stay tuned for the next update in a few days.