Smart Mobile Studio
  • News
  • Forums
  • Download
  • Store
  • Showcases
    • Featured demos
    • The Smart Contest 2013, Round 1 – Graphics
  • Documentation
    • Get the book
    • System requirements
    • Prerequisites
    • Getting started
      • Introduction
      • Application architecture
      • The application object
      • Forms and navigation
      • Message dialogs
      • Themes and styles
    • Project types
      • Visual project
      • Game project
      • Console project
    • Layout manager
    • Networking
      • TW3HttpRequest
      • TW3JSONP
      • Loading files
  • About

Focus, callstack and recursive processing

Posted on 29.09.2011 by Jon Lennart Posted in Documentation 6 Comments
Call stack limitations

Call stack limitations

The world of mobile computing is getting better and better, but there are some limitations when dealing with smart gadgets running iOS or Android. The first thing is obviously the speed, which doesn’t even come close to what an average home computer can muster. Even my 5 year old test machine can outrun any mobile device without making so much as a dent in the cpu-o-meter.

The second limitation people are going to discover, depending of-course on the line of work you do, is that mobile devices is the proverbial banker when it comes to memory – implying that your call stack is but a microbial-fraction of what we are used to. What does this mean? Well every time you call a method under Javascript, you eat up one call-slot. If this procedure in turns calls another, and another (and so on) it quickly gobbles up the meager call stack limit. The size of the call stack is varied and depends on the vendor, as of writing iOS is sticking to it’s near obsolete range of 100 recursive calls while Android is said (I couldn’t find any official information on this) to have a call stack of about 800.

To kiss or not to kiss, that is the question

Kiss (keep it simple stupid) is one way to go when it comes to mobile development. But there are times when you must resort to classical speed mechanisms to reach the full potential of a system. As such I am in the process of optimizing the base-classes for speed and simplicity (some form of balance preferably). The method is really simple and it’s known as loop expansion (or just “unrolling”). Let’s for sake of argument say you have a control with a lot of child objects, like a grid or a list. Instead of doing a simple for/next loop, like this:

[sourcecode language=”delphi”]
Procedure TMyObject.DoSomething;
var
x:Integer;
Begin
for x:=0 to FChildren.Count-1 do
Begin
process(FChildren[x]);
end;
end;
[/sourcecode]

The above example is perfectly valid, but it’s quite slow depending on the data-load you expect. A list control could be expected to handle hundreds if not thousands of elements – so we want to speed things up and minimize loop time:

[sourcecode language=”delphi”]
Procedure TMyObject.DoSomething;
var
mLongs:Integer;
mShort:Integer;
mCount:Integer;
mIndex:Integer;
begin
mCount:=FChildren.Count;
mLongs:=mCount shr 3;
mShort:=mCount mod 8;

while mLongs>0 do
Begin
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
process(FChildren[mIndex]);inc(mIndex);
dec(mLongs);
end;

while mShort>0 do
Begin
process(FChildren[mIndex]);inc(mIndex);
dec(mShort);
end;
end;
[/sourcecode]

Speed-wise, at least when it comes to crunching fixed data like pixels or typed JS arrays, is nearly incomparable to the first approach. It obviously produce more code – but the benefits are just to good to ignore (roughly 4 times as fast).

As long as your final build does not exceed 10 megabytes, which is Apple’s limit on single javascript files, it’s not a problem (and that would be a huge, huge project if you ask me).

Men first, women and children last

When doing recursive calls that affect an entire object model you have to deal with the men first, meaning of-course those elements that does not have children attach to them (oh this is gonna get me into so much trouble if my wife reads it). Why? Because we want to minimize the number of recursive calls (or steps taken) a control has to make. Let’s use the simple formula to demonstrate (this would also be loop expanded in the final release):

[sourcecode language=”delphi”]
Procedure TMyObject.DoSomething;
var
x:Integer;
mCount:Integer;
mStack: TObjectList;
mObj: TObject;
Begin
//avoid multiple calls to getCount in FChildren array
mCount:=FChildren.Count-1;
for x:=0 to mCount do
Begin
//minimize call to getItem in FChildren array
mObj:=FChildren[x];
if mObj.count=0 then
// No kids? Do singles first
process(mObj) else
Begin
//kids? Ok, we’ll handle that later
if mStack=NIL then
mStack:=TobjectList.Create;
mStack.add(mObj);
end;
end;

if mStack<>NIL then
Begin
mCount:=mStack.Count;
while mCount>0 do
Begin
mObj:=mStack[mCount-1];
process(mObj);
dec(mCount);
end;
mStack.free;
end;
end;
[/sourcecode]

And last but not least, the “process” method should be inlined if possible, thats 50% of the point here. If you can avoid a secondary call – then avoid it, that’s how you cut down on call-stack overload and as a nice benefit you get a speed boost.

Notes

#1 [29.09.11, 18:31]: Eric Grange pointed out (as did two readers on this page) that the same effect will happen if we apply FChildren.Count-1 and start from zero. That would be slightly faster to since we dont execute a continous subtraction of the index. A clear error on my part.

#2 [29.09.11, 18:31]: Interestingly, the -1 trick was originally used for string processing (hence starting at 1) and has nothing to do with speed per-see, except that you save an “if FChildren.Count>0 then” in some languages. But this tiny advantage is indeed lost by the use of [-1] in the loop itself. Some basic compilers will execute the code even if FChildren.count =0. This is an old habit from the days of Amiga BlitzBasic programming.

#3 [29.09.11, 18:35]: Also the “process()” call should only be concidered pseudo code. It’s the loop expansion that’s being demonstrated. Preferably you would work with the data directly at this point rather than call another proc, since the whole point is to limit the number of calls while speeding up a process. It’s not meant as a literal example, but as the bare-bones of a technique.

General programming Smart Mobile Studio tutorial
« The elegance of Javascript
Control complexity and mobile resource management »

6 thoughts on “Focus, callstack and recursive processing”

  1. Linas says:
    29.09.2011 at 12:43

    I’m not quite sure about your 1 index trick 🙂 There is no difference between your version and e.g. for i:=0 to FChildren.Count - 1 do . It won’t be executed if list is empty and this version is better and more readable IMO because there is no need to decrease indexes by 1 in the loop.

    • Jon Lennart Aasenden says:
      29.09.2011 at 18:25

      Yes, eric pointed out the same thing earlier. See notes 🙂

  2. gabr says:
    29.09.2011 at 16:28

    I don’t get it. Why is

    for x:=1 to FChildren.Count do

    better than

    for x:=0 to FChildren.Count-1 do

    ??

    Both loops execute zero times if the Count is 0.

  3. Jon Lennart Aasenden says:
    29.09.2011 at 18:19

    Updated the article, see notes at the end of the document. It was not the for/next loop index that was the speed boost – but rather the unrolling of the loop. The use of -1 is an old habit from blitzbasic in the early 90’s where the for/next would execute even if the target value was smaller than the initial value. My bad.

  4. gabr says:
    29.09.2011 at 18:45

    Actually, there is one big difference between those two loops – if the iteration variable is declared as ‘cardinal’ and you loop to .Count-1 and the .Count is 0, Delphi will raise an exception.

    • Jon Lennart Aasenden says:
      29.09.2011 at 19:13

      Yes but that would be a signed/unsigned error using longwords rather than integer. Focus here is on the loop expansion which avoids all of these and is much faster.

Comments are closed.

Pages

  • About
  • Feature Matrix
  • Forums
  • News
  • Release History
  • Download
  • Showcases
    • The Smart Contest 2013, Round 1 – Graphics
  • Store
  • Documentation
    • Creating your own controls
    • Debugging, exceptions and error handling
    • Differences between Delphi and Smart
    • Get the book
    • Getting started
      • Introduction
      • Local storage, session storage and global storage
      • Application architecture
      • The application object
      • Forms and navigation
      • Message dialogs
      • pmSmart Box Model
      • Themes and styles
    • Layout manager
    • Networking
      • Loading files
      • TW3HttpRequest
      • TW3JSONP
    • Prerequisites
    • Real data, talking to sqLite
    • System requirements
    • Project types
      • Visual project
      • Game project
      • Console project

Archives

  • December 2019
  • December 2018
  • November 2018
  • July 2018
  • June 2018
  • February 2018
  • September 2017
  • April 2017
  • November 2016
  • October 2016
  • September 2016
  • April 2016
  • March 2016
  • January 2016
  • October 2015
  • September 2015
  • July 2015
  • April 2015
  • January 2015
  • December 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • November 2011
  • October 2011
  • September 2011

Categories

  • Announcements (25)
  • Developers log (119)
  • Documentation (26)
  • News (104)
  • News and articles (16)

WordPress

  • Register
  • Log in
  • WordPress

Subscribe

  • Entries (RSS)
  • Comments (RSS)
© Optimale Systemer AS