How to Speed Up This Code?

LiveCode is the premier environment for creating multi-platform solutions for all major operating systems - Windows, Mac OS X, Linux, the Web, Server environments and Mobile platforms. Brand new to LiveCode? Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: How to Speed Up This Code?

Post by mwieder » Fri Jul 31, 2015 7:00 am

Indeed.
Yes, every time I code a repeat loop, I look for ways to turn it into a 'repeat for each' construct if possible.
It's a dramatic difference.

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7390
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: How to Speed Up This Code?

Post by jacque » Fri Jul 31, 2015 4:15 pm

I wonder how lineOffset along with its skip parameter would compare. I may try it later if no one else does. I'm not at the computer right now.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: How to Speed Up This Code?

Post by FourthWorld » Fri Jul 31, 2015 5:49 pm

jacque wrote:I wonder how lineOffset along with its skip parameter would compare.
Remember that the only thing a computer understands is a byte. Everything else has to be figured out from examining sequences of bytes.

Lineoffset requires the engine to examine every byte, looking for both the string you specific and returns, counting those returns as it goes until it matches the string, and then returns the count.

The most common use of lineoffset is to then retrieve the data on the found line, which requires that the process be repeated a second time in its entirety, since LC doesn't maintain an understanding of the byte position the string was found at between calls.

The skip param helps tremendously when using the offset function, since characters occupy a fixed number of bytes (1 in v6 and earlier, 2 in v7 and later), so the skip option plays well with the only thing the computer inherently understands, moving instantly to the specified start location.

With lineoffset, however, the concept of a "line" is a human construct unknown to the computer, so even when we specify that it should start on line 200 it still needs to examine every character from the beginning just to figure out where line 200 is.

A while back I tested relative performance of lineoffset vs offset used with CRs as part of the search string, with the goal of measuring equivalent output but using two very different means. Of course offset can't be used in all cases where lineoffset can, but IIRC where it can the performance is many times faster, not surprising given that offset is all about fixed-length entities.

For this particular phone number list problem I'll bet there's a good way to handle this with good if not amazing efficiency, but the optimal solution would depend on the specifics of the data, and being a list of contact info I'd understand if that can't be publicly shared. But if equivalent dummy data could be posted I'd be interested in seeing if I can get some time over the weekend to poke around in it.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7390
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: How to Speed Up This Code?

Post by jacque » Fri Jul 31, 2015 6:14 pm

All right, it could use offset then. I'm thinking that since offset can retain its relative position in the text just as "repeat for each" does, it will find target strings faster than examining every line.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

sritcp
Posts: 431
Joined: Tue Jun 05, 2012 5:38 pm

Re: How to Speed Up This Code?

Post by sritcp » Fri Jul 31, 2015 7:13 pm

Thierry was kind enough to private mail me his code using lineOffset. Here it is:

Code: Select all

on rexSolution searchFor
   local N, tFound, tData,
   local L2skip = 0
   put field "RawData" into tData
   repeat while true
      put lineOffset( searchFor, tData, L2skip) into N
      if N = 0 then exit repeat
      put line (N + 3 + L2skip) of tData &cr  after tFound
      add (N+3) to L2skip
   end repeat
  
   put tFound into field "Output"
   put the number of lines of tFound into msg
end rexSolution
And, here are the results for my "needles in a haystack" dataset (which I have attached to an earlier message):

"repeat with i=1 to ...." method: 540 seconds (true!!!)
"repeat for each line ..." method: 3 seconds
"lineOffset" method: 0.3 seconds (yes!!!)

This is nuts! I think the penalty is too high for someone like me (an amateur) to rely on trial and error to land up with the right tool. There needs to be a set of guidelines relating the nature of problem to tool effectiveness (speed is only one of the criteria). In other words, something to bridge Richard's engine-level insights and my need not to go completely off-track (see test results above).

Aside, after Mark's comment above, I too am wondering if "repeat with i = ..." has a role to play (apart from tripping me up!). By using an external counter along with "repeat for each ...", we have all the advantages without any of the drawbacks of "... line i ...".

Regards,
Sri.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: How to Speed Up This Code?

Post by FourthWorld » Fri Jul 31, 2015 7:27 pm

sritcp wrote:This is nuts! I think the penalty is too high for someone like me (an amateur) to rely on trial and error to land up with the right tool. There needs to be a set of guidelines relating the nature of problem to tool effectiveness (speed is only one of the criteria). In other words, something to bridge Richard's engine-level insights and my need not to go completely off-track (see test results above).
It's possible to write slow code and fast code in every language. The difference with LiveCode is that it does a very good job of insulating us from machine logic, but the tradeoff is that the code is ultimately being executed by a machine where it might benefit from thinking like a machine.

With lower-level languages you have no choice but to learn machine logic first, so once you've spent months getting the necessary foundation to use the language at all you've already acquired the knowledge needed to use it for better efficiency.

Everything in computing involves tradeoffs. The easier it is to get started with a language is the degree to which it will be easy to get started writing slower code; over time we pick up what we need as we go so it generally evens out well.

The "repeat with..." form isn't bad, and in predecessor dialects of this family of languages it was the only repeat form available, so LiveCode pretty much had to support it to remain true to its linguistic roots. But LiveCode has taken its language into many areas much more performant, and along with that the learning curve has gone up in proportion to the increased number of language options.

The Dictionary notes that "repeat for each..." is faster, but I agree it would be helpful to offer additional notes there about why it's faster. I've added that to my to-do list for the community effort underway soon to assist with the documentation overhaul in progress now.

In the meantime, anyone here is free to use the Dictionary's Comments feature to add notes that might be beneficial to newcomers.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

sritcp
Posts: 431
Joined: Tue Jun 05, 2012 5:38 pm

Re: How to Speed Up This Code?

Post by sritcp » Fri Jul 31, 2015 7:29 pm

FourthWorld wrote:.......... and being a list of contact info I'd understand if that can't be publicly shared. But if equivalent dummy data could be posted I'd be interested in seeing if I can get some time over the weekend to poke around in it.
Hi Richard:
It is public data. I have attached a zip file to an earlier message.

Regards,
Sri

sritcp
Posts: 431
Joined: Tue Jun 05, 2012 5:38 pm

Re: How to Speed Up This Code?

Post by sritcp » Fri Jul 31, 2015 7:39 pm

FourthWorld wrote:........ In the meantime, anyone here is free to use the Dictionary's Comments feature to add notes that might be beneficial to newcomers.
Would it be possible to link from each dictionary entry to a discussion of the finer points of that entry? This way, it wouldn't clutter up the dictionary, but let us drill deeper when necessary. The discussion would be different from forum posts such as here, in that it would focus solely on issues relating to the use of that entry.

Regards,
Sri

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: How to Speed Up This Code?

Post by mwieder » Fri Jul 31, 2015 7:59 pm

I'm surprised and delighted to find that the lineOffset approach is so fast (and I should have suspected that Thierry would come up with a blazingly-fast solution). I expected that lineOffset might be in the same ballpark as 'repeat for each', but not that it would be so much faster.

I do think that searching and sorting algorithms will probably have to be hand-crafted for individual situations (in the current case, the fax number is always three lines beneath the title), and there will be cases where only the 'repeat with i=...' form will be effective. Fortunately LiveCode offers many different ways to get a solution to any given problem, and there's no one-size-fits-all optimization. I think that comes from experience and also from the sort of group-think that we've just had here where we can all learn from each other and collectively hone the best approach.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: How to Speed Up This Code?

Post by FourthWorld » Fri Jul 31, 2015 8:14 pm

sritcp wrote:It is public data. I have attached a zip file to an earlier message.
Thanks. I'd missed that; very helpful to have the raw data.

Thierry's contributions here are always excellent, so it's not surprising to see he's once again come up with a very speedy solution.

But given how lineoffset and repeat for each work under the hood, I was surprised to find the former faster than the latter.

So now that I have the data set I put both his function and Mark Wieder's into a test harness that calls them as functions, and removes everything not essential to the task:

Code: Select all

local sData

on mouseUp
   put fld "RawData" into sData
   -- Test 1 - lineoffset:
   put the millisecs into t
   put rexSolution( "Superintendent") into r1
   put the millisecs - t into t1
   --  Test 2 - repeat for each:
   put the millisecs into t
   put altSolution( "Superintendent") into r2
   put the millisecs - t into t2
   -- Show times, and whether results match:
   put "Test 1 - lineoffset:   "& t1 &" ms" & \
         cr& "Test 2 - repeat for each: "& t2 &" ms" & \
         cr&"Results match: "& (r1 = r2)
end mouseUp

-- Test 1 - lineoffset:
function rexSolution searchFor
   local L2skip = 0
   repeat while true
      put lineOffset( searchFor, sData, L2skip) into N
      if N = 0 then exit repeat
      put line (N + 3 + L2skip) of sData &cr  after tFound
      add (N+3) to L2skip
   end repeat
   return tFound
end rexSolution

-- Test 2 - repeat for each
function altSolution searchFor
   put 1 into i
   repeat for each line tLine in sData
      if searchFor is in tLine then
         put line (i+3) of sData & cr after tFound
      end if
      add 1 to i
   end repeat
   return tFound
end altSolution
Here the results I get are:

Test 1 - lineoffset: 356 ms
Test 2 - repeat for each: 168 ms
Results match: true

It's possible that I over-edited those functions, but the results match and the speeds are pretty much in line with what I'd expect with each method.

I think the biggest difference between the results here and the results you'd tested earlier is that Mark's original code contained some additional UI updating stuff that Thierry's didn't. Those UI additions would be very helpful for larger data sets, but in general I find when data is known to be smaller than 1 MB we can often process it quickly enough in LiveCode that we don't even need a watch cursor. :)
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am

Re: How to Speed Up This Code?

Post by Simon » Fri Jul 31, 2015 8:24 pm

Oh Man! I need a faster machine;
Test 1 - lineoffset: 533 ms
Test 2 - repeat for each: 226 ms
Results match: true

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: How to Speed Up This Code?

Post by FourthWorld » Fri Jul 31, 2015 8:29 pm

sritcp wrote:I was under the impression that LiveCodeJournal was in suspended animation; the tutorials in there are of some vintage and the last blog entry is over a year old. Glad to know it is coming alive again!
https://www.youtube.com/watch?v=dGFXGwHsD_A
:)

Thanks for the kind words. Shortly after I'd updated LCJ to use LiveCode top-to-bottom in a new CMS, I got swamped with some interesting new projects that have been very involving. On top of the LC community work I've been engaged with, it's meant that LCJ has sat dormant longer than I'd prefer, horribly overdue for a new responsive template and completion of the backlog of articles I've been amassing notes for.

With any luck I'll be able to clear my plate to get back to that soon, and in the meantime I'd like to remind everyone here that LiveCode Journal has always been a community resource, so feel free to write me if you'd like to contribute articles or other resources you'd like to share there.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

sritcp
Posts: 431
Joined: Tue Jun 05, 2012 5:38 pm

Re: How to Speed Up This Code?

Post by sritcp » Fri Jul 31, 2015 8:43 pm

The surprises keep on coming!

I told myself that, while I am on a voyage of discovery, let me use one more thing I have learnt, i.e., arrays are faster than lists.
So I put the field into a list and "split" the list into an array. Here's the code:

Code: Select all

on mouseUp
   local tArray, i, tTime, tOutput
   put the milliseconds into tTime
   put field "RawData" into tArray
   split tArray by return
   put 1 into i
   repeat for each element tElement in tArray
      if tElement is "Special Education Director" then put tArray[i + 3] & return after tOutput
      add 1 to i
   end repeat
   put tOutput into field "Output"
   put the milliseconds - tTime into msg
end mouseUp
The result: 56 ms (My machine is old; a 2010 MBP17")

And to think that I started out with 540000 ms!

Regards,
Sri.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: How to Speed Up This Code?

Post by FourthWorld » Fri Jul 31, 2015 9:02 pm

Simon wrote:Oh Man! I need a faster machine;
Test 1 - lineoffset: 533 ms
Test 2 - repeat for each: 226 ms
Results match: true
Linux fanboy response: "It's not the hardware. Switch to Linux!" :)

Of course the only thing Linux has to do with it is it lets me choose my components. I ran the test on a custom build with a modest but satisfying G3220, Haswell dual-core @ 3GHz.

What CPU are you running?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: How to Speed Up This Code?

Post by FourthWorld » Fri Jul 31, 2015 9:44 pm

sritcp wrote:

Code: Select all

on mouseUp
   local tArray, i, tTime, tOutput
   put the milliseconds into tTime
   put field "RawData" into tArray
   split tArray by return
   put 1 into i
   repeat for each element tElement in tArray
      if tElement is "Special Education Director" then put tArray[i + 3] & return after tOutput
      add 1 to i
   end repeat
   put tOutput into field "Output"
   put the milliseconds - tTime into msg
end mouseUp
The result: 56 ms (My machine is old; a 2010 MBP17")
It's good for speed, but now we need to double-check it for accuracy, as the results are very different from the output of the first two. In my tests the output from this one is a subset of the others, and most of the extras in the first two are duplicate fax numbers, but not all.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Post Reply