Need for Speed
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Need for Speed
I’m generating large sets of integers (millions) which takes just a few seconds and then sorting them numerically (another couple seconds) and then deleting all the multiples which takes hours.
Repeat with zz=the number of lines of BigTemp down to 1 --delete duplicates
if line zz of BigTemp=line zz-1 of BigTemp then delete line zz-1 of BigTemp
end Repeat
How can the deletion process be speeded up? “Lines” deletes about twice as fast as “words”
Repeat with zz=the number of lines of BigTemp down to 1 --delete duplicates
if line zz of BigTemp=line zz-1 of BigTemp then delete line zz-1 of BigTemp
end Repeat
How can the deletion process be speeded up? “Lines” deletes about twice as fast as “words”
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
Can I ask a few questions first?
1. How are you generating this large list numbers?
2. What is the advantage of generating a list that can contain duplicates so you then have to 'weed the garden'?
1. How are you generating this large list numbers?
2. What is the advantage of generating a list that can contain duplicates so you then have to 'weed the garden'?
Re: Need for Speed
on mouseUp
put the milliseconds into timeOn
put empty into fld 1
repeat with xx=2 to 50000
put xx into zz
repeat until zz=1
if zz mod 2 =1 then
put ((zz*3)+1) into zz
put zz &return after BigTemp
else
put zz/2 into zz
put zz &return after BigTemp
end if
end repeat
end repeat
put BigTemp into fld 1
put the number of lines of BigTemp&return&return before fld 1
put (the milliseconds-timeOn)/1000&" seconds"
end mouseUp
This is part of an investigation into the Collatz numbers.
put the milliseconds into timeOn
put empty into fld 1
repeat with xx=2 to 50000
put xx into zz
repeat until zz=1
if zz mod 2 =1 then
put ((zz*3)+1) into zz
put zz &return after BigTemp
else
put zz/2 into zz
put zz &return after BigTemp
end if
end repeat
end repeat
put BigTemp into fld 1
put the number of lines of BigTemp&return&return before fld 1
put (the milliseconds-timeOn)/1000&" seconds"
end mouseUp
This is part of an investigation into the Collatz numbers.
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
Tangentially . . .
Why is SORT not very good?
- -
with this:
results in this:
- -
AND even I, with my limited knowledge of Mathematics, know that 989 is bigger than 99.
Why is SORT not very good?
- -
with this:
Code: Select all
on mouseUp
sort lines of fld "TANK" descending
end mouseUp
- -
AND even I, with my limited knowledge of Mathematics, know that 989 is bigger than 99.
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
So, over here [MacOS 13, 8 GB RAM] your script took just over 35 seconds to generate your list of numbers.
Re: Need for Speed
My iMac usually generates the list in under 10 seconds. A little over 5 million integers, it's kind of astonishing.
But that's all beside the point. How to eliminate duplicates in a similar timely fashion?
But that's all beside the point. How to eliminate duplicates in a similar timely fashion?
Re: Need for Speed
Firstly, the reason for the slow iteration through your loop is that in the "repeat with <index>" version the engine has to keep track of every one of those millions of lines or words and count through the list every time to reach the index.
Generally it is far better speedwise to use the "repeat for each <word/item/line>" version. You can't then directly manipulate the original list, so in the processing part you would normally make your checks and append correct or desired results to a new variable. In this case you might
Personally I find the best way of deduplicating a list is to add 1 to the contents of an array key where the keys are the values being deduplicated (works with numbers or text, whatever).
So when you have your zz variable, instead of appending it to BigTemp you wouldWhat that means is that the unique elements will have a value of 1, and the non unique ones will have a higher value but still only the one key. The values ate irrelevant, so you can just
Generally it is far better speedwise to use the "repeat for each <word/item/line>" version. You can't then directly manipulate the original list, so in the processing part you would normally make your checks and append correct or desired results to a new variable. In this case you might
Code: Select all
put empty into tPrevTemp
put empty into tNewBigTemp --not necessary if these are really transient variables
repeat for each line tTemp in BigTemp
if tTemp is not tPrevTemp then
. put tTemp & cr after tNewBigTemp
. put tTemp into tPrevTemp
end if
end repeat
--use tNewBigTemp in place of BigTemp instead
So when you have your zz variable, instead of appending it to BigTemp you would
Code: Select all
add 1 to BigTempArray[(zz)]
Code: Select all
put the keys of BigTempArray into tBigTemp
sort tBigTemp numeric ascending
Last edited by SparkOut on Sun Sep 11, 2022 12:08 am, edited 1 time in total.
-
- VIP Livecode Opensource Backer
- Posts: 10043
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Need for Speed
It would be interesting to see a comparison with the array option that used to be included as an example of finding unique values, e.g.:
Edit: Sparkout beat me to it.
Would be good to know results just the same.
Code: Select all
function UniqueVals pList
repeat for each line tLine in pList
put tLine into tA[tLine]
end repeat
get the keys of tA
sort lines of it numeric
return it
end UniqueVals

Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
Hmm: got a bit bogged down:
-
Code: Select all
on mouseup
put empty into fld "fLINES2"
put empty into fld "TANK2"
put fld "TANK" into XYZ
get DOUBLETROUBLE(XYZ)
put it after fld "TANK2"
put ((the number of lines in fld "TANK2") && "lines >>>") into fld "fLINES2"
end mouseup
function DOUBLETROUBLE WXYZ
split WXYZ by return and comma
combine WXYZ by return and comma
sort WXYZ numeric
return WXYZ
end DOUBLETROUBLE
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
Removed link as improved version below.
BE WARNED: On a Mac, once it has generated the Collatz numbers the stack BLOATS to about 89 MB.
BE WARNED: On a Mac, once it has generated the Collatz numbers the stack BLOATS to about 89 MB.

Last edited by richmond62 on Sun Sep 11, 2022 12:51 am, edited 1 time in total.
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
The commas are a curse.
This IS better (aesthetically at least, as it bungs a space at the end of each number rather than a comma):
-
UNFORTUNATELY one cannot do this:
This IS better (aesthetically at least, as it bungs a space at the end of each number rather than a comma):
-
Code: Select all
on mouseup
set the vis of img "COOKING" to true
put empty into fld "fLINES2"
put empty into fld "TANK2"
put fld "TANK" into XYZ
get DOUBLETROUBLE(XYZ)
put it after fld "TANK2"
put ((the number of lines in fld "TANK2") && "lines >>>") into fld "fLINES2"
set the vis of img "COOKING" to false
end mouseup
function DOUBLETROUBLE WXYZ
split WXYZ by return and space
combine WXYZ by return and space
sort WXYZ numeric
return WXYZ
end DOUBLETROUBLE
Code: Select all
function DOUBLETROUBLE WXYZ
split WXYZ by return and space
combine WXYZ by return
sort WXYZ numeric
return WXYZ
end DOUBLETROUBLE
-
- Livecode Opensource Backer
- Posts: 10078
- Joined: Fri Feb 19, 2010 10:17 am
Re: Need for Speed
Ah, got it at last . . .
-

Code: Select all
function DOUBLETROUBLE WXYZ
split WXYZ by return and space
combine WXYZ by return and null
sort WXYZ numeric
return WXYZ
end DOUBLETROUBLE
- Attachments
-
- Number Gen.livecode.zip
- Stack.
- (185.94 KiB) Downloaded 174 times
-
- VIP Livecode Opensource Backer
- Posts: 10043
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Need for Speed
Good call on using both split and combine. I was thinking of the old example for getting frequency counts, but that's not needed here.
The old scripting adage abides:
- Know the engine
- Trust the engine
- Use the engine
The old scripting adage abides:
- Know the engine
- Trust the engine
- Use the engine
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn