Page 1 of 1

Ways to compare two or more strings

Posted: Mon May 27, 2013 1:32 am
by archer2009BUSknbj
is there a way to compare two strings to see how many words they have in common

for example

a="apples pears oranges"

b="apples sultanas strawberries pears"

so they have 2 things in common

is there a command that will let you know how many things are in common or a simple way to work it out?

Re: Ways to compare two or more strings

Posted: Mon May 27, 2013 1:44 am
by sturgis
Since your lists are space separated, it should be easy to use word chunking to do the check.
Here is 1 way.

Code: Select all

put "apples pears oranges" into a -- would make sense to compare word length and use the shortest list to iterate through for speed purposes
-- in this case, just using the a list since I know its the shortest.


put "apples sultanas strawberries pears" into b

set the wholematches to true -- so that apple will not match apples.  

-- go through each word contained in variable a. 
repeat for each word tWord in a
-- if the word we're looking for (tword) is in variable b, the result of the function will be the location (word number) in b
-- where the word appears.  If its 0 there was no match.  If its > 0 there was a match so tack the found word on to the list
 if wordoffset(tWord,b) > 0 then then put tWord & comma after tCommon
end repeat
-- get rid of the trailing comma
delete the last char of tCommon
put tCommon -- put the list of matches into the msg box. 

Re: Ways to compare two or more strings

Posted: Mon May 27, 2013 1:25 pm
by bn
Hi Sturgis,

I know you like this stuff...

This gave me the opportunity to test the new LiveCode split command variant "as set"

Code: Select all

on mouseUp
   put "apples sultanas strawberries pears" into tData1
   put "apples pears oranges" into tData2
   split tData1 by space as set -- needs LC 6.0 to work
   split tData2 by space as set -- needs LC 6.0 to work
   intersect tData1 with tData2
   combine tData1 with space as set -- needs LC 6.0 to work
   put tData1
end mouseUp
Kind regards
Bernd

Re: Ways to compare two or more strings

Posted: Mon May 27, 2013 1:40 pm
by sturgis
Hmm. It works with 5.5.4 too! An undocumented feature apparently. Going to try it with 5.5.1... Ok, 5.5.1 doesn't work, 5.5.2 does. Didn't bother to check 5.5.3. I like it when new stuff shows up! Thx for pointing me the right way, sure makes it easy to do the comparison.

Re: Ways to compare two or more strings

Posted: Mon May 27, 2013 1:50 pm
by bn
I actually always wanted to try intersect....

here is the version that works with all versions of Livecode/Revolution

Code: Select all

on mouseUp
   put "apples sultanas strawberries pears" into tData1
   put "apples pears oranges" into tData2
   repeat for each word aWord in tData1
      add 1 to tArray1[aWord] -- here you also count the number of occurences of the word, not used here
   end repeat
   repeat for each word aWord in tData2
      add 1 to tArray2[aWord] -- here you also count the number of occurences of the word, not used here
   end repeat
   intersect tArray1 with tArray2
   put the keys of tArray1
end mouseUp
BTW I am not shure which version is faster, wordOffset is very very fast. Arrays always have an overhead.

Kind regards
Bernd

Re: Ways to compare two or more strings

Posted: Mon May 27, 2013 2:11 pm
by sturgis
Would need a much much bigger set to check before any speed differences would matter. Also, checked the 5.5.2 release notes, not a word about "as set" yet it is there. Not that i'm complaining.

There was also a discussion on the list about "skip lists," I wonder if/how split lists could be applied to a problem like this.

EDIT: Ok, so 5.5.3 works, 5.5.2 fails. :) It must be morning.