Having the last word

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Having the last word

Post by richmond62 » Mon Aug 26, 2019 3:34 pm

Here's something I fell foul of for the first time today . . .

I had a field called "fDATA2" containing some text:

"A man who had been soaked in water, and smothered in mud, and"

as one does 8) , and wanted to chop its end off like this:

Code: Select all

on mouseUp
repeat until the last word of fld "fDATA2" is "water"
      if the last word of fld "fDATA2" is "water" then
         --do nix
      else
         delete the last word of fld "fDATA2"
      end if
      wait 20 ticks
   end repeat
end mouseUp
and "blow me down", but it emptied the whole field . . .

Why, forbye?

Because LiveCode did not 'see' the word "water", it did, however 'see' the word "water,", which
was a right pain in the bum because . . .

Any text analysis program I write to do this sort of thing will have to trawl its way through
a textField bunging spaces before any punctuation marks.
Last edited by richmond62 on Mon Aug 26, 2019 4:29 pm, edited 1 time in total.

bogs
Posts: 5480
Joined: Sat Feb 25, 2017 10:45 pm

Re: Having the last word

Post by bogs » Mon Aug 26, 2019 3:49 pm

Why, forbye?
because with a period attached, "water" =/= "water."

Maybe instead of...

Code: Select all

repeat until the last word of fld "fDATA2" is "water"
it should be

Code: Select all

repeat until the last word of fld "fDATA2" contains "water"
(not tested) :D
Image

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Mon Aug 26, 2019 3:50 pm

Probably . . . but while Thou wast being clever, I was mucking around like this:

Code: Select all

on mouseUp
   put empty into fld "fDATA2"
   put 1 into KOUNT
   repeat until char KOUNT of fld "fDATA" is empty
      switch char KOUNT of fld "fDATA"
         case "," 
            put " ," after fld "fDATA2"
            break
         case "." 
            put " ." after fld "fDATA2"
            break
         case ";" 
            put " ;" after fld "fDATA2"
            break
         case ":" 
            put " :" after fld "fDATA2"
            break
         case "!" 
            put " !" after fld "fDATA2"
            break
         case "?" 
            put " ?" after fld "fDATA2"
            break
         case ")" 
            put " )" after fld "fDATA2"
            break
         case "(" 
            put "( " after fld "fDATA2"
            break
         default
            put char KOUNT of fld "fDATA" after fld "fDATA2"
      end switch
      add 1 to KOUNT
   end repeat
end mouseUp

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Mon Aug 26, 2019 3:53 pm

contains

does work, BUT . . . it leaves a trailing punctuation mark.

bogs
Posts: 5480
Joined: Sat Feb 25, 2017 10:45 pm

Re: Having the last word

Post by bogs » Mon Aug 26, 2019 4:01 pm

Yes (tested it myself), BUT you can always at the end either ...
z.) ditch punctuation or
y.) simply grab characters 1 to 5 of the last word (and go on from there).

Either would be better than case'ing it to death.
Image

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Mon Aug 26, 2019 4:25 pm

case'ing it to death
You have no concept of what that involves . . . my
Devawriter Pro contains something in the order
of 1000 switch statements, each containing about 3000 cases. 8)

At present Devawriter Pro is (very slowly)
going through a 'rationalisation' process so it is NOT such a "deadly CASE." :D

bogs
Posts: 5480
Joined: Sat Feb 25, 2017 10:45 pm

Re: Having the last word

Post by bogs » Mon Aug 26, 2019 4:29 pm

Ah, so like the title for this thread *should* have been ~
Raymond Burr as Perry Mason in "The case of the Devawriter" :P
Image

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Mon Aug 26, 2019 4:31 pm

Not really: Indic writing systems don't feature commas . . .

. . . they are far, far more bizarre. 8)

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7392
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: Having the last word

Post by jacque » Mon Aug 26, 2019 4:37 pm

For the original question, it would probably work if you use "trueword" instead of "word".
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Mon Aug 26, 2019 4:42 pm

Thanks: I'll give "trueword" a go. :D

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Wed Aug 28, 2019 8:12 pm

I haven't managed to get round to playing with "trueword" yet, but I have had some
fairly dirty thoughts . . .

1. How Anglo-Centric is "trueword"?

Well, let's try it with "De'ath" (this is a Huguenot name).

And, "just for fun" let's try it with "Жалба,",

And, because I am a sadistic old so-and-so "স্কুল" in the text:

"আমার নাম জন রিচমন্ড ম্যাথিউসন এবং আমি একজন স্কুলশিক্ষক যিনি বুলগেরিয়ায় থাকেন এবং কর্মরত।"

So . . . "trueword" works for "De'ath" and "Жалба,", but NOT for "স্কুল" (Bengali) because the word
is written with sandhi elision as "স্কুলশিক্ষক" where 'school' is elided with 'teacher'.

https://en.wikipedia.org/wiki/Sandhi

So, frankly, "trueword" is only sufficient for languages that employ European writing systems.

2. How good is trueword in texts that employ "different" punctuation system?

Well, for starters the whole thing would be useless for 'scriptura continua.' Leonardo da Vinci would have laughed.

I wonder how far "trueword" would get with the Greek άνω τελεία (as I don't remember
any Greek from when I was at school I'm not going to make a complete fool of myself here)?

¿I wonder about Spanish?

Well, well, well, it did OK with "¿Cuánto cuesta esa alfombra?" removing the "¿" from 'Cuánto."

That's impressive.

Last edited by richmond62 on Wed Aug 28, 2019 10:41 pm, edited 2 times in total.

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7392
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: Having the last word

Post by jacque » Wed Aug 28, 2019 10:04 pm

From the dictionary:
A trueWord is a word chunk, delimited by Unicode word breaks, as determined by the ICU Library. When there are no alphabetic or numeric characters between two word breaks, that string is not considered by LiveCode to be a trueWord.
The examples include Chinese and Russian, but I don't see any RTL languages there. ICU word breaks are defined here: http://userguide.icu-project.org/boundaryanalysis

Edit: A more detailed explanation: http://www.unicode.org/reports/tr29/#Word_Boundaries It talks about the difficulties for various languages. Hebrew, a RTL language, is apparently mostly compatible with the default rules but may require some special adjustments. Languages that do not normally include spaces or punctuation for word breaks present substantial problems.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Mon Sep 02, 2019 7:46 pm

As 'trueword' cuts the mustard for the vast majority of writing systems
I wonder what the utility of 'word' is at all, and wonder whether it might not
be a good idea to transfer the functionality of 'trueword' to 'word' and then
remove 'trueword' altogether.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10049
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: Having the last word

Post by FourthWorld » Tue Sep 03, 2019 12:53 am

richmond62 wrote:
Mon Sep 02, 2019 7:46 pm
As 'trueword' cuts the mustard for the vast majority of writing systems
I wonder what the utility of 'word' is at all, and wonder whether it might not
be a good idea to transfer the functionality of 'trueword' to 'word' and then
remove 'trueword' altogether.
There was a long discussion about that on Use LiveCode list back when Mark Waddingham was putting Unicode in place.

Many options were discussed, but in the end it was determined that if what we now call trueWord were used with the "word" chuck type the impact to existing code would be vastly damaging.

So to preserve legacy code while allowing the new Unicode parsing, trueWord became its own chunk type.

And FWIW, after we all struggled with what to call this new chunk type, the winning suggestion came from yours truly. :)
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: Having the last word

Post by richmond62 » Tue Sep 03, 2019 7:15 am

Peut être 'd'être' ce n'est pas un mot!

"truewordOffset("d'être","Ce n'est pas tant d'être riche qui fait le bonheur, c'est de le devenir.") -- returns 5"

Post Reply