itemDelimiter misbehaving

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 10:32 am

I have a text that consists of several sentences ending in periods ("."):

If I might offer any apology for so exaggerated a fiction as the
Barnacles and the Circumlocution Office, I would seek it in the
common experience of an Englishman, without presuming to mention the
unimportant fact of my having done that violence to good manners, in the
days of a Russian war, and of a Court of Inquiry at Chelsea. If I might
make so bold as to defend that extravagant conception, Mr Merdle, I
would hint that it originated after the Railroad-share epoch, in the
times of a certain Irish bank, and of one or two other equally
laudable enterprises. If I were to plead anything in mitigation of the
preposterous fancy that a bad design will sometimes claim to be a good
and an expressly religious design, it would be the curious coincidence
that it has been brought to its climax in these pages, in the days of
the public examination of late Directors of a Royal British Bank. But,
I submit myself to suffer judgment to go by default on all these counts,
if need be, and to accept the assurance (on good authority) that nothing
like them was ever known in this land.

Now I want to "chop it up" into its component sentences, so I set the itemDelimiter to "."
and run through it:

Code: Select all

on mouseUp
set the itemDelimiter to "."
put 1 into KOUNTX
   repeat until item KOUNTX of fld "noCR" contains "XQX"
      put KOUNTX into fld "K1"
      put ((item KOUNTX of fld "noCR") & ".") into line KOUNTX of fld "CHOPPED"
      add 1 to KOUNTX
   end repeat
   end mouseUp
my source fld called "noCR" (because I've stripped it of carriage returns) and my results fld called "CHOPPED."

What I end up with is rubbish:

If I might offer any apology for so exaggerated a fiction as the
If I might
If I were to plead anything in mitigation of the
But,

This would seem to suggest that as well as "." (or in spite of my setting), line-ends are being treated as itemDelimiters.

Klaus
Posts: 14194
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: itemDelimiter misbehaving

Post by Klaus » Tue Oct 22, 2019 11:06 am

Hi Richmond,

Code: Select all

repeat until item KOUNTX of fld "noCR" contains "XQX"
Field "noCR" does not contain the string XQX, at least not in your example text. 8)
...line-ends are being treated as itemDelimiters.
NOPE!

Code: Select all

...
set itemdel to "."
answer the num of items of fld "noCR" 
## -> 4, which is correct!
...
No idea what is happening in your script (and I really do not want to examine it, its "logics"? is giving me headache),
but why not do it this way, Captain Cumbersome?

Code: Select all

on mouseUp
   set itemDel to "."
   ## Accessing variables is a thousandfold faster that accessing fields
   put fld "noCR" into tText
   
   ## Remove unwanted linebreaks first, but add a SPACE,
   ## since some lines have a CR instead of the neccessary SPACE in it
   replace CR with " " in tText
   
   ## Now collect all items in a CR delimited list
   repeat for each item tItem in tText
      put tItem & "." & CR after tNewText
   end repeat
   
   ## Remove possible leading SPACES in text
   replace (CR & " ") with CR in tNewText
   put tNewText into fld "chop suey"
end mouseUp
Tested and works! :D

Best

Klaus

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 12:14 pm

Field "noCR" does not contain the string XQX
No, it doesn't because it is a small sample of some text that has already undergone another routine.

Klaus
Posts: 14194
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: itemDelimiter misbehaving

Post by Klaus » Tue Oct 22, 2019 12:32 pm

richmond62 wrote:
Tue Oct 22, 2019 12:14 pm
Field "noCR" does not contain the string XQX
No, it doesn't because it is a small sample of some text that has already undergone another routine.
Ah, OK, was not sure.

I see what went wrong in your script!

Code: Select all

...
put ((item KOUNTX of fld "noCR") & ".") into line KOUNTX of fld "CHOPPED"
...
Your ITEMS contain CRs so they will overwrite the text of fld "chop suey".
You put a multiline item into a line of your field, so the linenumbers are amess
and you end with the rubbish you showed in your first example.

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: itemDelimiter misbehaving

Post by Thierry » Tue Oct 22, 2019 3:08 pm

Hi,

and for those having arthritis in their fingers:

Code: Select all

on mouseUp
   put fld "noCR" into tText
   put replaceText(tText, "\s*\n\s*", space) into tText
   put replaceText(tText, "\.\s*", "." &cr)
end mouseUp
Almost the same result as Klaus solution.
I let the reader find the difference...

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10049
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: itemDelimiter misbehaving

Post by FourthWorld » Tue Oct 22, 2019 3:50 pm

With the addition of Unicode, LiveCode now supports sentence as a chunk type. See the Dictionary for details.

That'll not only simplify the code, but also make it more robust, as it uses the ICU rules to determine sentence boundaries, including more punctuation than just periods.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Klaus
Posts: 14194
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: itemDelimiter misbehaving

Post by Klaus » Tue Oct 22, 2019 6:23 pm

sentence 1 of the_above_mentioned_example_text = If I might offer any apology for so exaggerated a fiction as the

So this wonderful fact does not really help very much here... 8)

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 6:32 pm

Of that I'm not sure, but one thing that I do wit is that it is a problem deselecting Don't wrap
in the Properties palette of some fields as it is instantly reselected.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 6:43 pm

For the benefit of non-Scots speakers I will interleave my remarks with 'owersettans.'

Anither problem wis ma retairns: Ah wis pittin this:

A problem I was encountering was to do with carriage returns: I was scripting thus:

Code: Select all

put fld "SOURCE" into SALSA
   replace numToCodePoint(0x000d) with " " in SALSA
   put SALSA into fld "noCR"
and wis still wi a text wi retairns, bit eftir a day wi ma rid-gouns Ah kent it war better til dae this:

and I was still ending up with text that contained some type of carriage return, but after teaching my pupils I worked
out a better way to get at things:

Code: Select all

put fld "SOURCE" into SALSA
   replace numToCodePoint(0x000d) with " " in SALSA
   replace numToCodePoint(0x000a) with " " in SALSA
   put SALSA into fld "noCR"
and aal warkt!

and, Lo! everything made me quake with ecstatic hoo-hahs!

Fit aa those retairns winnae CR, monie o them wis EOL, an Ah wisnae that mensefu
i the forenoon. :D

As all those carriage returns were, in point of fact, not carriage returns, many of them were end-of-line signals, and I was
goofy and sleepy in the morning.


But that numToCodePoint(0x000a) wis aye yaisefu. 8)

But that numToCodePoint(0x000a) fixed my gaff.

E'en if Ah wis a peedie bit glaikit.

Despite my goofiness.

And for those of you who "need educating," look no further than here:


https://youtu.be/T_Lk7qivXbw. 8)
Last edited by richmond62 on Tue Oct 22, 2019 7:25 pm, edited 5 times in total.

Klaus
Posts: 14194
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: itemDelimiter misbehaving

Post by Klaus » Tue Oct 22, 2019 7:01 pm

Sorry, I don't speak Klingon! 8)

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 7:05 pm

Sorry, I don't speak Klingon!
No, I don't suppose you do: nor that matter do I.

But you are lucky enough to live in a State where your language is not despised.

Nor, for that matter, do you have another language imposed on you at school.

Imagine having Dutch imposed on you in school in Germany"

"Wat een emmer stront!"
Last edited by richmond62 on Tue Oct 22, 2019 7:28 pm, edited 1 time in total.

Klaus
Posts: 14194
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: itemDelimiter misbehaving

Post by Klaus » Tue Oct 22, 2019 7:18 pm

I really do not despise your language (Gaelic?), sorry if I sounded that way!
I was just making a joke about the fact that I do not understand it.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 7:23 pm

Gaelic . . . no, not Erse (in either sense of the word).

That's a language spoken by people in the North and West of Scotland and is about as
closely related to Scots as Albanian is to German.

https://www.lallans.co.uk/

Klaus
Posts: 14194
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: itemDelimiter misbehaving

Post by Klaus » Tue Oct 22, 2019 7:31 pm

AHA! :D
Thank you.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 10099
Joined: Fri Feb 19, 2010 10:17 am

Re: itemDelimiter misbehaving

Post by richmond62 » Tue Oct 22, 2019 9:09 pm

ChewerText.jpg
-
This will select lines of text that contain words with the prefix "un-".

It is, obviously, as yet, something pretty crude.

It should be made faster for starters.
Attachments
Text Chewer.livecode.zip
Here's the stack.
(5.12 KiB) Downloaded 189 times

Post Reply