OK. I give up (Word delimiting)
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Re: OK. I give up
Richard.
Yes, changing my parsing thinking from "the number of words" to "the number of trueWords" makes this go away.
Thanks.
Craig
Yes, changing my parsing thinking from "the number of words" to "the number of trueWords" makes this go away.
Thanks.
Craig
Re: OK. I give up
Spoke too soon.
The trueWord keyword carries, I suppose, a certain amount of unicode er, baggage. Asking for either the number of words, or the number of trueWords in the following string:
gives 4 in both cases. Asking for trueWord 4 gives 250. Asking for word 4 gives ") [L: 250. Something is trumping the very real spaces in that string. That snippet:
although it contains spaces, is seen both as a single word and a single trueWord. I am trying, in this case to parse out the string "). It seems that simple spaces just do not cut it. I would have thought that the first task of trueWords is to always yield to spaces.
It is the quote that is the center of this issue.
Craig
The trueWord keyword carries, I suppose, a certain amount of unicode er, baggage. Asking for either the number of words, or the number of trueWords in the following string:
Code: Select all
(2 Lenses @ ") [L: 250
Code: Select all
") [L:
It is the quote that is the center of this issue.
Craig
-
- VIP Livecode Opensource Backer
- Posts: 10052
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: OK. I give up
trueWord uses the natural language rules now available to us in the IBM Unicode library to parse out what are usually true words.
This generally works well when parsing strings containing natural language.
But it seems of little or no value when parsing strings of arbitrary characters not at all like natural language.
If you want to parse by spaces only, maybe set the itemDel to space and parse by items.
This generally works well when parsing strings containing natural language.
But it seems of little or no value when parsing strings of arbitrary characters not at all like natural language.
If you want to parse by spaces only, maybe set the itemDel to space and parse by items.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: OK. I give up
Richard.
I thought I was the only one odd enough to set the itemDel to space.
But in this case, I need to catch certain character strings, and drill into them.
I will find a workaround, but there still remains a glitch in either the way we think about words, or the way the dictionary describes them.
Craig
I thought I was the only one odd enough to set the itemDel to space.

But in this case, I need to catch certain character strings, and drill into them.
I will find a workaround, but there still remains a glitch in either the way we think about words, or the way the dictionary describes them.
Craig
-
- VIP Livecode Opensource Backer
- Posts: 10052
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: OK. I give up
It seems like the only limitation the Dictionary that it doesn't discuss the edge case of no closing quote. If you can turn up how the HC team described that and include it in the report that'll help them avoid one more creative writing exercise.
But for the implementation itself, "word" seems very conformant with the Mother Tongue. And when we need something beyond what HC could do, we now also have trueWord.
But for the implementation itself, "word" seems very conformant with the Mother Tongue. And when we need something beyond what HC could do, we now also have trueWord.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: OK. I give up
You can create your function for what you intend for word.
Just use repeat for each char... and with switch / case you can create any combination.
Just use repeat for each char... and with switch / case you can create any combination.

Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
Re: OK. I give up
Richard.
The most recent (!) HC Script Language guide gives spaces and returns as delimiters. Note that tabs are not mentioned, and for good reason. In HC tabs do not delimited words.
I will file a report to QCC complaining about the single quote issue, and see what they say.
FWIW, I simply changed the quote in the string to ASCII 210, and all is well. That character looks just fine when giving a measurement in inches.
Craig
The most recent (!) HC Script Language guide gives spaces and returns as delimiters. Note that tabs are not mentioned, and for good reason. In HC tabs do not delimited words.
I will file a report to QCC complaining about the single quote issue, and see what they say.
FWIW, I simply changed the quote in the string to ASCII 210, and all is well. That character looks just fine when giving a measurement in inches.
Craig
-
- VIP Livecode Opensource Backer
- Posts: 10052
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: OK. I give up
I think I missed something. I thought the issue was about a string with an opening quote but no closing quote. How did tabs enter into this?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: OK. I give up
Richard,
No, you had mentioned:
Craig
No, you had mentioned:
I tried all this in HC, discovered it also has the single quote malaise, and, unlike LC, HC does not support tabs as word delimiters. So HC is different is all I meant. And the fact that LC includes tabs as word delimiters ought to have broken some HC stacks that were ported over.But for the implementation itself, "word" seems very conformant with the Mother Tongue.
Craig
Re: OK. I give up
Figures, I finally get back to this, and Craig answered his own question
Well, I'm posting a pic of my homework ANYWAY, just BECAUSE

Well, I'm posting a pic of my homework ANYWAY, just BECAUSE


Re: OK. I give up
Filed report # 21513
Craig
Craig
Re: OK. I give up
Bug confirmed.
Craig
Craig
Re: OK. I give up
So, bug confirmed eh? Good job !
I added a 3rd box and eliminated any quotes, the results turned out quite different as you probably already knew, but closer to what I would suspect they should be.
I added a 3rd box and eliminated any quotes, the results turned out quite different as you probably already knew, but closer to what I would suspect they should be.

-
- VIP Livecode Opensource Backer
- Posts: 10052
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: OK. I give up
Thanks. I don't mind seeing this changed; it's such an edge case I don't mind either way. But IIRC there was an earlier post confirming that the behavior we see in LC matches HC's behavior - is that correct? If so, I wonder if some old code may break as a result of improving this beyond what HC did.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: OK. I give up
Richard.
It may be a marginal issue, but if LC considers tabs to be word delimiters and HC does not, the number of words in a particular processed string may give unexpected results. I am constantly taking strings with tabs and counting words. HC would not have condoned that.
Craig
It may be a marginal issue, but if LC considers tabs to be word delimiters and HC does not, the number of words in a particular processed string may give unexpected results. I am constantly taking strings with tabs and counting words. HC would not have condoned that.
Craig