word breaks in Russian Unicode text

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
sp27
Posts: 135
Joined: Mon May 09, 2011 3:01 pm

word breaks in Russian Unicode text

Post by sp27 » Sat May 14, 2011 6:30 am

Step 4 in Devin Asay's article on Unicode at /spaces/lessons/buckets/1412/lessons/20441-Unicode "moves" two Russian words from one field to another:

set the unicodeText of fld "other" to word 1 to 2 of the unicodeText of fld "this"

I can reproduce his experience with the words that he used, but when I use the complete Russian alphabet, I find that the word breaks are not where I expect them. For instance, word 1 to 2 of this string:

АБВГДЕЖ ЗИЙКЛМНОПРСТУ ФХЦЧШЩЪЫЬЭЮЯ

should contain 8 letters in word 1 and 13 letters in word 2. However, Devin's command, when applied to this text in field "this" results in this display in field "other":

АБВГДЕЖ ЗИЙКЛМНОП

As you can see, the second word is chopped off at its 10th letter. That character is just another letter of the alphabet--not punctuation or anything like that.

With this string of lower case letters in field "this":

абв где жзи йкл мно прс туф хцч шщъ ыьэ юя

the script

Code: Select all

set the unicodeText of fld "other" to word 1 to 2 of the unicodeText of fld "this"
produces the expected result:

абв где

but if I change "word 1 to 2" to be "word 2 to 3", the display in field "other" is in Chinese characters:
㌀㐄㔄 㘀㜄㠄

Am I doing something terribly wrong? If anyone is working with Unicode, can they share their experience or perhaps confirm my findings?

This is with LC 4.6.1, Windows 7 64-bit.

Thanks,

Slava

Mark
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 5150
Joined: Thu Feb 23, 2006 9:24 pm
Contact:

Re: word breaks in Russian Unicode text

Post by Mark » Sat May 14, 2011 10:06 am

Hi Slava,

Looks like Devin made a small mistake.

Code: Select all

set the unicodeText of fld "other" to word 1 to 2 of the unicodeText of fld "this"
should be

Code: Select all

set the unicodeText of fld "other" to the unicodeText of word 1 to 2 of fld "this"
Best,

Mark
The biggest LiveCode group on Facebook: https://www.facebook.com/groups/livecode.developers
The book "Programming LiveCode for the Real Beginner"! Get it here! http://tinyurl.com/book-livecode

Post Reply