Yet Another Unicode Question

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Yet Another Unicode Question

Post by WaltBrown » Wed Aug 20, 2014 2:31 pm

I have a field with long lists of button labels. Here are the first two lines:
0.0; Languages; Available command languages; Bulgarian; Croatian; Czech; Danish; Dutch; English; Estonian; Finnish; French; German; Greek; Hungarian; Icelandic; Irish; Italian; Latvian; Lithuanian; Macedonian; Maltese; Norwegian Bokmål; Norwegian Nynorsk; Polish; Portuguese; Raeto-Romance Ladin; Raeto-Romance Surmiran; Raeto-Romance Sursilvan; Raeto-Romance Rumantsch Grischun; Romanian; Russian; Slovak; Slovene; Spanish; Swedish; Turkish;
1.1; Yes; Confirm operation; Да; Da,Izvrši; Ano; Ja,Udfør;Ja;Yes,Confirm;Jah,Kinnita;Jatka,Kyllä;OK,Oui;Ja,OK,Ausführen;Επιβεβαίωση; Igen,Oké;Staðfesta;Cinnte;Sì,Confermo;Jā;Taip;Да,Продолжи;Iva;Ja;Ja;Tak;Sim;Schi; Ea;Gie;Gea;Da;Да; Áno;Da,Potrdi,Ja;Sí,Confirmar;Ja,OK;Onayla;
The language list (0.0) is my selector for changing some of the button labels in my stacks based on the ETSI European Language selection. The second line (1.1) is one of a large number (74 in command language release 2.1.1) of command names in the aforementioned languages. UTF8 cut and pasted nicely into the field (and into this window) and allowed formatting, changing punctuation, etc. But when I select one of the items and set a button's label to it, it generally fails for the languages with non-Latin characters (totally for the Cyrillic based and Greek, and partially for the Turkics) . I tried variations of useUnicode, unicodeLabel, etc. I believe the issue is twofold - one is my ignorance of the specific details of the Unicode implementation in LC versions. The documents seem to say I can use UTF16 on button labels? I tried setting the unicodeLabels of the buttons but that also failed miserably, giving me a wide array of ideograms instead. Is there a UTF8 to UTF16 conversion method?

The other problem, obviously not in LC's domain, is the ETSI language list has almost no correlation to the Unicode language list, but that's for solution in ETSI TC HF, not here :-)

Thanks,
Walt Brown
Omnis traductor traditor

endernafi
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 296
Joined: Wed May 02, 2012 12:23 pm
Contact:

Re: Yet Another Unicode Question

Post by endernafi » Wed Aug 20, 2014 4:51 pm

Hi there Walt,

Using Unicode in Livecode is always tricky.
Generally, this simplest method should work assuming that the selected file is saved with UTF8 Encoding.

Code: Select all

on mouseUp
    answer file "Select Label File"
    set the unicodeLabel of me to uniEncode(url("file:" & it), "utf8")
end mouseUp
I've tested the above code as follows:

* Put all your sample labels into a txt file.
** The editor should be the simplest one, I suggest Sublime Text or Notepad++ respectively for Mac or Windows.
** The editor's *Encoding Option for Save* should be UTF8.

* Read that file via url("file:" ...) function.

* Encode it via uniEncode( ..., "UTF8").

* Put it into your button with set the unicodeLabel of or into your fields with set the unicodeText of respectively.

Here is the result (or proof that it works):
Screen Shot 2014-08-20 at 18.39.43.jpg
That's the safest way both for desktop and mobile.
Use a separate UTF8 encoded resource file for your text.

Btw, I don't know who translated your labels;
but using *OK* or *Ja* for Turkish is totally, utterly, completely wrong.
Being a professional translator for 8 whole years, I can suggest these for Turkish:
OK, Confirm -> Tamam, Onayla
OK, Confirm Operation -> Tamam, İşlemi Onayla
Yes, Confirm -> Evet, Onayla
Yes, Confirm Operation -> Evet, İşlemi Onayla


Hope it helps...

Best,

~ Ender
~... together, we're smarter ...~
__________________________________________

macOS Sierra • LiveCode 7 & xCode 8

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10052
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: Yet Another Unicode Question

Post by FourthWorld » Wed Aug 20, 2014 5:56 pm

endernafi wrote:Using Unicode in Livecode is always tricky.
Hopefully that should be past tense, to read "Unicode in LiveCode prior to v7 was always tricky".

If would be helpful if Walt had a little time to see if what he wants to do can be done easily and gracefully in v7 (dp10 was just released this morning):
http://downloads.livecode.com/livecode/

As with all software at all times (and slightly more so since v7 is still in testing), you'll want to make sure you have good backups before using it.

But the sweeping changes in v7 will make it the reference platform for the future, so even those folks not depending on Unicode would do well to work with it as much as possible, to ensure that it works flawlessly when the final build is released.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

endernafi
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 296
Joined: Wed May 02, 2012 12:23 pm
Contact:

Re: Yet Another Unicode Question

Post by endernafi » Wed Aug 20, 2014 6:03 pm

FourthWorld wrote: Hopefully that should be past tense, to read "Unicode in LiveCode prior to v7 was always tricky".
Hi Richard,

Well no, it's not past tense since v7 isn't officially released; it's still Developer Preview as you've stated, too.
But you have a point, of course.
Hopefully, v7 will change my sentence to this:
Using Unicode was tricky; thanks to Livecode 7.0, it's like a breeze now.
8)


~ Ender
~... together, we're smarter ...~
__________________________________________

macOS Sierra • LiveCode 7 & xCode 8

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10052
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: Yet Another Unicode Question

Post by FourthWorld » Wed Aug 20, 2014 6:49 pm

True, v7 isn't final yet, but it is available and needs testing.

If we put off testing until after release, we'll have made the one choice that guarantees v7 won't have been adequately testing before release. :)

Given the powerful scope of changes in v7, the value of testing v7 now can't be overstated if we want to be able to rely on the final version when it's released.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Re: Yet Another Unicode Question

Post by WaltBrown » Wed Aug 20, 2014 7:01 pm

Dang, I just extracted all the Turkish and posted it here, but my message didn't post!

Ender, thanks. Check http://www.google.com/url?sa=t&rct=j&q= ... 1344,d.aWw for the actual language selections. The OK and Ja you saw were Swedish, just before the Turkish entry for Onayla.

Your suggestions helped somewhat but not completely. Maybe it's Windows. I may wait for LC7 rather than waste time debugging code that will be deprecated anyway.

Best, Walt
Walt Brown
Omnis traductor traditor

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Re: Yet Another Unicode Question

Post by WaltBrown » Wed Aug 20, 2014 7:06 pm

Oh and by the way Ender, I like your signature - "Together we are smarter". True. I have a somewhat more cynical version - "Organizations get smarter proportional to the square root of the number of members"...
Walt Brown
Omnis traductor traditor

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Re: Yet Another Unicode Question

Post by WaltBrown » Wed Aug 20, 2014 7:17 pm

Here are two snapshots of my stack, the English and Turkish buttons (You can see the Unicode issue).
EnglishSnapshot.png
TurkishSnapshot.png
Edit: Except the last button, this was a quickie and I lost the last entry so it put the button name instead.
Last edited by WaltBrown on Wed Aug 20, 2014 7:21 pm, edited 2 times in total.
Walt Brown
Omnis traductor traditor

endernafi
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 296
Joined: Wed May 02, 2012 12:23 pm
Contact:

Re: Yet Another Unicode Question

Post by endernafi » Wed Aug 20, 2014 7:17 pm

WaltBrown wrote:Ender, thanks. Check thatLongUrl for the actual language selections.
I've read all Turkish sections; it's actually a pretty good translation.
So, my apologies to the translator in her/his absence 8)

WaltBrown wrote:I may wait for LC7 rather than waste time debugging code that will be deprecated anyway.
That sounds like a fair decision, the trouble of debugging deprecated code and all...

WaltBrown wrote:Your suggestions helped somewhat but not completely. Maybe it's Windows.
It may be Windows or some other thing.
It's kinda hard to explain the *thing*.
Let me try, though:
If the unicode text should pass through an internal channel and not handled properly, it might be corrupted.
Example?
Putting the content of the resource file into a custom property then read it from that custom property.
This is a delicate process and should be handled wisely.
There might be akin situations which may corrupt the unicode text.

Anyway, if it's not urgent, you should wait -or start trying- Livecode 7.

WaltBrown wrote:"Organizations get smarter proportional to the square root of the number of members"
Very true 8)


Regards,
~... together, we're smarter ...~
__________________________________________

macOS Sierra • LiveCode 7 & xCode 8

endernafi
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 296
Joined: Wed May 02, 2012 12:23 pm
Contact:

Re: Yet Another Unicode Question

Post by endernafi » Wed Aug 20, 2014 9:01 pm

Walt hi,

You got me curious so I've installed Windows 7 on a VM and tried my proposed solution:
Screen Shot 2014-08-20 at 22.50.13.jpg
Clearly your problem is a bit quirky.
Please try this stack and accompanied *labels* file:
Walt_UnicodeTest.zip
(962 Bytes) Downloaded 244 times
If it doesn't produce the result as above screenshot,
then you'll certainly know the problem is related to your particular Windows installation.
If it does indeed work as expected, then you can dig into your code.


Best,

~ Ender
~... together, we're smarter ...~
__________________________________________

macOS Sierra • LiveCode 7 & xCode 8

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Re: Yet Another Unicode Question

Post by WaltBrown » Thu Aug 21, 2014 5:20 am

Thanks Ender, that was a good starting point. Your stack works perfectly. I could also do:

Code: Select all

set the unicodeLabel of me to uniEncode(item 9 of url("file:" & it), "utf8")
I can also get the proper field display with:

Code: Select all

set the unicodeText of fld "fFileData" to uniEncode(url("file:" & it),"utf8")
I can get the chunk in a local variable. This works

Code: Select all

 
put item 9 of url("file:" & it) into tChunk
set the unicodeLabel of me to uniEncode(tChunk, "utf8")
The issue fails when I need to transfer a chunk of the field in a variable (or directly) to set the label. The following all fail:

Code: Select all

set the unicodeLabel of me to uniEncode(item 9 of fld "fFileData", "utf8")

set the unicodeLabel of me to uniEncode(item 9 of the unicodeText of fld "fFileData", "utf8")

// This one SHOULD work!
put item 9 of fld "fFileData" into tChunk
set the unicodeLabel of me to uniEncode(tChunk, "utf8")

put uniEncode(item 9 of fld "fFileData","utf8") into tChunk
set the unicodeLabel of me to uniEncode(tChunk, "utf8")

put item 9 of the unicodeText of fld "fFileData" into tChunk
set the unicodeLabel of me to uniEncode(tChunk,"utf8")

put uniEncode(item 9 of the unicodeText of fld "fFileData","utf8") into tChunk
set the unicodeLabel of me to uniEncode(tChunk,"utf8")

put uniEncode(item 9 of the unicodeText of fld "fFileData","utf8") into tChunk
set the unicodeLabel of me to tChunk
Arrrrgh! I should have put this away hours ago, when I decided to wait for LC7. I have a hard time dropping problems though. It seems I cannot get the data into and out of a field without it getting massaged somehow.
Walt Brown
Omnis traductor traditor

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Re: Yet Another Unicode Question

Post by WaltBrown » Thu Aug 21, 2014 5:40 am

I think I finally got it. This seems to work:

Code: Select all

put url("file:" & it) into tChunk
   set the unicodeText of fld "fFileData" to uniEncode(tChunk,"utf8")
   set the unicodeLabel of me to the unicodeText of item 9 of fld "fFileData"
I will go back to the original app and see if I can get the labels correct now, without using an external text file. I still can't put a chunk of a field into a variable, I'll need to pass around the chunk descriptor instead.
Walt
Walt Brown
Omnis traductor traditor

endernafi
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 296
Joined: Wed May 02, 2012 12:23 pm
Contact:

Re: Yet Another Unicode Question

Post by endernafi » Thu Aug 21, 2014 10:40 am

WaltBrown wrote:

Code: Select all

// This one SHOULD work!
put uniEncode(item 9 of fld "fFileData","utf8") into tChunk
set the unicodeLabel of me to uniEncode(tChunk, "utf8")
...
put uniEncode(item 9 of the unicodeText of fld "fFileData","utf8") into tChunk
set the unicodeLabel of me to tChunk
Walt,

You're encoding the chunk twice;
that is, you're encoding an already encoded text.
That's not the way.
You have to decode it first.
Try this instead:

Code: Select all

set the unicodeText of fld "fFileData" to uniEncode(tRawText, "utf8")
put uniDecode(the unicodeText of fld "fFileData", "utf8") into tChunk
set the unicodeLabel of me to uniEncode(item 9 of tChunk, "utf8")
Here is the result:
Screen Shot 2014-08-21 at 12.37.54.jpg

Best,

~ Ender
~... together, we're smarter ...~
__________________________________________

macOS Sierra • LiveCode 7 & xCode 8

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm

Re: Yet Another Unicode Question

Post by WaltBrown » Thu Aug 21, 2014 3:46 pm

Thanks! That did the trick. I was manipulating chunks of the data in the file after putting it into local variables, assuming I could leave them UTF8 encoded. That almost worked. Decoding the entire field contents into a local variable, manipulating, then re-encoding the selected label after manipulating the contents worked. Once I moved from the examples we have been trading into the actual app, I had to crawl through the execution path, and remove ALL intermediate uniDecode/uniEncode steps. I can only imagine that, once UTF16 or UTF32 dependent text is involved, I will have to leave the data uniEncoded during the chunking and manipulation processes and trust LC7 to be consistent.
Thanks again.
Walt Brown
Omnis traductor traditor

endernafi
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 296
Joined: Wed May 02, 2012 12:23 pm
Contact:

Re: Yet Another Unicode Question

Post by endernafi » Thu Aug 21, 2014 7:01 pm

You're most welcome,
glad that I could help...

~ Ender
~... together, we're smarter ...~
__________________________________________

macOS Sierra • LiveCode 7 & xCode 8

Post Reply