Getting question marks instead of Arabic
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Getting question marks instead of Arabic
Hello,
I am trying to put XML Arabic content from internet to a file , but i get question marks for the Arabic unicode characters. I am using Livecode 7.0.
My commands:
put URL "domain name/content" into ss
put textDecode(ss,"UTF-8") into url "file:/Users/maradi/Desktop/arabictest.txt"
I appreciate the help.
I am trying to put XML Arabic content from internet to a file , but i get question marks for the Arabic unicode characters. I am using Livecode 7.0.
My commands:
put URL "domain name/content" into ss
put textDecode(ss,"UTF-8") into url "file:/Users/maradi/Desktop/arabictest.txt"
I appreciate the help.
-
- VIP Livecode Opensource Backer
- Posts: 10048
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Getting question marks instead of Arabic
It may be that because Unicode is a binary (non-ASCII) format you'll need to use "binfile:" where you have "file:".
If that works then we have a philosophical point to discuss with the engine team: now that v7 aims to support Unicode transparently, should we consider making the "file" specifier as Unicode-savvy as everything else?
If that works then we have a philosophical point to discuss with the engine team: now that v7 aims to support Unicode transparently, should we consider making the "file" specifier as Unicode-savvy as everything else?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Getting question marks instead of Arabic
Hi Richard,
I tried binfile: , i got the same , all Arabic replaced by ?????. The answer is yes, to the question .
Thanks for the help.
I tried binfile: , i got the same , all Arabic replaced by ?????. The answer is yes, to the question .
Thanks for the help.
Re: Getting question marks instead of Arabic
Hello,
This work around did it , using field as intermediate :
put URL "http://www.xxx.com/xxx/rss" into field 1
put textDecode(field 1,"UTF-8") into field 1
put textEncode(field 1,"UTF-8") into url "file:/Users/maradi/Desktop/arabictest.txt"
This work around did it , using field as intermediate :
put URL "http://www.xxx.com/xxx/rss" into field 1
put textDecode(field 1,"UTF-8") into field 1
put textEncode(field 1,"UTF-8") into url "file:/Users/maradi/Desktop/arabictest.txt"
Re: Getting question marks instead of Arabic
I'm pretty sure they've considered that, wouldn't it break all existing scripts?FourthWorld wrote:It may be that because Unicode is a binary (non-ASCII) format you'll need to use "binfile:" where you have "file:".
If that works then we have a philosophical point to discuss with the engine team: now that v7 aims to support Unicode transparently, should we consider making the "file" specifier as Unicode-savvy as everything else?
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
Re: Getting question marks instead of Arabic
Read Fraser's explanation on the blog: http://livecode.com/blog/2014/04/02/exa ... ting-text/
Near the end he gives this example:
EDIT: Oh, never mind, I see you're already doing that. I think I'd write to support and see if they think there's a problem somewhere.
Near the end he gives this example:
put url("binfile:input.txt") into tInputEncoded
put textDecode(tInputEncoded, "UTF-8") into tInput
…
put textEncode(tOutput, "UTF-8") into tOutputEncoded
put tOutputEncoded into url("binfile:output.txt")
If you are using the open file/socket/process syntax, you can have the conversion done for you:
open tFile for utf-8 text read
Unfortunately, the URL syntax does not offer the same convenience. It can, however, auto-detect the correct encoding to use in some circumstances: when reading from a file URL, the beginning of the file is examined for a “byte order mark” that specifies the encoding of the text. It also uses the encoding returned by the web server when HTTP URLs are used. If the encoding is not recognised, it assumes the platform’s native text encoding is used. As the native encodings do not support Unicode, it is usually better to be explicit when writing to files, etc.
EDIT: Oh, never mind, I see you're already doing that. I think I'd write to support and see if they think there's a problem somewhere.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com