Getting question marks instead of Arabic

Aradim · Post by **Aradim** » Sun Oct 19, 2014 7:11 pm

Hello,

I am trying to put XML Arabic content from internet to a file , but i get question marks for the Arabic unicode characters. I am using Livecode 7.0.

My commands:

put URL "domain name/content" into ss

put textDecode(ss,"UTF-8") into url "file:/Users/maradi/Desktop/arabictest.txt"

I appreciate the help.

FourthWorld · Post by **FourthWorld** » Sun Oct 19, 2014 8:04 pm

It may be that because Unicode is a binary (non-ASCII) format you'll need to use "binfile:" where you have "file:".

If that works then we have a philosophical point to discuss with the engine team: now that v7 aims to support Unicode transparently, should we consider making the "file" specifier as Unicode-savvy as everything else?

Aradim · Post by **Aradim** » Sun Oct 19, 2014 8:38 pm

Hi Richard,

I tried binfile: , i got the same , all Arabic replaced by ?????. The answer is yes, to the question .

Thanks for the help.

Aradim · Post by **Aradim** » Sun Oct 19, 2014 9:33 pm

Hello,

This work around did it , using field as intermediate :

put URL "http://www.xxx.com/xxx/rss" into field 1
put textDecode(field 1,"UTF-8") into field 1
put textEncode(field 1,"UTF-8") into url "file:/Users/maradi/Desktop/arabictest.txt"

jacque · Post by **jacque** » Mon Oct 20, 2014 8:55 pm

FourthWorld wrote:It may be that because Unicode is a binary (non-ASCII) format you'll need to use "binfile:" where you have "file:".

If that works then we have a philosophical point to discuss with the engine team: now that v7 aims to support Unicode transparently, should we consider making the "file" specifier as Unicode-savvy as everything else?

I'm pretty sure they've considered that, wouldn't it break all existing scripts?

jacque · Post by **jacque** » Mon Oct 20, 2014 9:03 pm

Read Fraser's explanation on the blog: http://livecode.com/blog/2014/04/02/exa ... ting-text/

Near the end he gives this example:

put url("binfile:input.txt") into tInputEncoded
put textDecode(tInputEncoded, "UTF-8") into tInput
…
put textEncode(tOutput, "UTF-8") into tOutputEncoded
put tOutputEncoded into url("binfile:output.txt")

If you are using the open file/socket/process syntax, you can have the conversion done for you:

open tFile for utf-8 text read

Unfortunately, the URL syntax does not offer the same convenience. It can, however, auto-detect the correct encoding to use in some circumstances: when reading from a file URL, the beginning of the file is examined for a “byte order mark” that specifies the encoding of the text. It also uses the encoding returned by the web server when HTTP URLs are used. If the encoding is not recognised, it assumes the platform’s native text encoding is used. As the native encodings do not support Unicode, it is usually better to be explicit when writing to files, etc.

EDIT: Oh, never mind, I see you're already doing that. I think I'd write to support and see if they think there's a problem somewhere.

LiveCode Forums.

Getting question marks instead of Arabic

Getting question marks instead of Arabic

Re: Getting question marks instead of Arabic

Re: Getting question marks instead of Arabic

Re: Getting question marks instead of Arabic

Re: Getting question marks instead of Arabic

Re: Getting question marks instead of Arabic