Page 1 of 1

revZipEnumerateItems mistake with diacritical chars

Posted: Wed Jan 09, 2013 7:02 pm
by jmburnod
It seem revZipEnumerateItems don't like diacritical chars and it seem necessary to use urlEncode(MyfileName) to keep the diacritical chars.

I testes it with two small zip (LC 5.02)

• Zip 2
Name of files in the zip (not urlEncoded):
"août.png"
"affûter.png"

revZipEnumerateItems return:
Problucirc/affuÃÇter.png
__MACOSX/Problucirc/._affuÃÇter.png
Problucirc/AouÃÇt.png
__MACOSX/Problucirc/._AouÃÇt.png

Name of files on the disk:
"affuÃÇter.png"
"AouÃÇt.png"
• Zip 2
Name of files in the zip (urlEncoded):
"aff%9Eter.png"
"_ao%9Et.png"

revZipEnumerateItems return:

ucircencoded/aff%9Eter.png
__MACOSX/ucircencoded/._aff%9Eter.png
ucircencoded/ao%9Et.png
__MACOSX/ucircencoded/._ao%9Et.png

Name of files on the disk:
"aff%9Eter.png"
"ao%9Et.png"
Is it a bug for you or not ?

Best regards
Jean-Marc

Re: revZipEnumerateItems mistake with diacritical chars

Posted: Wed Jan 09, 2013 8:42 pm
by Mark
Hi Jean-Marc,

No, this isn't a bug. The file names in your file are stored as UTF8:

Code: Select all

     put uniDecode(uniEncode("AouÃÇt","UTF8"))
     -- Août
Kind regards,

Mark

Re: revZipEnumerateItems mistake with diacritical chars

Posted: Thu Jan 10, 2013 12:38 am
by jmburnod
Hi Mark,
The file names in your file are stored as UTF8
Why ?
I expected a same result when I decompress the zip by the finder

Kind regards
Jean-Marc

Re: revZipEnumerateItems mistake with diacritical chars

Posted: Thu Jan 10, 2013 12:56 am
by Mark
Hi Jean-Marc,

Did you try to compress files in the Finder and decompress them with LiveCode? Did you get a result different from if you compress and decompress them with LiveCode only?

Kind regards,

Mark

Re: revZipEnumerateItems mistake with diacritical chars

Posted: Thu Jan 10, 2013 10:44 am
by jmburnod
Hi Mark,
Did you try to compress files in the Finder and decompress them with LiveCode?
Yes
Did you get a result different from if you compress and decompress them with LiveCode only?
Yes. revZipper keep the original names "août" and "affûter".

How can I know if a zip has been compressed with the finder or LiveCode ?

Kind regards
Jean-Marc

Re: revZipEnumerateItems mistake with diacritical chars

Posted: Sat Jan 12, 2013 11:03 pm
by Mark
Hi Jean-Marc,

Take the 7th and 8th byte of the zip file and look at the 11th bit of these bytes together (that's the 3rd bit of the 8th byte). If this bit is 0, the file names are probably Latin 1 encoded but may be MacRoman encoded. If the bit is 1, the file names are UTF8 encoded.

(If you look at the ZIP specs, you will probably read 6th and 7th byte, but that's because the specs start counting at 0 rather than 1. You can find the specs at http://qery.us/3dl and search for "UTF-8").

Kind regards,

Mark

Re: revZipEnumerateItems mistake with diacritical chars

Posted: Sun Jan 13, 2013 11:00 am
by jmburnod
Hi Mark,
Thank
I'm coming back with results as soon as I learnt to swim in this new swimming pool :D
Kind regards

Jean-Marc