revZipEnumerateItems mistake with diacritical chars

LiveCode is the premier environment for creating multi-platform solutions for all major operating systems - Windows, Mac OS X, Linux, the Web, Server environments and Mobile platforms. Brand new to LiveCode? Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
jmburnod
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2729
Joined: Sat Dec 22, 2007 5:35 pm
Contact:

revZipEnumerateItems mistake with diacritical chars

Post by jmburnod » Wed Jan 09, 2013 7:02 pm

It seem revZipEnumerateItems don't like diacritical chars and it seem necessary to use urlEncode(MyfileName) to keep the diacritical chars.

I testes it with two small zip (LC 5.02)

• Zip 2
Name of files in the zip (not urlEncoded):
"août.png"
"affûter.png"

revZipEnumerateItems return:
Problucirc/affuÃÇter.png
__MACOSX/Problucirc/._affuÃÇter.png
Problucirc/AouÃÇt.png
__MACOSX/Problucirc/._AouÃÇt.png

Name of files on the disk:
"affuÃÇter.png"
"AouÃÇt.png"
• Zip 2
Name of files in the zip (urlEncoded):
"aff%9Eter.png"
"_ao%9Et.png"

revZipEnumerateItems return:

ucircencoded/aff%9Eter.png
__MACOSX/ucircencoded/._aff%9Eter.png
ucircencoded/ao%9Et.png
__MACOSX/ucircencoded/._ao%9Et.png

Name of files on the disk:
"aff%9Eter.png"
"ao%9Et.png"
Is it a bug for you or not ?

Best regards
Jean-Marc
https://alternatic.ch

Mark
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 5150
Joined: Thu Feb 23, 2006 9:24 pm
Contact:

Re: revZipEnumerateItems mistake with diacritical chars

Post by Mark » Wed Jan 09, 2013 8:42 pm

Hi Jean-Marc,

No, this isn't a bug. The file names in your file are stored as UTF8:

Code: Select all

     put uniDecode(uniEncode("AouÃÇt","UTF8"))
     -- Août
Kind regards,

Mark
The biggest LiveCode group on Facebook: https://www.facebook.com/groups/livecode.developers
The book "Programming LiveCode for the Real Beginner"! Get it here! http://tinyurl.com/book-livecode

jmburnod
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2729
Joined: Sat Dec 22, 2007 5:35 pm
Contact:

Re: revZipEnumerateItems mistake with diacritical chars

Post by jmburnod » Thu Jan 10, 2013 12:38 am

Hi Mark,
The file names in your file are stored as UTF8
Why ?
I expected a same result when I decompress the zip by the finder

Kind regards
Jean-Marc
https://alternatic.ch

Mark
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 5150
Joined: Thu Feb 23, 2006 9:24 pm
Contact:

Re: revZipEnumerateItems mistake with diacritical chars

Post by Mark » Thu Jan 10, 2013 12:56 am

Hi Jean-Marc,

Did you try to compress files in the Finder and decompress them with LiveCode? Did you get a result different from if you compress and decompress them with LiveCode only?

Kind regards,

Mark
The biggest LiveCode group on Facebook: https://www.facebook.com/groups/livecode.developers
The book "Programming LiveCode for the Real Beginner"! Get it here! http://tinyurl.com/book-livecode

jmburnod
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2729
Joined: Sat Dec 22, 2007 5:35 pm
Contact:

Re: revZipEnumerateItems mistake with diacritical chars

Post by jmburnod » Thu Jan 10, 2013 10:44 am

Hi Mark,
Did you try to compress files in the Finder and decompress them with LiveCode?
Yes
Did you get a result different from if you compress and decompress them with LiveCode only?
Yes. revZipper keep the original names "août" and "affûter".

How can I know if a zip has been compressed with the finder or LiveCode ?

Kind regards
Jean-Marc
https://alternatic.ch

Mark
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 5150
Joined: Thu Feb 23, 2006 9:24 pm
Contact:

Re: revZipEnumerateItems mistake with diacritical chars

Post by Mark » Sat Jan 12, 2013 11:03 pm

Hi Jean-Marc,

Take the 7th and 8th byte of the zip file and look at the 11th bit of these bytes together (that's the 3rd bit of the 8th byte). If this bit is 0, the file names are probably Latin 1 encoded but may be MacRoman encoded. If the bit is 1, the file names are UTF8 encoded.

(If you look at the ZIP specs, you will probably read 6th and 7th byte, but that's because the specs start counting at 0 rather than 1. You can find the specs at http://qery.us/3dl and search for "UTF-8").

Kind regards,

Mark
The biggest LiveCode group on Facebook: https://www.facebook.com/groups/livecode.developers
The book "Programming LiveCode for the Real Beginner"! Get it here! http://tinyurl.com/book-livecode

jmburnod
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2729
Joined: Sat Dec 22, 2007 5:35 pm
Contact:

Re: revZipEnumerateItems mistake with diacritical chars

Post by jmburnod » Sun Jan 13, 2013 11:00 am

Hi Mark,
Thank
I'm coming back with results as soon as I learnt to swim in this new swimming pool :D
Kind regards

Jean-Marc
https://alternatic.ch

Post Reply