Page 1 of 2

FTP and accented characters in filenames [solved]

Posted: Thu Sep 28, 2017 11:57 am
by ittarter
Hello,

I have a set of filenames some of which have accented characters (typical Spanish and French accents), which my application can't find. It correctly downloads filenames without accents.

I was wondering if it's possible to use unicode somehow to query the FTP server differently, so that it's able to find the files?

Here's the code.

Code: Select all

     put x into FtpFileName
      if last char in FtpFileName is not in "abcdefghijklmnopqrstuvwxyz1234567890" then
         delete last char in FtpFileName --this is to remove carriage return character at the end of the filename as it's stored in my SQL database, never been a problem before
      end if
      put URL FtpFileName into url ("binfile:" & LocalFileName) --ACTUAL FILE COPY command
      put the result into tError
      if tError is not empty then
         put tError --no errors show up but it creates a 0 KB file with the correct filename including accents
      end if

Re: FTP and accented characters in filenames

Posted: Thu Sep 28, 2017 4:06 pm
by jmburnod
Hi ittarter,
I use this for url with accented char:

Code: Select all

get "boîte"
  put urlEncode(unidecode(uniencode(it), "UTF8")) into tURL
Best regards
Jean-Marc

Re: FTP and accented characters in filenames

Posted: Fri Sep 29, 2017 3:55 pm
by ittarter
Hi Jean-Marc,

Thanks for idea but it isn't working yet. I've tried several variations (see below). The last one is the only one that results in any good files, but accented files are still empty.

I've double-checked my FTP server and the source files are definitely NOT empty...

If I do store URLencode as a variable it looks something like this (here, for española)
ftp%3A%2F%2Fname%3Apassword%1111.111.111.111%2FConversations%2FSpanish+0101%2FVocab%2FEspa%F1ola.mp3
But I have to do the url function as below, because urlencode doesn't result in a string that my FTP server can read. Maybe urlencode is just for HTTP?

Code: Select all

--put uniencode(FtpFileName,"UTF8") into FtpFileName
      --put uniencode(LocalFileName,"UTF8") into LocalFileName
--put urlEncode(unidecode(uniencode(FtpFileName), "UTF8")) into FtpFileName
      --put urlEncode(unidecode(uniencode(LocalFileName), "UTF8")) into LocalFileName
      put URL unidecode(uniencode(FtpFileName), "UTF8") into url ("binfile:" & LocalFileName)
EDIT: url ftpfilename (after uniencode and unidecode) results in some crazy ‰PNG stuff in the variable that I can't copy out but is a long string of unintelligibility. in notepad, notepad ++ and this forum post field it results in
‰PNG

Don't know if this is useful information, but there it is.

Re: FTP and accented characters in filenames

Posted: Fri Sep 29, 2017 4:31 pm
by jacque
The unicode functions have been deprecated, the new way is to use textEncode and textDecode. You encode the file name into UTF 8 before requesting it and decode it when reading it. I think this may be the problem, since LC will be sending everything in UTF 16 unless you convert it.

An easier way to remove the trailing carriage return is to use the constant "CR". That way you won't accidentally strip off any non-Roman characters.

Code: Select all

if last char of FtpFileName is cr then delete last char of FtpFileName 
And then:

Code: Select all

put textEncode(FtpFileName, "UTF8") into FtpFileName
The file will be in UTF 8 after download so to read it in LC use textEncode to convert it to UTF 16 before working with it.

Re: FTP and accented characters in filenames

Posted: Fri Sep 29, 2017 5:06 pm
by ittarter
LC 8.1.2, by the way.
jacque wrote:The unicode functions have been deprecated, the new way is to use textEncode and textDecode. You encode the file name into UTF 8 before requesting it and decode it when reading it. I think this may be the problem, since LC will be sending everything in UTF 16 unless you convert it.

Code: Select all

put textEncode(FtpFileName, "UTF8") into FtpFileName
No, this doesn't change anything. Files without special characters are good, files with special characters are 0 KB. Here's the string that results from textencode:
ftp://xxx/Conversations/Spanish 0101/Vocab/Español.mp3
And here's my code

Code: Select all

put URL textEncode(FtpFileName, "UTF8") into url ("binfile:" & LocalFileName)
I can put the textencode as a separate step but it doesn't change the result.

Sigh. Still stuck.

An easier way to remove the trailing carriage return is to use the constant "CR". That way you won't accidentally strip off any non-Roman characters.
Good idea, code updated, thanks!
The file will be in UTF 8 after download so to read it in LC use textEncode to convert it to UTF 16 before working with it.
They're mp3 files so I don't think I have to encode them to UTF 16.

Re: FTP and accented characters in filenames

Posted: Fri Sep 29, 2017 5:10 pm
by MaxV
ittarter wrote: ftp%3A%2F%2Fname%3Apassword%1111.111.111.111%2FConversations%2FSpanish+0101%2FVocab%2FEspa%F1ola.mp3
You don't have to urlencode all, just the parts with not ASCII chars, so

Code: Select all

put  URL (  "ftp://name.password@mycoolwebsite/Conversations/Spanish+0101/Vocab/" & urlencode("Española.mp3") )"     into myAudio
If name or password contains not pure ASCII chars, urlencode them also! :D

Re: FTP and accented characters in filenames

Posted: Fri Sep 29, 2017 5:44 pm
by jacque
You're right, mp3 files won't need encoding. That makes all the difference.

So follow MaxV's advice and urlEncode the file name.

Re: FTP and accented characters in filenames

Posted: Sat Sep 30, 2017 6:35 am
by shaosean
Here is the RFC for Internationalization of the File Transfer Protocol <https://tools.ietf.org/html/rfc2640> Section 3 has information about pathnames..

Re: FTP and accented characters in filenames

Posted: Sat Sep 30, 2017 10:52 am
by ittarter
MaxV wrote: You don't have to urlencode all, just the parts with not ASCII chars, so

Code: Select all

put  URL (  "ftp://name.password@mycoolwebsite/Conversations/Spanish+0101/Vocab/" & urlencode("Española.mp3") )"     into myAudio
This does not change the result, unfortunately. Filenames containing special characters yield 0 KB files, and filenames that do not contain special characters are fine.

My code is:

Code: Select all

set the itemdel to "/"
put item 1 to -2 of FtpFileName & "/" & textEncode(item -1 of FtpFileName, "UTF8") into x
set the itemdel to comma
put URL x into url ("binfile:" & LocalFileName)
The resulting url is:
ftp://xxx/Conversations/Spanish 0101/Vocab/Español.mp3

Re: FTP and accented characters in filenames

Posted: Sat Sep 30, 2017 4:19 pm
by jacque
You need urlEncode, not textEncode. Easy to make that slip after all that's gone on in the thread so far.

Re: FTP and accented characters in filenames

Posted: Sun Oct 01, 2017 5:49 pm
by ittarter
jacque wrote:You need urlEncode, not textEncode. Easy to make that slip after all that's gone on in the thread so far.
I believe that the line

Code: Select all

put URL xxx into url ("binfile:" & LocalFileName)
does the same as

Code: Select all

put URLencode(xxx) into url ("binfile:" & LocalFileName)
In any case I've done both, and neither works.

For the latter (using urlencode), the resulting urlencode is:
ftp%3A%2F%2Fxxx%2FConversations%2FSpanish+0101%2FVocab%2FEspa%C3%B1ol.mp3

As before, it results in a 0 KB file when the filename has special characters, and there are no problems the filename doesn't have special characters.

Re: FTP and accented characters in filenames

Posted: Mon Oct 02, 2017 1:04 pm
by MaxV
ittarter wrote:For the latter (using urlencode), the resulting urlencode is:
ftp%3A%2F%2Fxxx%2FConversations%2FSpanish+0101%2FVocab%2FEspa%C3%B1ol.mp3
Dear ittarter,
FTP need user and passwrod usually, so you the url must be something like:
FTP://name.password@someSite/path/to/the/file

if you don't put exactly "ftp://" livecode can't know what you need.

So your code:

Code: Select all

put URLencode(xxx) into url ("binfile:" & LocalFileName)
is wrong.

use

put URL ( "ftp://name.password@site" & urlencode("/Conversations/Spanish 0101/Vocab/Española.mp3") )into url ("binfile:" & LocalFileName)

Re: FTP and accented characters in filenames

Posted: Mon Oct 02, 2017 3:07 pm
by ittarter
MaxV wrote:FTP need user and passwrod usually, so you the url must be something like:
FTP://name.password@someSite/path/to/the/file
Yes, but if my servername and password were wrong, then the files without special characters would not successfully download either. But, as I've said in each of my posts, they download no problem. I'm simply writing "xxx" as a place-holder for servername and password, but in the real code, I'm using my real servername and password.

It's ONLY the files with special characters in their filenames that don't successfully download.

The code is

Code: Select all

set the itemdel to "/"
put "ftp://" & item 1 of FtpFileName & "/" & textEncode(item 2 to -1 of FtpFileName, "UTF8") into FtpFileName
set the itemdel to comma
put URL FtpFileName into url ("binfile:" & LocalFileName)
A sample FtpFileName that DOESN'T download:
ftp://name.password@IP/Conversations/Spanish 0102/Vocab/Un país.mp3
A sample FtpFileName that DOES download:
ftp://name.password@IP/Conversations/Spanish 0102/Vocab/12.mp3

I don't understand why one works and the other does not...

Re: FTP and accented characters in filenames

Posted: Tue Oct 03, 2017 3:01 am
by shaosean
Your code still shows textEncode and not URLEncode, perhaps try changing that (as mentioned previously)

Re: FTP and accented characters in filenames

Posted: Tue Oct 03, 2017 8:03 am
by ittarter
Shaosean,

I need both textencode(FTPfilename) AND URL FTPfilename to get ANY files to correctly transfer.

If I eliminate textencode OR if I use URLencode in place of URL FTPfilename, nothing is correctly transferred.

For example, if I adapt this (which MaxV advised earlier) to my server, just using url and urlencode,

Code: Select all

put URL ( "ftp://name.password@site" & urlencode("/Conversations/Spanish 0101/Vocab/Española.mp3") )into url ("binfile:" & LocalFileName)
it results in all 0 KB files. And anyway, in the notes for urlencode, it says it's for HTTP servers only.

Anyway, thanks for all your support, guys, nothing has worked yet for filenames with special characters...