Page 1 of 2
FTP and accented characters in filenames [solved]
Posted: Thu Sep 28, 2017 11:57 am
by ittarter
Hello,
I have a set of filenames some of which have accented characters (typical Spanish and French accents), which my application can't find. It correctly downloads filenames without accents.
I was wondering if it's possible to use unicode somehow to query the FTP server differently, so that it's able to find the files?
Here's the code.
Code: Select all
put x into FtpFileName
if last char in FtpFileName is not in "abcdefghijklmnopqrstuvwxyz1234567890" then
delete last char in FtpFileName --this is to remove carriage return character at the end of the filename as it's stored in my SQL database, never been a problem before
end if
put URL FtpFileName into url ("binfile:" & LocalFileName) --ACTUAL FILE COPY command
put the result into tError
if tError is not empty then
put tError --no errors show up but it creates a 0 KB file with the correct filename including accents
end if
Re: FTP and accented characters in filenames
Posted: Thu Sep 28, 2017 4:06 pm
by jmburnod
Hi ittarter,
I use this for url with accented char:
Code: Select all
get "boîte"
put urlEncode(unidecode(uniencode(it), "UTF8")) into tURL
Best regards
Jean-Marc
Re: FTP and accented characters in filenames
Posted: Fri Sep 29, 2017 3:55 pm
by ittarter
Hi Jean-Marc,
Thanks for idea but it isn't working yet. I've tried several variations (see below). The last one is the only one that results in any good files, but accented files are still empty.
I've double-checked my FTP server and the source files are definitely NOT empty...
If I do store URLencode as a variable it looks something like this (here, for española)
ftp%3A%2F%2Fname%3Apassword%1111.111.111.111%2FConversations%2FSpanish+0101%2FVocab%2FEspa%F1ola.mp3
But I have to do the url function as below, because urlencode doesn't result in a string that my FTP server can read. Maybe urlencode is just for HTTP?
Code: Select all
--put uniencode(FtpFileName,"UTF8") into FtpFileName
--put uniencode(LocalFileName,"UTF8") into LocalFileName
--put urlEncode(unidecode(uniencode(FtpFileName), "UTF8")) into FtpFileName
--put urlEncode(unidecode(uniencode(LocalFileName), "UTF8")) into LocalFileName
put URL unidecode(uniencode(FtpFileName), "UTF8") into url ("binfile:" & LocalFileName)
EDIT: url ftpfilename (after uniencode and unidecode) results in some crazy ‰PNG stuff in the variable that I can't copy out but is a long string of unintelligibility. in notepad, notepad ++ and this forum post field it results in
‰PNG
Don't know if this is useful information, but there it is.
Re: FTP and accented characters in filenames
Posted: Fri Sep 29, 2017 4:31 pm
by jacque
The unicode functions have been deprecated, the new way is to use textEncode and textDecode. You encode the file name into UTF 8 before requesting it and decode it when reading it. I think this may be the problem, since LC will be sending everything in UTF 16 unless you convert it.
An easier way to remove the trailing carriage return is to use the constant "CR". That way you won't accidentally strip off any non-Roman characters.
Code: Select all
if last char of FtpFileName is cr then delete last char of FtpFileName
And then:
Code: Select all
put textEncode(FtpFileName, "UTF8") into FtpFileName
The file will be in UTF 8 after download so to read it in LC use textEncode to convert it to UTF 16 before working with it.
Re: FTP and accented characters in filenames
Posted: Fri Sep 29, 2017 5:06 pm
by ittarter
LC 8.1.2, by the way.
jacque wrote:The unicode functions have been deprecated, the new way is to use textEncode and textDecode. You encode the file name into UTF 8 before requesting it and decode it when reading it. I think this may be the problem, since LC will be sending everything in UTF 16 unless you convert it.
Code: Select all
put textEncode(FtpFileName, "UTF8") into FtpFileName
No, this doesn't change anything. Files without special characters are good, files with special characters are 0 KB. Here's the string that results from textencode:
ftp://xxx/Conversations/Spanish 0101/Vocab/Español.mp3
And here's my code
Code: Select all
put URL textEncode(FtpFileName, "UTF8") into url ("binfile:" & LocalFileName)
I can put the textencode as a separate step but it doesn't change the result.
Sigh. Still stuck.
An easier way to remove the trailing carriage return is to use the constant "CR". That way you won't accidentally strip off any non-Roman characters.
Good idea, code updated, thanks!
The file will be in UTF 8 after download so to read it in LC use textEncode to convert it to UTF 16 before working with it.
They're mp3 files so I don't think I have to encode them to UTF 16.
Re: FTP and accented characters in filenames
Posted: Fri Sep 29, 2017 5:10 pm
by MaxV
ittarter wrote:
ftp%3A%2F%2Fname%3Apassword%1111.111.111.111%2FConversations%2FSpanish+0101%2FVocab%2FEspa%F1ola.mp3
You don't have to urlencode all, just the parts with not ASCII chars, so
Code: Select all
put URL ( "ftp://name.password@mycoolwebsite/Conversations/Spanish+0101/Vocab/" & urlencode("Española.mp3") )" into myAudio
If name or password contains not pure ASCII chars, urlencode them also!

Re: FTP and accented characters in filenames
Posted: Fri Sep 29, 2017 5:44 pm
by jacque
You're right, mp3 files won't need encoding. That makes all the difference.
So follow MaxV's advice and urlEncode the file name.
Re: FTP and accented characters in filenames
Posted: Sat Sep 30, 2017 6:35 am
by shaosean
Here is the RFC for Internationalization of the File Transfer Protocol <
https://tools.ietf.org/html/rfc2640> Section 3 has information about pathnames..
Re: FTP and accented characters in filenames
Posted: Sat Sep 30, 2017 10:52 am
by ittarter
MaxV wrote:
You don't have to urlencode all, just the parts with not ASCII chars, so
Code: Select all
put URL ( "ftp://name.password@mycoolwebsite/Conversations/Spanish+0101/Vocab/" & urlencode("Española.mp3") )" into myAudio
This does not change the result, unfortunately. Filenames containing special characters yield 0 KB files, and filenames that do not contain special characters are fine.
My code is:
Code: Select all
set the itemdel to "/"
put item 1 to -2 of FtpFileName & "/" & textEncode(item -1 of FtpFileName, "UTF8") into x
set the itemdel to comma
put URL x into url ("binfile:" & LocalFileName)
The resulting url is:
ftp://xxx/Conversations/Spanish 0101/Vocab/Español.mp3
Re: FTP and accented characters in filenames
Posted: Sat Sep 30, 2017 4:19 pm
by jacque
You need urlEncode, not textEncode. Easy to make that slip after all that's gone on in the thread so far.
Re: FTP and accented characters in filenames
Posted: Sun Oct 01, 2017 5:49 pm
by ittarter
jacque wrote:You need urlEncode, not textEncode. Easy to make that slip after all that's gone on in the thread so far.
I believe that the line
Code: Select all
put URL xxx into url ("binfile:" & LocalFileName)
does the same as
Code: Select all
put URLencode(xxx) into url ("binfile:" & LocalFileName)
In any case I've done both, and neither works.
For the latter (using urlencode), the resulting urlencode is:
ftp%3A%2F%2Fxxx%2FConversations%2FSpanish+0101%2FVocab%2FEspa%C3%B1ol.mp3
As before, it results in a 0 KB file when the filename has special characters, and there are no problems the filename doesn't have special characters.
Re: FTP and accented characters in filenames
Posted: Mon Oct 02, 2017 1:04 pm
by MaxV
ittarter wrote:For the latter (using urlencode), the resulting urlencode is:
ftp%3A%2F%2Fxxx%2FConversations%2FSpanish+0101%2FVocab%2FEspa%C3%B1ol.mp3
Dear ittarter,
FTP need user and passwrod usually, so you the url must be something like:
FTP://name.password@someSite/path/to/the/file
if you don't put exactly "ftp://" livecode can't know what you need.
So your code:
Code: Select all
put URLencode(xxx) into url ("binfile:" & LocalFileName)
is wrong.
use
put URL ( "ftp://name.password@site" & urlencode("/Conversations/Spanish 0101/Vocab/Española.mp3") )into url ("binfile:" & LocalFileName)
Re: FTP and accented characters in filenames
Posted: Mon Oct 02, 2017 3:07 pm
by ittarter
Yes, but if my servername and password were wrong, then the files without special characters would not successfully download either. But, as I've said in each of my posts, they download no problem. I'm simply writing "xxx" as a place-holder for servername and password, but in the real code, I'm using my real servername and password.
It's ONLY the files with special characters in their filenames that don't successfully download.
The code is
Code: Select all
set the itemdel to "/"
put "ftp://" & item 1 of FtpFileName & "/" & textEncode(item 2 to -1 of FtpFileName, "UTF8") into FtpFileName
set the itemdel to comma
put URL FtpFileName into url ("binfile:" & LocalFileName)
A sample FtpFileName that DOESN'T download:
ftp://name.password@IP/Conversations/Spanish 0102/Vocab/Un paÃs.mp3
A sample FtpFileName that DOES download:
ftp://name.password@IP/Conversations/Spanish 0102/Vocab/12.mp3
I don't understand why one works and the other does not...
Re: FTP and accented characters in filenames
Posted: Tue Oct 03, 2017 3:01 am
by shaosean
Your code still shows textEncode and not URLEncode, perhaps try changing that (as mentioned previously)
Re: FTP and accented characters in filenames
Posted: Tue Oct 03, 2017 8:03 am
by ittarter
Shaosean,
I need both textencode(FTPfilename) AND URL FTPfilename to get ANY files to correctly transfer.
If I eliminate textencode OR if I use URLencode in place of URL FTPfilename, nothing is correctly transferred.
For example, if I adapt this (which MaxV advised earlier) to my server, just using url and urlencode,
Code: Select all
put URL ( "ftp://name.password@site" & urlencode("/Conversations/Spanish 0101/Vocab/Española.mp3") )into url ("binfile:" & LocalFileName)
it results in all 0 KB files. And anyway, in the notes for urlencode, it says it's for HTTP servers only.
Anyway, thanks for all your support, guys, nothing has worked yet for filenames with special characters...