FTP and accented characters in filenames [solved]

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

FTP and accented characters in filenames [solved]

Post by ittarter » Thu Sep 28, 2017 11:57 am

Hello,

I have a set of filenames some of which have accented characters (typical Spanish and French accents), which my application can't find. It correctly downloads filenames without accents.

I was wondering if it's possible to use unicode somehow to query the FTP server differently, so that it's able to find the files?

Here's the code.

Code: Select all

     put x into FtpFileName
      if last char in FtpFileName is not in "abcdefghijklmnopqrstuvwxyz1234567890" then
         delete last char in FtpFileName --this is to remove carriage return character at the end of the filename as it's stored in my SQL database, never been a problem before
      end if
      put URL FtpFileName into url ("binfile:" & LocalFileName) --ACTUAL FILE COPY command
      put the result into tError
      if tError is not empty then
         put tError --no errors show up but it creates a 0 KB file with the correct filename including accents
      end if
Last edited by ittarter on Wed Oct 18, 2017 5:27 am, edited 1 time in total.

jmburnod
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2729
Joined: Sat Dec 22, 2007 5:35 pm
Contact:

Re: FTP and accented characters in filenames

Post by jmburnod » Thu Sep 28, 2017 4:06 pm

Hi ittarter,
I use this for url with accented char:

Code: Select all

get "boîte"
  put urlEncode(unidecode(uniencode(it), "UTF8")) into tURL
Best regards
Jean-Marc
https://alternatic.ch

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

Re: FTP and accented characters in filenames

Post by ittarter » Fri Sep 29, 2017 3:55 pm

Hi Jean-Marc,

Thanks for idea but it isn't working yet. I've tried several variations (see below). The last one is the only one that results in any good files, but accented files are still empty.

I've double-checked my FTP server and the source files are definitely NOT empty...

If I do store URLencode as a variable it looks something like this (here, for española)
ftp%3A%2F%2Fname%3Apassword%1111.111.111.111%2FConversations%2FSpanish+0101%2FVocab%2FEspa%F1ola.mp3
But I have to do the url function as below, because urlencode doesn't result in a string that my FTP server can read. Maybe urlencode is just for HTTP?

Code: Select all

--put uniencode(FtpFileName,"UTF8") into FtpFileName
      --put uniencode(LocalFileName,"UTF8") into LocalFileName
--put urlEncode(unidecode(uniencode(FtpFileName), "UTF8")) into FtpFileName
      --put urlEncode(unidecode(uniencode(LocalFileName), "UTF8")) into LocalFileName
      put URL unidecode(uniencode(FtpFileName), "UTF8") into url ("binfile:" & LocalFileName)
EDIT: url ftpfilename (after uniencode and unidecode) results in some crazy ‰PNG stuff in the variable that I can't copy out but is a long string of unintelligibility. in notepad, notepad ++ and this forum post field it results in
‰PNG

Don't know if this is useful information, but there it is.

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7393
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: FTP and accented characters in filenames

Post by jacque » Fri Sep 29, 2017 4:31 pm

The unicode functions have been deprecated, the new way is to use textEncode and textDecode. You encode the file name into UTF 8 before requesting it and decode it when reading it. I think this may be the problem, since LC will be sending everything in UTF 16 unless you convert it.

An easier way to remove the trailing carriage return is to use the constant "CR". That way you won't accidentally strip off any non-Roman characters.

Code: Select all

if last char of FtpFileName is cr then delete last char of FtpFileName 
And then:

Code: Select all

put textEncode(FtpFileName, "UTF8") into FtpFileName
The file will be in UTF 8 after download so to read it in LC use textEncode to convert it to UTF 16 before working with it.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

Re: FTP and accented characters in filenames

Post by ittarter » Fri Sep 29, 2017 5:06 pm

LC 8.1.2, by the way.
jacque wrote:The unicode functions have been deprecated, the new way is to use textEncode and textDecode. You encode the file name into UTF 8 before requesting it and decode it when reading it. I think this may be the problem, since LC will be sending everything in UTF 16 unless you convert it.

Code: Select all

put textEncode(FtpFileName, "UTF8") into FtpFileName
No, this doesn't change anything. Files without special characters are good, files with special characters are 0 KB. Here's the string that results from textencode:
ftp://xxx/Conversations/Spanish 0101/Vocab/Español.mp3
And here's my code

Code: Select all

put URL textEncode(FtpFileName, "UTF8") into url ("binfile:" & LocalFileName)
I can put the textencode as a separate step but it doesn't change the result.

Sigh. Still stuck.

An easier way to remove the trailing carriage return is to use the constant "CR". That way you won't accidentally strip off any non-Roman characters.
Good idea, code updated, thanks!
The file will be in UTF 8 after download so to read it in LC use textEncode to convert it to UTF 16 before working with it.
They're mp3 files so I don't think I have to encode them to UTF 16.

MaxV
Posts: 1580
Joined: Tue May 28, 2013 2:20 pm
Contact:

Re: FTP and accented characters in filenames

Post by MaxV » Fri Sep 29, 2017 5:10 pm

ittarter wrote: ftp%3A%2F%2Fname%3Apassword%1111.111.111.111%2FConversations%2FSpanish+0101%2FVocab%2FEspa%F1ola.mp3
You don't have to urlencode all, just the parts with not ASCII chars, so

Code: Select all

put  URL (  "ftp://name.password@mycoolwebsite/Conversations/Spanish+0101/Vocab/" & urlencode("Española.mp3") )"     into myAudio
If name or password contains not pure ASCII chars, urlencode them also! :D
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7393
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: FTP and accented characters in filenames

Post by jacque » Fri Sep 29, 2017 5:44 pm

You're right, mp3 files won't need encoding. That makes all the difference.

So follow MaxV's advice and urlEncode the file name.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

shaosean
Posts: 906
Joined: Thu Nov 04, 2010 7:53 am

Re: FTP and accented characters in filenames

Post by shaosean » Sat Sep 30, 2017 6:35 am

Here is the RFC for Internationalization of the File Transfer Protocol <https://tools.ietf.org/html/rfc2640> Section 3 has information about pathnames..

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

Re: FTP and accented characters in filenames

Post by ittarter » Sat Sep 30, 2017 10:52 am

MaxV wrote: You don't have to urlencode all, just the parts with not ASCII chars, so

Code: Select all

put  URL (  "ftp://name.password@mycoolwebsite/Conversations/Spanish+0101/Vocab/" & urlencode("Española.mp3") )"     into myAudio
This does not change the result, unfortunately. Filenames containing special characters yield 0 KB files, and filenames that do not contain special characters are fine.

My code is:

Code: Select all

set the itemdel to "/"
put item 1 to -2 of FtpFileName & "/" & textEncode(item -1 of FtpFileName, "UTF8") into x
set the itemdel to comma
put URL x into url ("binfile:" & LocalFileName)
The resulting url is:
ftp://xxx/Conversations/Spanish 0101/Vocab/Español.mp3

jacque
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 7393
Joined: Sat Apr 08, 2006 8:31 pm
Contact:

Re: FTP and accented characters in filenames

Post by jacque » Sat Sep 30, 2017 4:19 pm

You need urlEncode, not textEncode. Easy to make that slip after all that's gone on in the thread so far.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

Re: FTP and accented characters in filenames

Post by ittarter » Sun Oct 01, 2017 5:49 pm

jacque wrote:You need urlEncode, not textEncode. Easy to make that slip after all that's gone on in the thread so far.
I believe that the line

Code: Select all

put URL xxx into url ("binfile:" & LocalFileName)
does the same as

Code: Select all

put URLencode(xxx) into url ("binfile:" & LocalFileName)
In any case I've done both, and neither works.

For the latter (using urlencode), the resulting urlencode is:
ftp%3A%2F%2Fxxx%2FConversations%2FSpanish+0101%2FVocab%2FEspa%C3%B1ol.mp3

As before, it results in a 0 KB file when the filename has special characters, and there are no problems the filename doesn't have special characters.

MaxV
Posts: 1580
Joined: Tue May 28, 2013 2:20 pm
Contact:

Re: FTP and accented characters in filenames

Post by MaxV » Mon Oct 02, 2017 1:04 pm

ittarter wrote:For the latter (using urlencode), the resulting urlencode is:
ftp%3A%2F%2Fxxx%2FConversations%2FSpanish+0101%2FVocab%2FEspa%C3%B1ol.mp3
Dear ittarter,
FTP need user and passwrod usually, so you the url must be something like:
FTP://name.password@someSite/path/to/the/file

if you don't put exactly "ftp://" livecode can't know what you need.

So your code:

Code: Select all

put URLencode(xxx) into url ("binfile:" & LocalFileName)
is wrong.

use

put URL ( "ftp://name.password@site" & urlencode("/Conversations/Spanish 0101/Vocab/Española.mp3") )into url ("binfile:" & LocalFileName)
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

Re: FTP and accented characters in filenames

Post by ittarter » Mon Oct 02, 2017 3:07 pm

MaxV wrote:FTP need user and passwrod usually, so you the url must be something like:
FTP://name.password@someSite/path/to/the/file
Yes, but if my servername and password were wrong, then the files without special characters would not successfully download either. But, as I've said in each of my posts, they download no problem. I'm simply writing "xxx" as a place-holder for servername and password, but in the real code, I'm using my real servername and password.

It's ONLY the files with special characters in their filenames that don't successfully download.

The code is

Code: Select all

set the itemdel to "/"
put "ftp://" & item 1 of FtpFileName & "/" & textEncode(item 2 to -1 of FtpFileName, "UTF8") into FtpFileName
set the itemdel to comma
put URL FtpFileName into url ("binfile:" & LocalFileName)
A sample FtpFileName that DOESN'T download:
ftp://name.password@IP/Conversations/Spanish 0102/Vocab/Un país.mp3
A sample FtpFileName that DOES download:
ftp://name.password@IP/Conversations/Spanish 0102/Vocab/12.mp3

I don't understand why one works and the other does not...
Last edited by ittarter on Tue Oct 03, 2017 7:38 am, edited 1 time in total.

shaosean
Posts: 906
Joined: Thu Nov 04, 2010 7:53 am

Re: FTP and accented characters in filenames

Post by shaosean » Tue Oct 03, 2017 3:01 am

Your code still shows textEncode and not URLEncode, perhaps try changing that (as mentioned previously)

ittarter
Posts: 151
Joined: Sat Jun 13, 2015 2:13 pm

Re: FTP and accented characters in filenames

Post by ittarter » Tue Oct 03, 2017 8:03 am

Shaosean,

I need both textencode(FTPfilename) AND URL FTPfilename to get ANY files to correctly transfer.

If I eliminate textencode OR if I use URLencode in place of URL FTPfilename, nothing is correctly transferred.

For example, if I adapt this (which MaxV advised earlier) to my server, just using url and urlencode,

Code: Select all

put URL ( "ftp://name.password@site" & urlencode("/Conversations/Spanish 0101/Vocab/Española.mp3") )into url ("binfile:" & LocalFileName)
it results in all 0 KB files. And anyway, in the notes for urlencode, it says it's for HTTP servers only.

Anyway, thanks for all your support, guys, nothing has worked yet for filenames with special characters...

Post Reply