character encoding - never ending story
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
character encoding - never ending story
Hi altogether,
I have a MariaDB (10.0.29) with character set latin1 and collate latin1_german1_ci
When I save the text "(ÄÖÜäöüßÉ) from livecode to the DB without special dealing, in the database is stored "(????????)"
After searching the forum, I introduced the following function in my livecode script:
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text
and sent this to the database.
Result: characters are stored correctly.
But when retrieving the data again, I get "(ÄÖÜäöüßÉ)"
Now I do not know how to translate them back to the original in livecode. I tried
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text (does not work)
mactoiso("(ÄÖÜäöüßÉ)") results in "(ÄÖÜäöüßÉ)"
The following solution works, but it seems to me very inelegant:
function Umlautdecodierung eingabe
--Ä
replace "Ä" with "Ä" in eingabe
-- Ö
replace "Ö" with "Ö" in eingabe
-- Ü
replace "Ü" with "Ü" in eingabe
-- ä
replace "ä" with "ä" in eingabe
-- ö
replace "ö" with "ö" in eingabe
-- ü
replace "ü" with "ü" in eingabe
-- ß
replace "ß" with "ß" in eingabe
-- É
replace "É" with "Ö" in eingabe
   
return eingabe
end Umlautdecodierung
 
I hope one of you has the better idea to solve this problem.
By the way: My favourite solution would be not to have converting at all as this makes scripting much complicated. Perhaps you know the magic configuration for my MariaDB.
Best regards
Ulrich
			
			
									
									
						I have a MariaDB (10.0.29) with character set latin1 and collate latin1_german1_ci
When I save the text "(ÄÖÜäöüßÉ) from livecode to the DB without special dealing, in the database is stored "(????????)"
After searching the forum, I introduced the following function in my livecode script:
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text
and sent this to the database.
Result: characters are stored correctly.
But when retrieving the data again, I get "(ÄÖÜäöüßÉ)"
Now I do not know how to translate them back to the original in livecode. I tried
put unidecode(uniencode("(ÄÖÜäöüßÉ)"),"utf8") into utf8text (does not work)
mactoiso("(ÄÖÜäöüßÉ)") results in "(ÄÖÜäöüßÉ)"
The following solution works, but it seems to me very inelegant:
function Umlautdecodierung eingabe
--Ä
replace "Ä" with "Ä" in eingabe
-- Ö
replace "Ö" with "Ö" in eingabe
-- Ü
replace "Ü" with "Ü" in eingabe
-- ä
replace "ä" with "ä" in eingabe
-- ö
replace "ö" with "ö" in eingabe
-- ü
replace "ü" with "ü" in eingabe
-- ß
replace "ß" with "ß" in eingabe
-- É
replace "É" with "Ö" in eingabe
return eingabe
end Umlautdecodierung
I hope one of you has the better idea to solve this problem.
By the way: My favourite solution would be not to have converting at all as this makes scripting much complicated. Perhaps you know the magic configuration for my MariaDB.
Best regards
Ulrich
Re: character encoding - never ending story
I think you will also need to set the db to default UTF8.
see here>https://mariadb.com/kb/en/mariadb/setti ... ollations/
			
			
									
									see here>https://mariadb.com/kb/en/mariadb/setti ... ollations/
Andy .... LC CLASSIC ROCKS!
						Re: character encoding - never ending story
The uniEncode and uniDecode functions are deprecated and shouldn't be used with LC 7 or above, though they do still work. But the new functions are much easier to work with and I recommend them. See textEncode() and textDecode() in the dictionary. They do all the work for you, as long as you know the correct character set you're working with (and you do.) 
Or you can follow AndyP's suggestion, and set the database to use UTF8.
			
			
									
									Or you can follow AndyP's suggestion, and set the database to use UTF8.
Jacqueline Landman Gay         |     jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com
						HyperActive Software | http://www.hyperactivesw.com
Re: character encoding - never ending story
Code: Select all
put urlencode("ÄÖÜäöüßÉ")Code: Select all
pur urldecode("%C4%D6%DC%E4%F6%FC%DF%C9")
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
						My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
- 
				Hans-Helmut
- Posts: 57
- Joined: Sat Jan 14, 2017 6:44 pm
Re: character encoding - never ending story - mySQL
I am lost at the moment and asking for help. I have to use international characters UTF-8 encoded, mainly Russian, German and English together.
Even though all appears fine when looking on the server and using phpMyAdmin to browse the database, in LiveCode trying all kinds of settings, it does not work.
The Russian character string is "аловуе" - (alowue)
Selecting this record results in "??????"
Settings
LiveCode: 8.1.4 (rc 2)
OS: Windows 2000, 64bit, latest update
Server: Localhost via UNIX socket
Server type: MySQL
Server version: 5.6.33-log - MySQL Community Server (GPL)
Protocol version: 10
User: b@localhost
Server charset: UTF-8 Unicode (utf8)
Server connection collation: utf8_general_ci
User language: English
Database: b_address
Table: party
Column: name
Collumn collation: utf8_general_ci
			
			
									
									
						Even though all appears fine when looking on the server and using phpMyAdmin to browse the database, in LiveCode trying all kinds of settings, it does not work.
Code: Select all
#Simplified code snippet without error checking:
on mouseUp
global gConnectionID
   put "SELECT name FROM party" into tSQL
   put revDataFromQuery(tab, cr, gConnectionID, tSQL) into tList
   put textDecode ( tList , "UTF-8")  into field "data"
end mouseUpSelecting this record results in "??????"
Settings
LiveCode: 8.1.4 (rc 2)
OS: Windows 2000, 64bit, latest update
Server: Localhost via UNIX socket
Server type: MySQL
Server version: 5.6.33-log - MySQL Community Server (GPL)
Protocol version: 10
User: b@localhost
Server charset: UTF-8 Unicode (utf8)
Server connection collation: utf8_general_ci
User language: English
Database: b_address
Table: party
Column: name
Collumn collation: utf8_general_ci
- 
				Hans-Helmut
- Posts: 57
- Joined: Sat Jan 14, 2017 6:44 pm
Re: character encoding - never ending story
I am still stuck with Russian text in mySQL and LC... )
For now, I can not go through PHP or server side scripting. I need to use the direct connection as described.
All settings in the MySQL database are for UTF-8.
Russian characters are visible on the server side. But they do not render on the client side.
Executing through LiveCode using textDecode()
1. Special Latin-1 characters are not shown. A "Müller" will become "Mller". // Why? Wrong.
2. Any Russian character will not render at all: "Димитрий" will become "?????????" // Why? Wrong.
Executing without textDecode()
1. Special Latin-1 characters are shown. A "Müller" is still "Müller" with "u-Umlaut".
2. Any Russian character will not render: "Димитрий" will become "?????????"
Is this a bug in LiveCode?
Is there still something wrong on the server-side settings?
I really need this. For this, I can not use PHP. And a LiveCode server installation is not permitted.
Thanks for any help.
			
			
									
									
						For now, I can not go through PHP or server side scripting. I need to use the direct connection as described.
All settings in the MySQL database are for UTF-8.
Russian characters are visible on the server side. But they do not render on the client side.
Executing through LiveCode using textDecode()
1. Special Latin-1 characters are not shown. A "Müller" will become "Mller". // Why? Wrong.
2. Any Russian character will not render at all: "Димитрий" will become "?????????" // Why? Wrong.
Executing without textDecode()
1. Special Latin-1 characters are shown. A "Müller" is still "Müller" with "u-Umlaut".
2. Any Russian character will not render: "Димитрий" will become "?????????"
Is this a bug in LiveCode?
Is there still something wrong on the server-side settings?
I really need this. For this, I can not use PHP. And a LiveCode server installation is not permitted.
Thanks for any help.
Re: character encoding - never ending story
Before to do your select, try to execute this query :Hans-Helmut wrote: All settings in the MySQL database are for UTF-8.
Russian characters are visible on the server side. But they do not render on the client side.
Code: Select all
   revExecuteSQL gConnectionID, "SET NAMES 'utf8'"Re: character encoding - never ending story
As mentioned above, you need to use textDecode()  to translate the incoming text to a format LC can use. Most databases use UTF8 so I think it's safe to assume that. 
Edit : I just saw you are already using textDecode so ignore the above.
			
			
									
									Code: Select all
put textDecode(data, "UTF8") into tStringJacqueline Landman Gay         |     jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com
						HyperActive Software | http://www.hyperactivesw.com
Re: character encoding - never ending story
Use urlencode and urldecode functions, all chars are translate to standard ASCII and put data safely in a database, then with urldecode  come back in your charset.
Urlencode and urldecode functions are the best way to preserve data. See http://livecode.wikia.com/wiki/URLEncode
Examples:
put urlencode("Müller")
=
M%FCller
put urlencode(textEncode("Димитрий","UTF8"))
=
%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9
put urldecode("M%FCller")
=
Müller
put textdecode(urldecode("%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9"),"UTF8")
=
Димитрий
As you can see the urlencode function uses always just plan ASCII that is compatible with any charset, so data are compatible with any database in the world!!! 
 
Why I added textencode/textdecode with Russian chars? Because my PC is UTF16, but URLencode/urldecode works only with UTF8 chars. Livecode always works with PC encoding, in my case UTF16, so I needed to add the textencode with chars like Russian that have different hexadecimal values from UTF8 in my PC.
			
			
									
									Urlencode and urldecode functions are the best way to preserve data. See http://livecode.wikia.com/wiki/URLEncode
Examples:
put urlencode("Müller")
=
M%FCller
put urlencode(textEncode("Димитрий","UTF8"))
=
%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9
put urldecode("M%FCller")
=
Müller
put textdecode(urldecode("%D0%94%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B9"),"UTF8")
=
Димитрий
As you can see the urlencode function uses always just plan ASCII that is compatible with any charset, so data are compatible with any database in the world!!!
 
 Why I added textencode/textdecode with Russian chars? Because my PC is UTF16, but URLencode/urldecode works only with UTF8 chars. Livecode always works with PC encoding, in my case UTF16, so I needed to add the textencode with chars like Russian that have different hexadecimal values from UTF8 in my PC.
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
						My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
Re: character encoding - never ending story
Actually, textEncode/textDecode work with nine different encodings:my PC is UTF16, but URLencode/urldecode works only with UTF8 chars.
"ASCII"
"UTF-16"
"UTF-16BE"
"UTF-16LE"
"UTF-32"
"UTF-32BE"
"UTF-32LE"
"UTF-8"
"CP1252"
When importing or opening files, LC uses the machine native encoding which will vary depending on the OS.Livecode always works with PC encoding
Jacqueline Landman Gay         |     jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com
						HyperActive Software | http://www.hyperactivesw.com
