Page 1 of 1
problems with diacritical characters in text
Posted: Tue Feb 10, 2009 1:58 am
by matthiasr
Hi,
i have a text file (CSV- seperated with ; ), which contains diacritical characters. If i want to import it into Excel i have to select OEM or Codepage 850 or 858 to get the diacritical characters displayed correctly.
Is Revolution able to convert the textfile from one codepage to another?
Regards,
Matthias
Posted: Tue Feb 10, 2009 7:03 am
by Janschenkel
The 'read from file' and 'write to file' commands automatically convert between the operating system's "native" encoding to the ISO-encoding that Revolution uses internally. If the file isn't employing the "native" encoding, it's hard for Rev to guess which one, right?
It doesn't offer ways to set which external encoding to convert from. But you can still write your own encoders/decoders in pure Revolution script - though it will be up to you to read and write the files as binary and convert to the right ISO characters.
Here's a basic and very incomplete example, assuming an encoding where all characters are shifted "to the right".
Code: Select all
on mouseUp
answer file "Pick a text file in MySillyEncoding"
if the result is "Cancel" then exit mouseUp
put it into tFile
-- now read the contents of the file
open file tFile for binary read
read from file tFile until EOF
put it into tEncodedData
close file tFile
-- then start converting it
put empty into tData
repeat with tIndex = 1 to the number of bytes in tEncodedData
put byteToNum(byte tIndex of tEncodedData) into tNumber
if tNumber is 0 then
put 255 into tNumber
else
subtract 1 from tNumber
end if
put numToChar(tNumber) after tData
end repeat
answer tData
end mouseUp
While not a very practical example, it gives you a heading: read the files in as binary data, and then perform some sort of mapping from the other encoding to the internal ISO encoding that Revolution uses.
HTH,
Jan Schenkel.
Posted: Tue Feb 10, 2009 8:40 am
by matthiasr
Hi Jan,
thanks for the reply.
If the file isn't employing the "native" encoding, it's hard for Rev to guess which one, right?
I hoped, that i could say Rev to convert from OEM or MS-DOS character set to Windows Ansi. Until now i had only to do with Windows only text in my apps. But now i have to use a textfile with OEM character set.
So there is no function for it? What a pity.
So the other way would be to replace the wrong characters with the correct ones. This should be not so difficult, as the normal german alphabet only has 3 different diacritical characters and one special character.
Regards,
Matthias