Page 1 of 1

problems with diacritical characters in text

Posted: Tue Feb 10, 2009 1:58 am
by matthiasr
Hi,

i have a text file (CSV- seperated with ; ), which contains diacritical characters. If i want to import it into Excel i have to select OEM or Codepage 850 or 858 to get the diacritical characters displayed correctly.

Is Revolution able to convert the textfile from one codepage to another?

Regards,

Matthias

Posted: Tue Feb 10, 2009 7:03 am
by Janschenkel
The 'read from file' and 'write to file' commands automatically convert between the operating system's "native" encoding to the ISO-encoding that Revolution uses internally. If the file isn't employing the "native" encoding, it's hard for Rev to guess which one, right?

It doesn't offer ways to set which external encoding to convert from. But you can still write your own encoders/decoders in pure Revolution script - though it will be up to you to read and write the files as binary and convert to the right ISO characters.

Here's a basic and very incomplete example, assuming an encoding where all characters are shifted "to the right".

Code: Select all

on mouseUp
  answer file "Pick a text file in MySillyEncoding"
  if the result is "Cancel" then exit mouseUp
  put it into tFile
  -- now read the contents of the file
  open file tFile for binary read
  read from file tFile until EOF
  put it into tEncodedData
  close file tFile
  -- then start converting it
  put empty into tData
  repeat with tIndex = 1 to the number of bytes in tEncodedData
    put byteToNum(byte tIndex of tEncodedData) into tNumber
    if tNumber is 0 then
      put 255 into tNumber
    else
      subtract 1 from tNumber
    end if
    put numToChar(tNumber) after tData
  end repeat
  answer tData
end mouseUp
While not a very practical example, it gives you a heading: read the files in as binary data, and then perform some sort of mapping from the other encoding to the internal ISO encoding that Revolution uses.

HTH,

Jan Schenkel.

Posted: Tue Feb 10, 2009 8:40 am
by matthiasr
Hi Jan,
thanks for the reply.
If the file isn't employing the "native" encoding, it's hard for Rev to guess which one, right?
I hoped, that i could say Rev to convert from OEM or MS-DOS character set to Windows Ansi. Until now i had only to do with Windows only text in my apps. But now i have to use a textfile with OEM character set.

So there is no function for it? What a pity.

So the other way would be to replace the wrong characters with the correct ones. This should be not so difficult, as the normal german alphabet only has 3 different diacritical characters and one special character.

Regards,

Matthias