Page 1 of 2
Odd Excel data?
Posted: Fri Oct 14, 2016 3:19 pm
by dunbarx
I see I am in the beginner's section. Again.
I have written to excel forever. For the first time, also since forever, I am trying to read from it.
All works fine, open an excel file, read from it, close file. But the data I see is not the text of the spreadsheet. It looks like this:
Code: Select all
PK ! ;Hé@l ƒ [Content_Types].xml ¢(†
And it makes no difference what the actual contents of that spreadsheet are. The it variable contains the above string. I assume I am not doing something correctly.
OS 10.9, LC 6.7.9
Craig Newman
Re: Odd Excel data?
Posted: Sat Oct 15, 2016 4:51 pm
by jacque
I know little about the file format in Excel but I assume you have to open the file as binary. The content is probably UTF8 or native, you'd need to check exactly how it's stored, but you'll likely also need to re-encode it as UTF16 which is what LC uses.
It would be much easier to do that in LC 7 or 8 where unicode translation is simpler and more accurate.
Re: Odd Excel data?
Posted: Sat Oct 15, 2016 5:40 pm
by FourthWorld
The "PK" there is telling. Many file formats use "magic numbers" (which often aren't numbers at all) in the first few bytes of the file to identify its format. LiveCode, for example, uses "REVO5500", where the first four bytes identify that it's a LiveCode stack and the next four are the stack file format version.
"PK" is often used to denote "PKZip", or what we more commonly call simply "Zip".
Recent versions (mid-last-decade and forward) of Microsoft Office, and pretty much all versions of LibreOffice/OpenOffice, use multi-part data bundled into common Zip format.
You can use LC's revZip external to explore them and extract what you need. It's been a while since I did this, but last time I poked around I was able to find one of the files within the Zip bundle that contained the cell contents (others contain styling info, formulas, etc.).
Another option might be Curry Kenworthy's Excel lib. He announced this some time ago, but I don't see it on his site right now, so if extracting via revZip is too cumbersome you might drop a note to Curry to see where he's at with that:
http://lists.runrev.com/pipermail/use-l ... 71751.html
Re: Odd Excel data?
Posted: Sat Oct 15, 2016 11:39 pm
by dunbarx
Jacque, Richard.
Thanks for the replies. But are you both saying that I cannot read the file "directly"? The test spreadsheet I made in trying to debug this mess contained one cell, "A1", with "XYZ" in it.
It makes no difference if the file is open or not. Same results, no errors. Actually, I am happy about that, because I want to read from an open file that is being updated by another process.
I will try the several suggestions, and perhaps save the spreadsheet as text-only and see if that helps. But why on earth should I need to? I have no issue writing data to a spreadsheet, where all comes out just fine. Why cannot I simply read back the same data in the same format I just wrote?
Craig
Re: Odd Excel data?
Posted: Sat Oct 15, 2016 11:52 pm
by FourthWorld
dunbarx wrote:I will try the several suggestions, and perhaps save the spreadsheet as text-only and see if that helps. But why on earth should I need to?
Applications use formats that support the full range of savable data. In a word processor, for example, in addition to the text there may be style runs, pagination, indexes, and other elements. In a spreadsheet, in addition to the data contained in the cells there may be data about the formatting of the contents those cells, the formatting of the cell borders, formulas, charts, and perhaps even multiple sheets containing many different collections of all of those.
There's a lot more to a spreadsheet than the cell contents.
I have no issue writing data to a spreadsheet, where all comes out just fine. Why cannot I simply read back the same data in the same format I just wrote?
How did you write the data to a spreadsheet?
My hunch would be that you wrote a plain-text file, and then imported it, yes? If so, the format on disk is not the format in memory, and if you then save the imported data to a native Excel-format file you'll see the same Zip format as when making a new file in Excel.
Think of it like importing a text file into a field: if you wrote an app that allows you to open a text file and it creates a stack and a field within it to display the text, if you then save the stack file you'll have a lot of other things in that file beyond just the textual contents of the field.
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 3:41 am
by dunbarx
I get all that, Richard, and thanks for the information.
I would not have been surprised to get back a lot of extra formatting data, but I was indeed surprised to find that the actual contents was simply not part of that information.
In other words, I would not complain that I might be required to parse out the pertinent "actual" data, but am miffed that it does not appear at all.
Also, even if I simply type in "XYZ" in a new spreadsheet, save it and then try to read the contents, no dice. Same crap comes back.
In other words, I am calm about the fact that there is information I do not need, but upset that the whole point of reading data from that application is wasted.
In still other words, what do I have to do to get back "XYZ", regardless of what else comes over as lagniappe?
I will try to open as binary, as Jacque suggests, but I thought that the default, "text" would have made do. I am sure I will one day resolve this and even understand the nuances.
Craig
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 4:22 am
by FourthWorld
Zip is a compression format, so the contents you're looking for won't be visible in the raw data.
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 6:15 am
by jacque
Richard got it. I knew I'd seen that "PK" before but I couldn't remember where.
Craig, it's a zip file. You have to decompress it before you can read the contents.
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 8:51 am
by [-hh]
PK = Phil Katz
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 10:24 am
by AxWald
Craig,
what you have is an .xlsx, the new XML-based Excel file format.
You need to open it as a zip container, you'll find a folder "xl" in it. Within this folder there's the data:
- "./xl/workbook.xml" has the names of the single sheets of the document, in the "<sheet>" section.
- "./xl/sharedStrings.xml" has some data. First column, sometimes more.
- "./xl/worksheets/sheet[x].xml" has the remaining data, where "[x]" is "SheetID" from "workbook.xml"
There might be more, I only had a quick look at it. But these you can access easily, using revZip and simple text file processing. Further investigation was omitted, because I'm allergic to XML ;-)
Have fun!
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 3:30 pm
by dunbarx
Thanks all.
So now I am depressed.
What I wanted to do was to, from LC, read a "live" excel spreadsheet. That is, one that is being updated via another process (written to from an external electronic device)
To explain, I want to type into a spreadsheet, switch to LC and invoke a read of that spreadsheet, I take that text and do stuff with it. I could switch back and forth, typing and reading as the data changes. But if I have to unpack a zip file after each read, unless I can do that under script control, I cannot let that process run untended.
Untended.
The other gadget has a feature to write explicitly to Excel. It does not output to just any old text editor, one that stays in the clear.
I wonder if I can load an old version of Excel that works the old-fashioned way? Did I mention I was depressed?
Craig
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 6:13 pm
by AxWald
Hmm.
dunbarx wrote:What I wanted to do was to, from LC, read a "live" excel spreadsheet. [...] I wonder if I can load an old version of Excel that works the old-fashioned way?
You'd need to save quite often in Excel, else you'd not have fresh data on disc ...
If you'd like to try, just set the output format for "save as" to one of the old versions, this way even a new Excel should write the old formats.
(There might be a format "97-2003 (*.xls)" - this is what I use whenever I have to write MS Office formats; it's the least incompatible)
Or, if you really need to grab live data from Excel, I'd try "do ... as VBScript" or VB via "shell".
Never did VB besides database scripting in Access, but it shouldn't be this hard to find a VB freak that can help you - and Excel would, that's sure, give you even unsaved data each time you ask for. Oh, I actually found
an example!
No reason to feel depressed yet ;-)
Have fun!
Re: Odd Excel data?
Posted: Sun Oct 16, 2016 8:18 pm
by FourthWorld
What is this external device that outputs data specifically in Microsoft Excel's very complex format?
Does it offer any simpler, more common options?
Re: Odd Excel data?
Posted: Mon Oct 17, 2016 5:30 am
by dunbarx
@AxWald.
I will try that, creating a file in an older format that may not be so "modern".
@Richard.
What is this external device that outputs data specifically in Microsoft Excel's very complex format?
It is a light meter, a very sophisticated one, that can output via USB to an excel spreadsheet. A front end manages the discourse. Interestingly, I am never going to save the spreadsheet itself. It is only a vehicle to get data from that gadget into LC. After each session I will likely empty the contents. The reason for that being that the gadget will continually write CSV data, appending forever, and I do not want to exceed the limits of the spreadsheet itself. I will trigger an event externally, and read only the last bits of data at the "end" of the file. Each new trigger will invoke a new write to excel, a new read into LC, and a new result for me after I separate out the new info.
An untended live process with events about every ten seconds.
Craig
Re: Odd Excel data?
Posted: Mon Oct 17, 2016 6:10 am
by FourthWorld
Does the gadget write in Excel or in CSV format?