Stripping text from word documents

LiveCode is the premier environment for creating multi-platform solutions for all major operating systems - Windows, Mac OS X, Linux, the Web, Server environments and Mobile platforms. Brand new to LiveCode? Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
jpottsx1
Posts: 46
Joined: Thu Jun 04, 2009 12:46 am
Contact:

Stripping text from word documents

Post by jpottsx1 » Thu Feb 18, 2010 5:21 pm

Is it possible to read into text fields the contents of .DOC, and .PDF stripping out the formatting on import. I just want the actual text with CR/LF and none of the other garbage that comes with the read anyfile demo.
Jeff G potts

Klaus
Posts: 14198
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: Stripping text from word documents

Post by Klaus » Thu Feb 18, 2010 6:25 pm

Hi Jeff,

no this is not possible without extreme efforts!
DOC and PDF are NOT plain text files, as you have seen, so this is not possible right "out of the box".

If you are on OS X you could use SHELL and "textutil" to convert a DOC or PDF to plain or rtf text and work with that one.


Best

Klaus

Curry
Posts: 111
Joined: Mon Oct 15, 2007 11:34 pm
Contact:

Re: Stripping text from word documents

Post by Curry » Sat Apr 24, 2010 6:57 am

The WordLib library can import Word files. Its forte is the newer DOCX format (Word 2007) and OpenOffice, but it does provide basic support for legacy Word DOC files, and it seems that's just what you're after. It does a pretty good job of stripping out the text.

To get the plain text with no styles, just import and then "put field 1 into field 1" for example, to clear any formatting.

(I hope to provide full formatting support for the legacy DOC files in a future version, and the more registered users I have, the more I will be able to develop the library!)
Best wishes,

Curry Kenworthy

LiveCode Development, Training & Consulting
http://livecodeconsulting.com/

WordLib: Conquer MS Word & OpenOffice
SpreadLib: "Excel-lent" spreadsheet import/export
http://livecodeaddons.com/

Post Reply