Extracting data from Word documents

LiveCode is the premier environment for creating multi-platform solutions for all major operating systems - Windows, Mac OS X, Linux, the Web, Server environments and Mobile platforms. Brand new to LiveCode? Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
Andycal
Posts: 144
Joined: Mon Apr 10, 2006 3:04 pm

Extracting data from Word documents

Post by Andycal » Fri Oct 19, 2007 9:56 am

Bit left-wing this, but has anyone got any ideas if it's possible to use RunRev to extract data from Word documents?

Specifically, I've noticed some recruitement sites will actually take a CV and then extract address, phone number and other information automatically.

Mark
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 5150
Joined: Thu Feb 23, 2006 9:24 pm
Contact:

Post by Mark » Fri Oct 19, 2007 12:56 pm

On Windows, use the follosing VB Script:

Code: Select all

Option Explicit

Dim objWord
Dim strFile
If WScript.Arguments.Count < 1 Then
	WScript.Echo("Usage: doc2txt.vbs C:\file.doc")
	WScript.Quit
End If

strFile = Wscript.Arguments(0)

Set objWord = WScript.CreateObject("Word.Application")

objWord.Documents.Open strFile
objWord.ActiveDocument.SaveAs strFile&".rtf", 6
objWord.ActiveDocument.Close

objWord.Quit
and call it using the shell function.

On Mac OS X, you can use TextUtil. Look here for more info:
<http://www.hmug.org/man/1/textutil.php>
or type "man textutil" in the terminal.

Best,

Mark
The biggest LiveCode group on Facebook: https://www.facebook.com/groups/livecode.developers
The book "Programming LiveCode for the Real Beginner"! Get it here! http://tinyurl.com/book-livecode

Andycal
Posts: 144
Joined: Mon Apr 10, 2006 3:04 pm

Post by Andycal » Fri Oct 19, 2007 1:18 pm

Gotcha, nice one!

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10045
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Post by FourthWorld » Fri Oct 19, 2007 7:04 pm

This post on the Rev discussion list describes a new library for pulling text from Word 2007 documents:

http://lists.runrev.com/pipermail/use-r ... 03526.html
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Post Reply