Manipulating a large CSV file (beginner question)

Creating desktop or client-server database solutions?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
will_harrison
Posts: 5
Joined: Mon May 09, 2011 3:01 pm

Manipulating a large CSV file (beginner question)

Post by will_harrison » Mon May 16, 2011 6:45 pm

Hi,

I've signed up to the seminar to learn LiveCode for enterprise deployment of iPad applications. My first is an app that allows sales people to search a large price list to create a request for quote with a customer. the database for the pricelist exist as a CSV file of about 160,000 individual products/options. The first question I have is...is a CSV file with a defined delimiter the best format? Secondly, what is the best way to access this file?

I'm hoping the semainar can get me going in the right direction. My programming skills are rusty and I don't want to take the time to learn Objective C. LiveCode seems to be the best alternative to get this app up and running.

Thanks in advance for the response. Will

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: Manipulating a large CSV file (beginner question)

Post by mwieder » Mon May 16, 2011 7:35 pm

Will-

Well, I don't think csv is the best format for *anything*, but that's beside the point <g>.

If you're dealing with csv files already, then you can just treat them as text files. Hopefully they're delimited by something reasonable, like real tabs. If so, check out the itemDelimiter property. The size of the csv file shouldn't be a problem as long as it can fit into memory all at once. In that case, just read in the lines something like this (note: the code below is completely untested)

-- first find the csv file on the disk
answer file "where is the csv file"
if it is not empty then
put it into tFile
put url ("file:" && tFile) into tCSV
end if
set the itemDelimiter to tab

-- now you can pick out the product you're interested in
-- assuming the first item in each line of the csv file is the product id you could do
filter tCSV with tProductID & "*"
-- and you've got the line of the csv file you want

-- or you could go the long way around:
put empty into tFound
repeat for each line tLine in tCSV
if item 1 of tLine is tProductID then
put tLine into tFound
exit repeat
end if
end repeat
if tFound is not empty then
-- found what we're looking for
put item 2 of tFound into tPrice
end if

will_harrison
Posts: 5
Joined: Mon May 09, 2011 3:01 pm

Re: Manipulating a large CSV file (beginner question)

Post by will_harrison » Wed May 18, 2011 9:59 pm

Thanks for your response to my question. That helps immensely.

I did another look at how my data file is formatted. It is actually a text file (.txt) with a ~ delimiter. I can export the data with a different delimiter but I don't think that will matter. The file consists of records that have a product id. If a product has options, the product id is the same for the "root" product and the option records. I'll want to find the "root" and then list all the options for choosing.

Any ideas on how to make this work are appreciated!

Will

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: Manipulating a large CSV file (beginner question)

Post by mwieder » Thu May 19, 2011 2:08 am

I think the filter command is going to be your friend here. Without actual sample lines from your csv file I'll have to guess a bit, but (assuming a field delimiter of "~", the entire tCSV text already loaded into tCSV, and the product id you're interested in at tProductID)

filter tCSV with "*~" & tProductID & "~*"

leave tCSV with just the lines from the csv file that contain a matching product id field. This is rather brute force, since you could possibly get some matches from other fields, but I don't know details about the csv layout. A more stringent filter string would help narrow this down.

will_harrison
Posts: 5
Joined: Mon May 09, 2011 3:01 pm

Re: Manipulating a large CSV file (beginner question)

Post by will_harrison » Sat Dec 29, 2012 1:55 am

I got off trace for a few months with other projects and am back on this task. Now I clearly realize that a ODBC link will be a better choice. Do you agree?

Will

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: Manipulating a large CSV file (beginner question)

Post by mwieder » Sat Dec 29, 2012 2:37 am

Wow - this is a nostalgic thread. Talk about getting off track - I had to read the whole thread again to remember what this was about <g>.

So... the data is already in a database and it gets exported to a csv file? If that's the case, then accessing the database directly would be a much better way to go. And since you're talking ODBC, I take it that this is a Windows machine?

will_harrison
Posts: 5
Joined: Mon May 09, 2011 3:01 pm

Re: Manipulating a large CSV file (beginner question)

Post by will_harrison » Sat Dec 29, 2012 11:22 pm

Yes, it is in a Access database and could stay there instead of exporting out to a csv or other txt file. I'd like to do open it on a Windows or Mac. My first task is to open it up and then search for a partnumber. I could send a sample script that I've written already to open and then get the data. Would you mind critiquing what I've started? I did set up the ODBC links to the database. The preliminary work I've done opens the database ok, but doesn't get the data. What would you like me to send you?

Thanks for the help. Will

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: Manipulating a large CSV file (beginner question)

Post by mwieder » Sun Dec 30, 2012 12:28 am

Will-

I long ago banished Office products from my computers, so I wouldn't be able to help with the database itself. Maybe you could post some of the offending code here. By "opens the database ok", I assume you mean you've got a numeric id to work with, and that's a great start - it's usually the stumbling block most folks run into. So... "doesn't get the data"... you're throwing a SQL select statement at the database and it's returning an empty result or an error code? Are you using revQueryDatabase to return a recordset or revDataFromQuery to get the data itself?

will_harrison
Posts: 5
Joined: Mon May 09, 2011 3:01 pm

Re: Manipulating a large CSV file (beginner question)

Post by will_harrison » Tue Jan 01, 2013 8:16 pm

Happy New Year!

I've made some progress. I exported out my database from MS-Access and was able to create a SQLite database using a Firefox SQL database manager. Now I will go back to my code and see if I can open and not get the revdberr. Yes, I'm using the revDataFromQuery.

Here is the code segment for my "retrieve" button:

on mouseUp
-- check the global connection ID to make sure we have a database connection
global gConnectionID
if gConnectionID is not a number then
answer error "Please connect to the database first."
exit to top
end if

-- construct the Price List (this selects all the data from the specified table)
put "Active Price List Only" into tTableName -- set this to the name of a table in your database
put "SELECT * FROM tTableName " into tQuery

-- query the database
get revDataFromQuery(tab, cr, gConnectionID, tQuery)

-- check the result and display the data or an error message
if item 1 of tData = "revdberr" then
answer error "There was a problem querying the database:" & cr & tData
else
put tData into field "Data"
set the dgText of group "datagrid 1" to tData

--This closes the Database connection
revCloseDatabase tDatabaseID
end if
end mouseUp

Thanks for the help. Will

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Contact:

Re: Manipulating a large CSV file (beginner question)

Post by mwieder » Tue Jan 01, 2013 10:12 pm

The first thing that jumps out at me is that you're not putting anything into tData before checking item 1 of it for an error. Try this...

Code: Select all

-- query the database
put revDataFromQuery(tab, cr, gConnectionID, tQuery) into tData

-- check the result and display the data or an error message
if item 1 of tData = "revdberr" then
...but I'm also a bit uneasy about using spaces in table names. YMMV.

Post Reply