Using CSV as reference file
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Using CSV as reference file
Hi,
I need to be able to import a CSV file to use as reference data and 'link' to this data via a given input key. Not sure the best, most efficient way to attack this particular problem, so here goes.
I have an external CSV file which is developed and updated periodically by a 3rd party and is access via HTML - e.g. put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData.
It contains Post Code data. It has 14 columns, of which only 4 are of interest to me. The 'key' (that I need) to the CSV file is in the 2nd field (column). The CSV file is delimited by comma and each field is enclosed in quotes e.g. "222","2001","Sydney GPO","Sydney Metro Area","NSW". The first row of the CSV contains the field names e.g. "id","postcode","name","area","state"
What I need to understand:-
1. The best structure to import this data into
2. Make field 2 (postcode) the key of data structure
3. Easiest way to get rid of column 1 (id) - if necessary
4. Most efficient way to lookup values in this file by key
The file contains up to 18,000 records, so looping thru this file to find values is sub-optimal.
Any advice would be most welcomed.
Regards
I need to be able to import a CSV file to use as reference data and 'link' to this data via a given input key. Not sure the best, most efficient way to attack this particular problem, so here goes.
I have an external CSV file which is developed and updated periodically by a 3rd party and is access via HTML - e.g. put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData.
It contains Post Code data. It has 14 columns, of which only 4 are of interest to me. The 'key' (that I need) to the CSV file is in the 2nd field (column). The CSV file is delimited by comma and each field is enclosed in quotes e.g. "222","2001","Sydney GPO","Sydney Metro Area","NSW". The first row of the CSV contains the field names e.g. "id","postcode","name","area","state"
What I need to understand:-
1. The best structure to import this data into
2. Make field 2 (postcode) the key of data structure
3. Easiest way to get rid of column 1 (id) - if necessary
4. Most efficient way to lookup values in this file by key
The file contains up to 18,000 records, so looping thru this file to find values is sub-optimal.
Any advice would be most welcomed.
Regards
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
I am a pain in the bum because I tend to write in these reply boxes as I try things out.
1.
2. Import the CSV file as a text file and strip out the double quotes.
- -
Rude, Crude, and Inefficient.
Got to get that code away from fields.
IMPORT button:
REMOVE QUOTES button:
After 45 minutes the REMOVE button had chewed its way through about a third of the imported text.
1.
Which 4? The first, the last, 4 distributed in some odd fashion through each row?It has 14 columns, of which only 4 are of interest to me.
2. Import the CSV file as a text file and strip out the double quotes.
- -
Rude, Crude, and Inefficient.
Got to get that code away from fields.
IMPORT button:
Code: Select all
on mouseUp
answer file "Choose a CSV file to import"
if the result = "cancel"
then exit mouseUp
else
set the text of fld "prePROC" to URL ("file:" & it)
end if
put "&&&" after fld "prePROC"
put the number of chars in fld "prePROC" into fld "KOUNT"
end mouseUp
Code: Select all
on mouseUp
put empty into fld "PROC1"
put empty into fld "KOUNT2"
put fld "prePROC" into rawDDD
repeat until char 1 of rawDDD is "&"
add 1 to fld "KOUNT2"
if char 1 of rawDDD is quote then
delete char 1 of rawDDD
else
put char 1 of rawDDD after fld "PROC1"
delete char 1 of rawDDD
end if
end repeat
end mouseUp
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
Introduced a COMBO button:
Still pretty slow.
* AND, for some "funny" reason does NOT strip out the quotes.
Code: Select all
on mouseUp
answer file "Choose a CSV file to import"
if the result = "cancel"
then exit mouseUp
else
put URL ("file:" & it) into rawDDD
put "&&&" after rawDDD
end if
-------------
put empty into fld "KOUNT2"
put empty into fld "PROC1"
repeat until char 1 of rawDDD is "&"
add 1 to fld "KOUNT2"
if char 1 of rawDDD is quote then
delete char 1 of rawDDD
else
put char 1 of rawDDD after procDDD
delete char 1 of rawDDD
end if
end repeat
put rawDDD into fld "PROC!"
end mouseUp
* AND, for some "funny" reason does NOT strip out the quotes.
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
At the risk of being offensive I have cut that enormously long CVS file down to 10 lines:
This leaves us with something manageable for development.
This leaves us with something manageable for development.
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
I don't think I understand that: as my Granny said, "A picture is worth a thousand words."The 'key' (that I need) to the CSV file is in the 2nd field (column).

- -
If you could download that picture and mark the key I think things would be easier.
Do you mean by "key" the factor that is to be used to order the data?
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
If that is what you mean . . .
- -
button REMOVE QUOTES & FIRST LINE & REORDER
- -
button REMOVE QUOTES & FIRST LINE & REORDER
Code: Select all
on mouseUp
put empty into fld "PROC1"
put empty into fld "KOUNT2"
put fld "prePROC" into rawDDD
repeat until char 1 of rawDDD is "&"
add 1 to fld "KOUNT2"
if char 1 of rawDDD is quote then
delete char 1 of rawDDD
else
put char 1 of rawDDD after fld "PROC1"
delete char 1 of rawDDD
end if
end repeat
delete line 1 of fld "PROC1"
---
set the itemDelimiter to ","
put 1 into LYNE
repeat until line LYNE of fld "PROC1" is empty
put empty into TAIL
put item 1 of line LYNE of fld "PROC1" into TWOO
put item 2 of line LYNE of fld "PROC1" into WUN
put 3 into ITEMM
repeat until item ITEMM of line LYNE of fld "PROC1" is empty
put (item ITEMM of line LYNE of fld "PROC1") & "," after TAIL
add 1 to ITEMM
end repeat
put WUN & "," & TWOO & "," & TAIL into line LYNE of fld "PROC1"
add 1 to LYNE
end repeat
end mouseUp
Last edited by richmond62 on Tue May 26, 2020 10:26 am, edited 1 time in total.
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
You're telling me!The file contains up to 18,000 records, so looping thru this file to find values is sub-optimal.

I don't think you need to LOOP through the file, but you do need to READ through the comma-delimited items in the file
after it has been processed in the way I have demonstrated.
The key (and that does not mean your key) to the whole problem (which comes down to speed) lies with the word item.
- -
button LOOKUP:
Code: Select all
on mouseUp
set the itemDelimiter to ","
put empty into fld "PROC2"
ask "Key value? "
put it into KEYY
put it into fld "KEYYY"
put 1 into LYNE
put 1 into LYNE2
repeat until line LYNE of fld "PROC1" is empty
if item 1 of line LYNE of fld "PROC1" contains KEYY then
put line LYNE of fld "PROC1" into line LYNE2 of fld "PROC2"
add 1 to LYNE2
else
--do nix
end if
add 1 to LYNE
end repeat
end mouseUp
- Attachments
-
- OZ_PC.livecode.zip
- Here's the stack.
- (2.36 KiB) Downloaded 216 times
-
- Posts: 738
- Joined: Thu Sep 11, 2014 1:49 pm
Re: Using CSV as reference file
Made a little example on how I would import the file into an array and then show it in a TreeView.
takes about 4 seconds.
Below is the code in the button and you need a TreeView named 'TreeViewData'.
Hope this helps some?
regards,
Paul
takes about 4 seconds.
Below is the code in the button and you need a TreeView named 'TreeViewData'.
Code: Select all
on mouseUp
set the arraydata of widget "TreeViewData" to empty
answer file "Choose a CSV file to import"
if the result = "cancel"
then exit mouseUp
else
put URL ("file:" & it) into gCSVData
end if
----------------------------------------------------------
replace quote with "" in gCSVData
replace comma with tab in gCSVData
put line 1 of gCSVData into tColumnDescr
set the itemdelimiter to tab
put the number of items of tColumnDescr into tNbrOfColumns
delete line 1 of gCSVData -- do not want description row
put 0 into tCounter
repeat for each line tLineData in gCSVData
add 1 to tCounter
repeat with tColumnNbr = 1 to tNbrOfColumns
put item tColumnNbr of tColumnDescr into tColumnName
put item tColumnNbr of tLineData into tColumnData
put tColumnData into aArrayData[tCounter][tColumnName]
end repeat
end repeat
set the arraydata of widget "TreeViewData" to aArrayData
end mouseUp
regards,
Paul
Re: Using CSV as reference file
Wow - richmond62, thanks for your efforts here.
The 'key' I want from the data is in the 'postcode' field (column 2) in the spreadsheet.
The other fields I need are in columns 3,4 & 11 (locality, state, sa3name).
Is it possible to get rid of the quotes with something like:-
put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData
replace quote with empty in tPostcodeData
mrcoollion, thanks will test this tomorrow.
The 'key' I want from the data is in the 'postcode' field (column 2) in the spreadsheet.
The other fields I need are in columns 3,4 & 11 (locality, state, sa3name).
Is it possible to get rid of the quotes with something like:-
put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData
replace quote with empty in tPostcodeData
mrcoollion, thanks will test this tomorrow.
Re: Using CSV as reference file
My real issue then is being able to prompt for postcode input and display the postcode detail from the table, specifically sa3name, locality & state. The imported postcode file is a reference file - I don't need to display it. What is most efficient way to link input postcode to the above table?mrcoollion wrote: ↑Tue May 26, 2020 11:30 amMade a little example on how I would import the file into an array and then show it in a TreeView.
takes about 4 seconds.
ImportCSV01.zip
Below is the code in the button and you need a TreeView named 'TreeViewData'.
Hope this helps some?Code: Select all
on mouseUp set the arraydata of widget "TreeViewData" to empty answer file "Choose a CSV file to import" if the result = "cancel" then exit mouseUp else put URL ("file:" & it) into gCSVData end if ---------------------------------------------------------- replace quote with "" in gCSVData replace comma with tab in gCSVData put line 1 of gCSVData into tColumnDescr set the itemdelimiter to tab put the number of items of tColumnDescr into tNbrOfColumns delete line 1 of gCSVData -- do not want description row put 0 into tCounter repeat for each line tLineData in gCSVData add 1 to tCounter repeat with tColumnNbr = 1 to tNbrOfColumns put item tColumnNbr of tColumnDescr into tColumnName put item tColumnNbr of tLineData into tColumnData put tColumnData into aArrayData[tCounter][tColumnName] end repeat end repeat set the arraydata of widget "TreeViewData" to aArrayData end mouseUp
regards,
Paul
In SQL terms:-
select sa3name, locality, state from aArrayData where postcode = inputpostcode
-
- Livecode Opensource Backer
- Posts: 366
- Joined: Tue Apr 10, 2012 9:18 am
Re: Using CSV as reference file
how often is this csv file updated? where does this file live normally? who puts it on the server?I have an external CSV file which is developed and updated periodically by a 3rd party and is access via HTML - e.g. put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData.
I ask, because a slightly different approach could be to have a simple script on your server that imports/updates a mysql database every night ( or more often).
then you work with the database rather than having to churn through a csv file. The server would do the heavy work for you so your app could just request a small bit of info at a time.
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
Knowing 'that' the whole procedure can be shortened, and a list field can be produced that contains ONLY columns 2,3,4 & 11.Wow - richmond62, thanks for your efforts here.
The 'key' I want from the data is in the 'postcode' field (column 2) in the spreadsheet.
The other fields I need are in columns 3,4 & 11 (locality, state, sa3name).
mrcoollion's idea with the array is very clever, but I am not sure how you get at the data you need at the end of it.
At present I have a 5 minute break in the middle of online teaching, so I will take a quite a bit until I get round to that.

Re: Using CSV as reference file
Hi,
Result: "Processed 18272 lines in 154 millisecs" - now this didn't take this long.
Your sanitized data are now saved in a custom property to wait for further processing:
Another button, to get the entries matching a certain postcode (in fld "postcode_fld"):
Searching for postcode "4816": "Filtered 41 lines from total 18272 in 20 millisecs" - no matter what, it's always ~20 millisecs here.
Have fun!
Really? A button to read 'em all:
Code: Select all
on mouseUp
answer file "What file?"
if it is empty then exit mouseUp
put URL ("file:" & it) into myVar
put the millisecs into t1 -- we got data, start timer!
delete line 1 of myVar -- kill header line
repeat for each line L in myVar -- extract used data
put clearQuote(item 2 of L) & tab & \
clearQuote(item 3 of L) & tab & \
clearQuote(item 4 of L) & tab & \
clearQuote(item 11 of L) & CR after myData
end repeat
delete char -1 of myData
set the c_myData of this stack to myData -- save for later
put the millisecs - t1 into myTime
answer "Processed" && the number of lines of myData && "lines in" && myTime && "millisecs"
end mouseUp
function clearQuote what
replace quote with empty in what
return what
end clearQuote
Your sanitized data are now saved in a custom property to wait for further processing:
Code: Select all
0200 ANU ACT
0200 Australian National University ACT
0800 DARWIN NT Darwin City
0801 DARWIN NT Darwin City
...
Another button, to get the entries matching a certain postcode (in fld "postcode_fld"):
Code: Select all
on mouseUp
put the c_myData of this stack into myVar -- load data
put fld "postcode_fld" into myPC
if myPC is not in myVar then -- just to be sure ...
answer error "Your postcode doesn't occur!"
exit mouseUp
end if
put the millisecs into t1 -- start timer
set itemdel to tab
repeat for each line L in myVar
if item 1 of L = myPC then -- filter matching lines
put L & CR after myData
end if
end repeat
delete char -1 of myData
put myData into fld "result_fld" -- display results
put the millisecs - t1 into myTime
answer "Filtered" && the number of lines of myData && "lines from total" \
&& the number of lines in myVar && "in" && myTime && "millisecs"
end mouseUp
Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
-
- Posts: 738
- Joined: Thu Sep 11, 2014 1:49 pm
Re: Using CSV as reference file
Could not resist.
Here is a version that imports the CSV, strips all necessary columns, and has a filter field in which you can type the postal code to filter the list.
It is pretty fast, maybe someone can make it faster
Regards,
Paul (MrCoolLion)
Here is a version that imports the CSV, strips all necessary columns, and has a filter field in which you can type the postal code to filter the list.
It is pretty fast, maybe someone can make it faster

Regards,
Paul (MrCoolLion)
-
- Livecode Opensource Backer
- Posts: 10099
- Joined: Fri Feb 19, 2010 10:17 am
Re: Using CSV as reference file
Fantastic!Could not resist.
