Another data storage question

Philhold · Post by **Philhold** » Sun Mar 15, 2009 11:06 pm

The application I am working on will have a small database, less than 100 records and a user interface. There may (will probably) be a need in the future to provide a replacement, updated, user interface whilst retaining the data.

So that I don't start off down the wrong road, what is the best option for data storage so that I can make updating the UI trivial for non techy users?

Thanks

Phil

PS I hope that this is not too general a question for this forum.

FourthWorld · Post by **FourthWorld** » Mon Mar 16, 2009 7:26 am

Not too general at all. I suspect a good many will benefit from this thread you've started.

You could look at data storage in three tiers:

1. Cards, using shared background fields for storage
Good for data sets < 5000 records

2. Custom properties, lists of data stored in a stack file
Good for data sets < 50,000 records (depending on the specifics of what you're doing with the data)

3. DBMS engines like SqLite, MySQL, Valentina, etc.
Good for probably more data than you'd exceed.

There are tradeoffs with each approach, as each has its own unique strengths and weaknesses making each ideally suited for some uses but not others.

For a data set as small as you describe, unless you plan on growing it to become 50 times larger than you currently anticipate, storing the data in shared fields on cards might work well.

The downside for that in your case is that you have the foresight to anticipate needing to make UI changes, and IMO there's little benefit to storing data in fields if those fields aren't also for display.

So you may find the best balance of simplicity and flexibility storing your data as lists in a custom property of a stack file. Then using Rev's convenient chunk exprressions to get data by line and item you can fill in fields in the UI for display and editing.

One the storage side, this article by Sarah Reichelt at revJournal.com may be helpful:
http://www.revjournal.com/tutorials/sav ... ution.html

Any of course if you have any questions about ways to handle the parsing of your data, just post 'em here and we'll chime in. I've spent a fair amount of time benchmarking various different ways to parse text in Rev, since as you know a big chunk of my job is all about parsing data feeds and web templates. So if I can be of help I'd be glad to see what I can do.

Happy scripting -

Philhold · Post by **Philhold** » Mon Mar 16, 2009 10:46 am

Hi Richard,

Many thanks, I think that the article you pointed to provides the answer I was looking for.

I think I may be contemplating trying to run before I can even stand up. Terms like "custom property" just don't mean anything and then you realise that they can (I think) just be chunks of text delimited how ever you want and you just need to write little routines to write them out and read them back in again. Or am I missing something even more simple?

Could you point to snippets that write and read common text data storage formats like tab delimited, CSV etc. Or are custom properties not like that at all?

Sorry to sound so thick but I just can't see the wood for all these trees.

Cheers

Phil

PS I always seem to start using dodgy analogies when I'm floundering.

FourthWorld · Post by **FourthWorld** » Tue Mar 17, 2009 4:23 pm

CSV is the devil's playground. I could go on for pages as to why it is among the worst formats ever devised, but in short there is no single standard, and the myriad ad hoc variants that exist aren't even implemented consistently among products from a single vendor (e.g., Microsoft). Because it simply must die, I won't bother describing the horrors that go into parsing it. It's a favor to the user, and to the computing industry as a whole, to encourage them to use a saner format like tab-delimited instead.

A strong benefit of tab-delimited data in Rev is that you can just put the contents of the data set into a field for display, since Rev's multi-column list fields already use tabs as column delimiters.

As for properties, one could think of them like low-overhead fields that you never see. Indeed, the text of a field is just another property, so the syntax is similar:

Code: Select all

set the text of fld 1 to tMyData
set the uData of fld 1 to tMyData

But with properties you get both more and less than you do with fields: more flexibility, and less overhead.

On the overhead, this post to the use-rev list uses a metaphor no one but me finds humorous to describe the order-of-magnitude more work the engine has to do to store data in a field than it does for a property:
http://lists.runrev.com/pipermail/use-r ... 11477.html

On the flexibility, custom properties in Rev are implemented to mirror the syntax used for its associative arrays. This means you can slice and dice your data however best suits your retrieval needs.

If your data is well suited as a flat table, you could just dump a tab- and return-delimited chunk into a field with the line for uData above.*

But if your data is hierarchical by nature, you can use syntax like:

Code: Select all

set the uData["employee"]["startDate"] to "4/4/2004"

Efficient use of array syntax relative to "repeat for each line" in a return-delimited list is a deep topic, more than we want to get into here. In many cases array notation provides faster access to specific elements, but not in all cases, and the simplicity of tab-delimited data makes maintenance a breeze.

For the moment I'll assume tabbed data will be useful for you, and you're encouraged to experiment with arrays if you find yourself working with hierarchical data.

Some things to look up in the Dictionary for working with tab- and return-delimited data:

- lineoffset: lets you find the line number for a matching string

- itemOffset: does the same for items within a string

- itemDelimiter: can be set to any value; defaults to comma, but can be set to tab

- repeat for each: very fast way to traverse a list

- filter: this very flexible command makes one-liners of many filtering tasks

- hilitedLines: the numbers of the lines currently selected in a list field, delimited by commas

- hilitedText: the actual text of selected lines in a list field, delimited by returns.

Here's a simple handler as an example for filling in a form from the contents of a record associated with a selection in a list field. It assumes that your data is a list of contacts (like the Congress Contacts example included in your WebMerge installation, which I use for a lot of simple testing like this), which has eight fields for each record and the first field is a unique identifier, and that your form fields are in a group named "form":

Code: Select all

-- Script of list field:
on selectionChanged
   set the itemdel to tab
   -- Get the first item of our list:
   put item 1 of the hilitedText of me into tRecordID
   -- Get the data to work on:
   put the uData of stack "MyData" into tData
   -- Find the record; note that putting delimiters around the string we're 
   -- looking for prevents finding erroneous substrings:
   put lineoffset(cr& tRecordID &tab, cr&tData) into tLineOffset
   if tLineOffset = 0 then
     answer "No record found for ID "&quote& tRecordID &quote&"."
     exit to top
   end if 
   -- Now that we know which line, get that line's data:
   put line tLineOffset of tData into tRecord
   -- Fill in the fields in our "Form" group:
   put 0 into i
   repeat for each item tItem in tRecord
       add 1 to i
       put tItem into fld i of grp "Form"
   end repeat
end selectionChanged

(That's off the top of my head, so if it has a bug just consider that an "exercise for the reader" <g>)

In actual practice you'd probably want to put the form-filling stuff into a more generalized handler stored somewhere farther along in the message path so it could be used by other objects. But this will hopefully at least get you started on your own explorations of this sort of data management.

* What's with the "u"? Hungarian-lite:
http://www.fourthworld.com/embassy/arti ... style.html

Philhold · Post by **Philhold** » Tue Mar 17, 2009 5:43 pm

Hi Richard,

Absolutely brilliant, many thanks! I've pasted your post into a Yojimbo note until I decide to buy Scripters Scrapbook when it will go in there.

Whilst busy doing other things today I've been thinking that perhaps the best option is to write an export/import routine to allow for the GUI to be updated and to allow it to be transferred to a different application if that became necessary at some stage in the future.

Meanwhile I think I'll go away and digest your post and try to learn how to use tab delimited format in custom properties as that sounds like a really good idea.

Best wishes

Phil

PS I like the bucket down the hall analogy

Philhold · Post by **Philhold** » Wed Mar 18, 2009 5:54 pm

Hi Richard,

I had a play around with your script last night and together with fwFormMaker and the data from the WebMerge Demo I downloaded I got the whole thing working exactly as expected.

I munged a bit of your data into tab delimited and cut and pasted that into the "Property Contents" panel of the stack's custom properties pane and called it uData.

Now having done that I understand a bit about how custom properties work. It takes so getting used to that any object can have lots of custom properties each with loads of data stored in them.

I'm now playing around with looping scripts to add data and will get onto a script to update existing data later. But ...

I find the lack of pre-existing routines for doing this kind of basic stuff a tad frustrating. If I want to make an SQL database there's no problem because the INSERT, SELECT, UPDATE and DELETE techniques are all fairly standard at the lower level. So you can climb on board and have something working fairly quickly. With other types of data storage however there are too many choices. You are provided with a selection of "design your own wheel packs" but it would be much better if a choice of wheels was part of Runtime Revolution and all you needed to decide which one and how fast you wanted to spin it. At least at first. Once you got used to standard data storage techniques you could then move on to trying to design something even better for your purposes.

I found another of your articles here:
http://www.sonsothunder.com/devres/revo ... stk001.htm which helps a little further to explain what is required.

Best wishes

Phil

FourthWorld · Post by **FourthWorld** » Wed Mar 18, 2009 10:54 pm

Boundaries. They define our space, giving is the freedom to explore in finite spce. Life is hard without them.

I hear ya' on reinventing the wheel, but the difficulty is that folks tend to fall into two camps on data storage: they either use a database, or they use a million different solutions specific to their app.

Rev provides pretty good support for using DBs as data stores, made even easier with Trevor's libDB:
http://www.bluemangolearning.com/developer/revolution/

But generic data stores for document-centric apps like I tend to make are a bigger challenge, because everyone likes to do things differently: some use stacks like I do, but others use stacks with properties in nested arrays, others store stuff in fields their UI stacks (I know, I know, but for quick-n-dirty apps it's so simple to do that it's hard to resist), and others define their own formats like most of the rest of the world does.

Most folks see data storage in a rather Boolean way: If you have really small data sets, consider fields on cards; for everything else use a database.

I love DBs for massive data sets, but for document-centric stuff I need more flexibility and I prefer control over how the data is stored. And DB files just aren't well suited from an API standpoint for use as document files.

This way of using custom props as generic data stores we've been discussing here is not as common as you might think, even though it's darned convenient.

It might be soon, though: I've been crafting a library to make that sort of thing as simple as I can while providing reasonable performance and flexibility.

Drop me an email and we'll pick this conversation up there. It's a long way from being documented, but perhaps with your help it can become something others can use.

ambassador@fourthworld.com

gyroscope · Post by **gyroscope** » Wed Mar 18, 2009 11:02 pm

If I may join in here in this thread Phil and Richard,

I've been crafting a library to make that sort of thing as simple as I can while providing reasonable performance and flexibility.

That sounds terrific; especially if there was step by step documentation on how to set up a custom property for a series of text fields per record; I'd pay for that...

Simon Knight · Post by **Simon Knight** » Mon Sep 27, 2010 8:49 am

Hi,

I hope it is o.k. to post to a thread that is over a year old - well here goes

I am creating an application that uses a data description file which is in TSV format (I agree with Richard CSV is almost hopeless!). The idea is that the app will read in the data in the file and store its content in a custom property which will persist between different sessions. Once loaded I only expect to have to update the data description to correct errors.

At present my custom property is storing the file as a direct copy of the TSV file because I think I read somewhere that it was not possible to use an array as a custom property. When my application runs it processes the information stored in the custom properties into arrays which are then used by the rest of the application.

It seems that I may have been making things harder than necessary as the snip of code and the description from above implies that custom properties can be treated/used like arrays :

Code: Select all

set the uData["employee"]["startDate"] to "4/4/2004"

Am I correct?

bn · Post by bn » Mon Sep 27, 2010 10:06 am

Hi Simon,
good to see you around.

set the uData["employee"]["startDate"] to "4/4/2004"

is that from a working script or is it an example script? I can not get the syntax to work. For a custom property it lacks the object descriptor. E.g. set the uData[x][x] of object to xxx.

From all I know and what I do is to store arrays in customproperties and when working with them taking the array out of the custom property an put it into a local/script local variable. Than I work with the array from there and restore the custom property once I am finished. That works very well.

As far as I know you can not work directly into a custom property.
From the user guide p 234

Changing a part of a property
Like built-in properties, custom properties are not containers, so you cannot use a chunk expression to change a part of the custom property. Instead, you put the property's value into a variable and change the variable, then set the custom property back to the new variable contents:
put the lastCall of this card into myVar
put "March" into word 3 of myVar
set the lastCall of thisCard to myVar

So basically you could import your data as tab delimited, turn that into an array which you store in a custom property. When accessing the array in a script retrieve the array from the custom property and put it into a variable. etc.
regards
Bernd

Simon Knight · Post by **Simon Knight** » Mon Sep 27, 2010 11:45 am

Hi Bernd,

Its good to here from you. I've been busy during the summer doing out doors activities so I've not had much time for programming. The snag is I've forgotten some of the things I learnt in the beginning of the year.

The code snip is a direct copy from a post by Richard from Forthworld. I've quoted a little more below.

If your data is well suited as a flat table, you could just dump a tab- and return-delimited chunk into a field with the line for uData above.*

But if your data is hierarchical by nature, you can use syntax like:
CODE: SELECT ALL
set the uData["employee"]["startDate"] to "4/4/2004"

Efficient use of array syntax relative to "repeat for each line" in a return-delimited list is a deep topic, more than we want to get into here. In many cases array notation provides faster access to specific elements, but not in all cases, and the simplicity of tab-delimited data makes maintenance a breeze.

This was a surprise to me as I believe that a custom property may not be an array.

Your statement

From all I know and what I do is to store arrays in customproperties and when working with them taking the array out of the custom property an put it into a local/script local variable. Than I work with the array from there and restore the custom property once I am finished. That works very well.

interests me; do you mean that you are able to use syntax like:

Code: Select all

set the  uCustomProperty of myStack to sMyDataArray

to store an array in a custom property?
At present I am doing the following:

Code: Select all

 --populate the array that will be used to store the data
   set the itemDelimiter to tab
   put empty into s138Fields  -- set the local array variable to empty
   -- s138Fields is an array variable
   put the c138Fields of this stack into tFieldData  --copy the property var into tFieldData
   repeat for each line tLine in tFieldData  --Loop through tFieldData and copy the firsy item into local s138Fields
      put tline into s138Fields[item 1 of tline] --stores the data line indexed by field index e.g. 138-121
   end repeat

which is a bit harder, note I used c to denote a custom variable in the code above. I've just realised that I can probably answer the question with a little experimentation.....

best wishes
Simon

Klaus · Post by **Klaus** » Mon Sep 27, 2010 11:52 am

Simon Knight wrote:...
This was a surprise to me as I believe that a custom property may not be an array.

With the introduction of multi-dimensional arrays, Rev also enabled custom properties to store them.
And I like it

Best

Klaus

bn · Post by bn » Mon Sep 27, 2010 12:16 pm

Hi Simon,

set the uCustomProperty of myStack to sMyDataArray

Yes, that is what I do.
As Klaus pointed out you can also store mulitdimensional arrays in a custom property, and you can send arrays as a parameter, which also was not possible before. So now Arrays behave almost like any other variable. That makes it a lot easier since you don't have to remember special cases.

The only difference with regards to custom properties that I know of is that you can not read out elements of an array that is in a custom property directly (hence the need to put the custom property into a variable) but suppose your custom property cData contains "1,4,8" you could say

Code: Select all

put item 2 of the cData of this card into field "x"

I think it is best to just take the custom properties into a variable, work with that variable and set the custom property to that variable if you want to reflect the changes of the variable in the custom property.

As to what Richard wrote

set the uData["employee"]["startDate"] to "4/4/2004"

I dont get this to work and I guess Richard gave an example for a multi-dimensional array with the syntax slightly off.

regards

Bernd

Simon Knight · Post by **Simon Knight** » Mon Sep 27, 2010 2:43 pm

Thanks Bernd,

I have been having a play and think I have it working o.k. I have managed to select a number of items in an array based on one of the two keys (using intersect) and am now trying to extract the data elements as a list to be displayed in a list control to allow the user to make a selection.

All good clean fun!

Simon

FourthWorld · Post by **FourthWorld** » Mon Sep 27, 2010 2:53 pm

bn wrote:As to what Richard wrote
set the uData["employee"]["startDate"] to "4/4/2004"
I dont get this to work and I guess Richard gave an example for a multi-dimensional array with the syntax slightly off.

Can I used the "I was -pre-coffee" excuse?

I'm not sure what I was thinking at the time, but the correct syntax for storing that two-dimensional value would be:

Code: Select all

set the employee["startdate"] of me to "4/4/04"

That gives you a property set named "employee", with an element named "startData" containing the value "4/4/04".

Because Rev objects have multiple property sets, one can store two-dimensional arrays easily and with much better performance than is possible for persistent array data than any other means, such as using arrayEncode to store to a file or storing the array as a property element.

However, as noted by the posters above, if your data has more than two dimensions you have little choice but to use one of the slower methods for storage.

LiveCode Forums.

Another data storage question

Another data storage question

Re: Another data storage question

Re: Another data storage question

Re: Another data storage question

Re: Another data storage question

Re: Another data storage question

Re: Another data storage question

Re: Another data storage question