Page 2 of 2

Posted: Thu Apr 09, 2009 12:48 am
by sturgis
Besides the speed issue, it just seems easier to me to eat a whole file at once, then do with it as you see fit. That's what I used to do with perl, and it worked great.

I have to say I'm loving rev. With languages I've used in the past there were a few small "AH HAH!" moments, and a bunch of the "I'm totally lost again." moments. For me with rev, the reverse is true so far. Lots of Ah HAH! and not very much time spent totally lost.

Thx again for the file reading tests and education. It'll be useful to know as I progress.
massung wrote:
sturgis wrote:@massung, ...In the case of the script chris is working on, and since as you stated, rev is serial, cumulative would be the answer here right?
Correct.

Just for a fun follow-up. I ran one more test on my 32 MB file to test caching myself, to compare against what Rev should be doing under-the-hood (but isn't). Reading 4K at a time, then parsing each line from that 4K, then reading 4K more, rinse, repeat...

Total time = 0.72 seconds

As you can see, using Rev to read a (large) file line-by-line is just horrendously slow. You'll be much better off just doing it yourself.

That's very sad. :(

Jeff M.

Posted: Thu Apr 09, 2009 12:50 am
by sturgis
Grats Chris, glad it worked.
chris9610 wrote:OK the gig is up. sturgis was correct!

I added 2 lines of code.

Code: Select all

revexecuteSQL gConID, "begin;"
before the repeat
and

Code: Select all

revexecuteSQL gConID, "commit;"
after end repeat

This solved the problem and execution time went from 25+ minutes to 12 seconds. No other changes.

This is fast enough for me.

Posted: Thu Apr 09, 2009 7:28 pm
by massung
Just thought of one more test to run, and thought this might really help you out, Chris:

Code: Select all

open file tFile

-- read the entire file in 1 shot
read from file tFile until eof

-- loop over every line of the file, but w/o get
repeat for each line tLine in it
  # TODO: your SQL stuff here
end repeat

close file tFile
On my 35 MB file, this code executes in about 0.25 seconds. Obviously the repeat is just creeping along the contents of the file, while using get line n from ... must be starting again from the beginning of the string each time.

Anyway, that should yield you very fast results, I'd think.

Jeff M.

Posted: Fri Apr 10, 2009 2:41 pm
by Bernard
Just wanted to say Jeff, that "repeat for each" is the way that experienced Rev developers use. I remember being told about it years ago by Richard Gaskin.

I just went to check, and the Dictionary does say that "this form is much faster than the with countVariable = startValue to endValue form when looping through the chunks of a container", and gives an example that pretty much matches what you've been doing.

By and large, I have to say that I'm rarely disappointed by the speed of Rev.

Posted: Fri Apr 10, 2009 3:25 pm
by sturgis
Hey, would my own short code examples run faster with

Code: Select all

repeat 60000 times
And then increment my own counter, rather than the with i = 1 to 60000.
Or do they run about the same?
Bernard wrote:Just wanted to say Jeff, that "repeat for each" is the way that experienced Rev developers use. I remember being told about it years ago by Richard Gaskin.

I just went to check, and the Dictionary does say that "this form is much faster than the with countVariable = startValue to endValue form when looping through the chunks of a container", and gives an example that pretty much matches what you've been doing.

By and large, I have to say that I'm rarely disappointed by the speed of Rev.

Posted: Sat Apr 11, 2009 11:47 am
by Bernard
do they run about the same?
For a loop counter I don't think that your suggestion will make any difference. The 'repeat for each' form offers a huge increase in speed when one is processing chunks from a container. Without it I beileve what happens is that Rev's engine is going through the container finding chunk n+1 each time, and it's this seek operation that slows down using a repeat counter. On a small enough container (e.g. 10 lines) there's unlikely to be any perceptible difference between the alternative looping methods.

Where you might see a speed increase is with an alternative form of revExecuteSql, where you use bind variables and an array holding the appropriate values. There's information in the Dictionary, but it's also found in this Newsletter about the 2.9 release: http://www.runrev.com/newsletter/februa ... etter1.php .[/quote]