Detecting and extracting a section of text and calculating
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Detecting and extracting a section of text and calculating
Hello everyone,
I am new to RevMedia, creating my first program, and finding it difficult to find the appropiate command/s to search for a line from within a text document for a predefined set of text then read a set amount of characters then do a calculation. Take the following line:
<Sample21131>IX=21131 TC=147 IN=1 TT=L16:03:12:00 TB=00000000 RF=2 RW=1 RA=1 HW=0 IS=PAL OS=625 LA=114 LP=229 DN=0 DL=0 MH=0 DW=1 DA=1 CP=1 UP=153 UM=79 UA=121 VP=164 VM=104 VA=127 NR=1 MD=1 A1=1282 A2=1296 P1=2981 P2=3016 S1=0 S2=0 A3=0 A4=0 P3=0 P4=0 S3=1 S4=1 SV=1 CC=0</Sample21131>
This is what I would like to do within RevMedia:
After scanning the document for <Sample####> (the sample number is incremental within the file in this case its 21131) I need to then find the text TT=L within the line and extract the number following - 16:03:12:00, then extract the : between the numbers and then preform a calcualtion on the number.
I am not having much success trying to learn how to do this so hopefully someone can help me out.
Kind regards,
Scott
I am new to RevMedia, creating my first program, and finding it difficult to find the appropiate command/s to search for a line from within a text document for a predefined set of text then read a set amount of characters then do a calculation. Take the following line:
<Sample21131>IX=21131 TC=147 IN=1 TT=L16:03:12:00 TB=00000000 RF=2 RW=1 RA=1 HW=0 IS=PAL OS=625 LA=114 LP=229 DN=0 DL=0 MH=0 DW=1 DA=1 CP=1 UP=153 UM=79 UA=121 VP=164 VM=104 VA=127 NR=1 MD=1 A1=1282 A2=1296 P1=2981 P2=3016 S1=0 S2=0 A3=0 A4=0 P3=0 P4=0 S3=1 S4=1 SV=1 CC=0</Sample21131>
This is what I would like to do within RevMedia:
After scanning the document for <Sample####> (the sample number is incremental within the file in this case its 21131) I need to then find the text TT=L within the line and extract the number following - 16:03:12:00, then extract the : between the numbers and then preform a calcualtion on the number.
I am not having much success trying to learn how to do this so hopefully someone can help me out.
Kind regards,
Scott
Re: Detecting and extracting a section of text and calculating
Hi Scott,
Welcome
i'm a poor frenchy and i'm very happy when i can answer before the english staff
if i understand what you want, you can try this stack
The script of btn "searchMyTime" :
All the best
Jean-Marc
Welcome
i'm a poor frenchy and i'm very happy when i can answer before the english staff

if i understand what you want, you can try this stack
The script of btn "searchMyTime" :
Code: Select all
on mouseUp
put fld "fMydata" into bufT
put "TT=L" into bufS
put the num of chars of bufS into nbc
put offset(bufS,bufT) into UnOff
put unoff+nbc into depPosTime
put "0123456789:" into TheCharOK
repeat with i = depPosTime to depPosTime+100
put char i of bufT into UnC
if UnC is in TheCharOK then
put i into EndPosTime
next repeat
else
exit repeat
end if
end repeat
put char depPosTime to EndPosTime of bufT into MyTime
put MyTime into fld "fMyTime"
end mouseUp
Jean-Marc
- Attachments
-
- searchmytime.rev.zip
- (1.57 KiB) Downloaded 270 times
https://alternatic.ch
Re: Detecting and extracting a section of text and calculating
Hi Scott,
welcome to the forum.
I made a little stack with a lot of comments that should give you some ideas.
Look up in the dictionary what you dont know or ask here.
the stack has the .livecode suffix. If you change that into .rev or you open that file from within RevMedia it will work.
best regards
Bernd
Edit: is just see that Jean-Marc has posted a script. Two is better than one
welcome to the forum.
I made a little stack with a lot of comments that should give you some ideas.
Look up in the dictionary what you dont know or ask here.
the stack has the .livecode suffix. If you change that into .rev or you open that file from within RevMedia it will work.
best regards
Bernd
Edit: is just see that Jean-Marc has posted a script. Two is better than one

- Attachments
-
- extractingFromLinesScottM.livecode.zip
- (2.29 KiB) Downloaded 254 times
Re: Detecting and extracting a section of text and calculating
Thank you Jean-Marc and Bernd you help is much appreciated.
I have used various forms of basic about ten years ago and only just started to get back in to developing software. I need to get my head around the scripting language and syntax of RevMedia. The manual helps but I find learning from others is a lot quicker so you may find me frequently posting as I continue to learn.
Kind Regards,
Scott
I have used various forms of basic about ten years ago and only just started to get back in to developing software. I need to get my head around the scripting language and syntax of RevMedia. The manual helps but I find learning from others is a lot quicker so you may find me frequently posting as I continue to learn.
Kind Regards,
Scott
Re: Detecting and extracting a section of text and calculating
Hi Bernd and Jean-Marc,
I have another problem. The files I need to extract the information from can average 90mb in size and is from a XML with average 28000 lines of text. Bernd your example is great but when I copy and paste a complete XML in to the 'origData' field RevMedia hangs when trying to extract the values.
I have manged to allow the user to load the file as a whole into a field but what I would like to do is read a line of text from a file then repeately do the extraction of the TT value until it reaches the end of file eliminating the need to load all the information in to memory. Is this possible?
Kind regards,
Scott
I have another problem. The files I need to extract the information from can average 90mb in size and is from a XML with average 28000 lines of text. Bernd your example is great but when I copy and paste a complete XML in to the 'origData' field RevMedia hangs when trying to extract the values.
I have manged to allow the user to load the file as a whole into a field but what I would like to do is read a line of text from a file then repeately do the extraction of the TT value until it reaches the end of file eliminating the need to load all the information in to memory. Is this possible?
Kind regards,
Scott
Re: Detecting and extracting a section of text and calculating
Marc,
could you post a couple of lines representative of the data?
You can access the file line by line. But this is slow. Even the repeat structure I used in the example is slow. There is another one 'repeat for each line aLine in tData' that is a lot faster. And I guess you could load 90 MB into Ram.
Once we figure out why the script hangs I would post a faster handler, the one I posted is easier to understand. And you would have to indicate in what form you want the extracted data, do you want a list of it?
regards
Bernd
could you post a couple of lines representative of the data?
You can access the file line by line. But this is slow. Even the repeat structure I used in the example is slow. There is another one 'repeat for each line aLine in tData' that is a lot faster. And I guess you could load 90 MB into Ram.
Once we figure out why the script hangs I would post a faster handler, the one I posted is easier to understand. And you would have to indicate in what form you want the extracted data, do you want a list of it?
regards
Bernd
Re: Detecting and extracting a section of text and calculating
Hi,
Sorry for the late reply. The XML is large therefore here is the link to download the zipped file from YouSendIt. The link will be valid for 1 week.
https://www.yousendit.com/download/dkly ... VG52Wmc9PQ
As for the data I would like to query the time code of each sample within the file and check if it is sequentail. If it is not then it would be considered a break in time and also counting any duplicate time code. The result would be returning the amount of time code breaks and duplicates, if any, from within the XML and generating a text document listing the break and duplicates points.
I am currently teaching myself about chunks in RevMedia. Is this the right path to take to do the above?
Kind regards,
Scott
Sorry for the late reply. The XML is large therefore here is the link to download the zipped file from YouSendIt. The link will be valid for 1 week.
https://www.yousendit.com/download/dkly ... VG52Wmc9PQ
As for the data I would like to query the time code of each sample within the file and check if it is sequentail. If it is not then it would be considered a break in time and also counting any duplicate time code. The result would be returning the amount of time code breaks and duplicates, if any, from within the XML and generating a text document listing the break and duplicates points.
I am currently teaching myself about chunks in RevMedia. Is this the right path to take to do the above?
Kind regards,
Scott
Re: Detecting and extracting a section of text and calculating
Scott,
I downloaded your data and made a small stack to extract the occurenc of multiple samples with the same timecode. Have a look and see the comments in the script. It works on your test data but does little error checking. So if your data is not always exactly the same structure you might have to adapt/error check.
As for the increment in the timecode: there is some inconsistency in your data I dont understand:
I am not shure if that is the way you want your data or if this is an error in the time stamping. This happens for all the increments in minutes and also affects the increment in hours:
regards
Bernd
I downloaded your data and made a small stack to extract the occurenc of multiple samples with the same timecode. Have a look and see the comments in the script. It works on your test data but does little error checking. So if your data is not always exactly the same structure you might have to adapt/error check.
As for the increment in the timecode: there is some inconsistency in your data I dont understand:
if you look at the data you will notice that the increment from seconds to minutes is not consistent for the first entry for the new minute (15:55), it takes the old minute and resets that to zero (15:54:00:00) instead of 15:55:00:00<Sample8829>IX=8829 TC=133 IN=1 TT=L15:54:59:23 TB=00000000
<Sample8830>IX=8830 TC=134 IN=1 TT=L15:54:59:24 TB=00000000
<Sample8831>IX=8831 TC=135 IN=1 TT=L15:54:00:00 TB=00000000
<Sample8832>IX=8832 TC=136 IN=1 TT=L15:55:00:01 TB=00000000
<Sample8833>IX=8833 TC=137 IN=1 TT=L15:55:00:02 TB=00000000
<Sample8834>IX=8834 TC=138 IN=1 TT=L15:55:00:03 TB=00000000
I am not shure if that is the way you want your data or if this is an error in the time stamping. This happens for all the increments in minutes and also affects the increment in hours:
That is one of the reasons why I did not attempt to look for the breaks, because I am not shure about this. It would throw an error,but I dont know if it is supposed to be this way. What happens if you have more than 24 Hours?<Sample16329>IX=16329 TC=209 IN=1 TT=L15:59:59:23 TB=00000000
<Sample16330>IX=16330 TC=210 IN=1 TT=L15:59:59:24 TB=00000000
<Sample16331>IX=16331 TC=211 IN=1 TT=L15:59:00:00 TB=00000000
<Sample16332>IX=16332 TC=212 IN=1 TT=L16:00:00:01 TB=00000000
<Sample16333>IX=16333 TC=213 IN=1 TT=L16:00:00:02 TB=00000000
regards
Bernd
- Attachments
-
- scanTimeCode ScottM.livecode.zip
- (5.92 KiB) Downloaded 248 times
Re: Detecting and extracting a section of text and calculating
Wow Bernd,
Thank you and very much appreciated. I am amazed how quickly it loaded and produced the list. I have written a simular program in thinBasic, wrote it in 1 day, that checks for duplicates and time code errors. It is very, very slow. It would of taken me a very long time to work this out and do it in RevMedia though I am learning. I have thinBasic create a text file with the structure of the results as follows.
271591 = 14:18:30:00
272091 = 14:18:00:00
272268 = 14:19:07:01 (Duplicate TC)
272269 = 14:19:07:03
273591 = 14:19:50:00
274091 = 14:20:10:00
274309 = 14:20:28:17 (Duplicate TC)
274310 = 14:20:28:19
274841 = 14:20:40:00
275091 = 14:20:50:00
275341 = 14:21:00:00
276591 = 14:21:00:00
277027 = 14:22:17:10 (Duplicate TC)
277028 = 14:22:17:12
277824 = 14:22:49:07 (Duplicate TC)
277825 = 14:22:49:09
277841 = 14:22:40:00
278091 = 14:22:50:00
278341 = 14:23:00:00
278841 = 14:23:20:00
279591 = 14:23:50:00
280091 = 14:24:10:00
280261 = 14:24:26:19 (Duplicate TC)
280262 = 14:24:26:21
280341 = 14:24:20:00
280479 = 14:24:35:12 (Duplicate TC)
280480 = 14:24:35:14
280868 = 14:24:51:01 (Duplicate TC)
280869 = 14:24:51:03
281032 = 14:24:57:15 (Duplicate TC)
281033 = 14:24:57:17
281091 = 14:24:00:00
281158 = 14:25:02:16 (Duplicate TC)
281159 = 14:25:02:16 (Duplicate TC)
281160 = 14:25:02:16 (Duplicate TC)
281161 = 14:25:02:16 (Duplicate TC)
281162 = 14:25:02:15
281163 = 14:25:02:15 (Duplicate TC)
281164 = 14:25:02:15 (Duplicate TC)
281165 = 14:25:02:15 (Duplicate TC)
281166 = 14:25:02:15 (Duplicate TC)
281167 = 14:25:02:15 (Duplicate TC)
TC Break Count = 739
Duplicate TC Count = 289
Total TC Issues = 1028
The first number is the sample number taken from the XML. Second number is the timecode listing time code breaks (any time code that is not sequential, like what you are seeing) and duplicates. Then tags the text file with the amount of time code breaks, duplicates and then a total of time code issues.
I am grateful for you support Bernd and learning very fast from your efforts. Was really amazed how quickly RevMedia performed the task. I will be purchasing complete version of livecode by the end of the week.
Kind Regards,
Scott
Thank you and very much appreciated. I am amazed how quickly it loaded and produced the list. I have written a simular program in thinBasic, wrote it in 1 day, that checks for duplicates and time code errors. It is very, very slow. It would of taken me a very long time to work this out and do it in RevMedia though I am learning. I have thinBasic create a text file with the structure of the results as follows.
271591 = 14:18:30:00
272091 = 14:18:00:00
272268 = 14:19:07:01 (Duplicate TC)
272269 = 14:19:07:03
273591 = 14:19:50:00
274091 = 14:20:10:00
274309 = 14:20:28:17 (Duplicate TC)
274310 = 14:20:28:19
274841 = 14:20:40:00
275091 = 14:20:50:00
275341 = 14:21:00:00
276591 = 14:21:00:00
277027 = 14:22:17:10 (Duplicate TC)
277028 = 14:22:17:12
277824 = 14:22:49:07 (Duplicate TC)
277825 = 14:22:49:09
277841 = 14:22:40:00
278091 = 14:22:50:00
278341 = 14:23:00:00
278841 = 14:23:20:00
279591 = 14:23:50:00
280091 = 14:24:10:00
280261 = 14:24:26:19 (Duplicate TC)
280262 = 14:24:26:21
280341 = 14:24:20:00
280479 = 14:24:35:12 (Duplicate TC)
280480 = 14:24:35:14
280868 = 14:24:51:01 (Duplicate TC)
280869 = 14:24:51:03
281032 = 14:24:57:15 (Duplicate TC)
281033 = 14:24:57:17
281091 = 14:24:00:00
281158 = 14:25:02:16 (Duplicate TC)
281159 = 14:25:02:16 (Duplicate TC)
281160 = 14:25:02:16 (Duplicate TC)
281161 = 14:25:02:16 (Duplicate TC)
281162 = 14:25:02:15
281163 = 14:25:02:15 (Duplicate TC)
281164 = 14:25:02:15 (Duplicate TC)
281165 = 14:25:02:15 (Duplicate TC)
281166 = 14:25:02:15 (Duplicate TC)
281167 = 14:25:02:15 (Duplicate TC)
TC Break Count = 739
Duplicate TC Count = 289
Total TC Issues = 1028
The first number is the sample number taken from the XML. Second number is the timecode listing time code breaks (any time code that is not sequential, like what you are seeing) and duplicates. Then tags the text file with the amount of time code breaks, duplicates and then a total of time code issues.
I am grateful for you support Bernd and learning very fast from your efforts. Was really amazed how quickly RevMedia performed the task. I will be purchasing complete version of livecode by the end of the week.
Kind Regards,
Scott
Re: Detecting and extracting a section of text and calculating
Scott,
I added the calculation of the time code. The format is not what you want but it puts everything into the field.
No guarantees though. I did test the code against a small sample with known problems. But it is up to you to make shure it is working correctly.
Also to adapt the format to your liking.
But with the comments and reading up on Livecode you should fairly soon be able to do that.
good luck, and dont hesitate to ask when you get stuck.
regards
Bernd
Edit Oh, dont be disappointed, now it takes 2,5 seconds to take your 90 MB, 300.000 plus lines apart.
I added the calculation of the time code. The format is not what you want but it puts everything into the field.
No guarantees though. I did test the code against a small sample with known problems. But it is up to you to make shure it is working correctly.
Also to adapt the format to your liking.
But with the comments and reading up on Livecode you should fairly soon be able to do that.
good luck, and dont hesitate to ask when you get stuck.
regards
Bernd
Edit Oh, dont be disappointed, now it takes 2,5 seconds to take your 90 MB, 300.000 plus lines apart.

- Attachments
-
- scanTimeCode ScottM-2.livecode.zip
- (12.25 KiB) Downloaded 255 times
Re: Detecting and extracting a section of text and calculating
Lol Bernd, 2.5 seconds, so sloooow
Thank you for your help, I have learnt a lot in a short time.
Kind Regards,
Scott

Thank you for your help, I have learnt a lot in a short time.
Kind Regards,
Scott