Page 1 of 2
Regex not working
Posted: Thu Dec 17, 2020 5:26 pm
by micro04
I can not get the Rexex expression to work correctly.
I am tying to find lines matching the pattern X12345Y12345 from a very long list. The pattern [X] will works, but I want to make sure I match
X followed by a 5 digit mumber then Y followed by a 5 digit number. Found a regex test web site, but Livecode version of Regex seems different.
Code: Select all
filter lines of field "fld_text" matching "[X]\d{5}" into field "fld_results"
Re: Regex not working
Posted: Thu Dec 17, 2020 6:39 pm
by dunbarx
Hi.
I am no regex guy at all, but the old fashioned way would be:
Code: Select all
on mouseUp
repeat with y = 1 to 10000
put "X" & random(88888) + 10000 & "Y" & random(88888) + 10000 into line y of temp
end repeat
put "X12345Y12345" into line random(10000) of temp
---actual routine starts here
put 1 into tIndex
repeat for each line tLine in temp
if tLine= "X12345Y12345" then
exit repeat
end if
add 1 to tIndex
end repeat
answer tIndex
end mouseUp
Not sure how long your list is. The list above is 10,000 lines. only a second or two was needed to find the line of interest
The entire first section is just to create a list with one valid entry somewhere in it. You can simply substitute (pseudo):
for that section.
Craig
Re: Regex not working
Posted: Thu Dec 17, 2020 11:08 pm
by micro04
Thanks for replying.
I do not thinks this would work for what I am trying to do.
I need to find lines of data that match this general pattern. I am not trying to find specifically X12345Y12345. I need to find any line that only contains
X{5 digit number]Y{5 digit number}.
examples:
X89888Y45678
X45898Y34833
I need Regex to find all lines that match the pattern.The field I am reading also contains other information that I do not want in the final list.
Re: Regex not working
Posted: Thu Dec 17, 2020 11:47 pm
by dunbarx
Hi.
Piece of cake. Before all this fancy regex stuff, we actually worked for a living.
This handler looks for a properly formatted string.
Code: Select all
on mouseUp
--insert your data into temp
put 1 into tIndex
repeat for each line tLine in temp
if char 1 of tLine = "X" and char 2 to 6 of tLine is a number and char 7 of tLine = "Y" and char 8 to 12 of tLine is a number\
and the length of tLine = 12 then
exit repeat
end if
add 1 to tIndex
end repeat
answer tIndex
end mouseUp
Craig
Re: Regex not working
Posted: Fri Dec 18, 2020 12:05 am
by dunbarx
Here is a test handler that you can step through to see how it worked in the old days. To be fair, the "filter" command and regex itself is much more modern and faster.
Code: Select all
on mouseUp
put "X123Y123" into line 1 of temp
put "X1234567Y123" into line 2 of temp
put "X12345Y12345" into line 3 of temp
put "X123Y1234567" into line 4 of temp
put 1 into tIndex
breakpoint
repeat for each line tLine in temp
if char 1 of tLine = "X" and char 2 to 6 of tLine is a number and char 7 of tLine = "Y" and \
char 8 to 12 of tLine is a number and the length of tLine = 12 then exit repeat
add 1 to tIndex
end repeat
answer tIndex
end mouseUp
Craig
Re: Regex not working
Posted: Fri Dec 18, 2020 12:41 am
by micro04
That should help.
Still want to find out why the regex was not working. Livecode regex must be different from other programming languages.
Re: Regex not working
Posted: Fri Dec 18, 2020 1:23 am
by dunbarx
A guy named "Thierry" might just chime in soon.
Craig
Re: Regex not working
Posted: Fri Dec 18, 2020 4:41 am
by hpsh
seems this works for me
-- Sent when the mouse is released after clicking
-- pMouseButton specifies which mouse button was pressed
Code: Select all
-- Sent when the mouse is released after clicking
-- pMouseButton specifies which mouse button was pressed
on mouseUp pMouseButton
put empty into field "result"
repeat for each line tLine in field "input"
if matchtext(tLine,"[X]\d{5}") then
put tLine&cr after field "result"
end if
end repeat
end mouseUp
using 2 scrolling fields named input and result, hope it helps but this is written at 4 am
edited because I still don´t get the difference between am and pm at age 52 LOL
Re: Regex not working
Posted: Fri Dec 18, 2020 5:11 am
by hpsh
darn it, had to check this filter thingy
Code: Select all
-- Sent when the mouse is released after clicking
-- pMouseButton specifies which mouse button was pressed
on mouseUp pMouseButton
local mText
put field "input" into mText
filter lines of mText matching regex"[X]\d{5}"
put mText into field "result"
end mouseUp
happy coding folks

Re: Regex not working
Posted: Fri Dec 18, 2020 11:27 am
by micro04
Thanks,
Got it working, must have the word regex in the program line.
Code: Select all
filter lines of field "fld_text" matching regex "[X]\d{5}[Y]\d{5}" into field "fld_results"
Re: Regex not working
Posted: Fri Dec 18, 2020 2:47 pm
by dunbarx
How long does the regex solution take to find a line among 10,000?
Craig
Re: Regex not working
Posted: Fri Dec 18, 2020 6:36 pm
by FourthWorld
dunbarx wrote: ↑Fri Dec 18, 2020 2:47 pm
How long does the regex solution take to find a line among 10,000?
Comparative benchmarks with real-world uses would be very interesting.
Many years ago I ran one which favored looping chunk expressions, though I don't imagine that would be any sort of universal rule.
With enough comparisons we may be able to discern patterns that can guide us to the most efficient option for a given type of task.
Re: Regex not working
Posted: Fri Dec 18, 2020 7:04 pm
by jacque
Mark Waddingham once told me there's no specific answer. Execution time depends on the content of the data, length of the data, and structure of the regex. If timing is important, you'd need to test both methods per each example.
Re: Regex not working
Posted: Fri Dec 18, 2020 7:38 pm
by FourthWorld
jacque wrote: ↑Fri Dec 18, 2020 7:04 pm
Mark Waddingham once told me there's no specific answer. Execution time depends on the content of the data, length of the data, and structure of the regex. If timing is important, you'd need to test both methods per each example.
Exactly. It's not the algo, but the application of the algo to a certain type of problem. But they aren't random, they follow patterns.
I've done enough comparative benchmarking between arrays and chunks to have a fairly useful sense of when to use each. With enough benchmarking of regex vs chunks, similarly useful guidance may emerge.
Re: Regex not working
Posted: Fri Dec 18, 2020 8:26 pm
by hpsh
for me, 10000 lines of random text, with some hit and misses, takes 5 ticks with the filter variation, and something like 160 with the for each
but if the hits are put into a string, and after that is put into the result field it is pretty much the same
so seems to me it pretty fast, but yeah, the more lines, the slower it will go