I have an application I have been working on that uses regular expressions to extract email addresses from my customer's website.
on mouseUp
put text of field "myHTML" into myHTMLvar
put matchText(myHTMLvar,"([\w]+@[\w]{2,}+\.[\w]{2,}(?R)?)",myEMAIL) into myresult
answer myresult
answer myEMAIL
end mouseUp
I get the first email address just fine, but I am not understanding how to get the PCRE to do recursion. PCRE does not accept the /g modifier and after several days of searching for an alternative (like a combination of (?m) and ^ and $, I am at a loss for how to solve this problem via PCRE regex. I know it is possible, but I am just not getting it.
All the PHP examples online use a function called preg_match_all, but that does not appear to be valid in LiveCode.
I assume I may have to use a chunk variant to find where the first match hits and then start the next match from that point, thus writing a recursive function myself, but I am hoping that is not the case.
Any insight is appreciated.
Rick
Not following RegEx recursion in PCRE
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Re: Not following RegEx recursion in PCRE
Hello Ret,ret wrote: I assume I may have to use a chunk variant to find where the first match hits and then start the next match from that point, thus writing a recursive function myself
Yes, you are right; that's the way to go.
Otherwise, I have a not free library with more powerful API
to deal with these kind of problems.
Kind regards,
Thierry
Last edited by Thierry on Thu Nov 17, 2022 12:33 pm, edited 2 times in total.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
Re: Not following RegEx recursion in PCRE
Thierry - thank you so much for the advice and link to code example. This worked great. I very much appreciate your help!
Here is how I modified your code for solving my problem. I am using this with the revBrowserGet functions to populate a "myHTML" field on my main LiveCode card from the company's website, but this code would work for any website containing email data. I am glad to share the revBrowserGet code I am using with anyone who wants to see it. This works in 7.x and 8.x versions of LiveCode.
on mouseUp
local RX
local x
local myEMAIL
local my2ndEmail
put "([\w]+@[\w]{2,}+\.[\w]{2,}(?R)?)" into RX
put text of field "myHTML" into myHTMLvar
repeat while matchChunk(myHTMLvar, RX, p1start,p1End)
put matchText(myHTMLvar,RX,myEMAIL) into myresult
put myEMAIL &cr after my2ndEMAIL
#commenting next two lines out, but useful during testing
#answer myresult
#answer my2ndEMAIL
delete char 1 to p1End of myHTMLvar
end repeat
answer my2ndEMAIL
end mouseUp
Thanks again for saving me a lot of time. I was really losing sleep on this one over the past few days.
Rick
Here is how I modified your code for solving my problem. I am using this with the revBrowserGet functions to populate a "myHTML" field on my main LiveCode card from the company's website, but this code would work for any website containing email data. I am glad to share the revBrowserGet code I am using with anyone who wants to see it. This works in 7.x and 8.x versions of LiveCode.
on mouseUp
local RX
local x
local myEMAIL
local my2ndEmail
put "([\w]+@[\w]{2,}+\.[\w]{2,}(?R)?)" into RX
put text of field "myHTML" into myHTMLvar
repeat while matchChunk(myHTMLvar, RX, p1start,p1End)
put matchText(myHTMLvar,RX,myEMAIL) into myresult
put myEMAIL &cr after my2ndEMAIL
#commenting next two lines out, but useful during testing
#answer myresult
#answer my2ndEMAIL
delete char 1 to p1End of myHTMLvar
end repeat
answer my2ndEMAIL
end mouseUp
Thanks again for saving me a lot of time. I was really losing sleep on this one over the past few days.
Rick
Re: Not following RegEx recursion in PCRE
Hi Rick, I'm happy that you made it.ret wrote: Thanks again for saving me a lot of time. I was really losing sleep on this one over the past few days.
And yes, this forum is great for sharing experiences

In case you need to replace any patterns in a text, remember that
the replaceText() LC function has on its own
the same behavior as the '/g' from Perl or other languages...
Hence, no needs for a repeat while ...
Regards,
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!