Not following RegEx recursion in PCRE

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
ret
Posts: 2
Joined: Fri May 10, 2013 1:28 am

Not following RegEx recursion in PCRE

Post by ret » Thu Aug 06, 2015 6:25 am

I have an application I have been working on that uses regular expressions to extract email addresses from my customer's website.

on mouseUp
put text of field "myHTML" into myHTMLvar

put matchText(myHTMLvar,"([\w]+@[\w]{2,}+\.[\w]{2,}(?R)?)",myEMAIL) into myresult

answer myresult
answer myEMAIL

end mouseUp

I get the first email address just fine, but I am not understanding how to get the PCRE to do recursion. PCRE does not accept the /g modifier and after several days of searching for an alternative (like a combination of (?m) and ^ and $, I am at a loss for how to solve this problem via PCRE regex. I know it is possible, but I am just not getting it.

All the PHP examples online use a function called preg_match_all, but that does not appear to be valid in LiveCode.

I assume I may have to use a chunk variant to find where the first match hits and then start the next match from that point, thus writing a recursive function myself, but I am hoping that is not the case.

Any insight is appreciated.

Rick

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Not following RegEx recursion in PCRE

Post by Thierry » Thu Aug 06, 2015 10:57 am

ret wrote: I assume I may have to use a chunk variant to find where the first match hits and then start the next match from that point, thus writing a recursive function myself
Hello Ret,

Yes, you are right; that's the way to go.


Otherwise, I have a not free library with more powerful API
to deal with these kind of problems.

Kind regards,

Thierry
Last edited by Thierry on Thu Nov 17, 2022 12:33 pm, edited 2 times in total.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

ret
Posts: 2
Joined: Fri May 10, 2013 1:28 am

Re: Not following RegEx recursion in PCRE

Post by ret » Thu Aug 06, 2015 12:39 pm

Thierry - thank you so much for the advice and link to code example. This worked great. I very much appreciate your help!

Here is how I modified your code for solving my problem. I am using this with the revBrowserGet functions to populate a "myHTML" field on my main LiveCode card from the company's website, but this code would work for any website containing email data. I am glad to share the revBrowserGet code I am using with anyone who wants to see it. This works in 7.x and 8.x versions of LiveCode.

on mouseUp
local RX
local x
local myEMAIL
local my2ndEmail

put "([\w]+@[\w]{2,}+\.[\w]{2,}(?R)?)" into RX

put text of field "myHTML" into myHTMLvar

repeat while matchChunk(myHTMLvar, RX, p1start,p1End)
put matchText(myHTMLvar,RX,myEMAIL) into myresult
put myEMAIL &cr after my2ndEMAIL

#commenting next two lines out, but useful during testing
#answer myresult
#answer my2ndEMAIL

delete char 1 to p1End of myHTMLvar
end repeat

answer my2ndEMAIL

end mouseUp

Thanks again for saving me a lot of time. I was really losing sleep on this one over the past few days.

Rick

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Not following RegEx recursion in PCRE

Post by Thierry » Thu Aug 06, 2015 1:46 pm

ret wrote: Thanks again for saving me a lot of time. I was really losing sleep on this one over the past few days.
Hi Rick, I'm happy that you made it.

And yes, this forum is great for sharing experiences :)

In case you need to replace any patterns in a text, remember that
the replaceText() LC function has on its own
the same behavior as the '/g' from Perl or other languages...
Hence, no needs for a repeat while ...

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Post Reply