Regex for LiveCode

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sat Aug 31, 2024 2:37 pm

Hi Kaveh,
What is pattern you’re trying to capture?

Why not just use the filter function (ie filter with or without “*●*”?)

Do you need regex? And if so, what groups are you trying to capture?

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: Regex for LiveCode

Post by kaveh1000 » Sat Aug 31, 2024 2:55 pm

I am trying to convert this:

Code: Select all

Collection	47
Collection	●
CollectionIdentifier	50
CollectionIdentifier	●
CollectionIDType	51
CollectionIDType	●
CollectionSequence	52
CollectionSequenceNumber	55
CollectionSequenceType	53
CollectionSequenceTypeName	54
CollectionType	48
CollectionType	●
into this:

Code: Select all

Collection	47	●
CollectionIdentifier	50	●
CollectionIDType	51	●
CollectionSequence	52
CollectionSequenceNumber	55
CollectionSequenceType	53
CollectionSequenceTypeName	54
CollectionType	48	●
where all spaces are tabs. So in BBEdit on mac, search for

Code: Select all

^(.+)\t([0-9]+)\n\1\t●
replace by

Code: Select all

\1\t\2\t●
Kaveh

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sat Aug 31, 2024 4:17 pm

Yeah I see. Much shorter code with regex… but sadly you can’t.

Still, should only be a few lines with a couple of filters and a couple of loops. Definitely not as elegant but should be performant enough…

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sat Aug 31, 2024 10:56 pm

sorry meant to post this earlier, and I'm guessing you've already solved this, but for the sake of discussion, a non-regex way of doing this could be:

Code: Select all

function findDots pText
    local tOutput, tDots, temp
    filter pText with "*●" into tDots
    set the itemdelimiter to tab
    repeat for each line tLine in pText
        if item 1 of tLIne = temp then next repeat
        put item 1 of tLine into temp
        if temp & tab is in tDots then put "●" into item 3 of tLine
        put tLine & return after tOutput
    end repeat
    delete the last char of tOutput
    return tOutput
end findDots
Running this with with your source text in field 1:

Code: Select all

on mouseUp
    put findDots(field 1) into field 2
end mouseUp
puts the below into field 2

Code: Select all

Collection	47	●
CollectionIdentifier	50	●
CollectionIDType	51	●
CollectionSequence	52
CollectionSequenceNumber	55
CollectionSequenceType	53
CollectionSequenceTypeName	54
CollectionType	48	●
I'm sure you've solved this already it's just a quick exercise ;)

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: Regex for LiveCode

Post by kaveh1000 » Sun Sep 01, 2024 9:40 am

Thank you so much Stam. Yes, I did something similar. It shows what a wonderful language LiveCode is. But this method gets slow if there are say 1000s of lines. The thing about regex is that is fast, so it would be a great addition to LiveCode. I'll give you another example of what I need to do often, and that is to colour text. See attached shot. This is an XML file with 1000s of lines. I started by doing a loop with matchChunk and setting the colour of relevant text, but it takes a long long time. Instead now I get the htmltext of the field, then do one simple regex on the whole text. then set the htmltext to the result. It takes 4 seconds!!
Attachments
Screenshot 2024-09-01 at 09.31.33.png
Kaveh

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sun Sep 01, 2024 9:49 am

kaveh1000 wrote:
Sun Sep 01, 2024 9:40 am
Thank you so much Stam. Yes, I did something similar. It shows what a wonderful language LiveCode is. But this method gets slow if there are say 1000s of lines.
Agreed - I would much prefer to use regex for stuff like this, and it’s annoying that PCRE implementation is incomplete even though it was introduced many years ago.
Given the current state of play, I very much doubt we’ll seen any improvements on this in the near to medium term future…

Having said that - we’ve all seen that tweaking code can massively speed things up, often by orders of magnitude, and since there is no direct regex option for this you could always post your code (and/or a real sample text with 1000s of lines) and see what others come up with if you’re finding your code slow…

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: Regex for LiveCode

Post by kaveh1000 » Sun Sep 01, 2024 10:49 am

Agree with all.

I am not sure this is possible, but I wonder if there is a way of having an external (no LiveCode) module that does the regex job and sends it back to LiveCode. Just thinking out of the box. Or an open library that can be embedded in LiveCode. FYI I have written my own version that does basic backreferencing, e.g. \2\1 in replace, but it is far from complete, e.g. no lookaround. Happy to share that if of interest.
Kaveh

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sun Sep 01, 2024 11:05 am

If you have a generalisable solution that can be used for backreferencing agnostically of the calling handler, I’d be happy to add that to the wiki, fully acknowledging you of course.

As for an external library that should in theory be possible with a livecode builder library, but I’m afraid I can’t help there; perhaps others can
Last edited by stam on Sun Sep 01, 2024 12:33 pm, edited 1 time in total.

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: Regex for LiveCode

Post by kaveh1000 » Sun Sep 01, 2024 11:09 am

Thanks. I'll take a look at the code again and see if is worth uploading. :-)
Kaveh

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sun Sep 01, 2024 6:45 pm

kaveh1000 wrote:
Sun Sep 01, 2024 9:40 am
But this method gets slow if there are say 1000s of lines.
Well, you see, I take these things like a challenge ;)

I confirmed that with several thousand lines the handler I posted was terribly slow: Parsing 5000 lines took 19,913 milliseconds!
I had a few ideas which only made things slower. BUT: I got there in the end.

With the handler below which is array-based, I can successfully parse your example text replicated to 5000 lines in 68 milliseconds!
Can't imagine it would be much quicker with regex!!!

The handler first creates an array where the key is item 1 of each of your lines and puts the number into a subkey of this, and a dot exists puts this into a different subkey. Optionally you can combine these 2 subways into a string separated by tabs, and then it's a simple matter of using combine to create text.

I'm quite happy with the speed increase (from 19,913 to 86 milliseconds would probably be quicker if I didn't bother generating text to return) which goes to show the value of arrays and what a great language this is ;)

Here's a handler that takes text and returns text - can be truncated if you want it to return an array instead (either with subkeys or simple 1-dimensional array)

Code: Select all

function parseWithArrays pText
    local  tArray, x, tKeys
    
    //create numerically keyed array each element containing a line
    split pText by return
    
    //create array where the key is item 1 of the source, the number is subkey 1 and if dot exists, it is subkey 2 
    repeat for each element tElement in pText
        add 1 to x
        split tElement by tab
        if tElement[2] is not "●" then 
            put tElement[2] into tArray[tElement[1]][1]
        else
            put tElement[2] into tArray[tElement[1]][2]
        end if
    end repeat
    
    // optionally combine subkeys into 1
    put the keys of tArray into tKeys
    repeat for each line tKey in tKeys
        if tArray[tKey][2] is "●"  then 
            put tArray[tKey][1] & tab & "●"  into tArray[tKey]
        else 
            put tArray[tKey][1] into tArray[tKey]
        end if
        delete variable tArray[tKey][1]
        delete variable tArray[tKey][2]
    end repeat
    
    // if combined into single key, optionally create plain text  using 'combine'
    combine tArray using return and tab
    
    return tArray
end parseWithArrays
Stam

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: Regex for LiveCode

Post by kaveh1000 » Sun Sep 01, 2024 8:40 pm

Thank you so much Stam. I am going to take time to study this carefully, especially as I am not too good at arrays.

BTW the task I said was much faster using regex is the coloring of the text, and that is what I used regex for.

Love to continue the conversation!
Kaveh

stam
Posts: 3061
Joined: Sun Jun 04, 2006 9:39 pm

Re: Regex for LiveCode

Post by stam » Sun Sep 01, 2024 9:14 pm

Yeah, my go-to is regex as well - it's lean and it's mean and the internet is full of solutions and suggestions for using it.

But every now and again LiveCode makes us do things differently. Often that's fine, but sometimes it can be slow - but I've many examples on the the forum where people had slow handlers and it's possible to speed them up by at least an order of magnitude, if not more, so it's always worth persisting...

Well, at least it makes for a fun pastime ;)

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10043
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: Regex for LiveCode

Post by FourthWorld » Mon Sep 02, 2024 6:09 am

kaveh1000 wrote:
Sun Sep 01, 2024 8:40 pm
BTW the task I said was much faster using regex is the coloring of the text, and that is what I used regex for.
May we see the code you're using for that? StyledText arrays are often surprisingly efficient.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 539
Joined: Sun Dec 18, 2011 7:23 pm
Contact:

Re: Regex for LiveCode

Post by kaveh1000 » Mon Sep 02, 2024 7:07 am

Of course. Please see three files here: https://drive.google.com/drive/folders/ ... sp=sharing

So I use regex to convert the single color htmltext to a colored text. 4 seconds or so in LiveCode. Interested in non-regex solution. :-)
Kaveh

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4163
Joined: Sun Jan 07, 2007 9:12 pm

Re: Regex for LiveCode

Post by bn » Mon Sep 02, 2024 9:14 am

Hi Kaveh,

Here is an example to use styledText to colorise your XML file you posted as "originaltext.text"

It contains some in-line html which I do not colorise as you did not colorise it either.
<d104>&lt;P>Vijay Kumar
Using styled text takes a bit practice and is not a cure all but your use case is pretty straight forward.

Kind regards
Bernd
Attachments
useStyledTextKaveh.livecode.zip
(1.53 KiB) Downloaded 330 times

Post Reply