Regex expression

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Regex expression

Post by Thierry » Wed Jul 03, 2013 8:45 pm

Klaus wrote:Ladies and Gentlemen,

attention please, this is the moderator speaking!
Thierry wrote:Regexps are like cheese..
You are not allowed to mention "cheese" with postings < 10000!
:-D :-D :-D
Oups, Sorry Mister Moderator!

As I'll be in Cologne in a week or so,
I'll bring to you some French baguette, a bottle of red wine and a camembert for forgiveness :)

Best,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Klaus
Posts: 14177
Joined: Sat Apr 08, 2006 8:41 am
Contact:

Re: Regex expression

Post by Klaus » Wed Jul 03, 2013 8:54 pm

D'accord, mon ami! :-)

kimberlyn
Posts: 18
Joined: Mon Jul 01, 2013 1:40 pm

Re: Regex expression

Post by kimberlyn » Thu Jul 04, 2013 10:46 am

Well,

My file contains lines like this

Code: Select all

ITEM1,10,3,50,ATK,DFS,SPE,[GOF,30,33,0];
ITEM1,10,3,50,ATK,DFS,SPE,[GOF,30,33,0;TTF,21,33,;JDK,10,20,45;...];
ITEM1,10,3,50,ATK,DFS,SPE,[GOF,30,33,0;FFG,12,13,,];
As you can see the format is always the same but the numbers of parameters into the "[]" can be ilimited. The only thing sure is that they are une groups of 4 parameters.

And what I try to do is making a regex to "test" if the lines have the good format, which is :

Code: Select all

par1,nb1,nb2,nb3,par2,par3,par4,[par5,nb4,nb5,nb6;par5;nb7,nb8,nb9;...];
Hope you understand better what I'm trying to do :)

kimberlyn
Posts: 18
Joined: Mon Jul 01, 2013 1:40 pm

Re: Regex expression

Post by kimberlyn » Thu Jul 04, 2013 10:08 pm

up ! :)

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Regex expression

Post by Thierry » Wed Jul 10, 2013 5:35 pm

kimberlyn wrote:up ! :)
Hi,

Glad that you made it!

However, I think it could interest few LiveCoders to have a full working code,
so here is one..

I worked with some datas; the first 3 lines are fine, others not:

Code: Select all

ITEM1,10,3,50,ATK,DFS,SPE,[GOF,30,33,0];
ITEM2,10,3,50,ATK,DFS,SPE,[GOF,30,33,0;TTF,21,33,;JDK,10,20,45];
ITEM3,10,3,50,ATK,DFS,SPE,[GOF,30,33,0;FFG,12,13,];
ITEM4,10,3,50,ATK,DFS,SPE [FFG,12,13,];
ITEM5,10,3,50,ATK,DFS,SPE,[GOF,30,33,0;;FFG,12,13,];
ITEM6,10,3,50,ATK,DFS,SPE,[GOF,30,33,0;]FFG,12,13,];
ITEM7,10,3,50,ATK,DFS,SPE,[];
ITEM8,10,3,50,ATK,DFS,SPE,[GOF,30,33,0,FFG,12,13,];

Code: Select all

local myLog

on mouseup
   put empty into myLog
   put field "Ftest" into myText
   repeat for each line aLine in myText
      -- Validate a list of 8 items with a comma separator
      -- the last item is enclosed by brackets, end of line is ;
      -- and split the line in 2 parts
      get "(?x)            " & \ # whitespaces ignored in regex
      " \A                 " & \ # beginning of the line
      " ( (?: [^,]+,){7} ) " & \ # first 7 non-empty items separated by comma plus capture it
      " \[ ( [^\]]+ ] )    " & \ #  the eight item enclosed by square brackets plus capture it
      " ;\z                "     # last char of the line is a semi-column
      if not matchText( aLine, IT, part1, part2 ) then
         myput "Bad1:", aLine
         next repeat
      end if
      
      -- Validate part1, ie: ITEM3,10,3,50,ATK,DFS,SPE,
      get "(?x)            " & \ # whitespaces ignored in regex
      " \A                 " & \ # beginning of the line
      " [\w]+ ,            " & \ # 1st item is A-Z or 0-9 chars non-empty
      " (?: [\d]+,){3}     " & \ # items 2 to 4 are non-empty numbers. do not capture ()
      " (?: [\w]+,){3}     " & \ # items 5 to 7 are non-empty A-Z chars. do not capture ()
      " \z                 "     # end of line
      if not matchText( part1, IT ) then
         myput "Bad2:", part1
         next repeat
      end if
      
      -- Validate part2, ie: GOF,30,33,0;FFG,12,13,]
      get "(?x)            " & \ # whitespaces ignored in regex
      " \A                 " & \ # beginning of the line
      "(?:                 " & \ # 1st block, do not capture ()
      "  [\w]+             " & \ # 1st item is A-Z or 0-9 chars non-empty
      "  (?:,[^,;\]]*){3}  " & \ # 3 empty or not items, comma separator, do not capture ()
      "  [;\]]             " & \ # ending with a ; or ]
      ")+                  " & \ # end of 1st block: 1 or more times 
      " \z                 "     # end of line
      if not matchText( part2, IT ) then
         myput "Bad3:", part2
         next repeat
      end if
      
      myput "OK 1:", aLine
   end repeat
   put myLog
end mouseup

on myput p1, p2
   put p1 & tab & p2 &cr after myLog
end myput
Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

kimberlyn
Posts: 18
Joined: Mon Jul 01, 2013 1:40 pm

Re: Regex expression

Post by kimberlyn » Fri Jul 19, 2013 8:54 am

Oh thank you i didn't saw your answer !

May I ask a last question ?
Actually all this is very good. But when you have a line like this :

Code: Select all

ITEM6,ATK,DFS,SPE,[GOF,30,33,0], 2, [FFG,12,13; PPT, 10, 0]; 
I m focusing on this part :
2, [FFG,12,13; PPT, 10, 0].

The "2" means that I have two components. And I want to take the second value of each parameter (12 and 10 here).
How could I do considering that the length and number of element before this part is variable ?
I m thinking about playing with the "]," which indicates me when I finished the first part.

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Regex expression

Post by Thierry » Sat Jul 20, 2013 10:14 am

kimberlyn wrote: when you have a line like this :

Code: Select all

ITEM6,ATK,DFS,SPE,[GOF,30,33,0], 2, [FFG,12,13; PPT, 10, 0]; 
I m focusing on this part :
2, [FFG,12,13; PPT, 10, 0].
The "2" means that I have two components. And I want to take the second value of each parameter (12 and 10 here).
As you're focusing on the 2 last items of your text line, I would try to capture these 2 items.
For clarity, I've dropped all extra spaces in your text;
if you need them, it's easy to add \s* wherever you have to..
Unfortunately, you can't managed variable number of patterns with regex; you have to do a mix of regex and Livecode,
which works pretty well for most cases.

Here is a working piece of code.

Code: Select all

on mouseup
   get "ITEM6,ATK,DFS,SPE,[GOF,30,33,0],2,[FFG,12,13;PPT,10,0];"
   if not matchText( IT, ",(\d+),\[([^\]]+)];\z", N, lastPart ) then exit to top
   -- lastPart contains:  FFG,12,13;PPT,10,0
   replace ";" with cr in lastPart
   if N is not the number of lines of lastPart then answer "N incorrect"
   repeat for each line aLine in lastPart
      put item 2 of aLine &cr after whateveryoulike
   end repeat
   answer "Got it! N is " & N &cr& "result:" &cr& whateveryoulike
end mouseup
All the best,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Post Reply