LiveCode Forums.

Posted: **Fri Jan 16, 2015 6:39 am**

I have a RSS feed (xml) file that I need to extract links from. This is the part where I need to extract the link (there's several link in this file):

Code: Select all

<weblink>
<![CDATA[
https://www.domainName.com/p.php?l=0&p=0056&id=171
]]>
</weblink>

Having a difficult time figuring it out, if anyone can give me a hand or point me in the right direction. Thanks.

Posted: **Fri Jan 16, 2015 8:34 am**

Hi Shawn,
Isn't this

Code: Select all

put lineOffset("<![CDATA[",myXML) into myVar
add 1 to myVar

Then you get to use "lines to skip"

Simon

Posted: **Fri Jan 16, 2015 4:39 pm**

Hmmm. Not having any luck. I'll continue trying and post some code.

Posted: **Fri Jan 16, 2015 5:12 pm**

Ok. I went with the php file instead of the XML file and almost have what I need. Getting close. Here's some code, any help is greatly appreciated.
-- Things I need to do
A) loop through fld "fld1" and find a random link
B) see the second block of code, I need to find the string + 4 char
* the second code block is what I'm trying to achieve, but obviously doesn't work with my way of thinking.

Code: Select all

on mouseUp
   put URL "http://mydomain.com/rss.php" into tURL
   put tURL into fld "fld1"
   find string "https://www.myotherdomain.com/show.php?l=0&u=17156&id=" in fld "fld1"
   put the foundText into tFound
   put tFound into fld "fld2"
end mouseUp

This is what I'd like

Code: Select all

on mouseUp
   put URL "http://mydomain.com/rss.php" into tURL
   put tURL into fld "fld1"
   find random string "https://www.myotherdomain.com/show.php?l=0&u=17156&id=" & + 4 char in fld "fld1"
   put the foundText into tFound
   put tFound into fld "fld2"
end mouseUp

Posted: **Fri Jan 16, 2015 6:28 pm**

So you know how you want the target link to start? Maybe the "begins with" function will help http://livecode.com/developers/api/6.0. ... ns%20with/

Posted: **Fri Jan 16, 2015 7:02 pm**

I can find instances of the URL using this (although, not all of them in one swoop), but I need the next few characters too, which will always change, but always be 5 digits.

Code: Select all

on mouseUp
   find characters "https://www.mydomain.com/rss.php?z=0&p=156&id="
end mouseUp

Posted: **Sat Jan 17, 2015 3:27 am**

Hi shawn,
Can you post some of your XML/PHP whatever is returned?

And never loop through a field (way slow) always stick it into a variable and loop that.

Simon

Posted: **Thu Jan 22, 2015 4:29 pm**

This page could help you, it explains you how to create an RSS feed reader using Livecode XML functions: http://livecodeitalia.blogspot.it/2014/ ... e-rss.html
Use the google translate butoon on the right to translate in your language.

Posted: **Wed Jan 28, 2015 11:16 pm**

I have been working on learning regex and I thought this question would be a good one to try to see if you could extract the URLs with regex.

If the data you are looking for is always in the form <![CDATA[ the url]]> then the regex <!\[CDATA\[(.*)\]\] will capture the URL.

I put a couple of lines with urls in this format in the following code to test.

Code: Select all

on mouseUp
   put "<![CDATA[https://www.domainName.com/p.php?l=0&p=0056&id=181]]>"  & CR & "<![CDATA[https://www.domainName.com/p.php?l=0&p=0064&id=151]]>" into tURLtoExtract
   local tStart,tEnd
   put matchtext(tURLtoExtract, "<!\[CDATA\[(.*)\]\]",tURL) into tSuccess
   put matchchunk(tURLtoExtract, "<!\[CDATA\[(.*)\]\]",tStart,tEnd) into tSuccess
   put tSuccess into line 1 of msg
   put tStart into line 2 of msg
   put tEnd into line 3 of msg
   put tURL into line 4 of msg
end mouseUp

MatchText will find and extract the first match and put it in tURL. This won't return subsequent matches so you would have to iterate through your text to find subsequent matches. If there is only one URL per line in your feed you could iterate for each line.

If that did not work you could also use matchChunk which returns the start and end position of the match. You could have a repeat loop that uses the end position to delete characters to that point in the text and then use matchText and matchChunk again to get the next URL.

Not sure if this will do what you want but would be interested to see if it did.

Martin

LiveCode Forums.

I have a RSS feed (xml) file that I need to extract links

I have a RSS feed (xml) file that I need to extract links

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link

Re: I have a RSS feed (xml) file that I need to extract link