Page 1 of 1
=^..^= I can not find my head(ers)... ;)
Posted: Tue Apr 15, 2014 4:55 pm
by Mariasole
Ciao livecoders!
It is probably a stupid question but I can not find information about it ...
It' 'possible to download
only the "headers" of a web page without download all the content?
Let me give an example. There is a
very large html page that is updated every so often

.
Can I have a function that downloads
only the headers of this page without download all?
Thanks to '"http headers" could in fact check the
last modification date, without consuming all the bandwidth of the server!
I tried to use the function "
libUrlLastRHHeaders() ", but it seems to me that first have to download the web page before getting the last header...
There is something like this in LiveCode:
Code: Select all
give me the header of the page "http://www.example.com" without download the web page // :wink:
Thank you all for your help, and for me not to lose my head(er)
=^..^=
Mariasole
The Live Code
pasionaria
Re: =^..^= I can not find my head(ers)... ;)
Posted: Tue Apr 15, 2014 5:23 pm
by Thierry
Mariasole wrote:
Code: Select all
give me the header of the page "http://www.example.com" without download the web page // :wink:
Hi Maria,
You can always use the curl shell command as shown below (pseudo-code);
you need to check for the curl options as I dont' know them by heart.
Code: Select all
get shell("curl .... --head https://.......")
From memory, the --head option should give you only the head part.
Regards,
Thierry
Re: =^..^= I can not find my head(ers)... ;)
Posted: Tue Apr 15, 2014 8:08 pm
by Mariasole
Thanks Thierry! Molto gentile!
I went to look for curl.... I have no idea how it works

, I'll have to study it ...
There is no solution "native" for LC?
Thanks again....
=^..^= Mariasole
Re: =^..^= I can not find my head(ers)... ;)
Posted: Wed Apr 16, 2014 7:05 am
by Thierry
Mariasole wrote:Thanks Thierry! Molto gentile!
I went to look for curl.... I have no idea how it works

, I'll have to study it ...
Here is one nice tuto:
http://code.tutsplus.com/tutorials/tech ... --net-8470
and some code to test:
Code: Select all
get shell( "curl http://www.google.fr")
get shell( "curl -s http://www.google.fr ")
get shell( "curl -s --head http://www.google.fr ")
There is no solution "native" for LC?
Umm, last time I needed it (couple of years), didnt' find anything working well
But as we have a plethora of new LC versions since a while, may be it's not true anymore???
Sadly, if no one gave you an answer... you know what i mean
Regards,
Thierry
Re: =^..^= I can not find my head(ers)... ;)
Posted: Wed Apr 16, 2014 11:48 am
by Klaus
Hi all,
Mariasole wrote:There is no solution "native" for LC?
no, not in the moment.
best
Klaus
Re: =^..^= I can not find my head(ers)... ;)
Posted: Wed Apr 16, 2014 1:13 pm
by Thierry
Hi Maria,
So, after Klaus statement which doesn't surprise me much,
I highly recommend to spend a bit of time with curl.
It's a powerful and serious tool and it is not that difficult to work with...
May be other ideas will come...
Good luck,
My 2 cents,
Thierry
Re: =^..^= I can not find my head(ers)... ;)
Posted: Wed Apr 16, 2014 1:53 pm
by atout66
Hi Maria,
The script below loads the URL and it seems that it's what you want to avoid...
Anyway, if you just want to see the header part of a specific page, it could help you a bit:
Code: Select all
on mouseUp
## We assume we make a search between the div <thePrimString> | <theEndString>
## and you know the name of the next div after <header> div
local thePrimString , theEndString , theUrl , theTxt
local theFirstLine , theLastLine -- integer
put empty into fld "temp" -- the field where you'll look at the <header> not formated
put "<div id=""e&"header""e&">" into thePrimString -- for : <div id="header">
put "<div id=""e&"home""e&">" into theEndString -- for : <div id="home"> the next one after <header>
put "http://livecode.com/" into theUrl
put URL theUrl into theTxt
###########
put lineOffset(thePrimString,theTxt) into theFirstLine
put lineOffset(theEndString,theTxt) into theLastLine
put line theFirstLine to (theLastLine -1) of theTxt into theTxt
set the htmltext of fld "temp" to theTxt -- copy without HTML tags
end mouseUp
//////////////////////////////////////////////////
My 2 cents too
Kind regards, Jean-Paul.
Re: =^..^= I can not find my head(ers)... ;)
Posted: Wed Apr 16, 2014 2:06 pm
by Thierry
My 2 cents too
Ummm, your code does certainly extract the header part of the full url.
Often we want to have only the header and avoid downloading the whole web pages
for speed concerns..
Regards,
Thierry
Re: =^..^= I can not find my head(ers)... ;)
Posted: Wed Apr 16, 2014 3:22 pm
by FourthWorld
Mariasole, this is a bit of a funky workaround, but one way to obtain the header info without having to download the whole file is to trick the system by canceling the download shortly after it's started:
Code: Select all
function GetHttpHeader pUrl
load url pUrl
repeat 9999 -- arbitrarily large number to avoid a runaway loop in case I forgot anything here
get urlStatus(pUrl)
if it is "error" then return "Error getting headers for "& pUrl
--
if it is not among the items of "loading,cached" then
wait 10 millisecs with messages
else
exit repeat
end if
end repeat
unload url pUrl
return libUrlLastHttpHeaders()
end GetHttpHeader
I just threw that together so it may benefit from some tweaking, but hopefully the idea will be helpful.
Note that the "load url" command this relies on is non-blocking, so you'll want to prevent the user from triggering multiple calls to this while it's running (which hopefully won't take long).