Keith Devens .com |
Thursday, July 2, 2009 | ![]() |
| I have never met a man so ignorant that I couldn't learn something from him. – Galileo Galilei | ||
|
| ← ISO 8601 date/times | How annoying → |

Keith Gaughan (http://hereticmessiah.weblogs.com/) wrote:
Keith (http://www.keithdevens.com/) wrote:
Yeah, I started to put the message I just put there a while ago, but my computer was wonky and then I had to go out for the night, so I didn't get to do it.
Nicolas Hoizey <nhoizey@php.net> (http://www.phpheaven.net/) wrote:
It seems there is a bug somewhere in the URL creation.
I try using your script with the folowing URL :
http://www.phpheaven.net/rubrique14.html
It should give me this RSS feed :
http://www.phpheaven.net/rss2.xml
Bug instead it gives me this :
http://www.phpheaven.net/rubrique14.html/rss2.xml
Anyway, great job!
Keith (http://www.keithdevens.com/) wrote:
Hi Nicolas. I should have been clearer in my documentation. The $location you're supposed to pass as the second parameter to the getRSSLocation function is supposed to be the base location, not the page itself. So in your case, the base location would be http://www.phpheaven.net/, not http://www.phpheaven.net/rubrique14.html.
However, it was easy to change the behavior to match what you expected, so I changed the code. I also updated it to work with error_reporting(E_ALL) on. So grab the new code and give it a try, and let me know if you have any problems.
Jason DeFillippo (http://jason.defillippo.com/blog/) wrote:
Love it! Only suggestion would be to return an array of all possible hits. I have 2 RSS feeds on my site. One .rdf and one .xml but running the site's html thru your func it just returns the rdf. Beautiful work though. I'll definitely be using it.
Keith (http://www.keithdevens.com/) wrote:
Jason, I'll consider it, but considering that this code will be used for web-based applications, where it's not easy to pop up a dialogue box to choose which feed you want, and in the absence of further metadata (like "full posts" or "excerpts") to allow the code to automatically choose a preferred version, it's likely that returning more than one result would be a mis-feature.
However, the code should have predictable behavior when faced with more than one option (for instance, always return the first feed listed), a possibility I never considered in the first place. So, I should probably look at the code to see what happens and see if that's preferable.
Jason DeFillippo (http://jason.defillippo.com/blog/) wrote:
The need to gather all feed possibilities was what I needed in my web app so I just hacked the feature into your func so no worries :-) Thanks for the quick reply.
Keith (http://www.keithdevens.com/) wrote:
Good! Glad you were able to make due. There was no way I would have gotten to it soon, so it's good that you did what you did 
Nicolas (http://nicolashb.free.fr) wrote:
Anyone has implemented this in ASP?
Keith (http://keithdevens.com/) wrote:
Dunno, I did a quick Google search just now and didn't see anything.
philip (http://www.philipandrew.com/) wrote:
Does work for http://www.upsaid.com/beyan/ see there is a RSS feed for this page at http://www.101h.com/beyan/feed.xml
Thanks!
Max wrote:
Keith, could you give me a live example : which values should I assign to $html and $location in order to get the feed from http://www.phpheaven.net/ ?
I tried this with no luck 
$location = 'http://www.phpheaven.net/';
$html = getFile($location);
getRSSLocation($html, $location);
Thanks,
Max
Keith (http://keithdevens.com/) wrote:
Max, I tried exactly what you gave me and it worked fine, returning http://www.phpheaven.net/rss1.xml.
Note that you can use file_get_contents() instead of getFile() if you'd like, now that there's a function built into PHP that does exactly that.
Max wrote:
Thanks Keith, it appeared the server I was working on had crashed... Works like a charm now ! thank you :-)
steve (http://www.buzznet.com) wrote:
Keith. This kicks major ass. I love your brute-force approach. I was messing with using SAX parsers etc. What a mess. Your soloution is brief and accurate! thanks, dude.
some body wrote:
FYI, I just found your function in the code for the zfeeder web aggregator application:
Enej wrote:
great job. I was woundering what happends to if the user would accually put in a url to the feed instead of the html site?
it would be cool if it would point you to the feed regardless.
Thanks.
81.10.126.86 wrote:
how about trackback URL auto discovery ?
Mark_S wrote:
I tried this with no luck
$location = 'http://www.phpheaven.net/';
$html = getFile($location);
getRSSLocation($html, $location);
-------------------------
I'm struggling to get this to work?
I know its more down to my php confusion with calling
functions, like Max above.
Any help would be appreciated.
As my searches for Auto Discovery bring me back to this
page time and time again.
-------------------------
Does the above code make a .php page "Auto Discovered"
so to speak.
How do i include it in my php?
An example would be much appreciated.
My page that i would like "Auto Discovery" is php?
I can using the default tags on a html page,
get auto discovery to work.
But i can not get a php page to Auto discover !
I'm newbie / noivice level..
Thanks in advance Mark.
Cristian wrote:
I am sorry Keith Devens, but I had to modify the function to get all the feeds on the page. This is the code:
function getRSSLocation($html, $location){
if(!$html or !$location){
return false;
}else{
#search through the HTML, save all <link> tags
# and store each link's attributes in an associative array
preg_match_all('/<link\s+(.*?)\s*\/?>/si', $html, $matches);
$links = $matches[1];
$final_links = array();
$link_count = count($links);
for($n=0; $n<$link_count; $n++){
$attributes = preg_split('/\s+/s', $links[$n]);
foreach($attributes as $attribute){
$att = preg_split('/\s*=\s*/s', $attribute, 2);
if(isset($att[1])){
$att[1] = preg_replace('/([\'"]?)(.*)\1/', '$2', $att[1]);
$final_link[strtolower($att[0])] = $att[1];
}
}
$final_links[$n] = $final_link;
}
#now figure out which one points to the RSS file
for($n=0; $n<$link_count; $n++){
if(strtolower($final_links[$n]['rel']) == 'alternate'){
if(strtolower($final_links[$n]['type']) == 'application/rss+xml'){
$href = $final_links[$n]['href'];
}
if(!$href and strtolower($final_links[$n]['type']) == 'text/xml'){
#kludge to make the first version of this still work
$href = $final_links[$n]['href'];
}
if($href){
if(strstr($href, "http://") !== false){ #if it's absolute
$full_url[] = $href;
}else{ #otherwise, 'absolutize' it
$url_parts = parse_url($location);
#only made it work for http:// links. Any problem with this?
$full_url[] = "http://$url_parts[host]";
if(isset($url_parts['port'])){
$full_url[count($full_url)-1] .= ":$url_parts[port]";
}
if($href{0} != '/'){ #it's a relative link on the domain
$full_url[count($full_url)-1] .= dirname($url_parts['path']);
if(substr($full_url[count($full_url)-1], -1) != '/'){
#if the last character isn't a '/', add it
$full_url[count($full_url)-1] .= '/';
}
}
$full_url[count($full_url)-1] .= $href;
}
//return $full_url;
}
}
}
if (isset($full_url)) {
return $full_url;
} else {
return false;
}
}
}
Jake wrote:
Great code! Thanks a million, and thanks to Cristian for the ability to pull the feeds in an array.
Comments closed.
Generated in about 0.173s.
(Used 8 db queries)
Well, he's went and fixed it now.