PDA

View Full Version : RSS Feed not returning HTML?


seancoady
Jan-15-2007, 11:48 PM
Hi,

I'm trying to retrieve Smugmug's Popular Photo RSS feed from within a client application and noticing some strange results. It's returning HTML instead of RSS XML:

This can be easily seen by fetching the feed with a wget on the Unix command line:

wget "http://www.smugmug.com/hack/feed.mg\?Type=popular\&Data=today\&format=rss200"

(note the '?' and '&' characters are backslash-escaped to avoid being interpreted by the Unix shell)

What comes back is HTML of a generic page on Smugmug. Here are the first few lines that come back:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
****** http-equiv="content-type" content="text/html; charset=iso-8859-1" />
****** name="description" content="The ultimate in photo sharing. Easily create online photo albums. Share, store, organize and print." />

...
(and later in the document)

<body onload=" smugLoad();" class="loggedIn pureSmugCSS bodyColor_Black">
<div id="bodyWrapper">
<!-- These extra divs/spans may be used as catch-alls to add extra imagery. -->
<div id="extraDiv1"><span></span></div><div id="extraDiv2"><span></span></div><div id="extraDiv3"><span></span></div>
...

As you can see, there's nothing RSS-like about the response. It's an HTML page.

What's strange is that it does show up correctly if I insert the URL (http://www.smugmug.com/hack/feed.mg?Type=popular&Data=today&format=rss200) in a browser. Firefox renders the response as RSS XML. Any idea why Firefox is seeing it differently?

Confused....
:scratch

devbobo
Jan-16-2007, 02:52 AM
Sean,

The problem is the method u used to escape the characters. You need to use URL encoding, not \& and \?.

As below...


wget "http://www.smugmug.com/hack/feed.mg%3FType=popular%26Data=today%26format=rss20 0"


Cheers,

David

seancoady
Jan-16-2007, 08:00 AM
Thanks David, but even with URL encoding it is still returning HTML.

wget "http://www.smugmug.com/hack/feed.mg%3FType=popular%26Data=today%26format=rss20 0"

Wget downloads and saves a file called index.html which contains the HTML contents that I mentioned above.

What do you get back when you run that command?

devbobo
Jan-16-2007, 08:21 PM
Sean,

Firstly...

make sure that there is no space in 'format=rss200' like being displayed above. For some reason, vBulletin is screwing up the display and inserting a space.

I tried this on a windows port of wget...and it works perfectly.

Cheers,

David

seancoady
Jan-17-2007, 11:49 PM
I'm sure there's a perfectly good explanation, or I'm missing something very obvious, but it still doesn't work for me. I tried it on Windows XP in a DOS prompt using a wget binary and got the same result as on Linux. I made sure not to include the space or the '0' at the end (I agree, I think that was just a display issue with this reader, but I realized that the wget works fine without the quotes when it's URL-escaped anyway so I left them out)

wget http://www.smugmug.com/hack/feed.mg%3FType=popular%26Data=today%26format=rss20

However I still get a 10,675 byte index.html download which is an HTML page.


..even more confused

seancoady
Jan-17-2007, 11:52 PM
by the way, that is a rss200 at the end. It's not showing up in vBulletin for some reason.

devbobo
Jan-18-2007, 02:13 AM
Sean,

PM me your email address and I will send u the exact command i am running and the output file.

Cheers,

David