PDA

View Full Version : Why so many problems with slideshows?


jfriend
Jul-16-2009, 06:35 PM
So, in the last two weeks, we've seen all sorts of slideshow problems.

We've seen issues of custom sizes used in the slideshow go completely wonky sometimes downloading 10MB originals and displaying some small piece of that.
We've seen issues of slideshows just stalling after a few images and never going again (regardless of how long one waits).
We've seen issues of very small images (much smaller than the slideshow is declared or the images are). Occasionally a browser cache clear will make the problem go away, but many times it will not. Even that in itself indicates something is wrong in the Smugmug design if the slideshow occasionally requires a manual cache clear. We know our viewers won't know to do that so it will just be broken for them if that happens to them.

There is occasional participation in these threads from Smugmug personnel and we've seen a couple of acknowledgment of problems in the custom size generation, but that doesn't seem to explain item 2) above and we've never seen any status on when any of this is going to get fixed. We've seen a couple of cases where Andy said custom size problems are now supposed to be fixed, yet many problems remain.

Can someone from Smugmug please speak to this issue? Are you actually working on this issue? Are you working on all the issues that folks have seen here? Is anyone diligently following all the new daily threads on slideshow issues to see how to get them all fixed? Do you think everything is fixed now?

For a company that is usually pretty good at taking ownership for an issue and communicating with customers, there's been zero official communication on this issue that has affected a lot of slideshows. Nothing in the status feed. Nothing in the releases feed.

Please tell us what is going on with slideshow problems.

If you want some reading on slideshow issues:

http://www.dgrin.com/showthread.php?t=137248
http://www.dgrin.com/showthread.php?t=137172
http://www.dgrin.com/showthread.php?t=137169
http://www.dgrin.com/showthread.php?t=137167
http://www.dgrin.com/showthread.php?t=137262
http://www.dgrin.com/showpost.php?p=1160360&postcount=4097
http://www.dgrin.com/showthread.php?t=136883

Shizam
Jul-16-2009, 08:56 PM
So, in the last two weeks, we've seen all sorts of slideshow problems.

We've seen issues of custom sizes used in the slideshow go completely wonky sometimes downloading 10MB originals and displaying some small piece of that.
We've seen issues of slideshows just stalling after a few images and never going again (regardless of how long one waits).
We've seen issues of very small images (much smaller than the slideshow is declared or the images are). Occasionally a browser cache clear will make the problem go away, but many times it will not. Even that in itself indicates something is wrong in the Smugmug design if the slideshow occasionally requires a manual cache clear. We know our viewers won't know to do that so it will just be broken for them if that happens to them.

There is occasional participation in these threads from Smugmug personnel and we've seen a couple of acknowledgment of problems in the custom size generation, but that doesn't seem to explain item 2) above and we've never seen any status on when any of this is going to get fixed. We've seen a couple of cases where Andy said custom size problems are now supposed to be fixed, yet many problems remain.

Can someone from Smugmug please speak to this issue? Are you actually working on this issue? Are you working on all the issues that folks have seen here? Is anyone diligently following all the new daily threads on slideshow issues to see how to get them all fixed? Do you think everything is fixed now?

For a company that is usually pretty good at taking ownership for an issue and communicating with customers, there's been zero official communication on this issue that has affected a lot of slideshows. Nothing in the status feed. Nothing in the releases feed.

Please tell us what is going on with slideshow problems.

If you want some reading on slideshow issues:

http://www.dgrin.com/showthread.php?t=137248
http://www.dgrin.com/showthread.php?t=137172
http://www.dgrin.com/showthread.php?t=137169
http://www.dgrin.com/showthread.php?t=137167
http://www.dgrin.com/showthread.php?t=137262
http://www.dgrin.com/showpost.php?p=1160360&postcount=4097
http://www.dgrin.com/showthread.php?t=136883

Thanks for cataloging the threads that reference the issues, fortunately most of the issues stem from the same one or two problems that we're actively working on. Bear with us, as limp as that sounds.

jfriend
Jul-17-2009, 01:59 PM
Thanks for cataloging the threads that reference the issues, fortunately most of the issues stem from the same one or two problems that we're actively working on. Bear with us, as limp as that sounds. Any idea when this is going to get fixed?

cabbey
Jul-17-2009, 09:24 PM
Any idea when this is going to get fixed?

I think a significant amount of it is already fixed and is just waiting for the previously badly generated images to age out of the cdn in the viewers geo.

jfriend
Jul-17-2009, 09:31 PM
I think a significant amount of it is already fixed and is just waiting for the previously badly generated images to age out of the cdn in the viewers geo. How long does that take?

jlatas
Jul-18-2009, 06:18 AM
I think a significant amount of it is already fixed and is just waiting for the previously badly generated images to age out of the cdn in the viewers geo.

What makes you think that its nearly fixed. The issues remain. Furthermore, its not about bad images...sometimes an image displays correctly while at other times they do not. This isn't about caches and cookies and other such nonsense. This is about changes to the slideshows internally at SmugMug. I know they are working on it but lets be realistic about what's going on.

Andy
Jul-18-2009, 08:04 AM
What makes you think that its nearly fixed. The issues remain. Furthermore, its not about bad images...sometimes an image displays correctly while at other times they do not. This isn't about caches and cookies and other such nonsense. This is about changes to the slideshows internally at SmugMug. I know they are working on it but lets be realistic about what's going on.
It really isn't the slideshow, we didn't change that at all. It was to do with custom size creation.
Can you tell me where on your site you're having troubles? We'll have a look. Best is to write our heroes, http://www.smugmug.com/help/emailreal Thanks.

I viewed your homepage slideshow, it looks great!

jfriend
Jul-18-2009, 08:22 AM
It really isn't the slideshow, we didn't change that at all. It was to do with custom size creation.
Can you tell me where on your site you're having troubles? We'll have a look. Best is to write our heroes, http://www.smugmug.com/help/emailreal Thanks.

I viewed your homepage slideshow, it looks great! Andy, I went to check out his slideshow at http://www.jlatas.com/. It starts out fine, then when it gets to an image of someone on a motorcyle coming at the camera, it just stops. I've waited more than 5 minutes and the slideshow is just stalled, doing nothing.

I'm not saying that there aren't custom size issues or caching issues, but when the slideshow just stops completely, that has to be because it isn't written to handle these types of unexpected circumstances properly. It should be able to continue on to other images unless the custom size generation is just completely down which it doesn't seem to be because I can view other people's slideshows.

Here's the image it's been stuck on for more than 5 minutes:

http://jfriend.smugmug.com/photos/594469107_E9BP2-L.jpg

The slideshow hasn't been reliable for around 2 weeks now.

Edit: 45 minutes later and the slideshow is stil stalled.

jlatas
Jul-18-2009, 08:45 AM
It really isn't the slideshow, we didn't change that at all. It was to do with custom size creation.
Can you tell me where on your site you're having troubles? We'll have a look. Best is to write our heroes, http://www.smugmug.com/help/emailreal Thanks.

I viewed your homepage slideshow, it looks great!

All I know is that it was working fine for the couple of months prior to the news that you guys have tweaked the new journal and slideshow features. That is my slideshow photo of the woman on the motorcycle that jfriend is showing in the previous post. Like i said...all was fine until sometime early last week.

Can everyone else who is seeing issues please chime in here so that it doesn't just look like its just a few people who don't know how to clear their cache and cookies...(sarcasm intended!)

jfriend
Jul-18-2009, 09:35 AM
It really isn't the slideshow, we didn't change that at all. It was to do with custom size creation.
Can you tell me where on your site you're having troubles? We'll have a look. Best is to write our heroes, http://www.smugmug.com/help/emailreal Thanks.

I viewed your homepage slideshow, it looks great!

I cleared my disk cache, then restarted the slideshow at http://www.jlatas.com/ and it stalled fairly quickly (within a minute or so) on this image. It's been stuck there for 5 minutes again:

http://jfriend.smugmug.com/photos/594517176_eqNkx-M.jpg

Edit: 20 minutes later and the slideshow is still stalled.

kd2
Jul-18-2009, 09:43 AM
I'm having the same problems with the slideshow on my homepage. I noticed the problems after the new journal/slideshow features were announced. I haven't made any changes to my slideshow and it's been working fine for months prior.

I also sent a note to the help desk to see if they can look into this for me.

Andy
Jul-18-2009, 10:32 AM
but when the slideshow just stops completely, that has to be because it isn't written to handle these types of unexpected circumstances properly.
I agree that Sam could look for ways to handle this better - but the slideshow code hasn't changed, that's all I'm saying... the custom size issue is bringing out the need for some better error handling, I'll discuss it more with Sam this week.

jfriend
Jul-18-2009, 10:44 AM
I agree that Sam could look for ways to handle this better - but the slideshow code hasn't changed, that's all I'm saying... the custom size issue is bringing out the need for some better error handling, I'll discuss it more with Sam this week. I'm glad you guys will discuss this. There have been issues with the slideshow stalling as far back as many months ago. Standard advice offered is to flush disk cache, restart browser, reinstall Adobe Flash. None of those should be needed. It should just work.

And when the custom sizes code goofs up (as long as it doesn't stop serving images all together), the slideshow shouldn't stall. Worst case, it should be able to continue on to the next image if it doesn't get what it expected with one image. Granted, if the custom image turns out to be 10MB, there might be a long pause while that's downloaded, but it should then do something intelligent and continue.

Andy
Jul-18-2009, 10:55 AM
All I know is that it was working fine for the couple of months prior to the news that you guys have tweaked the new journal and slideshow features. That is my slideshow photo of the woman on the motorcycle that jfriend is showing in the previous post. Like i said...all was fine until sometime early last week.

Can everyone else who is seeing issues please chime in here so that it doesn't just look like its just a few people who don't know how to clear their cache and cookies...(sarcasm intended!)I just viewed your slideshow again, and had no problems. So I expect you are still getting stale images from our CDN.

One thing you could do, to ensure fresh images delivered (vs. stuff from the CDN) is to change 600,600 in your slideshow parameters to something else... maybe 500,500 or 610, 610 .... this should force new sizes to come to you.

jfriend
Jul-18-2009, 11:21 AM
I agree that Sam could look for ways to handle this better - but the slideshow code hasn't changed, that's all I'm saying... the custom size issue is bringing out the need for some better error handling, I'll discuss it more with Sam this week. Andy, in your discussion with Sam, here's a network trace from when this slideshow stalls. This was taken right after a browser cache clear:

# Result Protocol Host URL Body Caching Content-Type Process User-defined
0 200 HTTP photos.smugmug.com /photos/566496599_ARymr-590x590-0.jpg 108,537 public, max-age=31355786 Expires: Fri, 16 Jul 2010 16:42:16 GMT image/jpeg firefox:6136
1 200 HTTP photos.smugmug.com /photos/568609205_qpAiY-470x590-1.jpg 75,602 public, max-age=31621346 Expires: Mon, 19 Jul 2010 18:28:16 GMT image/jpeg firefox:6136
2 200 HTTP photos.smugmug.com /photos/564773574_fqnNs-590x590-1.jpg 259,473 public, max-age=31355998 Expires: Fri, 16 Jul 2010 16:45:49 GMT image/jpeg firefox:6136
3 200 HTTP photos.smugmug.com /photos/576176481_fJTbB-590x590-0.jpg 61,113 public, max-age=31606234 Expires: Mon, 19 Jul 2010 14:16:26 GMT image/jpeg firefox:6136
4 200 HTTP photos.smugmug.com /photos/564774159_tvnur-470x590-0.jpg 76,012 public, max-age=30788572 Expires: Sat, 10 Jul 2010 03:08:44 GMT image/jpeg firefox:6136
5 200 HTTP photos.smugmug.com /photos/564774015_dpkjh-469x590-0.jpg 93,322 public, max-age=31354031 Expires: Fri, 16 Jul 2010 16:13:03 GMT image/jpeg firefox:6136
6 200 HTTP photos.smugmug.com /photos/566496495_rPtFb-590x590-0.jpg 109,496 public, max-age=31605782 Expires: Mon, 19 Jul 2010 14:08:54 GMT image/jpeg firefox:6136
7 200 HTTP photos.smugmug.com /photos/569045434_aZVEn-590x590-1.jpg 131,056 public, max-age=31356016 Expires: Fri, 16 Jul 2010 16:46:09 GMT image/jpeg firefox:6136
8 200 HTTP photos.smugmug.com /photos/564773211_TAVJX-590x590-1.jpg 120,366 public, max-age=31606072 Expires: Mon, 19 Jul 2010 14:13:45 GMT image/jpeg firefox:6136
9 200 HTTP photos.smugmug.com /photos/576301040_7riaL-590x590-1.jpg 77,137 public, max-age=31354036 Expires: Fri, 16 Jul 2010 16:13:09 GMT image/jpeg firefox:6136
10 200 HTTP photos.smugmug.com /photos/569046043_cm8Mh-590x590-1.jpg 135,473 public, max-age=31292565 Expires: Thu, 15 Jul 2010 23:08:45 GMT image/jpeg firefox:6136
11 200 HTTP photos.smugmug.com /photos/569045347_GHCsf-590x590-0.jpg 117,337 public, max-age=31354020 Expires: Fri, 16 Jul 2010 16:13:04 GMT image/jpeg firefox:6136
12 200 HTTP photos.smugmug.com /photos/573833844_9jY9s-590x590-0.jpg 36,268 public, max-age=31605761 Expires: Mon, 19 Jul 2010 14:08:49 GMT image/jpeg firefox:6136
13 200 HTTP photos.smugmug.com /photos/564773422_SUhA3-590x590-1.jpg 166,052 public, max-age=31631686 Expires: Mon, 19 Jul 2010 21:20:58 GMT image/jpeg firefox:6136
14 200 HTTP photos.smugmug.com /photos/564793305_pNdpN-590x590-0.jpg 59,550 public, max-age=31606542 Expires: Mon, 19 Jul 2010 14:21:58 GMT image/jpeg firefox:6136
15 200 HTTP photos.smugmug.com /photos/564772884_iHevn-470x590-1.jpg 106,655 public, max-age=31605758 Expires: Mon, 19 Jul 2010 14:08:58 GMT image/jpeg firefox:6136
16 200 HTTP photos.smugmug.com /photos/564773762_bDDyt-470x590-0.jpg 100,075 public, max-age=31355866 Expires: Fri, 16 Jul 2010 16:44:39 GMT image/jpeg firefox:6136
17 200 HTTP photos.smugmug.com /photos/569045772_7TDNT-590x590-2.jpg 0 public, max-age=31440990 Expires: Sat, 17 Jul 2010 16:23:27 GMT image/jpeg firefox:6136
18 302 HTTP photos.smugmug.com /photos/569736972_Lc5JJ-590x590-0.jpg 0 public, max-age=30663277 Expires: Thu, 08 Jul 2010 16:21:38 GMT image/jpeg firefox:6136
19 200 HTTP photos.smugmug.com /photos/569736972_Lc5JJ-O.jpg 189,760 public, max-age=31608765 Expires: Mon, 19 Jul 2010 14:59:46 GMT image/jpeg firefox:6136
20 200 HTTP photos.smugmug.com /photos/564774309_RVM7H-468x590-1.jpg 0 public, max-age=31440982 Expires: Sat, 17 Jul 2010 16:23:27 GMT image/jpeg firefox:6136
21 200 HTTP photos.smugmug.com /photos/569045920_KmvFa-590x590-1.jpg 128,981 public, max-age=31606491 Expires: Mon, 19 Jul 2010 14:22:00 GMT image/jpeg firefox:6136
22 200 HTTP photos.smugmug.com /photos/569045638_9YyJB-590x590-2.jpg 245,833 public, max-age=31292750 Expires: Thu, 15 Jul 2010 23:13:04 GMT image/jpeg firefox:6136
23 200 HTTP photos.smugmug.com /photos/564774476_eXzVR-439x590-2.jpg 89,959 public, max-age=31631737 Expires: Mon, 19 Jul 2010 21:22:55 GMT image/jpeg firefox:6136

Network requests 17, 18 and 20 NEVER returned any data. The last image displayed in the slideshow was the one returned in #16. So, the slideshow stalled forever waiting for request #17 to return something. It never did so the slideshow stalled forever. There seems to be at least a couple issues here:

1) Why does a response never come back from requests 17, 18 and 20. I have no way of knowing if this is a CDN problem or a Smugmug custom size generation request problem. It would be unusual for the CDN to return nothing - more likely for it to return a wrong size image which is not what is happening here, but I can't really tell which it is.

2) When the slideshow code encounters requests that don't come back, it stalls. It should have some time limit that it waits for a response and then either try one more time or skip on to the next image. I think this issue has been in the slideshow for a long time. I've documented this problem before in network traces (slideshow stalls when a network request doesn't return).

If I look at request #17 in a little more detail, we see that it was a request for this URL:

http://photos.smugmug.com/photos/569045772_7TDNT-590x590-2.jpg (http://photos.smugmug.com/photos/569045772_7TDNT-590x590-2.jpg)

I cannot even get that image all by itself in my browser.

If I modify the URL to this:

http://photos.smugmug.com/photos/569045772_7TDNT-590x590.jpg (http://photos.smugmug.com/photos/569045772_7TDNT-590x590.jpg)

Then, I can fetch the image in my browser. So, perhaps something is wrong with the forum of the URL or with the CDN for that form of the URL or with the custom size generation for that form of the URL.

The same is true for #18. If I take the -0 suffix off, the image works, but as the URL is requested, it doesn't work.

Hopefully this is enough new information that you guys can make some progress on both what is happening with the custom size requests and with the slideshow error handling.

I'm in Los Altos (on AT&T DSL), a few miles away from your Mtn View office so I might be seeing the same CDN as your Mtn View office. If you're in New York, you'd obviously be seeing a different CDN.

jlatas
Jul-18-2009, 12:34 PM
I just viewed your slideshow again, and had no problems. So I expect you are still getting stale images from our CDN.

One thing you could do, to ensure fresh images delivered (vs. stuff from the CDN) is to change 600,600 in your slideshow parameters to something else... maybe 500,500 or 610, 610 .... this should force new sizes to come to you.

Changing the params to 610,610 does indeed get me new images that are displayed correctly and that have not stalled at this point. Great! But I have done nothing with my images that are stored at SmugMug so wouldn't that mean that something is happening during your resizing algorithm that is causing the hangup??? Maybe you guys need to flush the cache on your servers!!!

Andy
Jul-18-2009, 12:39 PM
Changing the params to 610,610 does indeed get me new images that are displayed correctly and that have not stalled at this point. Great!
Cool.

Yeah the problem is the olde busted images are stuck in the CDN, our engineers tell us that they'll flush out but I do not know more details on how long it takes and such. I'm glad you're sorted out for now, sorry for the hassle!

cabbey
Jul-18-2009, 01:29 PM
1) Why does a response never come back from requests 17, 18 and 20. I have no way of knowing if this is a CDN problem or a Smugmug custom size generation request problem. It would be unusual for the CDN to return nothing - more likely for it to return a wrong size image which is not what is happening here, but I can't really tell which it is.

The cdn has a cached "image" that is zero bytes in size. There was a short time where we were returning this bogusity for a small percentage of images... I'm sorry to say *your* geo-specific cdn cache grabbed that image during that time and got that bad data. That's one of thousands of cache world wide, which is why that image loads fine for me and Andy. (love the mischievous smile on her face btw.)

2) When the slideshow code encounters requests that don't come back, it stalls. It should have some time limit that it waits for a response and then either try one more time or skip on to the next image.

Not to throw a fellow sourcerer under the bus or anything, but yeah, I agree with you on this.


http://photos.smugmug.com/photos/569045772_7TDNT-590x590-2.jpg (http://photos.smugmug.com/photos/569045772_7TDNT-590x590-2.jpg)

I cannot even get that image all by itself in my browser.

The slide show is using the browser to fetch the image... if it can't give it to the slideshow, I dunno why you think you can get it to load something different just by asking it nicely. :)

If I modify the URL to this:

http://photos.smugmug.com/photos/569045772_7TDNT-590x590.jpg (http://photos.smugmug.com/photos/569045772_7TDNT-590x590.jpg)

Then, I can fetch the image in my browser. So, perhaps something is wrong with the forum of the URL or with the CDN for that form of the URL or with the custom size generation for that form of the URL.

The same is true for #18. If I take the -0 suffix off, the image works, but as the URL is requested, it doesn't work.

Right, as I said the other day, that url is wedged in the cdn with bogus data... either the wrong size image, or no image at all. By changing the url to another way to refer to the same image, you have bypassed the bogosity in the cdn and fetched your image.

This demonstrates a way for folks to solve this issue for themselves before the cdn flushes your broken images. Figure out what image it is that is blocking your slideshow, it will be the *next* image to be displayed... you can use the thumbnails to figure it out. Goto the gallery and perform any action that will cause it to be modified. Rotate, watermark, flip, etc then do a second one to undo that. This will increment the serial number on the image twice, changing the url similar to John's hand hack above and faking the cdn out to think it's a new image.

Hopefully this is enough new information that you guys can make some progress on both what is happening with the custom size requests and with the slideshow error handling.

Sorry John, no new info here... but you confirmed that you're seeing what we already fixed the source of... now just to wait for it to trickle through the affected CDNs.

I'm in Los Altos (on AT&T DSL), a few miles away from your Mtn View office so I might be seeing the same CDN as your Mtn View office. If you're in New York, you'd obviously be seeing a different CDN.

HQ's view of the relevant bits of the cdn:


marchhair:~ cabbey$ host photos.smugmug.com
photos.smugmug.com is an alias for www.smugmug.com.edgesuite.net.
www.smugmug.com.edgesuite.net is an alias for a539.b.akamai.net.
a539.b.akamai.net has address 209.107.213.61
a539.b.akamai.net has address 209.107.213.62

cabbey
Jul-18-2009, 01:57 PM
This isn't about caches and cookies and other such nonsense.

Who said anything about cookies? (I'd like a nice oatmeal craisin cookie though if you're baking. :)

This is about changes to the slideshows internally at SmugMug.

The reported failures I've seen have *all* boiled down root cause to images served out to it, through the cdn, incorrectly. Not admittedly it coulda done better with what it got, and I'm sure Shizam will look into making it more fault tolerant.

I know they are working on it but lets be realistic about what's going on.

uhm. "they" includes "me", and I'm 100% realistic... we haven't had any reproducible cases of bad images served out by our code in several days now. All the current problems are a result of previously served out bad images being cached in the cdn... so yes, it *is* all about the cache... normally a tool for good, today it has an unfortunate side effect of caching mistakes and making them last longer.

jfriend
Jul-18-2009, 02:30 PM
Sorry John, no new info here... but you confirmed that you're seeing what we already fixed the source of... now just to wait for it to trickle through the affected CDNs.


I would have thought it was new info that when the slideshow code fails to get good data for one image, it stalls forever.

If the slideshow had different coding for that condition, your viewers probably wouldn't even notice that the slideshow was skipping a few images and that the CDN had a bunch of poisoned images. You'll notice from the network trace that the slideshow has even already pre-fetched images past the bad ones so it could just go right ahead with the ones that it successfully got and ignore the bad ones.

FYI, this is not new behavior for the slideshow. I've reported network traces like this many months ago - it's one of the reason that slideshows stall.

If you want the slideshow to be robust regardless of CDN issues, hiccups in custom sizes, glitches in the network (which I would think is something you would want), the slideshow just needs better coding for this condition. Since the order of images in the slideshow isn't an ironclad contract with the user, but not stalling forever is, I would think it would be an algorithm something like this:

If I'm still waiting for this next image to come in and I've either gotten an invalid response or it's been longer than X time with no valid response on that request, I'll skip that image for now and go on to the following images in the slideshow.

If indeed, the CDN is just returning a zero length image, then this particular instance may be as simple as checking for that condition and skipping that request to go onto the next image. But, the general purpose robustness improvement would include the case where the request never comes back too or comes back with some sort of other invalid data.

Do you not see some actions that should be taken in the slideshow code here to prevent these stalls?

jlatas
Jul-18-2009, 06:14 PM
Who said anything about cookies? (I'd like a nice oatmeal craisin cookie though if you're baking. :)



The reported failures I've seen have *all* boiled down root cause to images served out to it, through the cdn, incorrectly. Not admittedly it coulda done better with what it got, and I'm sure Shizam will look into making it more fault tolerant.



uhm. "they" includes "me", and I'm 100% realistic... we haven't had any reproducible cases of bad images served out by our code in several days now. All the current problems are a result of previously served out bad images being cached in the cdn... so yes, it *is* all about the cache... normally a tool for good, today it has an unfortunate side effect of caching mistakes and making them last longer.

Well then, where did the "bad images" come from in the first place? You say that you didn't have any bad images served up in 'several days now'...so that means there was a code issue at least several days ago. I just get tired of people always insisting that the user doesn't know enough to clear their cache or delete their cookies...yes, that was the advice given here several days ago...so don't be so 'smug' about that! Their obviously is, or was, an issue with how images were being resized for the slideshows. Why did the issue with incorrect images being dished out occur in the first place? I still have yet to see anyone really say that yes, we had an issue a few days ago and bad data was being delivered from SmugMug servers...

Allen
Jul-18-2009, 06:49 PM
I've had at least a half dozen stalls in the last few months and in EVERY case
I regenerated the "non-retrieving photo" by re-adding the WM and the show
ran on through. Regenerating the display sizes fixed it. All these stalls
were with newly uploaded galleries using FSSS. Click the thumb after the
stalled photo and it would keep running.

kd2
Jul-19-2009, 11:21 AM
My slideshow is working just fine now and I haven't done anything. So something changed and it wasn't anything I did.

Andy
Jul-19-2009, 12:05 PM
My slideshow is working just fine now and I haven't done anything. So something changed and it wasn't anything I did.
Out with the bad, in with the good - the bad images out in our Content Delivery Network (CDN) were flushed for you... so that's why.

Allen
Jul-19-2009, 12:15 PM
Out with the bad, in with the good - the bad images out in our Content Delivery Network (CDN) were flushed for you... so that's why.
Guess I was doing this by regenerating new display sizes to each stalled
photo. Kickin' the CDN.:D

Andy
Jul-19-2009, 12:18 PM
Kickin' the CDN.:D
Right in the junk!

Andy
Jul-19-2009, 12:20 PM
I just get tired of people always insisting that the user doesn't know enough to clear their cache or delete their cookies..
Most folks don't know this, you're very advanced and you 'get' it... we're sorry, we really hadn't encountered this error before, and some of us heroes, we tried some methods we're used to using to get fresh images. But in this case, there were some busted images in the CDN that flushing user's cache in a browser won't change.....

I'm really sorry.

kd2
Jul-19-2009, 12:48 PM
Out with the bad, in with the good - the bad images out in our Content Delivery Network (CDN) were flushed for you... so that's why.

Well, I'm back to being a totally happy camper! Thanks!

jfriend
Jul-20-2009, 08:36 AM
I would have thought it was new info that when the slideshow code fails to get good data for one image, it stalls forever.

If the slideshow had different coding for that condition, your viewers probably wouldn't even notice that the slideshow was skipping a few images and that the CDN had a bunch of poisoned images. You'll notice from the network trace that the slideshow has even already pre-fetched images past the bad ones so it could just go right ahead with the ones that it successfully got and ignore the bad ones.

FYI, this is not new behavior for the slideshow. I've reported network traces like this many months ago - it's one of the reason that slideshows stall.

If you want the slideshow to be robust regardless of CDN issues, hiccups in custom sizes, glitches in the network (which I would think is something you would want), the slideshow just needs better coding for this condition. Since the order of images in the slideshow isn't an ironclad contract with the user, but not stalling forever is, I would think it would be an algorithm something like this:

If I'm still waiting for this next image to come in and I've either gotten an invalid response or it's been longer than X time with no valid response on that request, I'll skip that image for now and go on to the following images in the slideshow.

If indeed, the CDN is just returning a zero length image, then this particular instance may be as simple as checking for that condition and skipping that request to go onto the next image. But, the general purpose robustness improvement would include the case where the request never comes back too or comes back with some sort of other invalid data.

Do you not see some actions that should be taken in the slideshow code here to prevent these stalls?

Shizam, Andy, Cabbey or Doc, can one of your respond here? I've got a smoking gun network trace (which is repeatable) on why slideshows stall and the only response I've gotten says that there's nothing new here and implies that nothing should be changed and we should all just wait for the CDN caches to clear.

If you want slideshows to stop stalling when anything goes wrong, some error handling in the slideshow code needs to be improved. It's that simple. In fact, if you fixed this now, slideshows wouldn't even stall with the bad CDN cache images out there now and we wouldn't have had to wait two weeks for CDN caches to clear.

If you don't care that slideshows are susceptible to stalls when an image doesn't get delivered properly, just let me know and I'll stop bringing these kinds of things to your attention.

Andy
Jul-20-2009, 09:02 AM
Shizam, Andy, Cabbey or Doc, can one of your respond here? I've got a smoking gun network trace (which is repeatable) on why slideshows stall and the only response I've gotten says that there's nothing new here and implies that nothing should be changed and we should all just wait for the CDN caches to clear.

If you want slideshows to stop stalling when anything goes wrong, some error handling in the slideshow code needs to be improved. It's that simple. In fact, if you fixed this now, slideshows wouldn't even stall with the bad CDN cache images out there now and we wouldn't have had to wait two weeks for CDN caches to clear.

If you don't care that slideshows are susceptible to stalls when an image doesn't get delivered properly, just let me know and I'll stop bringing these kinds of things to your attention.Hello John, thanks for posting. We care immensely and we're really grateful you took the time to gather this info. I actually took two days off (Saturday and Sunday) to do some R&R. Sam was at his inlaws this weekend. I promise you that Sam will see your posts and he will reply.

I'm sorry to have upset you, and I'm really, really sorry I didn't reply to this thread over the weekend.

cabbey
Jul-20-2009, 01:08 PM
Shizam, Andy, Cabbey or Doc, can one of your respond here? I've got a smoking gun network trace (which is repeatable) on why slideshows stall and the only response I've gotten says that there's nothing new here and implies that nothing should be changed and we should all just wait for the CDN caches to clear.

Sorry John, "nothing new" != "nothing should be changed". Didn't mean to give you that impression.

jfriend
Jul-20-2009, 01:16 PM
Sorry John, "nothing new" != "nothing should be changed". Didn't mean to give you that impression. I'm just waiting for someone at Smugmug to acknowledge that the slideshow error handling should be improved such that when it either receives a zero length response for an image, never receives a response for a requested image or gets some kind of error when requesting an image that the slideshow will do something intelligent like skip that request and continue on rather than stall forever.

Andy
Jul-20-2009, 03:32 PM
I'm just waiting for someone at Smugmug to acknowledge that the slideshow error handling should be improved such that when it either receives a zero length response for an image, never receives a response for a requested image or gets some kind of error when requesting an image that the slideshow will do something intelligent like skip that request and continue on rather than stall forever.
I acknowledge that we could do better at this.

jfriend
Jul-20-2009, 04:24 PM
I acknowledge that we could do better at this. Is someone going to file a bug so that code changes actually get made?

Andy
Jul-20-2009, 07:33 PM
Is someone going to file a bug so that code changes actually get made?
Already done :thumb

jfriend
Jul-20-2009, 07:35 PM
Already done :thumb Thanks.

Shizam
Jul-21-2009, 06:29 AM
Is someone going to file a bug so that code changes actually get made?

I ment to post the status but got caught up in fiddling with the problem, it took a good portion of the day to create a reproducible test case between our dev server and my sandbox but I eventually got it to occur and am working on a fix. The problem I was able to reproduce is SS stalling sometimes if you get a HTTP response 200 but a 0 length file.

Sam

jfriend
Jul-21-2009, 10:11 AM
I ment to post the status but got caught up in fiddling with the problem, it took a good portion of the day to create a reproducible test case between our dev server and my sandbox but I eventually got it to occur and am working on a fix. The problem I was able to reproduce is SS stalling sometimes if you get a HTTP response 200 but a 0 length file.

Sam I guess it's better to be working on the problem than talking to me about it - I just wasn't sure it had reached the right folks until late in the day. It's kind of hard to tell that from the outside sometimes.

Response 200 with a 0 length file is the one I was able to see that caused a problem. I'm glad you set up a reproducible case.

Since we know the custom size generation can occasionally hiccup and leave the CDN with some junk, it's probably worth protecting against response 200 and any sort of invalid file, not just 0 length. Presumably, you would encounter that a little bit later in the pipeline when you go to render it.

Other errors that would be likely to occur in the real world would be an http request that just never returns a response (either a TCP error due to an interrupted connection or no official TCP error, but just never receive a response within a reasonable amount of time).

And, of course there's all the 400 and 500 errors that are less likely but should be handled in a way that keeps the slideshow going assuming other requests succeed.

Andy
Jul-21-2009, 07:34 PM
I guess it's better to be working on the problem than talking to me about it - I just wasn't sure it had reached the right folks until late in the day.

Hey John, you know me and us by now. We always do the right thing, and we never, ever ever ignore your posts. I even replied you several times on a precious couple of days off I that I took.

Sam's got a fix in testing already, thanks again.

jfriend
Jul-21-2009, 08:20 PM
Hey John, you know me and us by now. We always do the right thing, and we never, ever ever ignore your posts. I even replied you several times on a precious couple of days off I that I took.

Sam's got a fix in testing already, thanks again. That's good news. Thanks. This will prevent many stalled slideshows, even when the CDN has bad data.

On the other part of your note, put yourself in my position. I report things that aren't acted upon so I never know what your position is on an issue until someone replies in the affirmative that you guys agree it's important and that it's going to get fixed. You want me to believe that you'll just do the right thing, but I don't know what you think the right thing is until you tell me.

I appreciate that you replied in the middle of days off, but your first replies only told me that you'd seen it, not that you guys were going to act on it.

Andy
Jul-22-2009, 03:39 AM
not that you guys were going to act on it.
Give us the benefit of the doubt, we deserve it :deal