what's the deal SM? more problems lately!

J.T.J.T. Registered Users Posts: 279 Major grins
edited March 19, 2008 in SmugMug Support
This is directed to any SM Support Hero or anyone else wanting answers.

It seems lately that SM has had numerous times where SM service and subsequently SM user's sites have been down and interrupted for many reasons, some of which are not told or explained.

While I have a low amount of traffic and customers coming to my site, there are a lot more experienced users and professional photographers that have tons of traffic and business that rely on consistent service without interruptions. I agree that there will be an occasional interruptions, but not as many as there have been recently. I think I may speak for many users, but it seems that SM has had a lot of issues lately. Haven't they?

In just the past few weeks there have been numerous outages and times where I / my family / customers can't access the site. While SM has gotten better at notifying us of the interrupted service when it occurs, it doesn't discount the fact that the interruptions are more frequent.

On my end, I guess I just want to know why there are so many problems and what SM is doing to prevent them or lessen the frequency. What's going on down there in Mountain View?

While I applaud SM for everything they do and for providing excellent customer service for all of us, I would just like some answers to what's going on!

Thanks for listening and again thanks for providing a great product for all of us!
John "J.T."
http://johnthiele.smugmug.com

Nikon D80 w/MB-D80 vertical grip
Tokina 50-135 f/2.8
Nikkor 50mm f/1.4D
Nikkor 18-55mm f/3.5-5.6G
Nikkor 70-300mm f/4.5-5.6G VR

RPS Studio Rotating Flash Bracket

SB 600

"Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
«1

Comments

  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 18, 2008
    Hi J.T., thanks for posting.

    As soon as we were able to post last night, we did, here on Dgrin: http://www.dgrin.com/showthread.php?t=81777

    Last night's problem was not something that we expected (are they ever????), and I hope our Ops team will comment more when they can.

    We know it stinks and we spend a ton of time, energy and money to make this site as available as possible - and to minimize down times.

    I'm sorry that we were down, I really am. We don't like it any more than you do, I can promise you that.
  • dogwooddogwood Registered Users Posts: 2,572 Major grins
    edited January 18, 2008
    Andy wrote:

    As soon as we were able to post last night, we did, here on Dgrin: http://www.dgrin.com/showthread.php?t=81777

    Andy--

    Not to add fuel to the fire, and I know full well you and the other SM staff are more concerned about downtime than most of us (hey, I might have a client or two complain-- you have hundreds (thousands?) of customers wondering what's up?)--

    -- but last night dgrin was down for me at the same time the smugmug sites were down. Not a huge deal and I realize you were working on it and I appreciate the new logo/explanation rather than the smugmug homepage. Just saying that dgrin wasn't accessible (for me anyway) last night either.

    But again, I'm personally happy with the level of service I recieve for the cost of SM, so don't take this the wrong way!

    Portland, Oregon Photographer Pete Springer
    website blog instagram facebook g+

  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 18, 2008
    dogwood wrote:
    Just saying that dgrin wasn't accessible (for me anyway) last night either

    Hi Pete, yeah that sorta sucked bad, didn't it :uhoh
  • aquaticvideographeraquaticvideographer Registered Users Posts: 278 Major grins
    edited January 19, 2008
    Status page?
    Would it be difficult to have a members-only status page? Other sites (like .Mac) do that...it might be good to have something like that so SmugMug subscribers can be in the know if/when dgrin.com and SmugMug are down.

    Just a thought...:tiptoe
  • J.T.J.T. Registered Users Posts: 279 Major grins
    edited January 19, 2008
    Would it be difficult to have a members-only status page? Other sites (like .Mac) do that...it might be good to have something like that so SmugMug subscribers can be in the know if/when dgrin.com and SmugMug are down.

    Just a thought...:tiptoe

    I second that idea...or maybe even a mass email (obviosuly not spam) to members that indicates that their site is down temporarily. No one likes spam, but I personally wouldn't mind an email notification that my site is down!

    Just my 2 cents.
    John "J.T."
    http://johnthiele.smugmug.com

    Nikon D80 w/MB-D80 vertical grip
    Tokina 50-135 f/2.8
    Nikkor 50mm f/1.4D
    Nikkor 18-55mm f/3.5-5.6G
    Nikkor 70-300mm f/4.5-5.6G VR

    RPS Studio Rotating Flash Bracket

    SB 600

    "Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 19, 2008
    J.T. wrote:
    I second that idea...or maybe even a mass email (obviosuly not spam) to members that indicates that their site is down temporarily. No one likes spam, but I personally wouldn't mind an email notification that my site is down!

    Just my 2 cents.
    It would take more time to send those emails than the time the outages normally are, unfortunately.
  • J.T.J.T. Registered Users Posts: 279 Major grins
    edited January 19, 2008
    Andy wrote:
    It would take more time to send those emails than the time the outages normally are, unfortunately.

    Really? headscratch.gif I'm not an IT guru but wouldn't something like a draft email that states SM is currently down and experiencing a problem, sorry for the inconvenience, etc., etc. ... then when the outtage occurs, the email is sent to all affected members. That wouldn't work?
    John "J.T."
    http://johnthiele.smugmug.com

    Nikon D80 w/MB-D80 vertical grip
    Tokina 50-135 f/2.8
    Nikkor 50mm f/1.4D
    Nikkor 18-55mm f/3.5-5.6G
    Nikkor 70-300mm f/4.5-5.6G VR

    RPS Studio Rotating Flash Bracket

    SB 600

    "Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
  • sethseth Registered Users Posts: 14 Big grins
    edited January 19, 2008
    I suggest a blog, used just for outage reporting, that we can subscribe to using RSS (or something similar). Here's an example where this works very well. Ideally it would not be hosted on one of SmugMug's servers.
  • DJKennedyDJKennedy Registered Users Posts: 555 Major grins
    edited January 20, 2008
    I was on earlier, but can't get onto my site now. I dont even get my site in read only mode, or do I get that fancy (but cool) error page - I just get sent to the smugmug site with the 'We're having some temporary difficulties..' text.

    I would have preferred the new fancy logo indicating difficulties, not the smugmug site as that would serve to throw people off, thinking they messed up the typing in of the url.

    But I do agree, this is happening more and more and more...esp the last few weeks.
    http://www.djkennedy.com

    What did Cinderella say when she left the photo shop? "One day my prints will come."

  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 20, 2008
    DJKennedy wrote:
    I was on earlier, but can't get onto my site now. I dont even get my site in read only mode, or do I get that fancy (but cool) error page - I just get sent to the smugmug site with the 'We're having some temporary difficulties..' text.

    I would have preferred the new fancy logo indicating difficulties, not the smugmug site as that would serve to throw people off, thinking they messed up the typing in of the url.

    Use your SmugMug nickname and you will get that.
  • DJKennedyDJKennedy Registered Users Posts: 555 Major grins
    edited January 20, 2008
    Andy wrote:
    Use your SmugMug nickname and you will get that.

    Thanks Andy, that works for me...but what about the general public that uses my URL? That won't work for them - not that I have thousands of hits a day or anything like that mwink.gif
    http://www.djkennedy.com

    What did Cinderella say when she left the photo shop? "One day my prints will come."

  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 20, 2008
    Should be back to normal, folks. Sorry for the hassle!
  • cabbeycabbey Registered Users Posts: 1,053 Major grins
    edited January 20, 2008
    J.T. wrote:
    Really? headscratch.gif I'm not an IT guru but wouldn't something like a draft email that states SM is currently down and experiencing a problem, sorry for the inconvenience, etc., etc. ... then when the outtage occurs, the email is sent to all affected members. That wouldn't work?

    It's not the drafting, but the sending... it would take a few hours on a normal mail server to deliver all of the 450,000 mails (one per customer, using what I'm sure is by now an out of date number of SM customers) you're talking about. Plus to do it right they'd need to off site that server in order to handle an issue where they're offline entirely at HQ, like a quake hits and takes out the fiber into/outof the NAP down the street where all the ISPs in the valley exchage traffic. That adds all sorts of issues.

    Also note that email is not a real time mechanism... just normal email flow can sometimes take longer to get through to you than any of these recent outages have occurred, so by the time you hear about it, it's over. And how many of those mails would go to users that never tried to touch their site during that outage and otherwise wouldn't have known?

    This suggestion has come up a couple times, but frankly, it doesn't make any sense for the SM guys to pursue it.
    SmugMug Sorcerer - Engineering Team Champion for Commerce, Finance, Security, and Data Support
    http://wall-art.smugmug.com/
  • anderivanderiv Registered Users Posts: 80 Big grins
    edited January 20, 2008
    Hey cabbey - great explanation.

    A few of my other hosting providers have started a "status" blog to update customers on the current status and/or ongoing issues. I think this is a great compromise between the "push" email mechanism and the status posts on the dgrin forums.

    If we visit our photo sites and something's not working, we could check the status blog to see if the problem is widespread. When you publish information in this fashion, we'd also be able to subscribe to the RSS feed, which would be quite nice.

    http://blogs.smugmug.com/status <--- perhaps that URL would be appropriate?

    ...just my $0.02.

    -Erik
    P.S. Smugmug web dudes - you may want to assign an appropriate 404 handler on the blogs.smugmug.com server :-)
    Erik Anderson
    http://andersonfam.org
    http://andersonfam.smugmug.com
    D70 | SB-600 | Nifty Fifty | Tamron 17-50 f/2.8 | Nikon 70-300 f/4-5.6G
  • CindyCindy Registered Users Posts: 542 Major grins
    edited January 21, 2008
    I want to know everytime it's down and for how long!
    I want to know everytime it's down and for how long and it'd really, really be nice if we started getting some info as to why! Had I not logged on to dgrin tonight I'd not have known it was down today... but if customers were visiting my site and asked me later why they couldn't get there... or hey why couldn't I see my pics... I'd like an explanation to give them without having to search and by chance find the answer.

    JMHO.

    EDIT: In all fairness I do alot more in studio proofing (customer views on my computer so I don't bother with uploading to smugmug)... but I'd like that to eventually change.
    Cindy Colbert (Utterback) • Wishing You Co-Bear Love, Hugs & Laughter!!!
  • J.T.J.T. Registered Users Posts: 279 Major grins
    edited January 21, 2008
    Cindy wrote:
    I want to know everytime it's down and for how long and it'd really, really be nice if we started getting some info as to why! Had I not logged on to dgrin tonight I'd not have known it was down today... but if customers were visiting my site and asked me later why they couldn't get there... or hey why couldn't I see my pics... I'd like an explanation to give them without having to search and by chance find the answer.

    JMHO.

    EDIT: In all fairness I do alot more in studio proofing (customer views on my computer so I don't bother with uploading to smugmug)... but I'd like that to eventually change.

    Cindy you summed it all up! That's the whole reason why I started this thread.
    John "J.T."
    http://johnthiele.smugmug.com

    Nikon D80 w/MB-D80 vertical grip
    Tokina 50-135 f/2.8
    Nikkor 50mm f/1.4D
    Nikkor 18-55mm f/3.5-5.6G
    Nikkor 70-300mm f/4.5-5.6G VR

    RPS Studio Rotating Flash Bracket

    SB 600

    "Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
  • LouwPhotographyLouwPhotography Registered Users Posts: 63 Big grins
    edited January 24, 2008
    I agree with this too. Here's an article that talks a bit about downtime issues: http://www.joelonsoftware.com/items/2008/01/22.html

    I hope it's something that's considered.
    anderiv wrote:
    Hey cabbey - great explanation.

    A few of my other hosting providers have started a "status" blog to update customers on the current status and/or ongoing issues. I think this is a great compromise between the "push" email mechanism and the status posts on the dgrin forums.

    If we visit our photo sites and something's not working, we could check the status blog to see if the problem is widespread. When you publish information in this fashion, we'd also be able to subscribe to the RSS feed, which would be quite nice.

    http://blogs.smugmug.com/status <--- perhaps that URL would be appropriate?

    ...just my $0.02.

    -Erik
    P.S. Smugmug web dudes - you may want to assign an appropriate 404 handler on the blogs.smugmug.com server :-)
  • bkatzbkatz Registered Users Posts: 286 Major grins
    edited January 25, 2008
    I think you have hit the nail on the head. s an internal vendor (regular job) at my company as well a customer of another internal IT shop - we want to know when something went down - how long the outage lasted and at least a bare minimal root cause (reason why).

    You guys do an awesome job and work hard to make our lives easier and are worth every pennyclap.gif but at the same time we have no real SLA with you (although implied) and I for one don't like that when I ask in every thread during an outage what the problem is/was all I ever see is that "it is now fixed!"headscratch.gif I am at the point where I don't ask very much anymore and I can read in between the lines of Don's blog that yo guys are having some database issues.

    Keep up the good work but if you can - try and keep us informed or set an SLA where we know what info we can expect.....Just MHO
  • CindyCindy Registered Users Posts: 542 Major grins
    edited January 28, 2008
    Down again last night...
    Down again last night... no explanation that I can find...
    Discovered via the following link:
    http://www.dgrin.com/showthread.php?t=82709

    What's up smugmug? headscratch.gif
    Cindy Colbert (Utterback) • Wishing You Co-Bear Love, Hugs & Laughter!!!
  • J.T.J.T. Registered Users Posts: 279 Major grins
    edited January 28, 2008
    Andy,

    There have been a lot of people viewing this thread over the past week or so as many of us are concerned about the increased frequency that SmugMug has been down lately!

    After reading these replies and many offering different suggestions will / is anyone look at these suggestions for notifying us when SmugMug is down?

    Secondly, with all of this discussion, you / SmugMug really haven't told any of us why these problems occur that cause SmugMug to do down unexpectedly!

    Yes, we all know that each of us hate having our sites down for any amount of time, but honest answers to these two basic questions would be beneficial for all of us. deal.gif

    Maybe you can't answer this, but I am sure someone else can!

    Thanks for listening to our concerns, as always!
    John "J.T."
    http://johnthiele.smugmug.com

    Nikon D80 w/MB-D80 vertical grip
    Tokina 50-135 f/2.8
    Nikkor 50mm f/1.4D
    Nikkor 18-55mm f/3.5-5.6G
    Nikkor 70-300mm f/4.5-5.6G VR

    RPS Studio Rotating Flash Bracket

    SB 600

    "Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 28, 2008
    J.T. wrote:
    Andy,

    There have been a lot of people viewing this thread over the past week or so as many of us are concerned about the increased frequency that SmugMug has been down lately!

    After reading these replies and many offering different suggestions will / is anyone look at these suggestions for notifying us when SmugMug is down?

    Secondly, with all of this discussion, you / SmugMug really haven't told any of us why these problems occur that cause SmugMug to do down unexpectedly!

    Yes, we all know that each of us hate having our sites down for any amount of time, but honest answers to these two basic questions would be beneficial for all of us. deal.gif

    Maybe you can't answer this, but I am sure someone else can!

    Thanks for listening to our concerns, as always!

    We've addressed a lot, the issue of notification. Far and away, the problem is fixed before we could get notices out. We pre-notify when there's maintenance (notice to your message panel).

    As to the "why" the outage occurred, I will ask our ops team to answer - they can't always comment but I will ask.

    Please be assured that we're working on making SmugMug as available as possible - and that over time, we've had an enormously strong record as far as uptime. We don't like downtime any more than you guys do.
  • J.T.J.T. Registered Users Posts: 279 Major grins
    edited January 28, 2008
    Andy wrote:
    We've addressed a lot, the issue of notification. Far and away, the problem is fixed before we could get notices out. We pre-notify when there's maintenance (notice to your message panel).

    As to the "why" the outage occurred, I will ask our ops team to answer - they can't always comment but I will ask.

    Please be assured that we're working on making SmugMug as available as possible - and that over time, we've had an enormously strong record as far as uptime. We don't like downtime any more than you guys do.

    Thank you Andy! thumb.gif
    John "J.T."
    http://johnthiele.smugmug.com

    Nikon D80 w/MB-D80 vertical grip
    Tokina 50-135 f/2.8
    Nikkor 50mm f/1.4D
    Nikkor 18-55mm f/3.5-5.6G
    Nikkor 70-300mm f/4.5-5.6G VR

    RPS Studio Rotating Flash Bracket

    SB 600

    "Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
  • darryldarryl Registered Users Posts: 997 Major grins
    edited January 28, 2008
    Andy wrote:
    As to the "why" the outage occurred, I will ask our ops team to answer - they can't always comment but I will ask.

    I understand the reasons why you won't pre-announce features. Competitive advantage, raising users expectations, etc.

    But giving us a post-mortem about downtime doesn't seem unreasonable. I'm actually a little surprised that SmugMug doesn't do this, considering Don's background in NOCs and ISPs.

    Webhosting companies and ISPs like Pair.com, Sonic.net and Dreamhost all provide this kind of information. I know SmugMug is "just photo hosting", but you guys run it like a NOC or ISP. As such, I'd expect the same levels of transparency and professionalism when it comes to downtimes. (Yah, that new "Sorry, we're broken" cartoon is cute, but I tend to agree with one of the Pros that thought it wasn't the most "professional" error message.)

    Sonic.net's MOTD archive: http://sonic.net/motd/
    Pair.com's System Notices: http://www.pair.com/support/system_notices.html
    DreamHost has (heh, Erik) a blog: http://www.dreamhoststatus.com/

    P.S. It seems especially surprising to not get this kind of information considering the usual amount of candor we get on these very forums.
    P.P.S. Note that I am *not* asking for an SLA. I know Don has blogged about SLAs before. I know *my* photos don't need "five 9s" of reliability. (Although I'm sure many of your commercial and professional users *would* like such an agreement.) I just want to know WTF happened. Would that really damage your reputation? I guess if the notices were like, "One of our dogs chewed through the fiber... again", then maybe. But I think you guys must feed your dogs ok. I saw one drinking out of his own water bowl in Don or Chris's office in the Scoble video! :-}
  • bkatzbkatz Registered Users Posts: 286 Major grins
    edited January 28, 2008
    Darryl has hit this spot onthumb.gif.

    I know that I talked about SLAs but what I really want to know is what happened....and eventually how you might keep it form happening again.

    I love you guys and the support you provide and I wouldn't trade youiloveyou.gif but you know - even when the dog drinks from the toilet and gets the blue tongue you wanna know about it.......rolleyes1.gif
  • CindyCindy Registered Users Posts: 542 Major grins
    edited February 1, 2008
    Ahhhhhhhh..... from the looks of it we've had problems again on:
    2/01/08 - http://www.dgrin.com/showthread.php?t=83203 and I just saw here on:
    1/31/08 - http://www.dgrin.com/showthread.php?t=83132
    Any reasons? Lots & lots of problems recenly folks... you know I love you but frankly I'm beginning to get rather concerned...
    Cindy Colbert (Utterback) • Wishing You Co-Bear Love, Hugs & Laughter!!!
  • J.T.J.T. Registered Users Posts: 279 Major grins
    edited February 4, 2008
    follow up
    Andy wrote:
    We've addressed a lot, the issue of notification. Far and away, the problem is fixed before we could get notices out. We pre-notify when there's maintenance (notice to your message panel).

    As to the "why" the outage occurred, I will ask our ops team to answer - they can't always comment but I will ask.

    Please be assured that we're working on making SmugMug as available as possible - and that over time, we've had an enormously strong record as far as uptime. We don't like downtime any more than you guys do.

    Andy,

    Any luck with the Smug ops team giving you answers or specific reasons as to why the outages are occuring and with increased frequency?

    Thanks
    John "J.T."
    http://johnthiele.smugmug.com

    Nikon D80 w/MB-D80 vertical grip
    Tokina 50-135 f/2.8
    Nikkor 50mm f/1.4D
    Nikkor 18-55mm f/3.5-5.6G
    Nikkor 70-300mm f/4.5-5.6G VR

    RPS Studio Rotating Flash Bracket

    SB 600

    "Sometimes I do get to places just when God's ready to have somebody click the shutter." -- Ansel Adams
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited February 4, 2008
    J.T. wrote:
    Andy,

    Any luck with the Smug ops team giving you answers or specific reasons as to why the outages are occuring and with increased frequency?

    Thanks
    We've been busy working on this:
    http://www.dgrin.com/showthread.php?t=82969

    We're working on making the site as bullet proof as possible from an Ops standpoint, too. When Ops has something to tell us, all of us, they'll post here on Dgrin.
  • darryldarryl Registered Users Posts: 997 Major grins
    edited March 10, 2008
    *Bump* (from this thread and a summary:

    - SmugMug is great. Unscheduled downtimes are rare, and when they happen, they are generally pretty short.
    - SmugMug's customer service via e-mail and Dgrin is great.

    HOWEVER...

    - Other reputable companies that provide hosting or networking services have made it a policy to be as transparent as possible when downtimes occur.
    - In his blog, Don publicly complained about Amazon's lack of a status page for their services, which SmugMug builds on top of.
    I’ve asked Amazon repeatedly for an “Amazon Web Services Health” page that shows the current expected state of all their services. Then you can tell at a glance (and even poll and work into your own monitoring) whether any of the services are having problems. Something like Keynote’s Internet Health Report would be a good start, but as Jesse Robbins points out, trust.salesforce.com is the gold standard. This page could also double as a mechanism to let customers know what’s being worked on and current ETAs when there are problems.

    I'd say something snarky here, but honestly, I don't want to be a jerk. I'd just really really love if SmugMug rose to the expectations that Don has set for one of his key vendors.
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited March 10, 2008
    darryl wrote:
    *Bump* (from this thread and a summary:

    - SmugMug is great. Unscheduled downtimes are rare, and when they happen, they are generally pretty short.
    - SmugMug's customer service via e-mail and Dgrin is great.

    HOWEVER...

    - Other reputable companies that provide hosting or networking services have made it a policy to be as transparent as possible when downtimes occur.
    - In his blog, Don publicly complained about Amazon's lack of a status page for their services, which SmugMug builds on top of.



    I'd say something snarky here, but honestly, I don't want to be a jerk. I'd just really really love if SmugMug rose to the expectations that Don has set for one of his key vendors.

    But we did make this, directly as a result of you and others asking for it: http://smugmug.wordpress.com/

    We haven't had any downtime to since then to publish anything about.
  • darryldarryl Registered Users Posts: 997 Major grins
    edited March 10, 2008
    Andy wrote:
    But we did make this, directly as a result of you and others asking for it: http://smugmug.wordpress.com/

    We haven't had any downtime to since then to publish anything about.

    Ahhh, I was just about to post a comment to Don's blog, and lo and behold, there that link is. Doh.

    Ok, well I'm happy the link is now here in Dgrin, and not buried in the comments of a blog entry. Will be subscribing to updates shortly.

    Thanks Andy and Don!
Sign In or Register to comment.