Options

Gallery Tools and Uploader Limits

SamirDSamirD Registered Users Posts: 3,474 Major grins
edited January 5, 2011 in SmugMug Support
I need to know if there are any practical or absolute limits on the number of images that can be manipulated in gallery or uploaded. Specifically, how many images will the "Move to Gallery" be able to handle? 5000? 10,000? 30,000? 50,000? How many images can the SM uploader handle in one session? 5000? 10,000? 30,000? 50,000?

The reason I ask this is because I have 100gb that I've been trying to upload to SM for years. Well, now I have to get them online to make a 3rd copy of my images. (I've found that hard drives will have bits change over time and a 3rd copy is the only way to tell which image is correct.) I've tried every single uploader and none of them work as consistantly as the default SM one. (Yes, even StarExplorer crashed on me. KomodoDrop, sendtosmugmug, you name it...)

Because the default SM uploader will choke my fastest system if I try to upload more than 10 galleries simultaneously, I thought about uploading everything into one gallery and then just split the images out to the different galleries. This upload will number somewhere between 20,000 to 100,000 images (I don't have my drives with me so I don't know the exact number). But there's no reason to try this all as one batch if there's a limit that I need to be aware of. Any assistance appreciated.
Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
Want faster uploading? Vote for FTP!

Comments

  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited January 19, 2010
    Samir, you're pushing the limits, for sure :)
    This is typically a browser/ram/computer thing, not a SmugMug thing.

    I really don't know the physical limits. I'll ask around internally if anyone has some suggestions.
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited January 19, 2010
    Andy wrote:
    Samir, you're pushing the limits, for sure :)
    As usual. rolleyes1.gif
    Andy wrote:
    This is typically a browser/ram/computer thing, not a SmugMug thing.

    I really don't know the physical limits. I'll ask around internally if anyone has some suggestions.
    Thank you for the quick reply Andy. thumb.gif Any information will be helpful. Specs of the system where they ran into/didn't run into the limits would be nice. I'll be trying this later today since Knology has upgraded their service to 2mb uploads. clap.gif
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • Options
    jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited January 19, 2010
    Are you sure you want to use Smugmug for this?

    I'm using http://www.backblaze.com as my online backup. It is an extra annual fee, but is also unlimited storage, will take any kind of file type (not just images) and has a painless uploader that just works automatically all the time in the background.

    I find the most important characteristic of a backup system is that it's automatic because no system is good if it relies on me remembering to do a manual operation regularly. Backblaze is totally automatic. I now have 680MB backed up through them (it took several months on my puny DSL line to get that much up there), but it just worked in the background 24 hours a day.

    I'm also able to back up other important things such as family documents, Lightroom catalogs, music library, home vidoes, etc... And, it keeps my whole disk hierarchy intact. if I ever need to restore it has a more typical restore interface, letting me restore anything from a single file to a whole directory tree to a whole drive at a time.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited January 19, 2010
    jfriend wrote:
    Are you sure you want to use Smugmug for this?

    I'm using http://www.backblaze.com as my online backup. It is an extra annual fee, but is also unlimited storage, will take any kind of file type (not just images) and has a painless uploader that just works automatically all the time in the background.

    I find the most important characteristic of a backup system is that it's automatic because no system is good if it relies on me remembering to do a manual operation regularly. Backblaze is totally automatic. I now have 680MB backed up through them (it took several months on my puny DSL line to get that much up there), but it just worked in the background 24 hours a day.

    I'm also able to back up other important things such as family documents, Lightroom catalogs, music library, home vidoes, etc... And, it keeps my whole disk hierarchy intact. if I ever need to restore it has a more typical restore interface, letting me restore anything from a single file to a whole directory tree to a whole drive at a time.
    I've looked into these systems and they rely too much on proprietary software for my tastes.

    Backup is actually not my main concern as I've got local backups of everything mirrored at two sites. But the problem I recently discovered is that consumer-level hard drives have data integrity issues. The data can change without warning or indication in less than a year. And this is without any indication of drive failure or a problem of any sort. Data integrity is now something I have to bring into the mix.

    I back up all photos to two 640gb external usb drives as well as a special archive category on SM. I've been uploading as I shoot stuff since 2009. But this doesn't help for the many hundreds of thousands of photos shot prior to 2008. These are on drives, but that's it. I want to get these on SM asap so I can empty drives that have the 3rd, 4th, 5th, and 6th copies of this data. I need the drives for expansion now as the 640gb is almost full. I've been meaning to do this for years, but now it's going to cost me if I don't.

    For regular data, I'm going to go with an external drive in addition to the two file servers (running raid 1) at two different physical sites joined via a vpn. This should be pretty good for backup as well as integrity. I already have the file servers in place. I just have to adjust the capacities and migrate all the existing data, which current resides on several sets of manally mirrored external drives. As these drives get freed, this will give me additional space for images going forward in the next few years.

    Previously, the main bottleneck on the upload project was upload bandwidth. And this coincidentally is getting solved as we speak since Knology is upgrading their Internet services. This will take my upload capability from just over 2mb to 15mb. The month long upload should now only take a week or so. wings.gif Now, if I can just get it queued up somehow.

    StarExplorer would have been ideal for this project, but I've never had any success with it. I actually bought it specifically for this purpose years ago. Back in the day, it froze randomly during large batches. Nikolai and I worked back and forth and we were able to make it usable for smaller batches, but then the SM uploader became better. I gave SE a shot yesterday with a 10gb batch on my newest and fastest system. It froze on after uploading just over a hundred...out of 7000. It didn't even save the upload queue. :cry
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • Options
    jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited January 19, 2010
    SamirD wrote:
    I've looked into these systems and they rely too much on proprietary software for my tastes.
    That's a bit of an odd statement. I don't see any more/less proprietary software in BackBlaze vs. Smugmug. Both use proprietary uploaders. Both use web interfaces for downloading. Both have a ton of proprietary software that make their sites work. The only way I know you'll get around the proprietary software issue is if you get generic web server hosting where you can use standard ftp for uploading into a standard (probably Linux) file system. There you will probably pay for your storage by the GB.

    FYI, I use StarExplorer for all my Smugmug uploads and it has saved my butt many times because I can do uploads to many different galleries in one unattended session and because it has great error recovery from a hiccup in either my net connection or Smugmug. If you have issues, Nikolai is usually pretty proactive in helping you figure out why it isn't working for you. But, I don't use Smugmug as a backup source.

    What led you to conclude that standard hard drives change single bits and that this happens frequently? That is not a typical behavior for disk drives. I'm not saying that it can't happen, but there's an awful lot of software in the world that counts on that not happening. Drives can certainly fail and one should be prepared for that, but random and invisible data corruption is not a common occurrence and not something that is supposed to ever happen without classifying it as a read failure. Single bits changing on a hard drive storing software applications would render software applications useless because one bit changing in a program file can ruin the whole program.

    FYI, I have 800GB of data with three separate hard drive copies at home (two hard drive backups) and most of it also in an online backup at BackBlaze. I lost two years of photos once so I'm pretty determined to keep that from happening again.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited January 19, 2010
    jfriend wrote:
    I don't see any more/less proprietary software in BackBlaze vs. Smugmug. Both use proprietary uploaders. Both use web interfaces for downloading. Both have a ton of proprietary software that make their sites work. The only way I know you'll get around the proprietary software issue is if you get generic web server hosting where you can use standard ftp for uploading into a standard (probably Linux) file system. There you will probably pay for your storage by the GB.
    I agree with you that both are proprietary. But SM is something that's already part of my workflow for images. If I wanted to use SM for everything, I'd add a Smugvault. I've looked at rsync.net as I use it for my web server, but it's too expensive for large sized backups.
    jfriend wrote:
    FYI, I use StarExplorer for all my Smugmug uploads and it has saved my butt many times because I can do uploads to many different galleries in one unattended session and because it has great error recovery from a hiccup in either my net connection or Smugmug. If you have issues, Nikolai is usually pretty proactive in helping you figure out why it isn't working for you. But, I don't use Smugmug as a backup source.
    I've worked with him on getting things to work, but this is software and with the various system configurations out there, there's no guarantee anything will work. I've tried SE on eight different systems, many with different processors, operating systems, and specs. I've not been able to get the level of success that everyone else has. Just my luck, I guess. :cry
    jfriend wrote:
    What led you to conclude that standard hard drives change single bits and that this happens frequently? That is not a typical behavior for disk drives. I'm not saying that it can't happen, but there's an awful lot of software in the world that counts on that not happening. Drives can certainly fail and one should be prepared for that, but random and invisible data corruption is not a common occurrence and not something that is supposed to ever happen without classifying it as a read failure. Single bits changing on a hard drive storing software applications would render software applications useless because one bit changing in a program file can ruin the whole program.
    The methodolgy was as follows:
    • All images are kept in directories by date. Copy images from memory card to 250gb drive. Use (now defunct) Seagate Software's Xcompare program to compare the files. This software is functionally equivalent to dos/win comp command, except that it traverses directories. Make sure everything compares okay.
    • copy images from one 250gb drive to other 250gb drive. Xcompare and verify that copy is okay. All three sets of images are the same at this point--card, 250gb drive one, 250gb drive two.
    • One year later, I copied contents of 250gb drives to 640gb drives and xcompared. These compared with no errors. Continue adding newly shot images to 640gb using same methodology as with 250gb drives.
    • One year later, xcompare 250gb drives to 640gb drives again just to "make sure" before deleting. Three files miscompare.
    • Compare just those directories with the files. Files still miscompare.
    • Compare files between 250gb drives--compare okay.
    • Compare files between 640gb drives--compare error.
    • Compare every combination of 250gb and 640gb drives--out of 4 drives, 3 drives have the same file.
    • Check file on the 640gb drive with the different file for any indication of change--date change, access time difference, exif change, anything. Nothing found. Scan drive for defects. Nothing found. Run SMART utilities to query drive for errors not reported to operating system. Nothing found. Conclusion--the files automatically changed on the 640gb drives.
    • Alarmed at what I found, I compared the internal drives in one of my file servers which are also kept manually mirrored. Over 5 files found to miscompare.
    • Worked on designing new storage methodology using what I've dubbed RAID1.3--mirroring, but with three sets of data vs two.
    jfriend wrote:
    FYI, I have 800GB of data with three separate hard drive copies at home (two hard drive backups) and most of it also in an online backup at BackBlaze. I lost two years of photos once so I'm pretty determined to keep that from happening again.
    I haven't lost much yet, but I don't plan to either. It's why I compare everything. Even in the DOS days when you'd never see a miscompare. But modern operating systems have shown me errors--only a few times, but enough to have me worry. And now drives can't be trusted--great. I've seriously considered moving to all enterprise class SCSI drives, but it's still cheaper to have three desktop drives than an enterprise one.

    As long as SM has a storage system with a target of 100% data integrity, I can trust my archive copies on SM to be the true, real copy. (I plan to download images after I upload them and compare them with the originals before I put this stamp on them.) Then the issue is how to get the images to SM. I have the bandwidth now, and I wish to know if the tools are up to the challenge.
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • Options
    jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited January 19, 2010
    SamirD wrote:
    As long as SM has a storage system with a target of 100% data integrity, I can trust my archive copies on SM to be the true, real copy. (I plan to download images after I upload them and compare them with the originals before I put this stamp on them.) Then the issue is how to get the images to SM. I have the bandwidth now, and I wish to know if the tools are up to the challenge.
    There's something fishy in the compare. If your disks were just randomly changing bits in your files, you'd have all sorts of system problems and, if I were you, I'd be very concerned about finding/fixing that issue before it messes up something seriously bad.

    I'd really like to know what bits in those files were actually different so you could see whether there is a logical explanation (like something touched the image metadata) or whether the different bits are in the extra bytes of a file sector/cluster outside the actual length of the file or whether a whole sector/cluster was clobbered.

    Also, what file system are you using? Is it FAT32 or NTFS. FAT32 is notorious in OS crashes for occasionally getting cross linked files and thus getting a whole wrong sector in a file, but NTFS is generally much more robust than that. But, without knowing what kind of corruption you have, we don't know if this is a possible cause or not.

    Anyway, I don't think I have much more helpful to say other than you really ought to find out what is corrupting your files. That is not normal behavior and would be very worrisome to me if I were you. It is not something that a disk just does on it's own without generating read errors. Disk information is CRCed when written so any bit drift is detected and either fixed or reported as a read error to the OS.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited January 20, 2010
    jfriend wrote:
    There's something fishy in the compare. If your disks were just randomly changing bits in your files, you'd have all sorts of system problems and, if I were you, I'd be very concerned about finding/fixing that issue before it messes up something seriously bad.
    I disagree. I've used this version of xcompare without fail for almost a decade. I initially didn't know if it would work as well with NTFS as it did with FAT, but I've done enough testing to validate its use. (The only limitation is on long filenames, which it uses the truncated version to compare so there can be compare errors due to comparing different files. But this is easily remedied by using regular comp in a directory with long filenames. It doesn't happen often since the storage spec for image files still uses the 8.3 format.) It's literally functionally identical to the command line compare command (comp). I even used to write batch files to do multi-directory comparisions using comp before xcomp came along.
    jfriend wrote:
    I'd really like to know what bits in those files were actually different so you could see whether there is a logical explanation (like something touched the image metadata) or whether the different bits are in the extra bytes of a file sector/cluster outside the actual length of the file or whether a whole sector/cluster was clobbered.
    I've thought about going into this level of detail. When I used comp on the files, it simply said they're different. The old version of comp would display the differences and only determine files to be different if there were more than 10 differences. I guess I can use FC to get more detail as to if it is a single bit or more, but I don't think the extra information will shed any more light on what's going on.

    If there were extra bytes on one of the files versus the other, the file sizes would have been different. I haven't studied enough on the NTFS file system to know if the space occupied by a file versus the file size can skew or 'move' within the allocated space. I know in FAT32 it shouldn't. And what's interesting is that this has happenned to two different sets of drives running two different file systems and two different OSs.
    jfriend wrote:
    Also, what file system are you using? Is it FAT32 or NTFS. FAT32 is notorious in OS crashes for occasionally getting cross linked files and thus getting a whole wrong sector in a file, but NTFS is generally much more robust than that. But, without knowing what kind of corruption you have, we don't know if this is a possible cause or not.
    It's really interesting. The file server has two 160GB drives running FAT32 on a PentiumPro 180 with 98se. No partition approaches the old 137GB limit even though the system has a Promise IDE card that extends the limit. The boot partition is separate from the data partitions. (And amazingly it's still decently quick, transferring at almost 20MB/sec from one set of drives to the next.)

    The other set of drives are 250GB Seagate Freeagent external USB drives running NTFS from the factory, and 640GB Maxtor OneTouche series external USB drives running NTFS from the factory. These are primarily connected to a Neoware XPe thin client that is primarily used for uploading to SM and copying off cards to the drives. At random, one of the 640GB drives also gets transported to a second site to be connected with a faster Internet connection for uploading to SM. It gets connected to a HP dc5750 business class desktop running XPP. I think transportation of the 640gb drives may have attributed to the bit changes on these drives, even though I use a laptop case to carefully transport the drives well within handling specs. But that still doesn't explain the errors on the FAT32 drives in the server, except that those drives have been in use longer.
    jfriend wrote:
    Anyway, I don't think I have much more helpful to say other than you really ought to find out what is corrupting your files. That is not normal behavior and would be very worrisome to me if I were you. It is not something that a disk just does on it's own without generating read errors. Disk information is CRCed when written so any bit drift is detected and either fixed or reported as a read error to the OS.
    I think this isn't a one-time phenomenon, but one that happens regularly--the odds of it are just so small. It can explain random system crashes or appliations not working one time, but fine the next time as well as a host of other intermittant system failures. There is a specification by all disk manufacturers for "recoverable read errors". And it's a exponentially small number. But as the bits on drives have increased, I'm sure this spec has remained rather consistant (they no longer post them on spec sheets), allowing these type of errors to start affecting things.

    Desktop class drives don't have as much crc information as their enterprise brothers. I don't think this would have happened if I had all SCSI drives. There's an interesting Intel white paper on this:
    http://www.intel.com/support/motherboards/server/sb/CS-029229.htm

    But I'm still wanting any info on the limits of the tools. I'm more concerned about how many images would be practical in the "Move to Gallery" tool. I know this limitation will be more based on the browers/os/memory/etc, but a general guideline would be great. I've worked with over a thousand without too much hiccup, but I'd like to know where the real breaking points may be--2k? 4k? 10k? If I upload this much into a gallery and then can't move it, it will be a lot of wasted time.
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited January 25, 2010
    The largest number of images I uploaded in one batch was over 2300. I was able to load the same number in the move tool in the gallery. But after realizing that the move tool sorts by file name and not the gallery setting (date shot in this case), this methodology won't work for me at all. :cry Looks like I'm back to multiple upload sessions on multiple computers.

    I wish I could just fly into SM headquarters and just plug into the network and upload directly. It would be worth the flight cost in the amount of time I'd save.
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited February 13, 2010
    SamirD wrote:
    I wish I could just fly into SM headquarters and just plug into the network and upload directly. It would be worth the flight cost in the amount of time I'd save.
    Can I do this? I'm driving myself crazy with trying to upload this much data. And I've been working on this two weeks straight...
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • Options
    SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited January 5, 2011
    And the hard drive bits still are randomly changing. A recent compare of the video archive drives to the 640gb drives resulted in 4 files that miscompared out of almost 300,000. A small number, but miscompared nonetheless.
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
Sign In or Register to comment.