Smugmug uploader makes a big mess (170 duplicates)

jfriendjfriend Registered Users Posts: 8,097 Major grins
edited January 20, 2013 in Bug Reporting
The @#$%^& Smugmug uploader made a big mess today. I'm attempting to upload 1615 images to a single gallery (this will be the master gallery from which I make many smart galleries that pull images from this). To summarize, I use the default Smugmug uploader on Chrome. I have duplicate protection on. I drop 1615 images into the uploader. When I'm all said and done, I have 1785 images in the gallery. That's 170 duplicates. Now I've got a giant mess to sort out to get rid of all the duplicates. I will probably have to write a script just to find all the dups and remove them.

I simply can't believe you guys can't design a reliable uploader. I've been encouraging you to fix this for 6 years now. You made some attempts with the latest uploader you released a little while ago, but under stress, it's just chock full of bugs that can make a giant mess.

Now, the upload sequence of events wasn't 100% normal (I'll describe in a bit), but still with duplicate protection on, I have no idea why your uploader would give me 170 dups. How difficult is it to do that part right?

Anyway, here's the sequence of events:
  1. Create new gallery
  2. Bring up default Smugmug uploader on a brand new computer using latest version of Chrome
  3. Drop 1615 JPEGs into the uploader
  4. Uploading starts
  5. After several hundred images, there's a neighborhood power outage. I'm using my laptop so the computer is not affected by the power outage, but the internet connection goes down.
  6. Power is down about 3-4 minutes and then comes back on.
  7. The downloader does not appear to recover, even when the internet connection has been restored as it appears to still be trying to upload several images and they aren't going anywhere.
  8. I wait several minutes. Uploader doesn't seem to be recovering.
  9. I close the uploader.
  10. Reopen the uploader, redrop all 1615 files into the uploader figuring that duplicate protection is on so it should be OK.
  11. It starts uploading again. After another few hundred files, I notice that two uploading images (it seems to normally like to do 3 at a time) are permanently stuck on the "verifying..." step that normally happens at the end of each image upload. I wait and wait and they never proceed on beyond that step.
  12. Because doing one upload image at a time is signfiicantly slower than the usual three at a time, I close the uploader again, reopen it, drop all 1615 files in again.
  13. Uploads start going again, 3 at a time.
  14. A few hundred images later, it again shows two images stuck on the verifying step so it's going slower again. I'm using borrowed time on someone else's internet connection for this upload at their house so I can't just walk away and go about doing other things. So, after waiting for awhile, I again close the uploader, reopen it, drop all 1615 files in again.
  15. Finally, it finishes.
  16. I check the gallery and see that somehow my gallery has 1785 images in it from 1615 original images - apparently at least 170 dups. Now I don't even trust that all 1615 images are actually there. There could be more than 170 dups and some missing images.
  17. Big mess.

List of bugs/problems encountered:
  1. When the internet connection dropped and then recovered, the uploader did not resume properly.
  2. When uploading lots of images, some of the images get stuck on the verifying step. This happened on four images. It does not recover from that and once it gets stuck, the upload throughput slows down (there are valid reasons to do three at once).
  3. When stopping the uploader and restarting with all images again, the uploader makes LOTS of duplicates.

I can't believe you guys can't make a reliable uploader. This single mess will probably cost me 4-8 hours to clean up and I've got to write code (beyond the means of most customers) just to figure out how many unique images really are there and which dups to get rid of.

FYI, for those of you who know I usually use StarExplorer for uploads like this, I was using a new laptop at a remote location, didn't have StarExplorer installed on that laptop yet (license file complications), had simplified my upload into one gallery and falsely believed that Smugmug has made their uploader a lot more robust for big uploads. Apparently, it isn't yet up to the task.

Edit: I have verified via a script that all 1615 files got uploaded and there are 170 extra duplicates (no two dups are the same - there are 170 unique dups - so it's not like one image caused the duplication). Here's a sampling of the filenames and upload times when there are dups:

2012-05-13_2230.png
--John
HomepagePopular
JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
Always include a link to your site when posting a question
«1

Comments

  • rainforest1155rainforest1155 Registered Users Posts: 4,566 Major grins
    edited May 14, 2012
    John,

    I'm very sorry about your uploading troubles. I assume you are using the default html5 uploader.
    Can you tell how many photos were already uploaded in the gallery when the power went out?
    Would you be able to give the Simple uploader a try instead and see if that works better in your case?

    Looking at your jfriend site, I cannot see any recent uploads in the last 7 days. Which account / gallery were you uploading to so we could take a closer look?
    Sebastian
    SmugMug Support Hero
  • sujit1779sujit1779 Registered Users Posts: 46 Big grins
    edited May 14, 2012
    Hi,

    May I suggest a desktop solution made by us, exactly for situations like this. It not only uploads images to Smugmug (it supports uploading to picasa, flickr, skydrive, dropbox, box and facebook), it checks for duplicates too, and be assured you won't get any duplicate images in your gallery :-) It can handle thousands of images with ease. Give it a try I am sure you will like it :-) you can download it from www.picbackman.com or from cnet http://download.cnet.com/PicBackMan/3000-13455_4-75650267.html?tag=mncol;1

    Thanks.
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited May 14, 2012
    jfriend wrote: »
    This single mess will probably cost me 4-8 hours to clean up

    While we look into the issue for you, why not simply delete the gallery and upload again? While the files are uploading, you can be doing other things, so there's no 4-8 hours of clean up.

    I'm happy to have a Hero work on your gallery and delete the duplicates, if you don't want to do that, just email us at the help desk ATTN: Andy.
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 14, 2012
    Andy wrote: »
    While we look into the issue for you, why not simply delete the gallery and upload again? While the files are uploading, you can be doing other things, so there's no 4-8 hours of clean up.

    I'm happy to have a Hero work on your gallery and delete the duplicates, if you don't want to do that, just email us at the help desk ATTN: Andy.
    I dont want to just reupload because I made specific arrangements to go over to a friend's house for a couple hours where they have much faster upload bandwidth to do this upload. On my upload bandwidth, it would take 18-24 hrs to complete and the whole time it was uploading our home internet access is slowed down significantly. It's not easy to reschedule going to the friend's house today and I need to get this fixed today.

    I'll contact you directly Andy about whether it makes sense to have a hero fix this or whether I should write the code to do so. I've written code that has identified the 170 duplicates, but haven't yet written code to delete them (that appears to involve using oauth with the API which I haven't done before).

    Sebastian, the gallery is in my friend.smugmug.com account.

    And it's in an unlisted gallery which I don't want Google to find at /Sports/Palo-Alto-Rowing-Club-2012/All-Regattas/22955010_vCZHfV.

    Andy or Sebastian, do you guys want to have anyone look at the results in that gallery before I start fixing/changing it?

    Sebastian, I'm not going to try this upload with the simple uploader. On my home bandwidth, this is about a 24 hour upload with our home internet access compromised the whole time - it's not a popular thing in the house to do. That's why I had arranged to go over to a friend's house with much faster upload access to do this upload. If I were going to upload again, I'd use StarExplorer which has been more reliable to me in the past. The uploader was whatever the default uploader would be in the Chrome browser (I assumed the default is the HTML5 uploader), but if you tell me how to identify one uploader from another, I could confirm which one it was. I always thought your uploaders should have some sort of visible name on them for this type of troubleshooting.

    If you guys want to experiment with different uploaders, I'll give you a DVD with the 1615 images on it, but I don't think there's anything special about these images. They are just JPEGs 3-6MB in size (depending upon how much cropping there was). Just run some tests yourself with long/large uploads.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • AllenAllen Registered Users Posts: 10,008 Major grins
    edited May 14, 2012
    Can StarExplorer be installed on a thumb drive and used anywhere?
    Al - Just a volunteer here having fun
    My Website index | My Blog
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 14, 2012
    Allen wrote: »
    Can StarExplorer be installed on a thumb drive and used anywhere?
    I don't know. That would be handy if so. I'm guessing it probably would work because there's no real installation program. I think you just need the exe and license file and supporting DLL(s) together.

    Right now I'm working on what's the best way to fix the gallery that's screwed up.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 14, 2012
    FYI, to anyone else following this thread. I ended up writing a script that uses the Smugmug API to find and remove the duplicates.

    I haven't gotten any acknowledgement from Smugmug about any of the three bugs I observed in the uploader though.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited May 14, 2012
    jfriend wrote: »
    I haven't gotten any acknowledgement from Smugmug about any of the three bugs I observed in the uploader though.

    We offered to fix the duplicate issue for you.

    I also stated that I'd have people looking into this. You haven't heard back from me because I don't have anything to say yet about the issue you filed - except what I did reply to you about, which is we're accepting millions and millions of files and haven't ever seen the issue you wrote us about. Thanks John!
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 14, 2012
    Andy wrote: »
    We offered to fix the duplicate issue for you.

    I also stated that I'd have people looking into this. You haven't heard back from me because I don't have anything to say yet about the issue you filed - except what I did reply to you about, which is we're accepting millions and millions of files and haven't ever seen the issue you wrote us about. Thanks John!
    I thanked you for the offer to fix it in separate correspondence. I elected to fix it myself because I felt more confident about getting the exact outcome I needed by doing the work myself.

    On your response, I guess I was expecting to hear something like: "Thanks for reporting those issues and putting together the detailed sequence of events - we'll file those three issues as bugs and have our sorcerers look into them. When I get more info, I'll post back."

    As a piece of feedback to you, when you say: "we're accepting millions and millions of files and haven't ever seen the issue you wrote us about", that offers me zero comfort. In fact, it makes me think you don't think what happened to me has a very high priority or maybe you don't even think it's a credible issue. I don't know if you intended it that way, but put yourself in my shoes. That statement does not make me feel better at all. I'd rather hear that you will file these as issues and have people look into them.

    The customer (me in this case) generally doesn't care that it doesn't happen to a lot of other people. If it happens to them, it's real and it seems important to them and trying to deflect the important of the issue by saying it isn't happening to anyone else just feels like you're telling me my issue isn't very important. These bugs wasted a lot of my time because your uploader can't reliably handle some circumstances. Those are facts.

    If you don't intend for it to handle 1600 files at a time or don't intend for it to handle an occasional internet connection hiccup or don't intend for the duplicate protection to be reliable, then just let us know so we can lower our expectations and continue to bother you about how you should have a more robust uploader. But, I was under the impression that you thought you now had a more robust uploader and when I finally tried to use it as such, it failed me significantly.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited May 14, 2012
    jfriend wrote: »
    I thanked you for the offer to fix it in separate correspondence. I elected to fix it myself because I felt more confident about getting the exact outcome I needed by doing the work myself.

    On your response, I guess I was expecting to hear something like: "Thanks for reporting those issues and putting together the detailed sequence of events - we'll file those three issues as bugs and have our sorcerers look into them. When I get more info, I'll post back."

    As a piece of feedback to you, when you say: "we're accepting millions and millions of files and haven't ever seen the issue you wrote us about", that offers me zero comfort. In fact, it makes me think you don't think what happened to me has a very high priority or maybe you don't even think it's a credible issue. I don't know if you intended it that way, but put yourself in my shoes. That statement does not make me feel better at all. I'd rather hear that you will file these as issues and have people look into them.

    The customer (me in this case) generally doesn't care that it doesn't happen to a lot of other people. If it happens to them, it's real and it seems important to them and trying to deflect the important of the issue by saying it isn't happening to anyone else just feels like you're telling me my issue isn't very important. These bugs wasted a lot of my time because your uploader can't reliably handle some circumstances. Those are facts.

    If you don't intend for it to handle 1600 files at a time or don't intend for it to handle an occasional internet connection hiccup or don't intend for the duplicate protection to be reliable, then just let us know so we can lower our expectations and continue to bother you about how you should have a more robust uploader. But, I was under the impression that you thought you now had a more robust uploader and when I finally tried to use it as such, it failed me significantly.

    Thanks for the feedback, John! You and I communicate by email and have for years. I told you I'd make sure the team saw this, and we'll certainly investigate. Until we can replicate internally we can't do anything - so we'll try and do just that, replicate it. Thanks again for the valuable feedback.
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 14, 2012
    Andy wrote: »
    Thanks for the feedback, John! You and I communicate by email and have for years. I told you I'd make sure the team saw this, and we'll certainly investigate. Until we can replicate internally we can't do anything - so we'll try and do just that, replicate it. Thanks again for the valuable feedback.
    You can probably reproduce the dropped connection by running a laptop on battery and then just turning off a home router in the middle of an upload, leave it off for 2-3 minutes, then turn it back on and see if it starts uploading again in a timely manner. Since I was running a battery powered laptop, the power outage that I experienced caused the router and WiFi connection to lose power, thus taking down the home network, but not taking down the computer.

    It's hard to know exactly what circumstance triggered the duplicates problem. I would theorize that a smart developer doing a thorough code review of the parts of the code that handles duplicates could probably identify several likely causes in a few hours of code inspection and could likely find issues more thoroughly and find them quicker than someone trying to reproduce the issue with no inspection of the code. Said another way, some issues are fixed much more effectively via whitebox code inspection/review rather than blackbox testing. There are bugs worth filing that you don't have a reproducible case and you challenge the developer to go figure out how the code could fail by examining the code, designing a test case that exploits that code weakness and then fixing the code weakness.

    It's also hard to know exactly what circumstance triggered the images that got stuck on the "verifying" step. It could have been a momentary connection issue where something got lost or it could have been a hiccup on the upload server that failed to send it or it could have been some sort of failure to see the right event in the client. Again, code review on the client could identify places where the client isn't properly protected against any of these issues or doesn't have a fallback code path if the verifying event is never received.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited May 14, 2012
    jfriend wrote: »
    You can probably reproduce the dropped connection by running a laptop on battery and then just turning off a home router in the middle of an upload, leave it off for 2-3 minutes, then turn it back on and see if it starts uploading again in a timely manner. Since I was running a battery powered laptop, the power outage that I experienced caused the router and WiFi connection to lose power, thus taking down the home network, but not taking down the computer.

    It's hard to know exactly what circumstance triggered the duplicates problem. I would theorize that a smart developer doing a thorough code review of the parts of the code that handles duplicates could probably identify several likely causes in a few hours of code inspection and could likely find issues more thoroughly and find them quicker than someone trying to reproduce the issue with no inspection of the code. Said another way, some issues are fixed much more effectively via whitebox code inspection/review rather than blackbox testing. There are bugs worth filing that you don't have a reproducible case and you challenge the developer to go figure out how the code could fail by examining the code, designing a test case that exploits that code weakness and then fixing the code weakness.

    It's also hard to know exactly what circumstance triggered the images that got stuck on the "verifying" step. It could have been a momentary connection issue where something got lost or it could have been a hiccup on the upload server that failed to send it or it could have been some sort of failure to see the right event in the client. Again, code review on the client could identify places where the client isn't properly protected against any of these issues or doesn't have a fallback code path if the verifying event is never received.
    Thanks John - will make sure our QA guys see this.
  • SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited May 15, 2012
    jfriend wrote: »
    FYI, to anyone else following this thread. I ended up writing a script that uses the Smugmug API to find and remove the duplicates.
    I wish I would've seen this thread sooner. :( I paid someone to write a script to do this over a year ago and would have let you use it. It's probably still a bit buggy, but has saved my butt in the past:
    http://www.huntsvillecarscene.com/smug/duplicates.php

    If you don't mind sharing your script, I think it would help a lot of us. A fellow SM'r local to me called me up after something similar happened to him on a 1600+ image gallery.
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 15, 2012
    SamirD wrote: »
    I wish I would've seen this thread sooner. :( I paid someone to write a script to do this over a year ago and would have let you use it. It's probably still a bit buggy, but has saved my butt in the past:
    http://www.huntsvillecarscene.com/smug/duplicates.php

    If you don't mind sharing your script, I think it would help a lot of us. A fellow SM'r local to me called me up after something similar happened to him on a 1600+ image gallery.
    Samir, my script isn't made for general consumption. It's wired to my particular gallery and has all sorts of debugging for me to make sure I was being safe about what was getting deleted.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited May 18, 2012
    jfriend wrote: »
    Samir, my script isn't made for general consumption. It's wired to my particular gallery and has all sorts of debugging for me to make sure I was being safe about what was getting deleted.
    Ahh, that makes sense. What was your algorithm? I think mine was to enumerate the image tree and compare the filename of the index with the previous item--if they were the same, show both as a duplicate. Did you find a better or more complete way?

    And just a word of thanks to the guys at Smugmug for the uploaders. thumb.gifclap.giflustbowdown.gif

    I had no idea how good they were until I tried to upload a couple of thousand thunbmails totaling just 50mb to Facebook. It's been over 12hrs and they're still not uploaded. Duplicates? Yes. Missing ones? Yes. It's a complete nightmare. Facebook is the most alpha-level web-site in production I've ever seen. rolleyes1.gif I'm not looking forward to using it at all. :cry
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 18, 2012
    SamirD wrote: »
    Ahh, that makes sense. What was your algorithm? I think mine was to enumerate the image tree and compare the filename of the index with the previous item--if they were the same, show both as a duplicate. Did you find a better or more complete way?

    And just a word of thanks to the guys at Smugmug for the uploaders. thumb.gifclap.giflustbowdown.gif

    I had no idea how good they were until I tried to upload a couple of thousand thunbmails totaling just 50mb to Facebook. It's been over 12hrs and they're still not uploaded. Duplicates? Yes. Missing ones? Yes. It's a complete nightmare. Facebook is the most alpha-level web-site in production I've ever seen. rolleyes1.gif I'm not looking forward to using it at all. :cry
    My general approach was to use the API to get a list of all image filenames in the gallery and one by one put them into a javascript object (used like a hashtable). When I get to the next filename, I see if it's already in the object. If it is, then this one is a dup and I add it to the dups array.

    When I'm done, I loop through and remove all the items from the dups array using the API.

    Note: this algorithm assumes that different images will never have the same filename. I know that is true for my images because I have a date/time code in my filenames along with the orginal camera-generated file number, but unique filenames is not necessarily true for other people's images so a more careful algorithm would probably also check the image size and perhaps the last modified time.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited May 19, 2012
    I also assumed that different images won't have the same filename. But duplicate files, whatever they may will by their very nature have the same file name. headscratch.gif So is our logic that far off?
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited May 19, 2012
    SamirD wrote: »
    I also assumed that different images won't have the same filename. But duplicate files, whatever they may will by their very nature have the same file name. headscratch.gif So is our logic that far off?
    True dups will have the same filename, but if you're shooting with more than one body or combining photos from more than one photographer and don't rename your files and you're not making sure they each have unique file numbers/names, then it's possible to have non-dups with the same filename. That was my point. I rename all camera generated filenames to have a timestamp in the filename so avoid this ever happening in my workflow. A generic tool meant for anyone to use it with could not make such assumptions about the workflow so it would have to be more careful about what was really a dup. That's one of the differences in the amount of work required developing robust software for anyone to use vs. developing software for one particular isolated use.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • NikolaiNikolai Registered Users Posts: 19,035 Major grins
    edited May 19, 2012
    I know it's probably too late and not very interesting for this thread primary posters, but I just deployed a new version of S*E that has image duplicates removal functionality, available to S*E Pro users via Albums4 Pro package, and to Pro+/Studio users as is.

    It takes into account image file name, file size, image dimensions and original date, if available.

    You can activate it from the context menu of both albums list and Category/Subcategory tree (so you can check the entire Category for dups:-)

    FWIW, it was this thread that brought this problem to my attention, so hopefully somebody else will not suffer through what John and Samir did.

    HTH
    Nikolai
    "May the f/stop be with you!"
  • SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited May 21, 2012
    jfriend wrote: »
    True dups will have the same filename, but if you're shooting with more than one body or combining photos from more than one photographer and don't rename your files and you're not making sure they each have unique file numbers/names, then it's possible to have non-dups with the same filename. That was my point. I rename all camera generated filenames to have a timestamp in the filename so avoid this ever happening in my workflow. A generic tool meant for anyone to use it with could not make such assumptions about the workflow so it would have to be more careful about what was really a dup. That's one of the differences in the amount of work required developing robust software for anyone to use vs. developing software for one particular isolated use.
    That's why in my implementation, I would show the thumbnail of each original duplicate in case they were actually different files. The final decision would be made by the user as to what gets deleted.

    But you're absolutely right about development of tools like this. Lots of base cases when you expand the use to many users compared to individual use.
    Nikolai wrote: »
    I know it's probably too late and not very interesting for this thread primary posters, but I just deployed a new version of S*E that has image duplicates removal functionality, available to S*E Pro users via Albums4 Pro package, and to Pro+/Studio users as is.

    It takes into account image file name, file size, image dimensions and original date, if available.

    You can activate it from the context menu of both albums list and Category/Subcategory tree (so you can check the entire Category for dups:-)

    FWIW, it was this thread that brought this problem to my attention, so hopefully somebody else will not suffer through what John and Samir did.

    HTH
    Nikolai
    Thank you very much for the update Nikolai! It's been a while since I've visited SE, and now that I've got some newer computers, I'll have to take a look at it again. thumb.gif
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
  • f300f300 Registered Users Posts: 7 Beginner grinner
    edited August 13, 2012
    Gosh, it's killing me, this "Skip Duplicates" (mis)feature just doesn't work reliably. I have a gallery on the largish side (1600 pics or so) and first of all no matter what I do to it once I select and drop all the pictures it says it ignores 1000 of them and uploads the others. Fine. I try to upload them in 100 batches. Some work (as in you upload and then when you try to upload them again they're skipped). Some don't. You can drop them again and again and some 60-70 pics will reupload.
    This is really getting way beyond annoying.
  • sujit1779sujit1779 Registered Users Posts: 46 Big grins
    edited August 13, 2012
    f300 wrote: »
    Gosh, it's killing me, this "Skip Duplicates" (mis)feature just doesn't work reliably. I have a gallery on the largish side (1600 pics or so) and first of all no matter what I do to it once I select and drop all the pictures it says it ignores 1000 of them and uploads the others. Fine. I try to upload them in 100 batches. Some work (as in you upload and then when you try to upload them again they're skipped). Some don't. You can drop them again and again and some 60-70 pics will reupload.
    This is really getting way beyond annoying.

    Hey I don't know if it ok to write here. But we have a product PicBackMan which helps you to upload thousands of pictures without duplicates. You can download it at www.picbackman.com

    Sujit
    Disclosure : I am involved with PicBackMan as a developer.
  • NikolaiNikolai Registered Users Posts: 19,035 Major grins
    edited August 14, 2012
    f300 wrote: »
    Gosh, it's killing me, this "Skip Duplicates" (mis)feature just doesn't work reliably. I have a gallery on the largish side (1600 pics or so) and first of all no matter what I do to it once I select and drop all the pictures it says it ignores 1000 of them and uploads the others. Fine. I try to upload them in 100 batches. Some work (as in you upload and then when you try to upload them again they're skipped). Some don't. You can drop them again and again and some 60-70 pics will reupload.
    This is really getting way beyond annoying.

    What uploader are you using? headscratch.gif
    "May the f/stop be with you!"
  • f300f300 Registered Users Posts: 7 Beginner grinner
    edited August 14, 2012
    I'm using the "default web-based one" which I think it's the HTML5. Tried firefox, chrome, also to change computers. And it's not like it happened once because "I was too fast", this is ongoing for more than one week already. I'm not using much Smugmug (I think these are the first uploads I try to do this year) and usually I used just to nuke the gallery and reupload but now it is just a big gallery and I'm not sure I can reupload "in one go". Also I don't fancy to split it in 5-10 batches and then upload and check each batch.

    If only 1000 pics are read from the gallery and checked against the uploaded pics is fine, but I need to know it.

    Are the other uploaders checking for duplicates (I guess I'll just try for myself anyway)?
  • rainforest1155rainforest1155 Registered Users Posts: 4,566 Major grins
    edited August 14, 2012
    Can you try the Simple uploader instead of the default html5 uploader to see if that works better in your case? The Simple uploader also has duplicate detection.

    Note that the duplicate detection of our web uploaders only works for photos already in the gallery when the uploader is opened up. So to have it consider photos you just uploaded, go to the gallery and open the uploader again.
    It won't take any photos into account that you just uploaded with the same or any other window.
    Sebastian
    SmugMug Support Hero
  • NikolaiNikolai Registered Users Posts: 19,035 Major grins
    edited August 14, 2012
    f300 wrote: »
    I'm using the "default web-based one" which I think it's the HTML5. Tried firefox, chrome, also to change computers. And it's not like it happened once because "I was too fast", this is ongoing for more than one week already. I'm not using much Smugmug (I think these are the first uploads I try to do this year) and usually I used just to nuke the gallery and reupload but now it is just a big gallery and I'm not sure I can reupload "in one go". Also I don't fancy to split it in 5-10 batches and then upload and check each batch.

    If only 1000 pics are read from the gallery and checked against the uploaded pics is fine, but I need to know it.

    Are the other uploaders checking for duplicates (I guess I'll just try for myself anyway)?

    My Star*Explorer (http://www.starexplorer.com) does that and then some... And you don't have to break it down in batches by 100... :-) It has free 30 day trial, so as long as you're on Windows, you may try it and see it fit meets your needs...
    "May the f/stop be with you!"
  • f300f300 Registered Users Posts: 7 Beginner grinner
    edited August 14, 2012
    Yea, thanks, actually I even went and tried your SE about one week ago but it said trial expired (it wasn't currently installed on my box for sure but I think I tried it once years ago, sorry I don't remember why I wasn't impressed...). FYI maybe if you had automatic order processing I might have bought it on the spot, just to forget about this duplicate mess :-)
    But really for my limited use it is too much, in more ways than one.

    I'm testing now the "simple" (Java) uploader, it is much slower (more than one second per duplicate skipped, on a quad no less...) but that isn't particularly bad. What's worse is that first time it crashed, then I checked my Java version and it seems it was behind (and it couldn't update automatically). So I updated Java and now it seems to be even slower (and it crashed just when starting first time).

    Not giving up yet, if I find one way that works it's fine (I really don't want much, just basic functionality) but if not next year I'll have to move on from smugmug (oh, and get something with real nested categories support, oh, that would be a breath of fresh air).
  • rainforest1155rainforest1155 Registered Users Posts: 4,566 Major grins
    edited August 15, 2012
    What browser and operating system do you use the Simple uploader on? I haven't used it extensively in a while, but in the past I used it a lot and never had problems with it crashing on Firefox.
    Sebastian
    SmugMug Support Hero
  • f300f300 Registered Users Posts: 7 Beginner grinner
    edited August 15, 2012
    XP and the latest stable firefox.
    I wouldn't worry about this part (crashing) yet.
    Last try yesterday it worked for hours (I said it takes more than one second to skip one duplicate - it's in fact way more than 1s). It did manage to skip everything fine (at least at first sight) yesterday but then I didn't have time to let it upload completely. Hopefully it will run through today and anyway even if it does I plan to run it once more to see if it skips everything as expected. I'll report back in any case.

    Can somebody with a largish (1500+ pics) gallery and access to the original folder make a test in the meantime? Just drag+drop all files in the HTML5 uploader and see if it says (for example) 1600 duplicates or it starts again uploading something like 600 pics (if it starts uploading you can cancel it fast enough before it uploads anything, just keep an eye on it and kill it so you don't end up with duplicates as well).
  • SamirDSamirD Registered Users Posts: 3,474 Major grins
    edited August 24, 2012
    f300 wrote: »
    Gosh, it's killing me, this "Skip Duplicates" (mis)feature just doesn't work reliably. I have a gallery on the largish side (1600 pics or so) and first of all no matter what I do to it once I select and drop all the pictures it says it ignores 1000 of them and uploads the others. Fine. I try to upload them in 100 batches. Some work (as in you upload and then when you try to upload them again they're skipped). Some don't. You can drop them again and again and some 60-70 pics will reupload.
    This is really getting way beyond annoying.
    I used to run into this problem a lot and developed a small 'duplicate detector' that I can share with you if you want to try it on your gallery, just PM me. It will help find the duplicates and give you a link to them so you can delete them.
    f300 wrote: »
    Can somebody with a largish (1500+ pics) gallery and access to the original folder make a test in the meantime? Just drag+drop all files in the HTML5 uploader and see if it says (for example) 1600 duplicates or it starts again uploading something like 600 pics (if it starts uploading you can cancel it fast enough before it uploads anything, just keep an eye on it and kill it so you don't end up with duplicates as well).
    I regularly upload large batches like this and use simple uploader without any issues. I have 5Mb upload bandwidth, so that may make a difference as I don't know how SM uploaders work on less bandwidth. And skipping can take forever especially on large galleries. Videos make that even worse.

    I always have to check the number of files uploaded though as the uploader can miss some or upload dupes. Finding a single dupe in a batch of 2000 is a real pain, so I developed a tool that can help me find it in a second. I used to use a technique that used the uploader, but that just takes waaaay too long on large galleries.
    Pictures and Videos of the Huntsville Car Scene: www.huntsvillecarscene.com
    Want faster uploading? Vote for FTP!
Sign In or Register to comment.