Summary of this very technical post:
These are the details of various experiments in bulk loading the massive amount of photos and data for the Artefacts. Some of the issues are to do with cleaning up the data from the Artefacts database. Others are about getting reasonable numbers of photos uploaded at a time, avoiding timeouts. At the time of writing the first 150 artefacts out of about 700 have been uploaded successfully with their data.
Importing artefacts photos:
Imported Access file into Excel
Removed all HTML tags (look for < and &)
Filled in blanks on Description column
Not sure how it will handle .doc files in the image column!
Expand multiple filenames manually
Plugin used is WP Photo Album, website http://wppa.nl/
Custom Datafields needed (set in the album in Table II J-10) are:
copyright (not populated for artefacts)
Noticed JPG AND jpg – different cases in the suffix – don’t think this matters if the data matches the actual filename.
Took everything up to image 70, with some “multi-image rows” left in for test load
Zipped the photos and the CSV file into a single zip file.
Attempted to upload zip file testload04082016.zip (via Photo ALbums / Upload Photos / Box C)
Error 405 – not allowed. Assumed that this was because I had more than 20 photos.
Split into 4 parts, 20 – 20 – 20 – ***30*** photos
It imported all four files (even the 30 photo one), didn’t overwrite duplicate photos already in the Album (good), but didn’t import the descriptions for some photos on certain of the “multi-image rows” (not what I hoped, but it was worth trying!). Will need to reload these as single lines.
Tried separating lines out
Got Invalid header. First item must be ‘name’, ‘photoname’ or ‘filename’ (but I didn’t zip the file!)
Moved filename to column 1, put into zip file. Tried again, also with a photo included in zip file, but that didn’t work.
Couldn’t delete superfluous files from “depot” either.
Uploaded the “separate lines” CSV to depot using Filezilla.
Deleted all Artefact photos.
Put all 88 photos up to no. 70 into a new zip file and uploaded, but it only imported one photo. Probably uploaded the wrong file.
Tried agaiin, got 405 not allowed – timed out?
Used Filezilla to upload zip file to depot
“File test artefact load 04082016 no 2/SWEHS000057.zip is of an unsupported filetype and has been ignored during extraction” – not surprising, it was a zip file, but otherwise extracted 88 files. “1 Zipfiles extracted. 1 CSVs imported, 0 photos processed. 34 photos skipped.”
Used Filezilla to upload a new version of the zip file of photos to depot.
87 files imported, 1 already there.
Photos imported, but not the data.
Deleted all the photos again.
Put photos and csv file in a single zip.
Used Filezilla to move new zip file to depot.
Ran Import again.
Again, photos loaded but no data.
CSV file copied to depot, Import run.
Invalid header. First item must be ‘name’, ‘photoname’ or ‘filename’ – I’d imported the wrong one.
Right CSV file copied to depot, Import run.
Photos, and most – but not all – have data!
Need to work out which ones out of the following don’t have data, and why…
The above are not in the data extract from the database, so not surprising that no data is uploaded.
These don’t have enough leading zeroes in the filename. Filenames are correct, checked against database. There are duplicate image files with one less zero! Solution – entries deleted from WordPress album.
Converted all artefact database to CSV and removed all except required columns – file: “artefacts 20160807.csv”
Expanded multi-image lines from 71 to 150.
Removed a couple of duplicate files with DC in filename.
Put images 71 to 150 and CSV into zip file: “artefact upload 71 to 150.zip”
Transferred zip file to Depot “/firstname.lastname@example.org/” using Filezilla.
Removed old CSV file from Depot.
Photos / Import
It extracted the photos, showing images.
Clicked Import. “Time out. 76 photos imported. Please restart this operation.”
50 photos left in Depot. Clicked Import again. “50 single photos imported”
Album Admin shows no data imported for the new photos.
Copied master CSV file “artefacts 20160807.csv” to Depot using Filezilla.
Imported – “1 CSVs imported, 193 photos processed. 1863 photos skipped”
All photos appear to have data attached, no duplicates – success!