Posted 1st August 2023
Restoring a large number of files out of glacier on Windows is difficult. There are examples on stack overflow that work in Unix environment with Awk. This is no good for Windows as windows does not have awk. The official AWS advice is to program your own bulk operation and gives no examples!
There is one suggestion from AWS to use the AWS S3api list-objects call to list all glacier objects and use the result from that in a restore-request command. The problem with that is the S3api is restricted to returning a maximum of 1000 results at once! Multiple calls need to be made to get a complete list. Note that if you are using several different types of Glacier storage you will need to run the command filtered for each type of storage.
I have an easier, faster solution!
For simple step-by-step instructions scroll down. For an explanation of each step and the reasons why read on.
The first step that I took was to get a list of files that I could not download using the AWS S3 sync command. The two reasons I use this command is that
My command was simply:
aws s3 sync s3://bucketname/folderpath/ .
The output from this command will be a list of errors, every file that is stored as Glacier will give an error like this:
warning: Skipping file s3://bucket/path/filename.txt. Object is of storage class GLACIER. Unable to perform download operations on GLACIER objects. You must restore the object to be able to perform the operation. See aws s3 download help for additional parameter options to ignore or force these transfers.
I then pasted the list of thousands of files into Notepad and used the find/replace command to convert that into a list of --restore-object commands.
You should then have a long list of commands looking something like this:
aws s3api restore-object --bucket bucketname --restore-request Days=25,GlacierJobParameters={"Tier"="Bulk"} --key path/filename1.txt --output text
aws s3api restore-object --bucket bucketname --restore-request Days=25,GlacierJobParameters={"Tier"="Bulk"} --key path/filename2.txt --output text
aws s3api restore-object --bucket bucketname --restore-request Days=25,GlacierJobParameters={"Tier"="Bulk"} --key path/filename3.txt --output text
...
Where the --key parameter is the path and filename in the error log above.
Paste your result into a batch file and double click on that file and hey presto you are done! Your files are now being restored. To check the status of restoration you can call this command:
aws s3api head-object --bucket bucketname --key path/filename.txt --output text
The result will let you know if the file is still being restored: ongoing-request="true", or if the value is false then the ongoing request has been completed.
Once the restore is completed then run this command:
aws s3 sync s3://bucketname/path/ . --force-glacier-transfer
The final parameter is required for bulk functions to download a restored file. Your job is then complete! No complex programming or 3rd party software required.