Wednesday, July 19, 2017

Powershell and file detection in S3 buckets using Get-S3Object

I am currently working on a project where I need to load data into an Amazon S3 bucket, this data comes from various On-Premise sources and eventually feeds into Gainsight.  If the data load process fails I need to check and see if there are any files in the configured "error" folder.  No files, no problem.  When attempting to check this with Powershell I ran into a couple of snags and wanted to provide some of the pain points here.
First you need to go and install the AWS tools for Windows Powershell from here  https://aws.amazon.com/powershell/
Make sure you follow the directions exactly to avoid errors and headaches.  Once it is running the task of checking for files can begin.

Import-Module AWSPowerShell
$bucket = "xxxx"
$path = "sample/input/error"
$AKey="xxxx"
$SKey="xxxx"
$region = "xxxx"

Set-AWSCredentials -AccessKey $AKey -SecretKey $SKey #-StoreAs For_Move
Initialize-AWSDefaults -ProfileName For_Move -Region $region

$objects = Get-S3Object -BucketName $bucket -Key $path

foreach($object in $objects)
{
   #to get rid of the bucket entry you need to set it to '' then make sure you remove that from the listing
   #so you need to set it with a replace of the known path
   $localfilename = $object.Key -replace $path,''
   IF ($localfilename -ne '')
   {
        Write-host $localfilename
   }
}

Import-Module AWSPowerShell -I put this in to make sure the references are loaded correctly
$bucket = "xxxx" - name of your S3 bucket that you are working with
$path = "sample/uploads/errors" - the path in that bucket to where you want to watch
$AKey="xxxx" - your Amazon access key
$SKey="xxxx" - your secret key

Make sure you know your Region, you can specify that in the Get-S3Object call or you can

The most important thing here is that the AWS Get-S3Object does just that - it returns and object, that object may or may not be a file.  Interestingly enough when you get the list of objects from the path you will see that it returns the path entry as an object as well.  So if I have one file in that path I will see 2 objects, the first is the path the second is the object. So how do you get rid of the path?  You can see that I take the object.key and REPLACE the known path string to something that is easy to find, in this case it is an empty string. {$localfilename = $object.Key -replace $path,' '] Then when comparing or looping you can exclude anything that matches that pattern.