Import files from google to s3

Preparing to import the file from Google drive

Open the file you would like to import
Ensure that all columns have headers. Columns without headers will be lost
Click Share in the top right corner of the sheet
If the document is unnamed, name it
Paste in the service account email address you have been provided into the email box
Ensure the suggested email matches the service account email and select it
On the new window, choose from the dropdown on the right hand side and select Viewer
Uncheck the Notify people checkbox
Click Share
You will be asked to confirm sharing outside the organisation, click share anyway
Your file is now available for import

Getting file detail

You will need to obtain the document key from the url
The document id is the portion of the url between https://docs.google.com/file/d/ and /edit#gid=0. See example below

Setting up the copier lambda

Before setting up an AWS Glue job, ensure that the relevant department configuration for that account is set up in AWS
- see Adding a department section in managing-departments.md
Open the Data Platform Project. You'll need to have a Github account (which you can create yourself using your Hackney email) and have been added to the 'LBHackney-IT' team to view this project (you'll need to request this from Rashmi Shetty). If you don't have the correct permissions, you'll get a '404' error.
Navigate to the main terraform directory (data-platform/terraform)
Open the 65-g-drive-to-s3 terraform file
Switch to 'edit mode' (using edit button on top right)
Copy one of the modules above, paste at the bottom of the file and update the following fields:
- module = "your-unique-module-name" (it is helpful to keep the same naming convention as your dataset/folder)
- lambda_name = "Your lambda name" (this is what you'll see in the Glue console, can be the same as your module name)
- file_id = "Your document id - see the Getting file detail section above"
- file_name = "The name of the file you are importing including the file extension and using underscores instead of spaces"
- service_area = "The name of the service area folder you would like to store in e.g. housing, social-care" (if this folder doesn't already exist in S3 you can name it here and this script will create it)
Committing your changes: The Data Platform team needs to approve any changes to the code, so your change won't happen automatically. To submit your change:
- Provide a description to explain what you've changed
- Select the option to create a new branch for this commit (i.e. the code you've changed). You can just use the suggested name for your branch.
- Once you click 'Propose changes' you'll have the opportunity to add even more detail if needed before submitted for review.
- You'll receive an email to confirm that your changes have been approved.

Preparing to import the file from Google drive​

Getting file detail​

Setting up the copier lambda​

Preparing to import the file from Google drive

Getting file detail

Setting up the copier lambda