The purpose of this assignment was to use cloud ETL skills on big data from two of Amazon’s available public datasets on product reviews. The goal is to perfrom the ETL process completely in the cloud and upload a DataFrame to an RDS instance.
This project required the use of Amazon Web Service (AWS), Relational Database Service (RDS), pgAdmin, google Colab, PySpark, and Google Colab
Started by creating an AWS-RDS to connect to pgAdmin

Then registered our AWS-RDS server in pgAdmin - displaying our databases for both datasets



One step before the loading - was to create the schema for our loading tables in pgAdmin

After the schema was created - we were able to move to the loading process

Checking that load was successful in pgAdmin (one example)



One step before the loading - was to create the schema for our loading tables in pgAdmin

After the schema was created - we were able to move to the loading process

Checking that load was successful in pgAdmin (one example)
