Currently data files (format either CSV or JSON) are sitting in S3 Buckets with the following [login to view URL] source files are same exact layout.
Customer_Name|Market_Name|Product_Name|Quantity|Amount
ABC |Seattle|Shoes|10|1000
we are expecting to see Normalize data in Redshift in the following Dimensions andFact Table
Dim_Customer, Dim_Market, Dim_Product and Fact_Sales
We are open for any of the following approaches.
Approach 1:
a)Using RDS (Either MS SQL or MySQL Flavor) as Staging Database
b)Moving Data From S3 to RDS (don't know this yet, looks like no copy :( command for RDS) and Developing SSIS ETL Packages or SQL Code to model the data and export as separate files ([login to view URL],[login to view URL], dim_product_csv and [login to view URL]) to S3
c)Scheduling a AWS Data Pipeline to load files from S3 to redshift
Approach 2:
A)Using Cloud Data integration tool such (FlyData, Snaplogic or Quoble ) or any other open source Cloud ETL tools.
Please respond only if you have worked on Amazon Redshift Thank you.