Examples and custom spark images for working with the spark-on-k8s operator on AWS
-
Updated
Feb 14, 2021 - Dockerfile
Examples and custom spark images for working with the spark-on-k8s operator on AWS
Extract, transform, and load data for analytic processing using AWS Glue
A pipeline within AWS to capture schema changes in S3 files and to update them in a DB.
Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.
1️⃣ Querying Parquet file from S3 using AwsWrangler. 2️⃣ Querying from Redshift tables using Glue & AwsWrangler
This workshop is to build a serverless data lake architecture using Amazon Kinesis Firehose for streaming data ingestion, AWS Glue for Data Integration (ETL, Catalogue Management), Amazon S3 for data lake storage, Amazon Athena for SQL big data analytics.
This is a case study showing how to deploy "Wait-for-Callback" using Step Functions
ETL using application streaming and creating a Data Lake
a toolkit that provides an object-oriented interface for working with parquet datasets on AWS
🐋 Docker image for AWS Glue Spark/Python
AWS Kinesis Analytics gather metrics from various computers (cpu, memory), perform aggregation on Kinesis stream data using Kinesis Analytics (with flink) and store the stream data into AWS S3 bucket which is used by Amazon Athena for running various Analytics queries and rending charts using Grafana.
The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.
The athena adapter plugin for dbt (https://getdbt.com)
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Add a description, image, and links to the glue-catalog topic page so that developers can more easily learn about it.
To associate your repository with the glue-catalog topic, visit your repo's landing page and select "manage topics."