Skip to content

v0.1.0: Update README.md

Pre-release
Pre-release

Choose a tag to compare

@JohnChe88 JohnChe88 released this 23 Feb 19:44
· 146 commits to main since this release
a4852aa

Spark on AWS Lambda is a standalone installation of Spark that runs on AWS Lambda using a Docker container. It provides a cost-effective solution for event-driven pipelines with smaller files, where heavier engines like Amazon EMR or AWS Glue incur overhead costs and operate more slowly.

Release 0.1.0 Features:

  1. Dockerfile that has Pyspark and dependencies installed.
  2. Sample script to read and write csv file on Amazon S3
  3. Authentication and authorization framework for connecting to Amazon S3