An accomplished Data & Analytics consultant. A Data Science & FinTech instructor and mentor at USC, ASU, Pepperdine, and Pluralsight. Author at Pack Publishing.
Highlights
- Pro
Lists (17)
Sort Name ascending (A-Z)
Stars
8
stars
written in Scala
Clear filter
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Spark: The Definitive Guide's Code Repository
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Automated data quality suggestions and analysis with Deequ on AWS Glue





