DataExpert.io’s cover photo
DataExpert.io

DataExpert.io

Education

San Francisco, California 29,807 followers

Data Engineering education, solutions, and evangelism

About us

EcZachly Inc is a company dedicated to inspiring and educating the next generation of data talent!

Industry
Education
Company size
2-10 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2023

Locations

Employees at DataExpert.io

Updates

  • DataExpert.io reposted this

    View profile for Zach Wilson
    Zach Wilson Zach Wilson is an Influencer

    Founder of DataExpert.io | ADHD | 1m Followers | Dogs

    A majority of those with ADHD and autism are unemployed. The amount of untapped potential in these brilliant people is immense I'm going to be live with Jhillika talking about how to grow your career while having ADHD tomorrow at 3 PM pacific on Riverside, YouTube, LinkedIn, and X! I'll be talking about: - how creating a "working with me" document can be life changing for those who are neurodivergent - things I do to manage my ADHD so it doesn't get in the way as much - how you can use resources like Mentra to land jobs Make sure to share this with your neurodivergent friends!

    This content isn’t available here

    Access this content and more in the LinkedIn app

  • DataExpert.io reposted this

    View profile for Zach Wilson
    Zach Wilson Zach Wilson is an Influencer

    Founder of DataExpert.io | ADHD | 1m Followers | Dogs

    Everybody and their dog is claiming to be an AI engineer nowadays! Delia and Madeline understand that wrapping ChatGPT with Python makes you a full stack developer, NOT AN AI engineer If you use ChatGPT to make data quality checks for your pipelines, you’re a data engineer not an AI engineer! If you’re fine tuning models, you’re an AI engineer. If you’re building models, you’re an AI engineer. If you’re building evaluation sets, you’re an AI engineer. If you aren’t doing any of those activities, you’re a spicy full stack developer!

  • DataExpert.io reposted this

    View profile for Gayathri Ramesh

    Data Engineer | Spark & PySpark | SQL | ADF | Azure Databricks | Python | Microsoft Certified | Learning Everyday

    Successfully started the Week 1: Dimensional Data Modeling of the "Finish the YouTube Boot Camp" hosted by Zach Wilson and DataExpert.io and have completed the Day 1 lecture and lab sessions 💪 Here's some gist about the topics and my insights from the lesson: Day 1: Working with Complex Data Types - Struct, Array etc. 1. Know Your Consumer Initially, before beginning any data modeling, it is imperative to understand in depth who the end user of data is. Whether it is being used for analytics by a Data Analyst / Data Scientists, or consumed by downstream jobs managed by Data Engineers to curate master data with other pipelines dependencies, or being fed into ML models or executive dashboards, the intricacies of the data model distinctively varies in terms of usage of complex vs. flat data types, storage and compression, ease of query and accessibility. 2. OLTP vs. Master Data vs. OLAP Continuum Understanding the differences when modeling a transactional system like an application database requiring low latency, low volume versus an analytical system such as cubes used for quick analysis on aggregations, while also finding a sweet spot in between, where the master data sits, which is deduped and optimized for completeness of entity definitions from which other datasets can be created. 3. Cumulative Table Design CT designs are very commonly used to create master data, where you hold on to all of the dimensions that existed right up until a specific time (until purged or hibernated). Such designs are beneficial for state transition tracking of different metrics for e.g. for growth accounting, which can be used to analyze patterns and model. Especially, the design serves well in computing cumulation metrics, using complex data types such as array of struct to combine the changing values. 4. Complex Data Types Usage of complex data types depending upon the the type of modeling based on the end user, ranging from most compact for transactional purposes to most usable for analytics, with upstream staging or master data residing somewhere in between. Mostly used complex data types such as struct, map, array, nested arrays such as array of struct are quite common to utilize for compacting the datasets. 5. Temporal Cardinality Explosion, Compression & Run-length Encoding Explored the importance of considering the cardinality when working with dimensions that have a time aspect, the need to sort data correctly before compressing such as using parquet format with run-length encoding. Also, complex data types such as array of struct can be used to combine the temporal dimension values, which prevents spark shuffle from ruining compression when working on distributed environments. Thank you, Zach Wilson & DataExpert.io for the incredible session! Day 2, loading 🔜 🚀 #bootcamp #zachwilson #dataexpertio #dataengineering #freeyoutubebootcamp #finishtheyoutubebootcamp #rampup #upskilling #onwardsandupwards

  • Kunmi put in the work and is improving his situation!

    View profile for Kunmi A.

    Data Analyst 📍 Research 📍Data Acquisition 📍 Data Engineering 📍 GIS

    I spent most of the past 7 weeks on DataExpert.io's free Data Engineering Bootcamp led by Zach Wilson! This incredible course provided me with hands-on experience and a deeper understanding of core data engineering concepts. Over the last 7 weeks, I’ve focused on optimizing systems, managing data pipelines, and developing scalable solutions that bring real impact to businesses. Here’s a look at some of the key takeaways for me: 🔹 Flink & Apache Spark: From sessionization logic in Flink to optimizing joins and aggregations in Apache Spark, I’ve successfully implemented solutions that enhanced data pipeline performance and ensured data integrity. It was also a great opportunity to experiment with different data partitioning and aggregation techniques for more efficient query execution. 🔹 Experimentation & Metrics: I’ve designed and executed A/B tests to enhance user engagement in music streaming app(e.g Apple Music). My focus on testing personalized playlists, onboarding processes, and social features has provided valuable insights on how user experience and retention can be significantly improved through data-driven decisions. 🔹 SQL & PySpark: Converting PostgreSQL queries to SparkSQL and building PySpark jobs to handle Slowly Changing Dimensions (SCD) transformations was an exciting challenge. My work on backfill query conversions and unit testing ensured the integrity of data transformation processes. 🔹 Data Pipeline Ownership: I have taken ownership of multiple data pipelines, ensuring smooth operations, monitoring, and troubleshooting through comprehensive runbooks and on-call schedules. I’m committed to maintaining robust systems and ensuring data flows seamlessly, even during unforeseen challenges. Each assignment has taught me something new—whether it's refining my approach to data pipelines, improving collaboration with teams, or driving product improvement experiments. I'm excited to continue expanding my skill set and exploring more innovative ways to harness the power of data to solve real-world challenges. Big thanks to Zach for making this a free resource and of course to the Discord community for providing troubleshooting tips every step of the way!

    • No alternative text description for this image
    • No alternative text description for this image
  • Dibyanshu is committing the necessary energy to actually upskill and become better!

    View profile for Dibyanshu Kumar

    Microsoft Certified Data Platform Engineer | Lakehouse | Big Data | Spark | Python | Kafka | DWH | CICD | Docker | Azure, AWS | EXL, Deloitte

    This journey felt different - it took late nights and bit of sweat to complete all the assignment, but it was absolutely worth it! Thanks to Zach Wilson for the great content and clear deadlines that kept me motivated to stay on track. If you're still thinking about joining, give it a try - it's totally worth it!" : https://lnkd.in/gn58ARjJ #bootcamp #dataengineering #learning

    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Stéfano Vivas

    Data Engineer | SQL Server | Python | Spark | Databricks | Azure Certified

    Hey folks! After seven weeks of a lot of hard work, I have successfully completed DataExpert.io Data Engineering Bootcamp! 🏆 📚 The topics covered were: - Dimensional Data Modeling - Fact Data Modeling - Apache Spark Fundamentals - Applying Analytical Patterns - Real-time pipelines with Flink and Kafka - Data Visualization and Impact - Data Pipeline Maintenance - KPIs and Experimentation - Data Quality Patterns Out of 30,100 participants, only 65 of us (0.2%) successfully completed all the homework assignments! I’m proud to be one of them! 💪 A big shoutout to Zach Wilson and the amazing DataExpert.io team for creating such a comprehensive and impactful program. 🙌 I would also like to thank the discord community who were always available to answer questions and share knowledge. Finally, I would like to thank the people who liked my posts with class notes and helped disseminate my content across the network. If you want to review my class notes, you can access them through the address below: https://lnkd.in/dW5h7HKd

    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Ragavi Ashok

    Seeking full time roles in Data domain| Master’s in Advance Data Analytics | Research Assistant @UNT

    One of the best things I accomplished this January was completing Zach Wilson’s Free Data Engineering Bootcamp —a challenging yet incredibly rewarding experience. 💪 This bootcamp required around 40 hours of dedication, packed with learning core concepts, engaging in hands-on labs, and tackling homework assignments. It wasn’t easy, but every bit of effort was absolutely worth it. 🌟 A huge thank you to Zach Wilson for making this bootcamp accessible to all. His ability to explain concepts through his work experiences at Facebook, Airbnb, and Netflix made the content engaging, relatable, and easy to grasp. 🙌 Through this workshop, I gained hands-on experience with essential data engineering topics, including: Dimensional Data Modeling Fact Data Modeling Apache Spark Fundamentals Analytical Patterns and KPIs Unit Testing PySpark Pipelines Real-Time Pipelines with Flink and Kafka Out of 30,100 participants, only 65 of us (0.2%) successfully completed all the homework assignments! I’m proud to be one of them! Every module brought exciting new challenges, and I couldn’t wait to dive into the next one. If you’re looking to enhance your data engineering skills, I highly recommend this bootcamp! (It’s open until February 7th.) DataExpert.io I’m now working on a personal project inspired by what I’ve learned and will share it once it’s complete. Stay tuned! Here’s to more growth, learning, and victories in 2025!

    • No alternative text description for this image
  • DataExpert.io reposted this

    Beginning the year strong by being one of the 65 out of 30,100 people (0.2%) who completed Zach Wilson's (DataExpert.io) free data engineering bootcamp! 🔥 📚 The topics covered were: - Dimensional Data Modeling - Fact Data Modeling - Apache Spark Fundamentals - Applying Analytical Patterns - Real-time pipelines with Flink and Kafka - Data Visualization and Impact - Data Pipeline Maintenance - KPIs and Experimentation - Data Quality Patterns Huge thanks to Zach and the DataExpert.io team for producing all 20+ hours lessons and labs and making them available. 🙏 Also thanks to the DataExpert.io community for the much needed solidarity. I am inspired by all of you hard-working and capable DE's! 😎 Check out on-going and future events at DataExpert.io Academy here 👉 https://www.dataexpert.io/

    • No alternative text description for this image
  • DataExpert.io reposted this

    View profile for Amanda C.

    Analytics Engineer | Instructor | SQL | Looker | dbt

    After some well-deserved holidays, I’m back on track! Last year was truly a game-changer for me. I decided to focus on my career, becoming a data visualization instructor at a bootcamp while balancing a full-time role as a Data Analyst. Amid it all, I enrolled in the Analytics Engineering course by Zach Wilson at DataExpert.io. It wasn’t easy— attending classes at 3 AM, keeping up with exercises, and relearning how to learn. I had to adapt my methods, work intensely, and push myself daily. But every hour invested, every interaction, and every moment of growth proved worth it. This certificate represents more than just hard work. It stands for my dedication to continuous improvement. https://lnkd.in/d-CqrhEv

    No alternative text description for this image

    web link

    dataexpert.io

Similar pages

Browse jobs