Dagster Labs’ cover photo
Dagster Labs

Dagster Labs

Software Development

San Francisco, California 11,818 followers

Building out Dagster, the data orchestration platform built for productivity.

About us

Building out Dagster, the data orchestration platform built for productivity. Join the team that is hard at work, setting the standard for developer experience in data engineering. Dagster Github: https://github.com/dagster-io/dagster

Industry
Software Development
Company size
11-50 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2018
Specialties
data engineering, data orchestration, open source software, and SaaS

Products

Locations

Employees at Dagster Labs

Updates

  • Dagster Labs reposted this

    View profile for Adrian Brudaru

    Open source pipelines - dlthub.com

    I just recorded a "how to deploy dlt to dagster demo" as part of our upcoming course. Since Dagster is easy peasy, the whole thing takes only 10 minutes. My microphone though decided to be a diva and forced me to re-record a handful of times. Want to get notified when the course comes out? Sign up to our education newsletter: https://lnkd.in/eRfYu-Gs

    • No alternative text description for this image
  • Airlift is a powerful new Tookit that makes it easy to bring all of your Airflow DAGs into the Dagster control plane. Without modifying your Airflow DAGs, you get the operational and observability benefits of Dagster, including: ⚙️ Rich Metadata and run history 🔍 Data Quality checks 🔗 Dependency mapping and automation Instead of a lengthy migration project you get immediate value to build momentum towards building a more powerful Data platform. Check out Airlift today!

  • Dagster Labs reposted this

    View profile for Pedram Navid

    Chief Dashboard Officer @ dagster

    I was speaking with a senior engineer the other day who proudly showed me how his team built a modern data stack from scratch that helped reduce their cloud bill. Impressive on paper, but as we dug deeper, I couldn't help but wonder: are they actually saving money, or just shifting costs? When we evaluate data infrastructure decisions, we often fixate on the monthly cloud bill while ignoring the true total cost of ownership. The equation isn't just about dollars spent on services - it's about engineering time, maintenance overhead, and opportunity cost. Self-managed infrastructure comes with significant hidden costs: - Engineering time spent babysitting open-source tools instead of building business value - Increased complexity that compounds over time - The expertise required to troubleshoot when things inevitably break - Missed opportunities while maintaining infrastructure instead of innovating I've seen teams choose Kafka over managed CDC solutions, only to spend weeks debugging configuration issues. I've watched organizations replace BigQuery and Snowflake with homegrown DIY solutions, then struggle with performance tuning that cloud warehouses handled automatically. Don't get me wrong, there are valid reasons to build custom infrastructure. Multi-region requirements, specific performance needs, or genuine cost efficiency at scale can justify the build it approach. But too often, teams make these decisions without fully accounting for the long-term maintenance burden. The most successful data teams I've worked with focus on building platforms that enable self-service while minimizing maintenance overhead. They use managed services strategically, reserving custom solutions for areas with genuine competitive advantage. In my experience, data leaders are almost always able to see and balance these trade-offs, but IC engineers are often stuck in the "that's a simple weekend project" mentality. What's your experience? Have you built a custom data stack that truly saved money, or found yourself drowning in maintenance costs? Would love to hear your perspective on the build vs. buy decision.

  • View organization page for Dagster Labs

    11,818 followers

    Community Spotlight! Kubox AI is an on-demand data platform designed to build and deploy analytics applications anywhere. It combines open-source Kubernetes with Dagster, Dask, and Jupyter notebooks, making it easy to scale and manage complex data workloads. Kubox offers the simplicity of SaaS with the flexibility of PaaS, minimizing overhead while providing a vendor-neutral data infrastructure. Check out the repo today!

    • No alternative text description for this image
  • Dagster Labs reposted this

    View profile for Alexander Noonan

    Developer Advocate & Data Engineer

    In my role, I speak to many data practitioners across many industries, which is fantastic! Here are a few observations: While we obsess over tools, frameworks, and infrastructure costs, the biggest challenges in data engineering are fundamentally human problems. Technical leaders who aren't technical, teams that build data silos to prevent access to data, and software engineers who don't treat data pipelines as first-class citizens are killing our productivity. The data tooling ecosystem has expanded dramatically in the last decade. We have incredible options for orchestration, transformation, and storage. Yet many organizations still struggle with the basics: - Teams duplicating engineering efforts in silos - Leadership refusing to invest in proper data quality measures - Engineers introducing tools without consulting actual users - Software teams refusing to fix data problems upstream - Stakeholders expecting miracles while underinvesting in infrastructure - What's particularly frustrating is how these issues compound. Poor data quality leads to a lack of trust, duplicate systems, increased costs, and limited investment in better solutions. Everyone pays lip service to 'garbage in, garbage out,' but nobody wants to invest in fixing data quality issues. What's your biggest challenge in data engineering today? Is it technical complexity, or is it the human element that's holding your team back? #DataEngineering #DataQuality #DataLeadership #DataStrategy

  • Dagster Labs reposted this

    View profile for Pedram Navid

    Chief Dashboard Officer @ dagster

    Apache Iceberg is promising freedom but is it just adding to fragmentation? I know the story is that many organizations are exploring Iceberg for its promise of breaking vendor lock-in, but something tells me the reality is more nuanced. Sure, some companies are successfully using Iceberg to bridge silo. I’ve heard of one organization that's connecting Athena users with their Snowflake/dbt teams through a unified table format. But freedom comes with responsibility. I’ve also heard from engineers reporting unexpected challenges with their AWS integration, small file performance issues, and operational overhead. The most compelling use cases I'm seeing aren't about "switching" to Iceberg entirely, but strategically applying it where it makes sense: - Multi-engine flexibility (using combinations of Spark, Trino, and StarRocks) - GDPR compliance through better metadata management - Breaking down data silos between teams with different tooling preferences What's clear is that Iceberg isn't a silver bullet though, it's a strategic architectural choice that requires understanding data patterns, security requirements, and engineering capabilities. In that respect, nothing new. Where do you stand on open table formats? Are your organizations considering Iceberg, Delta Lake, or sticking with your current solution? #DataEngineering #ApacheIceberg #DataLakehouse #ModernDataStack #DataArchitecture

  • Dagster Labs reposted this

    View profile for Nilton Kazuyuki U.

    Executive Manager | Business Intelligence, Data Engineering, Generative AI, Machine Learning and Data Observability | Professional Educator, Teacher and Speaker | Tableau and Snowflake Global Ambassador

    Queria entender se essa percepção esta correta ou não, no meu circulo de pessoas o principal orquestrador de fluxos de dados do mercado global é o Apache Airflow. Conheço muitas pessoas que trabalham como Staff e Data Engineers e a grande maioria diz que o Airflow embora open-source, é o queridinho ainda da engenharia de dados porque é muito flexivel, oferece muitos recursos de controle e observabilidade; mas tem um problema: é facil se perder no meio de tantas DAGs se nao tiver um template! Nos ultimos dias eu postei falando do Airflow, que conheço mais a fundo e chegou um caminhão de comentarios falando que o Dagster Labs e Airbyte esta top e ganhando um espaço de mercado grande! Não estava com essa percepção. É isso mesmo que esta acontecendo?

    • No alternative text description for this image
  • Dagster Labs reposted this

    View profile for Pedram Navid

    Chief Dashboard Officer @ dagster

    Does Kimball matter in 2025? As the old saying goes, if you want to learn something new, read something old. We've all seen the pattern. Teams abandon dimensional modeling for One Big Table. Delivery speeds up, and everyone celebrates, and then an inevitable complexity crisis hits. The tradeoff isn't just theoretical. I've watched teams struggle with massive denormalized tables that become unmaintainable as platforms scale. Performance issues creep in. Data inconsistencies multiply. Trust erodes. Heads begin to roll. Dimensional modeling isn't just a set of made-up arbitrary constraints. It emerged from practical experience with the challenges of organizing data at scale. I'm not advocating for rigid adherence to textbook star schemas, but there is wisdom in applying structure where it matters most. We've been sold a bad bill of goods: just dump data in S3, created a data lake, a lake house, a data swamp. The best teams I've seen have all built foundational models while creating targeted denormalized views for specific use cases. When resources tighten and compute costs matter, the pendulum swings back toward efficiency and structure. The organizations that thrive long-term are those that invest in modeling from the start. What's your experience with dimensional modeling in modern data platforms? #DataEngineering #DataModeling #DataPlatforms

  • Dagster Labs reposted this

    View profile for Pedram Navid

    Chief Dashboard Officer @ dagster

    "MS Fabric destroyed 3 months of work." Want the story? An engineer connected their Fabric workspace to DevOps and it wiped all their artifacts irreversibly. Microsoft's response? "It's a known issue" – with documentation conveniently uploaded the same day. This highlights a critical truth about data platforms: architecture choices have real consequences. Microsoft Fabric, despite heavy enterprise marketing, is still effectively in beta. As one engineer put it: "It's released to the public broken, users essentially opt in to a beta release that's not labeled as such." The fundamental challenge for data teams isn't just building pipelines – it's creating scalable platforms that enable self-service for data consumers. That's where Dagster takes a fundamentally different approach: • Data-centric architecture: Organize around actual data assets, not just tasks • End-to-end lineage: Track data across your entire platform • Developer experience: Local testing, branch deployments, and modern Python frameworks • Built-in data quality: Asset checks and testing baked in • Unified control plane: One interface for all pipelines The true cost of choosing the wrong platform goes far beyond licensing: • Lost engineering time from platform limitations or failures • Delayed business initiatives waiting on data • Reduced team morale from fighting with tooling • Technical debt from necessary workarounds • Opportunity cost of delayed innovation High interest rates cure all ailments, and they should also cure the impulse to chase shiny new objects without proper evaluation. As one data engineer who abandoned Fabric put it: "The salary and benefits were pretty crazy, but not enough for me to lose my soul." Your architecture choices matter. Choose wisely. #DataEngineering #DataPlatform #Dagster #DataOrchestration

Similar pages

Browse jobs

Funding

Dagster Labs 3 total rounds

Last Round

Series B

US$ 33.0M

See more info on crunchbase