Aman Dhesi’s Post

View profile for Aman Dhesi

Something new - hiring founding engineers

Introducing Superpipe Studio: a free and open-source tool to curate datasets, run evals, conduct experiments and optimize your LLM pipelines for accuracy, speed and cost. We've spent the last year building with LLMs and working closely with a number of companies and encountered the same problems repeatedly: - evaluating LLM software is hard, because... - collecting high-quality, representative, correcty-labeled data is hard... - which makes it hard to objectively compare different techniques, models and parameters... - which prevents continuous iteration and improvement of LLM software If you want to build a winning AI product - it all comes down to continuous evaluation and optimization. Unlike traditional software, you can't build it once, write some unit tests and sleep well at night. You need a virtuous cycle where product usage generates data that goes into your eval system, which is used to evaluate new models/techniques or fine-tune your own model. We tried existing tools in the market and found none of them really worked well to implement the virtuous cycle we needed, so we built a lightweight tool for internal use: Superpipe Studio. It helps you do 3 things: 1. curate and manage datasets 2. run experiments and compare them on accuracy/speed/cost 3. monitor your production pipelines (and expand your datasets with production data) We think everyone should have these abilities, so we're making Studio completely free and open-source with no restrictions. Studio is a work in progress and rough around the edges but still robust. We encourage you to try it and modify it to suit your needs. The gold-standard of AI software is Tesla/Waymo's full-self driving. Not every AI system needs 5 9's of accuracy, but the process for making your AI system better is the same that Tesla and Waymo followed. And it all starts with high-quality labeled data and a robust evaluation and optimization system. (Github and docs in comment)

  • table
See more comments

To view or add a comment, sign in

Explore topics