Is prompt engineering dead? Not yet, but it’s a great example of the speed of AI advancements and the subsequent challenge of navigating tech waves with what can appear to be rapidly decreasing marginal returns. Prompt engineering refers to the process of structuring prompts to increase the accuracy and effectiveness of LLMs. It’s a hot topic at the moment, as it helps solve for many of the quirks that result in poor quality outputs. Today prompt engineering is done by teams of people and is one of the new roles to emerge with this wave of AI. But recent research is showing that LLMs appear capable of creating better outcomes than humans when prompting themselves. Challenge 1: LLM models are being updated / enhanced so frequently that it makes it difficult to build lasting frameworks for one model. Challenge 2: As LLMs get smarter, prompt engineering capabilities will be embedded into the models themselves and become a feature of the LLM. Does this make investing in prompt-engineering today a waste - heck no. It just needs to be part of a larger LLMOps and ultimately DataOps motion. Why, LLMs don’t speak English, they just do a lot of math. And ultimately having *usable* proprietary data and a killer overall platform is the key to winning over the long run. We’ve been chatting to a lot of founders lately on their strategies for navigating many of the challenging questions when embedding AI into their platforms: should I fine-tune my own model? Which model should I use? Should I build a prompt-engineering team? We definitely do not have all the answers but thankfully we know a lot of folks who are actively navigating these questions and / or building the scaffolding to help others do so. We’re working on a short essay on some tips and tricks for early & growth stage companies - if you’re interested in contributing please reach out! #ai #data #software
Andrew Steele’s Post
More Relevant Posts
-
DeepSeek AI: 𝗥𝗲𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝗶𝘇𝗶𝗻𝗴 𝗔𝗜 𝗥𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴! 🐋 I'm excited to share the groundbreaking achievements of DeepSeek-R1, an open-source AI model that's redefining the boundaries of machine reasoning. Its performance across critical benchmarks is truly remarkable. 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀: ✦ MATH-500: 97.3% accuracy, surpassing OpenAI's 96.4% ✦ Codeforces: 96.3% performance, rivaling industry leaders ✦ AIME 2024: 79.8% success rate, leading the competition ✦ GPQA Diamond: 71.5%, demonstrating exceptional reasoning ✦ MMLU: Impressive 90.8% in general knowledge assessment ✦ SWE-bench Verified: Competitive 49.2% in software engineering tasks 𝗪𝗵𝗮𝘁 𝗺𝗮𝗸𝗲𝘀 𝗗𝗲𝗲𝗽𝗦𝗲𝗲𝗸-𝗥𝟭 𝘀𝗽𝗲𝗰𝗶𝗮𝗹: ✦ Advanced Chain-of-Thought reasoning for complex problem-solving ✦ Built-in self-verification for enhanced reliability ✦ Efficient architecture that runs smoothly on standard hardware The implications are significant. We're witnessing AI that can handle increasingly complex reasoning tasks with human-like performance. Whether you're in research, software development, or AI implementation, DeepSeek-R1's capabilities open new possibilities for practical applications. 𝗧𝗿𝘆 𝗶𝘁 𝘆𝗼𝘂𝗿𝘀𝗲𝗹𝗳: ├ 𝘼𝙄 𝘾𝙝𝙖𝙩: https://chat.deepseek.com └ 𝘿𝙚𝙚𝙥𝙎𝙚𝙚𝙠-𝙍1 𝙤𝙣 𝙃𝙪𝙜𝙜𝙞𝙣𝙜𝙁𝙖𝙘𝙚: https://lnkd.in/dEp5sH6M #AIResearch #DeepSeek #MachineLearning #DeepLearning #ReinforcementLearning #LLMs #AI
To view or add a comment, sign in
-
I've seen firsthand how using AI and Machine Learning can change the game for data governance in software development. Imagine having a single source of data that's always accurate. Cool, right? AI and Machine Learning can help us with that. They clean up messy data, find patterns, and make sure everything is up-to-date. It's like having a super smart assistant that never sleeps. When we use them for data governance, software development becomes smoother and less error-prone. This means faster projects and better products. Who wouldn’t want that? Have you started using AI and Machine Learning for your data yet? #AI #MachineLearning #DataGovernance #SoftwareDevelopment
To view or add a comment, sign in
-
🚀 Excited about the future of computer vision? Check out the complete MLOps cycle that transforms your projects from concept to deployment! 🔍 **Key Stages:** 1. **Data Collection & Preprocessing**: Start strong with high-quality data! 2. **Model Development**: Optimize performance through training and tuning. 3. **Model Deployment**: Seamless CI/CD pipelines ensure your model is production-ready! 4. **Monitoring & Maintenance**: Keep an eye on performance and retrain when necessary. 🔧 Leverage automation tools like MLflow and BentoML for efficiency. Collaboration is key! 📈 To dive deeper into MLOps for computer vision, contact us via WhatsApp 👉 https://lnkd.in/eXCvFp6H or message us on LinkedIn! Don't forget to subscribe to our newsletters and follow us for the latest in AI and martech 👉 https://lnkd.in/efEUt3pR [Source: https://lnkd.in/eb9DfKWt] 🌐 #MLOps #ComputerVision #AI #ML #DataScience
To view or add a comment, sign in
-
The more I think about it, the clearer it becomes that those leading this Generative AI wave have little understanding of software development. It seems they are building technologies for their own sake, rather than as the most efficient solutions to address problems. They are not engineers but programmers, as a wise man once told me. They are opting for complex methods to solve problems that could be addressed with much simpler techniques, which are more computationally efficient and thus cheaper. We see many frameworks built around using Large Language Models (LLMs) as the orchestration layer. We are now being sold on Ambient AI. In simple terms, this involves calling functions on a schedule or based on triggers, which is standard practice in software development. Then using Large Language Models as an orchestration layer (Agentic AI) with Planning, self-reflection, tool usage, Self Evaluation, and decisions through consensus. This is extremely computationally expensive and the same problems be solved with traditional software and making calls to Foundational Models when required. At a much cheaper cost, with lower complexity meaning it's more maintainable. In my opinion, over time we will stick with using traditional software development as the orchestration layer, as it is deterministic and explainable. Inference will be performed with foundation models as required, as we have always done with machine learning. The future of Artificial Intelligence lies not in complexity for complexity's sake but in the elegant simplicity of efficient engineering. #artificialintelligence #machinelearning #engineering
To view or add a comment, sign in
-
The areas where AI should be strongly implemented in the course of a software. AI specially LLMs are now existent for past 4 years. But still i see we are scratching the surface with the technology, just like internet in early stages. Here are the key areas, where you should be using starting now - 1. Creating core architecture of codebase. 2. Memory test cases. 3. Code commenting. 4. Identifying grade C issues in the code. 5. Solving business logics in chunks. 6. Learning to prompt. The time for code and testing can be reduced by 20-30% with the right use of AI. What would you add up in the list? Follow Rohan Girdhani for more such insights about tech.
To view or add a comment, sign in
-
Last week was blast attending The AI Furnace 🧨🔥's 100 Hot Startup in NYC. Talking with brilliant minds like Alexander Bricken from Anthropic was full of new perspectives in Building solutions with LLM. Sharing one thought provoking conversation . 🚀 𝗘𝗻𝗵𝗮𝗻𝗰𝗲 𝗟𝗟𝗠 𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝘄𝗶𝘁𝗵 𝗣𝗿𝗼𝗺𝗽𝘁 𝗖𝗮𝗰𝗵𝗶𝗻𝗴 🚀 Prompt caching is an effective way to improve the speed and efficiency of large language models (LLMs) like Claude. Here’s a quick guide: 𝗞𝗲𝘆 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀: • Conversational Agents: Extend chat sessions without reprocessing entire documents. • Coding Assistants: Retain code summaries for faster autocompletion and generation. • Large Document Handling: Process lengthy documents and images quickly without increasing response time. • Agentic Search & Tool Use: Streamline multiple tool calls by iterating on cached inputs. • Many-shot Prompting: Efficiently include multiple examples to boost output quality. 𝗕𝗲𝗻𝗲𝗳𝗶𝘁𝘀: • Cost Savings: Cut prompt repetition costs by up to 90%. • Faster Response: Reduce latency by up to 85%, beneficial for both GPU and CPU-based models. • Scalability: Decrease computational load for more scalable AI solutions. • Energy Efficiency: Conserve energy by reducing redundant processing. 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲𝘀: • Implementation Complexity: Integrating prompt caching requires careful planning—it's not a one-click fix. • Cache Management: Outdated cache data can lower response quality if not properly managed. • Memory Limitations: Large-scale systems may encounter memory constraints. I would love to know your thoughts..! #Anthropic #LLM #AI #MachineLearning #CachePrompting
To view or add a comment, sign in
-
Earlier this week, I promised a synthesis on scaling laws plateauing—here it is. I recommend to focus on stages B & C below. To simplify, let’s break it down into three stages in language model (LLM) evolution: A. Pre-training: Models train on base datasets to recognize patterns. B. Post-training: Models are fine-tuned on domain-specific data for expertise. C. Thinking/Inference: Models simulate first-principles reasoning or iterative solutions for complex prompts. As compute is vital across all three stages - Scaling laws are flattening primarily on A and partially on B. The key takeaways: B & C are critical for #AI to succeed in real-world use cases. They require highly specialized, high-quality data for meaningful inferences and action-driven responses. Generic #foundationmodels, built on vast public datasets, lack domain specificity. While they enable enterprise scaling, they often emulate intelligence rather than operating on first principles. This is acceptable in domains like medicine, where augmentation helps scale, but risky in industrial or operational scenarios where 98% efficacy isn’t enough (e.g., safety-critical processes or inventory forecasting). The upside? Compute costs have dropped by a factor of 1M in 10 years and continue to fall (currently by a factor of 3). This means: •Foundational models like #OpenAI’s GPT are being rivaled by open-source models (#DeepseekR1-lite recently matched GPT 4.0-preview on AIME & Math benchmarks). •Enterprises should prioritize post-training (B) or inference-time reasoning (C) with high-quality data to achieve production-grade use cases. While compute and data costs may seem high initially, they are expected to drop over-time as models scale memory and compute efficiency improves. It’s worth noting no model has yet been trained on #NVIDIA’s cutting-edge #Blackwell GPUs—exciting developments lie ahead! On that note, there is atleast one more scaling wave approaching… #LLM #AITrends #GenerativeAI #EnterpriseAI #DeepLearning #eyparthenon
To view or add a comment, sign in
-
Use the right tool for the right job. GenAI can do a lot of things and belongs in your toolbox, but like every tool it has advantages and disadvantages. Using GenAI for everything is costly and less reliable where plain software excels.
The more I think about it, the clearer it becomes that those leading this Generative AI wave have little understanding of software development. It seems they are building technologies for their own sake, rather than as the most efficient solutions to address problems. They are not engineers but programmers, as a wise man once told me. They are opting for complex methods to solve problems that could be addressed with much simpler techniques, which are more computationally efficient and thus cheaper. We see many frameworks built around using Large Language Models (LLMs) as the orchestration layer. We are now being sold on Ambient AI. In simple terms, this involves calling functions on a schedule or based on triggers, which is standard practice in software development. Then using Large Language Models as an orchestration layer (Agentic AI) with Planning, self-reflection, tool usage, Self Evaluation, and decisions through consensus. This is extremely computationally expensive and the same problems be solved with traditional software and making calls to Foundational Models when required. At a much cheaper cost, with lower complexity meaning it's more maintainable. In my opinion, over time we will stick with using traditional software development as the orchestration layer, as it is deterministic and explainable. Inference will be performed with foundation models as required, as we have always done with machine learning. The future of Artificial Intelligence lies not in complexity for complexity's sake but in the elegant simplicity of efficient engineering. #artificialintelligence #machinelearning #engineering
To view or add a comment, sign in
-
AI Engineering By Chip Huyen Chapter 6: RAG, Agents, & Memory 🔑 Key Themes: 1️⃣ RAG: External Knowledge Integration: RAG enhances models by retrieving information from external sources, reducing hallucinations. It utilizes term-based retrieval (keyword matching, fast, simpler – e.g., Elasticsearch) and embedding-based retrieval (semantic understanding, better context – vector databases). Optimization includes chunking, reranking, query rewriting, and contextual retrieval. 2️⃣ Agents: AI as Action Takers: Agents go beyond text generation, using tools (web browsers, code interpreters, APIs) to interact with the world. This requires careful consideration of read (perception) and write (action) capabilities, sophisticated planning, and control flows (sequential, parallel, if statements). 3️⃣ Planning: Task Decomposition & Execution: Agents need a robust planning process for complex tasks, which includes generating plans, validating them against heuristics (e.g., invalid tool calls), and executing those plans with techniques like function calling. 4️⃣ Memory: Context & Persistence: Memory systems (internal knowledge, short-term context, long-term external storage) are critical for managing information overflow, sustaining context across sessions, and boosting consistency. 🔑 Key Takeaways: 1️⃣ RAG for Enhanced Accuracy & Context: When models require extensive knowledge beyond their internal parameters, integrate RAG using term-based or embedding-based methods. 2️⃣ Agents for Interactive AI: Agents are key to building AI systems that can take actions and utilize external resources. Effective planning and proper tooling choices are essential components for complex tasks. 3️⃣ Planning is important to achive the goal: Focus on implementing robust planning processes with validation and reflection as an important step to avoid issues. 4️⃣ Memory Sustains Performance: Implement memory systems to manage session history, enable personalization, and improve the consistency of AI responses. Focus on different memory management strategies. #AI #AIEngineering #RAG #Agents #LLMs #MachineLearning #DeepLearning #ChipHuyen #PromptEngineering #MemorySystems #AIArchitecture
To view or add a comment, sign in
-
Founder & CEO of Click Therapeutics
6moPrescient!