New research from Meta on RL for coding: SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Hits 41.0% solve rate on SWE-bench Verified with Llama 3 https://lnkd.in/dcRjtaUx
Kentauros AI
Software Development
Wilmington, Delaware 219 followers
See us on the web: https://www.kentauros.ai/ Come talk with us on Discord: https://discord.gg/hhaq7XYPS6
About us
Build, deploy, and share AI agents with ease on the AgentSea platform.
- Website
-
https://kentauros.ai
External link for Kentauros AI
- Industry
- Software Development
- Company size
- 2-10 employees
- Headquarters
- Wilmington, Delaware
- Type
- Privately Held
- Founded
- 2023
Locations
-
Primary
Wilmington, Delaware 19808, US
Employees at Kentauros AI
-
Jeffrey Huckabay
Senior Software Engineer at Kentauros
-
Mariya Davydova
AI Agents Builder | Applied AI Advocate | Founder | Getting Things Done
-
Patrick Barker
Founder / CTO
-
Tapa Dipti Sitaula
Product Manager | AI & Full-Stack Solutions Builder | Volunteer Digital Consultant | MBA, Georgia Tech. MS, Carnegie Mellon University.
Updates
-
A HF engineer who monitors every PR to their repos, estimates that "about 20% of those are now written by AI, run by various users and companies. In some cases I chat with the AI during the PR review and it makes improvements." "https://lnkd.in/dt2bjumF"
-
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models "MinionS, in which the remote model decomposes the task into easier subtasks over shorter chunks of the document, that are executed locally in parallel. MinionS reduces costs by 5.7× on average while recovering 97.9% of the performance of the remote model alone." https://lnkd.in/d89y3u9d
-
Qwen models are getting reasoning! "<think>...</think> QwQ-Max-Preview 🤔 Today we release "Thinking (QwQ)" in Qwen Chat, backed by our QwQ-Max-Preview, which is a reasoning model based on Qwen2.5-Max. This model is still for preview. It is highly capable of math understanding, coding, agent, etc. Compared with Qwen2.5-Max, it is much smarter and has much more creatitvity. Very soon, we are about to release the official version of QwQ-Max, and we will open-weight both QwQ-Max and Qwen2.5-Max under the license of Apache 2.0! Furthermore, we will also provide smaller variants, e.g., QwQ-32B, which can be deployed on local devices. Also, since a great number of users are expecting our APP, we are going to release an Android and iOS APP while we release our official QwQ-Max. " - Blog: https://lnkd.in/gPsYJ-gy - Qwen Chat: https://chat.qwen.ai
-
A fantastic geometry dataset for visual RL learners used in the Easy R1 vERL fork: https://lnkd.in/dFh9ipiA
-
More data this week! GneissWeb: Preparing High Quality Data for LLMs at Scale "In this paper, we introduce GneissWeb, a large dataset yielding around 10 trillion tokens that caters to the data quality and quantity requirements of training LLMs." "The GneissWeb dataset consists of 10T high quality tokens distilled from 96 common-crawl snapshots of FineWeb. " "models trained using GneissWeb still achieve a 1.75 percentage points advantage over those trained on FineWeb-V1.1.0" https://lnkd.in/dTNAfTxM
-
ByteDance released their full AI stack: AIBrix, an open-source initiative designed to deliver the essential building blocks for consistent, scalable GenAI inference infrastructure. AIBrix delivers a cloud-native solution optimized for deploying, managing, and scaling large language model (LLM) inference, tailored specifically to enterprise needs. The initial release includes the following key features: - High-Density LoRA Management: Streamlined support for lightweight, low-rank adaptations of models. - LLM Gateway and Routing: Efficiently manage and direct traffic across multiple models and replicas. - LLM App-Tailored Autoscaler: Dynamically scale inference resources based on real-time demand. - Unified AI Runtime: A versatile sidecar enabling metric standardization, model downloading, and management. - Distributed Inference: Scalable architecture to handle large workloads across multiple nodes. - Distributed KV Cache: Enables high-capacity, cross-engine KV reuse. - Cost-efficient Heterogeneous Serving: Enables mixed GPU inference to reduce costs with SLO guarantees. - GPU Hardware Failure Detection: Proactive detection of GPU hardware issues. https://lnkd.in/g7yyVUs3
-
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models "In this work, we present Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, purposefully made for reinforcement learning (RL). To create Big-Math, we rigorously filter, clean, and curate openly available datasets, extracting questions that satisfy our three desiderata: (1) problems with uniquely verifiable solutions, (2) problems that are open-ended, (3) and problems with a closed-form solution." "we introduce 47,000 new questions with verified answers, Big-Math-Reformulated: closed-ended questions (i.e. multiple choice questions) that have been reformulated as open-ended questions through a systematic reformulation algorithm." https://lnkd.in/dD6ZWkhJ
-
The good folks at Llama Factory introduced "EasyR1: an efficient, scalable, multi-modality RL training framework We have witnessed the remarkable success of GRPO algorithm in DeepSeek R1. Now we extended veRL project to support vision-language models, enabling efficient RL training for Qwen2.5-VL models. After 30 training steps, it achieves a 5% performance gain in our experiments on Geometry3k test set 🚀 We will integrate more RL algorithms and VLM architectures in future updates. Stay tuned for more advancements!" https://lnkd.in/eb_NpFy3
-
The DeepSeek team open sourced FlashMLA, an efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences serving. "Achieving up to 3000 GB/s in memory-bound configuration and 580 TFLOPS in computation-bound configuration on H800 SXM5, using CUDA 12.6." https://lnkd.in/dTmWgYuy