Kentauros AI

Kentauros AI · 2025-02-26T06:26:49.787Z

The DeepSeek team open sourced FlashMLA, an efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences serving. "Achieving up to 3000 GB/s in memory-bound configuration and 580 TFLOPS in computation-bound configuration on H800 SXM5, using CUDA 12.6." https://lnkd.in/dTmWgYuy

Software Development

Wilmington, Delaware 219 followers

See us on the web: https://www.kentauros.ai/ Come talk with us on Discord: https://discord.gg/hhaq7XYPS6

Discover all 8 employees

About us

Build, deploy, and share AI agents with ease on the AgentSea platform.

Website: https://kentauros.ai
External link for Kentauros AI
Industry: Software Development
Company size: 2-10 employees
Headquarters: Wilmington, Delaware
Type: Privately Held
Founded: 2023

Locations

Primary

Wilmington, Delaware 19808, US

Get directions

Employees at Kentauros AI

See all employees

Updates

Kentauros AI

219 followers
18h
Report this post
New research from Meta on RL for coding: SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Hits 41.0% solve rate on SWE-bench Verified with Llama 3 https://lnkd.in/dcRjtaUx

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open...

arxiv.org

Like Comment Share
Kentauros AI

219 followers
1d
Report this post
A HF engineer who monitors every PR to their repos, estimates that "about 20% of those are now written by AI, run by various users and companies. In some cases I chat with the AI during the PR review and it makes improvements." "https://lnkd.in/dt2bjumF"

https://x.com/carrigmat/status/1893739321477275997

x.com

Like Comment Share
Kentauros AI

219 followers
1d
Report this post
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models "MinionS, in which the remote model decomposes the task into easier subtasks over shorter chunks of the document, that are executed locally in parallel. MinionS reduces costs by 5.7× on average while recovering 97.9% of the performance of the remote model alone." https://lnkd.in/d89y3u9d

Minions: Cost-efficient Collaboration Between On-device and Cloud...

arxiv.org

Like Comment Share
Kentauros AI

219 followers
2d
Report this post
Qwen models are getting reasoning! "<think>...</think> QwQ-Max-Preview 🤔 Today we release "Thinking (QwQ)" in Qwen Chat, backed by our QwQ-Max-Preview, which is a reasoning model based on Qwen2.5-Max. This model is still for preview. It is highly capable of math understanding, coding, agent, etc. Compared with Qwen2.5-Max, it is much smarter and has much more creatitvity. Very soon, we are about to release the official version of QwQ-Max, and we will open-weight both QwQ-Max and Qwen2.5-Max under the license of Apache 2.0! Furthermore, we will also provide smaller variants, e.g., QwQ-32B, which can be deployed on local devices. Also, since a great number of users are expecting our APP, we are going to release an Android and iOS APP while we release our official QwQ-Max. " - Blog: https://lnkd.in/gPsYJ-gy - Qwen Chat: https://chat.qwen.ai

https://qwenlm.github.io/blog/qwq-max-preview/

qwenlm.github.io

Like Comment Share
Kentauros AI

219 followers
2d
Report this post
A fantastic geometry dataset for visual RL learners used in the Easy R1 vERL fork: https://lnkd.in/dFh9ipiA

hiyouga/geometry3k · Datasets at Hugging Face

huggingface.co

Like Comment Share
Kentauros AI

219 followers
3d
Report this post
More data this week! GneissWeb: Preparing High Quality Data for LLMs at Scale "In this paper, we introduce GneissWeb, a large dataset yielding around 10 trillion tokens that caters to the data quality and quantity requirements of training LLMs." "The GneissWeb dataset consists of 10T high quality tokens distilled from 96 common-crawl snapshots of FineWeb. " "models trained using GneissWeb still achieve a 1.75 percentage points advantage over those trained on FineWeb-V1.1.0" https://lnkd.in/dTNAfTxM

GneissWeb: Preparing High Quality Data for LLMs at Scale

arxiv.org

Like Comment Share
Kentauros AI

219 followers
3d
Report this post
ByteDance released their full AI stack: AIBrix, an open-source initiative designed to deliver the essential building blocks for consistent, scalable GenAI inference infrastructure. AIBrix delivers a cloud-native solution optimized for deploying, managing, and scaling large language model (LLM) inference, tailored specifically to enterprise needs. The initial release includes the following key features: - High-Density LoRA Management: Streamlined support for lightweight, low-rank adaptations of models. - LLM Gateway and Routing: Efficiently manage and direct traffic across multiple models and replicas. - LLM App-Tailored Autoscaler: Dynamically scale inference resources based on real-time demand. - Unified AI Runtime: A versatile sidecar enabling metric standardization, model downloading, and management. - Distributed Inference: Scalable architecture to handle large workloads across multiple nodes. - Distributed KV Cache: Enables high-capacity, cross-engine KV reuse. - Cost-efficient Heterogeneous Serving: Enables mixed GPU inference to reduce costs with SLO guarantees. - GPU Hardware Failure Detection: Proactive detection of GPU hardware issues. https://lnkd.in/g7yyVUs3

GitHub - vllm-project/aibrix: Cost-efficient and pluggable Infrastructure components for GenAI inference

Like Comment Share
Kentauros AI

219 followers
4d
Report this post
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models "In this work, we present Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers, purposefully made for reinforcement learning (RL). To create Big-Math, we rigorously filter, clean, and curate openly available datasets, extracting questions that satisfy our three desiderata: (1) problems with uniquely verifiable solutions, (2) problems that are open-ended, (3) and problems with a closed-form solution." "we introduce 47,000 new questions with verified answers, Big-Math-Reformulated: closed-ended questions (i.e. multiple choice questions) that have been reformulated as open-ended questions through a systematic reformulation algorithm." https://lnkd.in/dD6ZWkhJ

SynthLabsAI/Big-Math-RL-Verified · Datasets at Hugging Face

huggingface.co

Like Comment Share
Kentauros AI

219 followers
4d
Report this post
The good folks at Llama Factory introduced "EasyR1: an efficient, scalable, multi-modality RL training framework We have witnessed the remarkable success of GRPO algorithm in DeepSeek R1. Now we extended veRL project to support vision-language models, enabling efficient RL training for Qwen2.5-VL models. After 30 training steps, it achieves a 5% performance gain in our experiments on Geometry3k test set 🚀 We will integrate more RL algorithms and VLM architectures in future updates. Stay tuned for more advancements!" https://lnkd.in/eb_NpFy3

https://github.com/hiyouga/EasyR1?tab=readme-ov-file

Like Comment Share
Kentauros AI

219 followers
5d
Report this post
The DeepSeek team open sourced FlashMLA, an efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences serving. "Achieving up to 3000 GB/s in memory-bound configuration and 580 TFLOPS in computation-bound configuration on H800 SXM5, using CUDA 12.6." https://lnkd.in/dTmWgYuy

GitHub - deepseek-ai/FlashMLA

Like Comment Share

Kentauros AI

Software Development

Wilmington, Delaware 219 followers

See us on the web: https://www.kentauros.ai/ Come talk with us on Discord: https://discord.gg/hhaq7XYPS6

About us

Locations

Employees at Kentauros AI

Jeffrey Huckabay

Senior Software Engineer at Kentauros

Mariya Davydova

AI Agents Builder | Applied AI Advocate | Founder | Getting Things Done

Patrick Barker

Founder / CTO

Tapa Dipti Sitaula

Product Manager | AI & Full-Stack Solutions Builder | Volunteer Digital Consultant | MBA, Georgia Tech. MS, Carnegie Mellon University.

Updates

Join now to see what you are missing

Similar pages

AI Infrastructure Alliance

Stability AI

Pachyderm Inc. (Acquired by HPE)

Beam

Leitmotif

One Medical

Ixana

Magic

VMware

cylib

Browse jobs

Engineer jobs

Network Operations Manager jobs

Director of Infrastructure jobs

Senior Engineering Manager jobs

Vice President of Engineering jobs

Software Engineering Manager jobs

Director of Engineering jobs

Manager jobs

Analyst jobs