Alternatives to CloudSight API

Compare CloudSight API alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to CloudSight API in 2026. Compare features, ratings, user reviews, pricing, and more from CloudSight API competitors and alternatives in order to make an informed decision for your business.

  • 1
    Google Cloud Vision AI
    Derive insights from your images in the cloud or at the edge with AutoML Vision or use pre-trained Vision API models to detect emotion, understand text, and more. Google Cloud offers two computer vision products that use machine learning to help you understand your images with industry-leading prediction accuracy. Automate the training of your own custom machine learning models. Simply upload images and train custom image models with AutoML Vision’s easy-to-use graphical interface; optimize your models for accuracy, latency, and size; and export them to your application in the cloud, or to an array of devices at the edge. Google Cloud’s Vision API offers powerful pre-trained machine learning models through REST and RPC APIs. Assign labels to images and quickly classify them into millions of predefined categories. Detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog.
  • 2
    Amazon Rekognition
    Amazon Rekognition makes it easy to add image and video analysis to your applications using proven, highly scalable, deep learning technology that requires no machine learning expertise to use. With Amazon Rekognition, you can identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content. Amazon Rekognition also provides highly accurate facial analysis and facial search capabilities that you can use to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases. With Amazon Rekognition Custom Labels, you can identify the objects and scenes in images that are specific to your business needs. For example, you can build a model to classify specific machine parts on your assembly line or to detect unhealthy plants. Amazon Rekognition Custom Labels takes care of the heavy lifting of model development for you, so no machine learning experience is required.
  • 3
    Azure Computer Vision
    Boost content discoverability, automate text extraction, analyze video in real time, and create products that more people can use by embedding vision capabilities in your apps. Use visual data processing to label content with objects and concepts, extract text, generate image descriptions, moderate content, and understand people’s movement in physical spaces. No machine learning expertise is required.
  • 4
    Imagga

    Imagga

    Imagga

    Build the next generation of Image Recognition Applications with Imagga's API. Empowering intelligent apps with our customizable machine learning technology. Automatically assign tags to your images. Powerful API for image analysis and discovery. Empower product discoverability in your application. Powerful API for building visual search capabilities. Unlock facial recognition in your applications. Powerful API for building face recognition. Train our image A.I. to better organize your photos in your own list of categories. Automatically categorize your image content. Powerful API for instant image classification. Automated adult image content moderation trained on state of the art image recognition technology. Automatically generate beautiful thumbnails. Powerful API for content-aware cropping. Let colors bring meaning to your product's photos. Powerful API for color extraction.
    Starting Price: $79 per month
  • 5
    fullmoon

    fullmoon

    fullmoon

    Fullmoon is a free, open source application that enables users to interact with large language models directly on their devices, ensuring privacy and offline accessibility. Optimized for Apple silicon, it operates seamlessly across iOS, iPadOS, macOS, and visionOS platforms. Users can personalize the app by adjusting themes, fonts, and system prompts, and it integrates with Apple's Shortcuts for enhanced functionality. Fullmoon supports models like Llama-3.2-1B-Instruct-4bit and Llama-3.2-3B-Instruct-4bit, facilitating efficient on-device AI interactions without the need for an internet connection.
  • 6
    Hive Data
    Create training datasets for computer vision models with our fully managed solution. We believe that data labeling is the most important factor in building effective deep learning models. We are committed to being the field's leading data labeling platform and helping companies take full advantage of AI's capabilities. Organize your media with discrete categories. Identify items of interest with one or many bounding boxes. Like bounding boxes, but with additional precision. Annotate objects with accurate width, depth, and height. Classify each pixel of an image. Mark individual points in an image. Annotate straight lines in an image. Measure, yaw, pitch, and roll of an item of interest. Annotate timestamps in video and audio content. Annotate freeform lines in an image.
    Starting Price: $25 per 1,000 annotations
  • 7
    SensePhoto

    SensePhoto

    SenseTime

    Based on the deep learning technology, provides multi-camera and single-camera portrait blur, single-camera portrait blur, re-lighting, super-resolution, image quality enhancement, and intelligent album management to intelligent terminal devices. Universal port interfaces support hassle-free integration. Offers customers professional and speedy technical support. Universal port interfaces support hassle-free integration. Provides a wide range of product features and produces high-quality professional image processing effects with our industry-leading technology. Extensive experience in AI and deep learning, leading big data-driven image analysis algorithm and a professional product development team. Proprietary technology empowers businesses and services. SenseTime is a leading AI software company focused on creating a better AI-empowered future through innovation. Upholding a vision of advancing the interconnection of the physical and digital worlds with AI.
  • 8
    imgix

    imgix

    Zebrafish Labs

    Powerful image processing, simple API, imgix transforms, optimizes, and intelligently caches your entire image library for fast websites and apps using simple and robust URL parameters. We don’t charge to create variations of your Master Images. You can be as creative with the service as possible. Over 100 real-time image operations, plus client libraries and CMS plugins for easy integrations with your product. Serve optimized images to every device quickly with a worldwide CDN optimized for visual content. Browse, search, sort, and organize all of your cloud storage images. Resize, crop, and enhance your images with simple URL parameters. Intelligent, automated compression that eliminates unnecessary bytes. Customers see images fast thanks to imgix's caching and global CDN. Introducing imgix Image Management. ​​Transform your cloud bucket into a sophisticated platform that allows you to finally see what your images can do for you.
  • 9
    Cloudmersive

    Cloudmersive

    Cloudmersive

    Cloudmersive offers a wide range of powerful APIs for various business needs, including virus scanning, document conversion, image recognition, and natural language processing (NLP). Their platform is designed for scalability and flexibility, providing solutions for both cloud and on-premise deployment. With over 16 programming languages supported, Cloudmersive allows businesses to integrate sophisticated functionalities like OCR, barcode scanning, and security threat detection into their applications with ease. Trusted by companies worldwide, Cloudmersive's APIs are engineered to enhance operational efficiency and ensure data security.
  • 10
    Eden AI

    Eden AI

    Eden AI

    Eden AI simplifies the use and deployment of AI technologies by providing a unique API connected to the best AI engines. Your time is precious: we take care of providing you with the AI engine best suited to your project and your data. No need to wait for weeks to change your AI engine. You can do it for free in a few seconds. We make sure to get you the cheapest provider while ensuring equal performance.
    Starting Price: $29/month/user
  • 11
    Azure AI Services
    Build cutting-edge, market-ready AI applications with out-of-the-box and customizable APIs and models. Quickly infuse generative AI into production workloads using studios, SDKs, and APIs. Gain a competitive edge by building AI apps powered by foundation models, including those from OpenAI, Meta, and Microsoft. Detect and mitigate harmful use with built-in responsible AI, enterprise-grade Azure security, and responsible AI tooling. Build your own copilot and generative AI applications with cutting-edge language and vision models. Retrieve the most relevant data using keyword, vector, and hybrid search. Monitor text and images to detect offensive or inappropriate content. Translate documents and text in real time across more than 100 languages.
  • 12
    LiteRT

    LiteRT

    Google

    LiteRT (Lite Runtime), formerly known as TensorFlow Lite, is Google's high-performance runtime for on-device AI. It enables developers to deploy machine learning models across various platforms and microcontrollers. LiteRT supports models from TensorFlow, PyTorch, and JAX, converting them into the efficient FlatBuffers format (.tflite) for optimized on-device inference. Key features include low latency, enhanced privacy by processing data locally, reduced model and binary sizes, and efficient power consumption. The runtime offers SDKs in multiple languages such as Java/Kotlin, Swift, Objective-C, C++, and Python, facilitating integration into diverse applications. Hardware acceleration is achieved through delegates like GPU and iOS Core ML, improving performance on supported devices. LiteRT Next, currently in alpha, introduces a new set of APIs that streamline on-device hardware acceleration.
  • 13
    ZETIC.ai

    ZETIC.ai

    ZETIC.ai

    Easily switch to server-less AI and start saving money today. It works on any NPU device and any OS. ZETIC.ai solves AI companies’ problems with on-device AI solutions using NPUs. Say goodbye to the enormous expenses of maintaining GPU servers and AI cloud services. Our server-less AI system reduces your costs significantly. Our automated pipeline ensures that the entire process is completed within just one day, streamlining your transition to on-device AI. We provide a tailored AI pipeline from data processing to deployment, including hardware-specific optimization and an on-device AI runtime library, ensuring a seamless conversion to on-device AI. Easily implement on-target on-device AI model libraries with our automated pipeline, while reducing massive GPU server costs and enhancing security with serverless AI to upgrade your AI. With ZETIC.ai’s unique technology, AI models can be ported directly to on-device AI applications without any loss.
  • 14
    Ai2 OLMoE

    Ai2 OLMoE

    The Allen Institute for Artificial Intelligence

    Ai2 OLMoE is a fully open source mixture-of-experts language model that is capable of running completely on-device, allowing you to try our model privately and securely. Our app is intended to help researchers better explore how to make on-device intelligence better and to enable developers to quickly prototype new AI experiences, all with no cloud connectivity required. OLMoE is a highly efficient mixture-of-experts version of the Ai2 OLMo family of models. Experience which real-world tasks state-of-the-art local models are capable of. Research how to improve small AI models. Test your own models locally using our open-source codebase. Integrate OLMoE into other iOS applications. The Ai2 OLMoE app provides privacy and security by operating completely on-device. Easily share the output of your conversations with friends or colleagues. The OLMoE model and the application code are fully open source.
  • 15
    Sirv

    Sirv

    Sirv

    Image CDN for resizing and optimizing your images for extremely fast delivery. Sirv automatically detects the most optimal image dimensions, resolution and format for each user. Automatic format conversion, so your website serves the best next-gen image formats such as WebP, instead of PNG of JPEG. Entirely automated and relied upon by over 30,000 businesses for the best possible image optimisation. Easily organise, search and tag your images in Sirv's digital asset management (DAM) service at https://my.sirv.com. It's a pleasure to use - fast and simple. Create your free trial now and start benefiting from the fastest image CDN service of them all.
  • 16
    Azure AI Content Safety
    Azure AI Content Safety is a content moderation platform that uses AI to keep your content safe. Create better online experiences for everyone with powerful AI models that detect offensive or inappropriate content in text and images quickly and efficiently. Language models analyze multilingual text, in both short and long form, with an understanding of context and semantics. Vision models perform image recognition and detect objects in images using state-of-the-art Florence technology. AI content classifiers identify sexual, violent, hate, and self-harm content with high levels of granularity. Content moderation severity scores indicate the level of content risk on a scale of low to high.
  • 17
    DecentAI

    DecentAI

    Catena Labs

    DecentAI provides: - Anonymized mobile access to hundreds of generative AI models: Explore models for text, image, audio, and vision. - Model Mixes and flexible model routing: Mix and match models, choose specific favorites, or let DecentAI select the best for you. - If one model is slow or unavailable, DecentAI seamlessly switches to another provider, ensuring a smooth and efficient experience. - Privacy-first design: Chats are stored on your device, not on our servers. - AI internet access: Allow models to pull in the latest information through anonymized web search. - Soon, you’ll be able to run models locally on your device and connect your own private models.
  • 18
    Foundry Local

    Foundry Local

    Microsoft

    Foundry Local is a local version of Azure AI Foundry that enables local execution of large language models (LLMs) directly on your Windows device. This on-device AI inference solution provides privacy, customization, and cost benefits compared to cloud-based alternatives. Best of all, it fits into your existing workflows and applications with an easy-to-use CLI and REST API.
  • 19
    BlackBerry Optics
    Our cloud-native BlackBerry® Optics provide visibility, on-device threat detection and remediation across your organization. In milliseconds. And our EDR approach effectively and efficiently hunts threats while eliminating response latency. It’s the difference between a minor security event—and one that’s widespread and uncontrolled. Identify security threats and trigger automated responses on-device with AI-driven security and context-driven threat detection rules to reduce detection and remediation time. Gain visibility with consolidated, AI-driven security and an enterprise-wide view of all endpoint activity, empowering detection and response capabilities for online and offline devices. Enable threat hunting and root cause analysis experiences with intuitive query language and up to 365 days of data retention options.
  • 20
    LFM2

    LFM2

    Liquid AI

    LFM2 is a next-generation series of on-device foundation models built to deliver the fastest generative-AI experience across a wide range of endpoints. It employs a new hybrid architecture that achieves up to 2x faster decode and prefill performance than comparable models, and up to 3x improvements in training efficiency compared to the previous generation. These models strike an optimal balance of quality, latency, and memory for deployment on embedded systems, allowing real-time, on-device AI across smartphones, laptops, vehicles, wearables, and other endpoints, enabling millisecond inference, device resilience, and full data sovereignty. Available in three dense checkpoints (0.35 B, 0.7 B, and 1.2 B parameters), LFM2 demonstrates benchmark performance that outperforms similarly sized models in tasks such as knowledge recall, mathematics, multilingual instruction-following, and conversational dialogue evaluations.
  • 21
    Zighra

    Zighra

    Zighra

    Seamlessly onboard and continuously protect users and enable passwordless access. Our real-time AI models are built to learn 10X faster than traditional algorithms. World’s first FIDO certified behavioral authentication technology that runs entirely on-device. Each of your customer is a unique human being. You know that, and Zighra knows how to prove that. Zighra’s patented technology delivers real-time behavioral intelligence and powerful security controls to continuously ascertain the identity of the customer, without the slightest disruption to user experience. With Zighra, you know exactly when you are interacting with your customer and when you are not, down to the very second. Flexible delivery options of on premise, cloud, or on-device allows choice. Users are asked to perform a specific action as an authenticator to determine whether the user or a bot is trying to use the device, such as holding the phone and swiping across the screen.
  • 22
    Diagnosis Pad

    Diagnosis Pad

    Diagnosis Pad

    Diagnosis Pad uses private on-device AI to generate diagnoses, guidance, transcriptions and clinical notes in real-time. Privacy All AI processing happens offline and on your device. No data is sent to online servers for maximum privacy. How to Use Simply tap Start Session to begin transcribing your session and processing the on-device intelligence. Diagnosis As the session progresses, the top three diagnoses will be generated. You can explore these in detail to understand why it is being suggested for your specific context. Recommendations The top three recommendations will also be generated, and can be expanded for more detail as well. Notes A summary of the transcript is generated at the end of the session. Settings You can toggle having the diagnosis, recommendations and notes generated live in-session or when the session has completed.
  • 23
    ABBYY Mobile Capture
    Mobile document capture and on-device text recognition. ABBYY Mobile Capture is an SDK that offers automatic data capture within your mobile app, providing real-time recognition and capturing photos of documents for on-device or back-end processing. A premium mobile onboarding process offers your customers a frictionless way to capture and provide self-servicing trailing documents to increase retention rates. Meet your customers’ expectations by minimizing manual interactions within your mobile apps and maximizing the ease-of-use for the end-user. Easy-to-integrate, pre-built, comprehensive mobile capture solution for your mobile application that saves development time and delivers best-quality results. Document processing and data capture with exceptional accuracy and ongoing learning continuously improves straight-through-processing rates. Automatically captures the best-quality image suitable for further back-end processing.
  • 24
    Apollo

    Apollo

    Liquid AI

    Apollo is a lightweight mobile application designed for fully on-device, cloud-free AI interactions, enabling users to engage with advanced language and vision models securely, privately, and with low latency. It supports a library of small foundation models from the company’s LEAP platform, allowing users to draft messages, emails, chat with a private AI assistant, craft digital characters, or use image-to-text capabilities, all without an internet connection and with no data leaving the device. Apollo is optimized for real-time responsiveness and offline operation, ensuring that inference happens entirely locally, with no API calls, servers, or user-data logging involved. It serves as both a personal AI playground and a testing bed for developers using LEAP models, letting one “vibe-check” how a model performs on their own mobile hardware before broader deployment.
  • 25
    Private LLM

    Private LLM

    Private LLM

    Private LLM is a local AI chatbot for iOS and macOS that works offline, keeping your information completely on-device, safe, and private. It doesn't need the internet to work, so your data never leaves your device. It stays just with you. With no subscription fees, you pay once and use it on all your Apple devices. It's designed for everyone, with easy-to-use features for generating text, helping with language, and a whole lot more. Private LLM uses the latest AI models quantized with state-of-the-art quantization techniques to provide a high-quality on-device AI experience without compromising your privacy. It's a smart, secure way to get creative and productive, anytime and anywhere. Private LLM opens the door to the vast possibilities of AI with support for an extensive selection of open-source LLM models, including the Llama 3, Google Gemma, Microsoft Phi-2, Mixtral 8x7B family and many more on both your iPhones, iPads and Macs.
  • 26
    Gemma 3n

    Gemma 3n

    Google DeepMind

    Gemma 3n is our state-of-the-art open multimodal model, engineered for on-device performance and efficiency. Made for responsive, low-footprint local inference, Gemma 3n empowers a new wave of intelligent, on-the-go applications. It analyzes and responds to combined images and text, with video and audio coming soon. Build intelligent, interactive features that put user privacy first and work reliably offline. Mobile-first architecture, with a significantly reduced memory footprint. Co-designed by Google's mobile hardware teams and industry leaders. 4B active memory footprint with the ability to create submodels for quality-latency tradeoffs. Gemma 3n is our first open model built on this groundbreaking, shared architecture, allowing developers to begin experimenting with this technology today in an early preview.
  • 27
    Geode

    Geode

    OmniIntelliLink Pte. Ltd.

    Geode is an on-device AI application for capturing, understanding, and structuring meetings—processed on your own devices for privacy-sensitive professional work. Geode is built for professionals who need to capture conversations and extract structured insights without routing sensitive content through external processing infrastructure. Learn more at geodeclarity.com. On macOS, Geode performs transcription, speaker separation, and AI summarization directly on Apple Silicon. The iPhone app serves as a lightweight companion for recording and review, while compute-intensive AI processing is handled on the Mac. Geode does not transmit recordings, transcripts, or summaries for remote processing. User content is not used for AI model training. By keeping meeting data local and under the user’s control, Geode supports privacy-sensitive and regulated professional workflows, including legal, consulting, healthcare, and executive use cases.
    Starting Price: $8.99/month/user
  • 28
    DeepSeek-VL

    DeepSeek-VL

    DeepSeek

    DeepSeek-VL is an open source Vision-Language (VL) model designed for real-world vision and language understanding applications. Our approach is structured around three key dimensions: We strive to ensure our data is diverse, scalable, and extensively covers real-world scenarios, including web screenshots, PDFs, OCR, charts, and knowledge-based content, aiming for a comprehensive representation of practical contexts. Further, we create a use case taxonomy from real user scenarios and construct an instruction tuning dataset accordingly. The fine-tuning with this dataset substantially improves the model's user experience in practical applications. Considering efficiency and the demands of most real-world scenarios, DeepSeek-VL incorporates a hybrid vision encoder that efficiently processes high-resolution images (1024 x 1024), while maintaining a relatively low computational overhead.
  • 29
    Azure AI Custom Vision
    Create a custom computer vision model in minutes. Customize and embed state-of-the-art computer vision image analysis for specific domains with AI Custom Vision, part of Azure AI Services. Build frictionless customer experiences, optimize manufacturing processes, accelerate digital marketing campaigns, and more. No machine learning expertise is required. Set your model to perceive a particular object for your use case. Easily build your image identifier model using the simple interface. Start training your computer vision model by simply uploading and labeling a few images. The model tests itself on these and continually improves precision through a feedback loop as you add images. To speed development, use customizable, built-in models for retail, manufacturing, and food. See how Minsur, one of the world's largest tin mines, uses AI Custom Vision for sustainable mining. Rely on enterprise-grade security and privacy for your data and any trained models.
    Starting Price: $2 per 1,000 transactions
  • 30
    Blitline

    Blitline

    Blitline

    Spend less & scale your apps with ease with Blitline’s Image Processing-as-a-Service (IPaaS). Blitline provides the most affordable Image Processing as a Service (IPaaS) solution for media and software companies that need bulk image and media processing at scale. From digital asset management (DAM) platforms and content management systems (CMS) to digital education sites and online marketplaces, the Blitline JSON API is a better alternative to Open Source solutions that bottleneck user experience innovations and expensive outsourced services that charge by the gigabyte and are primarily geared towards image and video formats only. Get started with the Blitline today for an all-in-one enterprise solution that will boost your secure media processing performance and lower your total cost of ownership. Massive. We maintain a cluster of machines as big as anyone. Always on demand. Smart. We were the first to market in 2011 and have been growing ever since.
    Starting Price: $9 per month
  • 31
    Genspark AI Browser
    Genspark AI Browser is a desktop browser with built-in AI features that run on the user’s device; no internet is needed for core model responses. It includes agent tools that assist during web browsing, comparing products, analyzing reviews, finding better deals, and helping with informed decision-making on any site. There is an autopilot mode that can automatically browse feeds, gather information, access premium databases, and perform complex web tasks without user intervention. The browser includes ad-blocking so that banners, pop-ups, and intrusive ads are blocked automatically to provide a cleaner, faster browsing experience. There’s also an MCP store, which lets users connect their browser to over 700 tools to enable workflow automation. The emphasis is on privacy (on-device AI), speed, and reducing friction in browsing, shopping, research, or general web tasks.
  • 32
    Google AI Edge Gallery
    Google AI Edge Gallery is an experimental, open source Android app that demonstrates on-device machine learning and generative AI use cases, letting users download and run models locally (so they work offline once installed). It offers several features including AI Chat (multi-turn conversation), Ask Image (upload or use images to ask questions, identify objects, get descriptions), Audio Scribe (transcribe or translate recorded/uploaded audio), Prompt Lab (for single-turn tasks such as summarization, rewriting, code generation), and performance insights (metrics like latency, decode speed, etc.). Users can switch between different compatible models (including Gemma 3n and models from Hugging Face), bring their own LiteRT models, and explore model cards and source code for transparency. The app aims to protect privacy by doing all processing on the device, no internet connection needed for core operations after models are loaded, reducing latency, and enhancing data security.
  • 33
    Moondream

    Moondream

    Moondream

    ​Moondream is an open source vision language model designed for efficient image understanding across various devices, including servers, PCs, mobile phones, and edge devices. It offers two primary variants, Moondream 2B, a 1.9-billion-parameter model providing robust performance for general-purpose tasks, and Moondream 0.5B, a compact 500-million-parameter model optimized for resource-constrained hardware. Both models support quantization formats like fp16, int8, and int4, allowing for reduced memory usage without significant performance loss. Moondream's capabilities include generating detailed image captions, answering visual queries, performing object detection, and pinpointing specific items within images. Its design emphasizes versatility and accessibility, enabling deployment across a wide range of platforms. ​
  • 34
    Ministral 8B

    Ministral 8B

    Mistral AI

    Mistral AI has introduced two advanced models for on-device computing and edge applications, named "les Ministraux": Ministral 3B and Ministral 8B. These models excel in knowledge, commonsense reasoning, function-calling, and efficiency within the sub-10B parameter range. They support up to 128k context length and are designed for various applications, including on-device translation, offline smart assistants, local analytics, and autonomous robotics. Ministral 8B features an interleaved sliding-window attention pattern for faster and more memory-efficient inference. Both models can function as intermediaries in multi-step agentic workflows, handling tasks like input parsing, task routing, and API calls based on user intent with low latency and cost. Benchmark evaluations indicate that les Ministraux consistently outperforms comparable models across multiple tasks. As of October 16, 2024, both models are available, with Ministral 8B priced at $0.1 per million tokens.
  • 35
    SnappKit

    SnappKit

    SnappKit

    SnappKit is a screenshot API built for developers who need reliable image generation without managing browser infrastructure. The problem: Setting up Puppeteer or Playwright means managing browser clusters, handling memory leaks, debugging timeout errors, and scaling infrastructure. It's weeks of work before you capture your first screenshot. The solution: One API call. Screenshots in under 2 seconds. 99.9% uptime. Key features: - URL to screenshot — Capture any webpage with full CSS rendering - HTML to image — Render raw HTML directly (perfect for dynamic OG images) - Multiple formats — PNG, JPEG, WebP output - Full customization — Viewport size, device emulation, full-page capture - Fast and reliable — Sub-2s response times, 99.9% uptime SLA Use cases: - Dynamic Open Graph image generation - Website thumbnails and link previews - Visual regression testing - PDF and report generation - Social media card automation
    Starting Price: $9/month
  • 36
    LFM2.5

    LFM2.5

    Liquid AI

    Liquid AI’s LFM2.5 is the next generation of on-device AI foundation models designed to deliver high-performance, efficient AI inference on edge devices such as phones, laptops, vehicles, IoT systems, and embedded hardware without relying on cloud compute. It extends the previous LFM2 architecture by significantly increasing the pretraining scale and reinforcement learning stages, yielding a family of hybrid models around 1.2 billion parameters that balance instruction following, reasoning, and multimodal capabilities for real-world agentic use cases. The LFM2.5 family includes Base (for fine-tuning and customization), Instruct (general-purpose instruction-tuned), Japanese-optimized, Vision-Language, and Audio-Language variants, all optimized for fast, on-device inference under tight memory constraints and available as open-weight models deployable via frameworks like llama.cpp, MLX, vLLM, and ONNX.
  • 37
    Ailiverse NeuCore
    Build & scale with ease. With NeuCore you can develop, train and deploy your computer vision model in a few minutes and scale it to millions. A one-stop platform that manages the model lifecycle, including development, training, deployment, and maintenance. Advanced data encryption is applied to protect your information at all stages of the process, from training to inference. Fully integrable vision AI models fit into your existing workflows and systems, or even edge devices easily. Seamless scalability accommodates your growing business needs and evolving business requirements. Divides an image into segments of different objects within the image. Extracts text from images, making it machine-readable. This model also works on handwriting. With NeuCore, building computer vision models is as easy as drag-and-drop and one-click. For more customization, advanced users can access provided code scripts and follow tutorial videos.
  • 38
    Ministral 3B

    Ministral 3B

    Mistral AI

    Mistral AI introduced two state-of-the-art models for on-device computing and edge use cases, named "les Ministraux": Ministral 3B and Ministral 8B. These models set a new frontier in knowledge, commonsense reasoning, function-calling, and efficiency in the sub-10B category. They can be used or tuned for various applications, from orchestrating agentic workflows to creating specialist task workers. Both models support up to 128k context length (currently 32k on vLLM), and Ministral 8B features a special interleaved sliding-window attention pattern for faster and memory-efficient inference. These models were built to provide a compute-efficient and low-latency solution for scenarios such as on-device translation, internet-less smart assistants, local analytics, and autonomous robotics. Used in conjunction with larger language models like Mistral Large, les Ministraux also serve as efficient intermediaries for function-calling in multi-step agentic workflows.
  • 39
    Voicekey

    Voicekey

    Voicekey

    Voicekey is a patented voice biometrics product using stateless Neural Network (NN) Technology/AI to help solve non-face-to-face identity authentication and identification security challenges. Voicekey is at’ heart’ a computational NN/AI engine that is consumed on-device or server based as part of an identity security application. Voicekey processes involved in enrolment and verification are consumed and accessed on-device or server based using an SDK depending on the platform (Java, iOS, Android, Windows mobile and Windows ) or RESTful API. Voicekey is a user configurable software ‘lock’ that can only be opened by the voice of a registered user.( The lock comes from the NN/AI technology).
  • 40
    dope.swg

    dope.swg

    dope.security

    Your new SWG. Eliminate the datacenter and perform security checks directly on endpoint for stronger privacy, reliability, and up to 4x performance speeds. The Fly-Direct architecture means all the functionality takes place on-device, without sacrificing performance. Users will find speed, reliability and privacy have all increased when migrating from a legacy SWG. dope.swg features integrated URL filtering, Anti-malware, Cloud Application Controls, Shadow IT, and user/group-based policies. It’s fully customizable: you decide where users can go. In the rare event the dope.cloud is down, fail-safe features allow access to trusted company-defined websites while blocking new requests for user safety. dope.swg’s endpoint-driven proxy solves the reliability, performance, and privacy issues that customers face every day with legacy SWGs. Instantly trial and install the proxy onto your device with a few clicks.
    Starting Price: $60 per month
  • 41
    Ray2

    Ray2

    Luma AI

    Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion. It has a strong understanding of text instructions and can take images and video as input. Ray2 exhibits advanced capabilities as a result of being trained on Luma’s new multi-modal architecture scaled to 10x compute of Ray1. Ray2 marks the beginning of a new generation of video models capable of producing fast coherent motion, ultra-realistic details, and logical event sequences. This increases the success rate of usable generations and makes videos generated by Ray2 substantially more production-ready. Text-to-video generation is available in Ray2 now, with image-to-video, video-to-video, and editing capabilities coming soon. Ray2 brings a whole new level of motion fidelity. Smooth, cinematic, and jaw-dropping, transform your vision into reality. Tell your story with stunning, cinematic visuals. Ray2 lets you craft breathtaking scenes with precise camera movements.
    Starting Price: $9.99 per month
  • 42
    Qwen2-VL

    Qwen2-VL

    Alibaba

    Qwen2-VL is the latest version of the vision language models based on Qwen2 in the Qwen model familities. Compared with Qwen-VL, Qwen2-VL has the capabilities of: SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. Understanding videos of 20 min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. Multilingual Support: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images
  • 43
    Doppel

    Doppel

    Doppel

    Detect phishing scams on websites, social media, mobile app stores, gaming platforms, paid ads, the dark web, digital marketplaces, and more. Identify the highest impact phishing attacks, counterfeits, and more with next-gen natural language & computer vision models. Track enforcements with an auto-generated audit trail through our no-code UI that works out of the box. Stop adversaries before they scam your customers and team. Scan millions of websites, social media accounts, mobile apps, paid ads, etc. Use AI to categorize brand infringement and phishing scams. Automatically remove threats as they are detected. Doppel's system has integrations with domain registrars, social media, app stores, digital marketplaces, the dark web, and countless platforms across the Internet. This gives you comprehensive visibility and automated protection against external threats. Doppel offers automated protection against external threats.
  • 44
    Kurate

    Kurate

    Genuus

    Build 'Love at first sight' personal experiences across digital channels and devices for your customers with our Experience Hub. Our hybrid cloud Content Management System allows you to manage, personalize and distribute content across multiple channels. Extend content not just to ecommerce web sites, corporate web sites, mobile apps, social media, but also to IoT devices, voice assistants, kiosks and digital signage. A single 'source of truth' for all your digital marketing activities. DMPM empowers you to manage and segment contacts, execute social media, email and SMS campaigns and provide analytics of your digital marketing activities. Achieve your brand's performance KPIs through our AI driven multi-channel marketing tool. Curate and manage all your media files, digital artworks, images, videos, architectural drawings, power point presentations, documents and other media content. A must have tool that supports your organization's digital transformation journey.
    Starting Price: $16.50 per user, per month
  • 45
    Palmyra LLM
    Palmyra is a suite of Large Language Models (LLMs) engineered for precise, dependable performance in enterprise applications. These models excel in tasks such as question-answering, image analysis, and support for over 30 languages, with fine-tuning available for industries like healthcare and finance. Notably, Palmyra models have achieved top rankings in benchmarks like Stanford HELM and PubMedQA, and Palmyra-Fin is the first model to pass the CFA Level III exam. Writer ensures data privacy by not using client data to train or modify their models, adopting a zero data retention policy. The Palmyra family includes specialized models such as Palmyra X 004, featuring tool-calling capabilities; Palmyra Med, tailored for healthcare; Palmyra Fin, designed for finance; and Palmyra Vision, which offers advanced image and video processing. These models are available through Writer's full-stack generative AI platform, which integrates graph-based Retrieval Augmented Generation (RAG).
    Starting Price: $18 per month
  • 46
    NetsPresso

    NetsPresso

    Nota AI

    NetsPresso is a hardware-aware AI model optimization platform. NetsPresso powers on-device AI across industries, and it's the ultimate platform for hardware-aware AI model development. Lightweight models of LLaMA and Vicuna enable efficient text generation. BK-SDM is a lightweight version of Stable Diffusion models. VLMs combine visual data with natural language understanding. NetsPresso resolves Cloud and server-based AI solutions-related issues, such as limited network, excessive cost, and privacy breaches. NetsPresso is an automatic model compression platform that downsizes computer vision models to a size small enough to be deployed independently on the smaller edge and low-specification devices. Optimization of target models being key, the platform combines a variety of compression methods which enables it to downsize AI models without causing performance degradation.
  • 47
    Aiko

    Aiko

    Aiko

    High-quality on-device transcription. Easily convert speech to text from meetings, lectures, and more. The transcription is powered by OpenAI's Whisper running locally on your device. The audio never leaves your device.
  • 48
    Alibaba Image Search
    Alibaba Cloud Image Search is an intelligent image search service that helps users find similar or identical images. Based on machine learning and deep learning, the product enables end-users to take a screenshot or upload an image to search and find desired products and fulfill other search requests. Allows your customers to use a product image to search for products from an image library. This feature simplifies the shopping process and is suitable for shopping scenarios where content-based image retrieval (CBIR) is required. After your customers use images to search for products, the system automatically recommends the same or similar products. This feature is suitable for product recommendation scenarios to improve the shopping experience of your customers.
  • 49
    QuickWhisper

    QuickWhisper

    IWT Pty Ltd

    QuickWhisper is a macOS application for transcription, dictation, and AI summarization using OpenAI's Whisper model. It runs entirely on-device with no cloud dependency required. The application transcribes audio from local files, YouTube videos, online meetings, and system audio. QuickWhisper can record meetings with calendar integration while keeping the recording interface hidden during screen sharing. System-wide dictation works across all macOS applications, replacing keyboard input with voice. All transcription runs on your Mac. AI summarization is available through cloud providers (OpenAI, Anthropic, Google, xAI, Mistral, Groq) or on-device via Ollama and LM Studio. QuickWhisper also includes batch transcription, Watch Folders for automatic background transcription, speaker diarization, Apple Shortcuts integration, and webhooks for third-party service integration.
    Starting Price: $39 one-time payment
  • 50
    Smart Engines

    Smart Engines

    Smart Engines

    Green AI-powered scanner SDK of ID cards, passports, driver’s licenses, residence permits, visas, and other ids, more than 1834+ types in total. Provides eco-friendly, fast and precise scanning SDK for a smartphone, web, desktop or server, works fully autonomously. Extracts data from photos and scans, as well as in the video stream from a smartphone or web camera, is robust to capturing conditions. No data transfer — ID scanning is performed on-device and on-premise. Automatic scanning of machine-readable zones (MRZ); all types of credit cards: embossed, indent-printed, and flat-printed; barcodes: PDF417, QR code, AZTEC, DataMatrix, and others on the fly by a smartphone’s camera. Provides high-quality MRZ, barcode, and credit card scanning in mobile applications on-device regardless of lighting conditions. Supports card scanning of 21 payment systems.