Announcing Nomic Embed Vision All Nomic Embeddings are now multimodal with backwards compatibility. Blog: https://lnkd.in/ewcnr28G Nomic Embed Vision: - Expands Nomic Embed into a high quality, unified embedding space for image, text, and multimodal tasks - Outperforms both OpenAI CLIP and text-embedding-3-small - Open weights and code to enable indie hacking, research, and experimentation - Released in collaboration with MongoDB, LangChain, LlamaIndex, Amazon Web Services (AWS), Hugging Face, DigitalOcean and Lambda Huggingface Open Weight Models: - v1: https://lnkd.in/eZBx2SWw - v1.5: https://lnkd.in/e2y9aFje Access on AWS Marketplace and in the Nomic Embedding API - https://lnkd.in/eCEd2ySs - https://lnkd.in/eQFteaBx
Nomic AI
Technology, Information and Media
New York, NY 5,791 followers
Building explainable and accessible AI systems.
About us
Nomic AI builds tools to structure, understand, and collaborate with unstructured data (text, images, embeddings, video and audio). Our flagship product, Nomic Atlas, allows anyone, regardless of skill, to easily curate, visualize, and act on unstructured data at a massive scale. Other benefits include users being able to remove anomalies to build better quality ML models faster, while improving internal data collaboration and data quality.
- Website
-
https://nomic.ai
External link for Nomic AI
- Industry
- Technology, Information and Media
- Company size
- 11-50 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Specialties
- AI, Unstructured Data, and MLOps
Locations
-
Primary
36 E 20th St
New York, NY 10003, US
Employees at Nomic AI
Updates
-
Improve your AI model performance with embedding visualization https://lnkd.in/eaqpGdYq
-
Announcing Nomic Embed Text V2, our new multilingual Mixture-of-Experts embedding model The highlights: - The first general purpose Mixture-of-Experts (MoE) embedding model - State of the art performance on the multilingual MIRACL benchmark for its parameter class - Supports over 100 languages - Fully open source training data, weights, & code - Apache 2.0 License This model release introduces the MoE architecture to embedding models. We also contribute a high-quality multilingual training dataset & recipe. Why MoE? It activates only a subset of model parameters during training and inference, encouraging only the most relevant model parameters to be used on inputs. This maintains strong performance on downstream tasks while cutting costs and memory usage. nomic-embed-text-v2-moe was trained on 1.6B high-quality text pairs across 100+ high-resource and low-resource languages. We used a recipe of consistency filtering to get high-quality data for contrastive pretraining, as well as hard-negative mining for supervised finetuning. Like our v1.5 text embedding model, nomic-embed-text-v2-moe was trained with Matryoshka Representation learning, meaning you can reduce the dimension of your embedding vectors from 768 to 256 and see minimal downstream performance degradation while cutting storage costs by 2/3. Download the model: - https://lnkd.in/eGthrY2B Check out the training code and data: - https://lnkd.in/eB5_GJt4 Read the blog post: - https://lnkd.in/ez8n4W-E
-
-
You can now run DeepSeek R1 Privately. Try it: https://lnkd.in/ejzkSnJ6
-
-
Vector Search Any Hugging Face Dataset Introducing the Hugging Face Datasets Connector in Nomic Atlas With Atlas, you can: • Explore an entire Hugging Face dataset in a data map. • Generate, vector search, and download embeddings from the dataset. • Analyze datasets with powerful tools like vector search and topic modeling. • Easily deduplicate your Hugging Face datasets. • Go multiplayer with tagging, data collaboration, and share links. https://lnkd.in/eXeVPF68
-
-
Explore 123,000 LinkedIn Job Postings https://lnkd.in/eQEmxweb
-
-
Introducing Open-Source, On-Device Inference-Time Compute in GPT4All - New model: GPT4All Reasoner v1 - Support for Code Interpreter, Tool Calling and Code Sandboxing Inference-time compute is now available on every laptop in the world. Blog: https://lnkd.in/e-x8M5-q
-
Web browsers are powerful tools for exploring massive datasets. And everybody already has them, which is why we're designing tools and software for the browser. Our fourth Data Mapping Series post shows what capabilities this unlocks: https://hubs.la/Q0302snL0