MongoDB Blog

MongoDB Commits to FedRAMP High and DoD Impact Level 5 Authorizations

June 30, 2025

Home

New in MongoDB Atlas Stream Processing: External Function Support

Today we're excited to introduce External Functions, a new capability in MongoDB Atlas Stream Processing that lets you invoke AWS Lambda, directly from your streaming pipelines. The addition of External Functions to Atlas Stream Processing unlocks new ways to enrich, validate, and transform data in-flight, enabling smarter and more modular event-driven applications. This functionality is available through a new pipeline stage, $externalFunction. What are external functions? External functions allow you to integrate Atlas Stream Processing with external logic services such as AWS Lambda. This lets you reuse existing business logic, perform AI/ML inference, or enrich and validate data as it moves through your pipeline, all without needing to rebuild that logic directly in your pipeline definition. AWS Lambda is a serverless compute service that runs your code in response to events, scales automatically, and supports multiple languages (JavaScript, Python, Go, etc.). Because there’s no infrastructure to manage, Lambda is ideal for event-driven systems. Now, by using external functions, you can seamlessly plug that logic into your streaming workloads. Where $externalFunction fits in your pipeline MongoDB Atlas Stream Processing can connect to a wide range of sources and output to various sinks. The diagram below shows a typical streaming architecture: Atlas Stream Processing ingests data, enriches it with stages like $https and $externalFunction, and routes the transformed results to various destinations. Figure 1. A high-level visual of a stream processing pipeline. The $externalFunction stage can be placed anywhere in your pipeline (except as the initial source stage) allowing you to inject external logic at any step. Atlas Stream Processing supports two modes for invoking external functions—synchronous and asynchronous. Synchronous execution type In synchronous mode, the pipeline calls the Lambda function and waits for a response. The result is stored in a user-defined field (using the “as” key) and passed into the following stages. let syncEF = { $externalFunction: { connectionName: "myLambdaConnection", functionName: "arn:aws:lambda:region:account-id:function:function-name", execution: "sync", as: "response", onError: "fail", payload: [ { $replaceRoot: { newRoot: "$fullDocument.payloadToSend" } }, { $addFields: { sum: { $sum: "$randomArray" }}}, { $project: { success: 1, sum: 1 }} ] } } Let’s walk through what each part of the $externalFunction stage does in this synchronous setup: connectionName: external function connection name specified in the Connection Registry. functionName: full AWS ARN or the name of the AWS Lambda function. execution: Indicates synchronous execution ("sync") as opposed to asynchronous (“async). as: specifies the Lambda response will be stored in the “response” field. onError: behavior when the operator encounters an error (in this case "fail" stops the processor). The default is to add the event to the dead letter queue. payload: inner pipeline that allows you to customize the request body sent, using this allows you to decrease the size of the data passed and ensure only relevant data is sent to the external function. This type is useful when you want to enrich or transform a document using external logic before it proceeds through the rest of the pipeline. Asynchronous execution type In async mode, the function is called, but the pipeline does not wait for a response. This is useful when you want to notify downstream systems, trigger external workflows, or pass data into AWS without halting the pipeline. let asyncEF = { $externalFunction: { connectionName: "EF-Connection", functionName: "arn:aws:lambda:us-west-1:12112121212:function:EF-Test", execution: "async" } } Use the async execution type for propagating information outward, for example: Triggering downstream AWS applications or analytics Notifying external systems Firing off alerts or billing logic Real-world use case: Solar device diagnostics To illustrate the power of external functions, let’s walk through an example: a solar energy company wants to monitor real-time telemetry from thousands of solar devices. Each event includes sensor readings (e.g., temperature, power output) and metadata like device_id and timestamp. These events need to be processed, enriched and then stored into a MongoDB Atlas collection for dashboards and alerts. This can easily be accomplished using a synchronous external function. Each event will be sent to a Lambda function that enriches the record with a status (e.g., ok, warning, critical) as well as diagnostic comments. After which the function waits for the enriched events to be returned and then sends them to the desired MongoDB collection. Step 1: Define the external function connection First, create a new AWS Lambda connection in the Connection Registry within Atlas. You can authenticate using Atlas's Unified AWS Access, which securely connects Atlas and your AWS account. Figure 2. Adding an AWS Lambda connection in the UI. 2. Implement the lambda function Here’s a simple diagnostic function. It receives solar telemetry data, checks it against thresholds, and returns a structured result. export const handler = async (event) => { const { device_id, group_id, watts, temp, max_watts, timestamp } = event; // Default thresholds const expectedTempRange = [20, 40]; // Celsius const wattsLowerBound = 0.6 * max_watts; // 60% of max output let status = "ok"; let messages = []; // Wattage check if (watts < wattsLowerBound) { status = "warning"; messages.push(`Observed watts (${watts}) below 60% of max_watts (${max_watts}).`); } // Temperature check if (temp < expectedTempRange[0] || temp > expectedTempRange[1]) { status = "warning"; messages.push(`Temperature (${temp}°C) out of expected range [${expectedTempRange[0]}–${expectedTempRange[1]}].`); } // If multiple warnings, escalate to critical if (messages.length > 1) { status = "critical"; } return { device_id, status, timestamp, watts_expected_range: [wattsLowerBound, max_watts], temp_expected_range: expectedTempRange, comment: messages.length ? messages.join(" ") : "All readings within expected ranges." }; }; 3. Create the streaming pipeline Using VS Code, define a stream processor using the sample solar stream as input. let s = { $source: { connectionName: 'sample_stream_solar' } }; // Define the External Function let EFStage = { $externalFunction: { connectionName: "telemetryCheckExternalFunction", onError: "fail", functionName: "arn:aws:lambda:us-east-1:121212121212:function:checkDeviceTelemetry", as: "responseFromLambda", } }; // Replace the original document with the Lambda response let projectStage = { $replaceRoot: { newRoot: "$responseFromLambda" } }; // Merge the results into a DeviceTelemetryResults collection let sink = { $merge: { into: { connectionName: "IoTDevicesCluster", db: "SolarDevices", coll: "DeviceTelemetryResults" } } }; sp.createStreamProcessor("monitorSolarDevices", [s, EFStage, projectStage, sink]); sp.monitorSolarDevices.start(); Once running, the processor ingests live telemetry data, invokes the Lambda diagnostics logic, and returns enriched results to MongoDB Atlas, complete with status and diagnostic comments. 4. View enriched results in MongoDB Atlas Explore the enriched data in MongoDB Atlas using the Data Explorer . For example, filter all documents where status = "ok" after a specific date. Figure 3. Data Explorer filtering for all documents with a status of “ok” from June 14 onwards. Smarter stream processing with external logic MongoDB Atlas Stream Processing external functions allow you to enrich your data stream with logic that lives outside the pipeline, making your processing smarter and more adaptable. In this example, we used AWS Lambda to apply device diagnostics in real-time and store results in MongoDB. You could easily extend this to use cases in fraud detection, personalization, enrichment from third-party APIs, and more. Log in today to get started, or check out our documentation to create your first external function. Have an idea for how you'd use external functions in your pipelines? Let us know in the MongoDB community forum !

July 3, 2025

Home

Introducing Query Shape Insights in MongoDB Atlas

As modern applications scale, databases are often the first to show signs of stress, especially when query patterns shift or inefficiencies arise. MongoDB has invested in building a robust observability suite to help teams monitor and optimize performance. Tools such as the Query Profiler and, more recently, Namespace Insights provide deep visibility into query behavior and collection-level activity. While powerful, these capabilities primarily focus on individual queries or collections, limiting their ability to surface systemic patterns that impact overall application performance. Today, MongoDB is excited to announce Query Shape Insights, a powerful new feature for MongoDB Atlas that offers a high-resolution, holistic view of how queries behave at scale across clusters. Query Shape Insights delivers a paradigm shift in visibility by surfacing aggregated statistics for the most resource-intensive query shapes. This accelerates root cause analysis, streamlines optimization workflows, and improves operational efficiency. Figure 1. Overview page of Query Shape Insights showing the most resource-intensive query shapes. A new granularity for performance analysis Previously, if a modern application experienced a traffic surge, it risked overloading the database with queries, causing rapid performance degradation. In those critical moments, developers and database administrators must quickly identify the queries contributing most acutely to the bottleneck. This necessitated scrutinizing logs or per-query samples. With the launch of Query Shape Insights, the top 100 query shapes are surfaced by grouping structurally similar queries with shared filters, projects, and aggregation stages into defined query shapes. These query shapes are then ranked by total execution time, offering MongoDB Atlas users greater visibility into the most resource-intensive queries. Each query shape is enriched with detailed metrics such as execution time, operation count, number of documents examined and returned, and bytes read. These metrics are rendered as time series data, enabling developers and database administrators to pinpoint when the regressions began, how long they persisted, and what triggered them. Figure 2. Detailed view of a query shape, with a pop-up displaying associated metrics. This new feature integrates seamlessly into the performance workloads teams use to monitor, debug, and optimize applications. Each query shape includes associated client metadata, such as application name, driver version, and host. This empowers teams to identify which services, applications, or teams impact performance. This level of visibility is particularly valuable for microservices-based environments, where inefficiencies might manifest across multiple teams and services. Query Shape Insights adapts based on cluster tier to support varying workload sizes. Teams can analyze the performance data of each query shape over a 7-day window. This enables them to track trends, find changes in application behavior, and identify slow regressions that might otherwise be missed. Integration with MongoDB’s observability suite Query Shape Insights was designed to enable MongoDB Atlas users to move from detection to resolution with unprecedented speed and clarity. Built directly into the MongoDB Atlas experience, this feature is a clear starting point for performance investigations. This is imperative for dynamic environments where application behavior evolves rapidly and bottlenecks must be identified and resolved rapidly. The Query Shape Insights dashboard offers comprehensive, time series–based analysis of query patterns across clusters. It enables teams to detect inefficiencies and understand when and how workloads have changed. Query Shape Insights answers critical diagnostic questions by surfacing the most resource-intensive query shapes. It identifies the workloads that consume the most resources and can help determine whether these workloads are expected or anomalous. Query Shape Insights can also help identify the emergence of new workloads and reveal how workloads have changed over time. To support this level of analysis, Query Shape Insights offers a rich set of capabilities, giving teams the clarity and speed they need to troubleshoot intelligently and maintain high-performing applications: Unified query performance view: Monitor query shapes to rapidly identify and investigate bottlenecks. Detailed query shape statistics: Track key metrics including execution time, document return counts, and execution frequency. Interactive analysis tools: Query shape drill-downs to view detailed metadata and performance trends. Flexible filtering options: Narrow analysis by shard/host, data range, namespace, or operation type. Programmatic access: Leverage MongoDB’s new Admin API endpoint to integrate query shape data with the existing observability stack. After using Query Shape Insights, MongoDB Atlas users can pivot directly to Query Profiler with filters pre-applied to the specific collection and operation type for more information beyond that provided by Query Shape Insights. Once they have traced the issue to its root, users can continue their diagnostics journey by visiting Performance Advisor . This recommends indexes tailored to the query shape, ensuring that cluster optimizations are data-driven and precise. Query Shape Insights is a leap forward in how teams manage, investigate, and respond to performance issues with MongoDB. By introducing a high-level, shape-aware view of query activity, Query Shape Insights enhances traditional reactive troubleshooting with greater clarity. This enables teams to troubleshoot faster and monitor performance effectively. Query Shape Insights is now available for all MongoDB Atlas dedicated clusters (M10 and above) deployments. Clusters must run on MongoDB 8.0 or later to access this feature. Support for Cloud Manager deployments is planned for the future. Check out MongoDB’s documentation for more details on Query Shape Insights. Start using Query Shape Insights today through your MongoDB Atlas portal.

July 2, 2025

Home

Build Event-Driven Apps Locally with MongoDB Atlas Stream Processing

Building event-driven architectures (EDAs) often poses challenges, particularly when you’re integrating complex cloud components with local development services. For developers, working directly from a local environment provides convenience, speed, and flexibility. Our demo application demonstrates a unique development workflow that balances local service integration with cloud stream processing, showcasing portable, real-time event handling using MongoDB Atlas Stream Processing and ngrok. With MongoDB Atlas Stream Processing, you can streamline the development of event-driven systems while maintaining all the components locally. Using this service’s capabilities alongside ngrok, this demo application shows a secure way to interact with cloud services directly from your laptop, ensuring you can build, test, and refine applications with minimal friction and maximum efficiency. Using MongoDB Atlas Stream Processing MongoDB Atlas Stream Processing is a powerful feature within the MongoDB Atlas modern database that enables you to process data streams in real time using the familiar MongoDB Query API (and aggregation pipeline syntax). It integrates seamlessly with MongoDB Atlas clusters, Apache Kafka, AWS Lambda, and external HTTP endpoints. Key takeaway #1: Build event-driven apps more easily with MongoDB Atlas Stream Processing One of the primary goals of MongoDB Atlas Stream Processing is to simplify the development of event-driven applications. Instead of managing separate stream processing clusters or complex middleware, you can define your processing logic directly within MongoDB Atlas. This means: A unified platform: Keep your data storage and stream processing within the same ecosystem. Familiar syntax: Use the MongoDB Query API and aggregation pipelines you already know. Managed infrastructure: Let MongoDB Atlas handle the underlying infrastructure, scaling, and availability for your stream processors. Key takeaway #2: Develop and test locally, deploy globally A significant challenge in developing event-driven systems is bridging the gap between your local development environment and cloud-based services. How do you test interactions with services running on your laptop? You can configure MongoDB Atlas Stream Processing to connect securely to HTTP services and even Apache Kafka instances running directly on your development machine! You can typically achieve this using a tunneling service like ngrok, which creates secure, publicly accessible URLs for your local services. MongoDB Atlas Stream Processing requires HTTPS for HTTP endpoints and specific Simple Authentication and Security Layer protocols for Apache Kafka, making ngrok an essential tool for this local development workflow. Introducing the real-time order fulfillment demo To showcase these capabilities in action, we’ve built a full-fledged demo application available on GitHub . Figure 1. High-level architecture diagram. This demo simulates a real-time order fulfillment process using an event-driven architecture orchestrated entirely by MongoDB Atlas Stream Processing. What the demo features A shopping cart service: Generates events when cart items change. An order processing service: Handles order creation and validation (running locally as an HTTP service). A shipment service: Manages shipment updates. Event source flexibility: Can ingest events from either a MongoDB capped collection or an Apache Kafka topic (which can also run locally). Processors from Atlas Stream Processing: Act as the central nervous system, reacting to events and triggering actions in the different services. An order history database: Centralizes status updates for easy tracking. Figure 2. High-level sequence diagram of a flow. How the demo uses MongoDB Atlas Stream Processing and local development Event orchestration: MongoDB Atlas Stream Processing instances listen for shopping cart events (from MongoDB or Kafka). Local service interaction: An ASP processor calls the Order Processing Service running locally on localhost via an ngrok HTTPS tunnel. Kafka integration (optional): Demonstrates ASP connecting to a local Kafka broker, also tunneled via ngrok . Data enrichment & routing: Processors enrich events and route them appropriately (e.g., validating order, triggering shipments). Centralized logging: All services write status updates to a central MongoDB collection that functions as a continuously materialized view of order status and history. This demo practically illustrates how you can build sophisticated, event-driven applications using ASP while performing key development and testing directly on your laptop, interacting with local services just as you would in a deployed environment. What the demo highlights Real-world EDA: Provides a practical example of asynchronous service communication. Orchestration powered by MongoDB Atlas Stream Processing: Shows how this service manages complex event flows. Local development workflow: Proves the concept of connecting this service to local HTTP / Apache Kafka via ngrok. Flexible event ingestion: Supports both MongoDB and Apache Kafka sources. Centralized auditing: Demonstrates easy status tracking via a dedicated history collection. Get started with the demo! MongoDB Atlas Stream Processing significantly lowers the barrier to entry for building robust, real-time EDAs. Its ability to integrate seamlessly with MongoDB Atlas, external services, and, crucially, your local development environment (thanks to tools like ngrok) makes it a powerful addition to the developer toolkit. Explore the demo project, dive into the code, and see for yourself how ASP can simplify your next event-driven architecture, starting right from your own laptop! Ready to see it in action? Head over to the GitHub repository ! The repository’s README.md file contains comprehensive, step-by-step instructions to get you up and running. In summary, you’ll: Clone the repository. Set up a Python virtual environment and install dependencies. Crucially, set up ngrok to expose your local order-processing service (and Apache Kafka, if applicable) via secure tunnels. (Details in the README.md appendix!) Configure your .env file with MongoDB Atlas credentials, API keys, and the ngrok URLs. Run scripts to create necessary databases, collections, and the MongoDB Atlas Stream Processing instance/connections/processors. Start the local order_processing_service.py . Run the shopping_cart_event_generator.py to simulate events. Query the order history to see the results! For detailed setup guidance, especially regarding ngrok configuration for multiple local services (HTTP and TCP / Apache Kafka), please refer to the appendix of the project's README.md .

July 1, 2025

Home

Data Modeling Strategies for Connected Vehicle Signal Data in MongoDB

Today’s connected vehicles generate massive amounts of data. According to an article from S&P Global Mobility, a single modern car produces nearly 25GB of data per hour. To put that in perspective: that’s like each car pumping out the equivalent of six full-length Avatar movies in 4K—every single day! Now scale that across millions of vehicles, and it’s easy to see the challenge ahead. Of course, not all of that data needs to be synchronized to the cloud—but even a fraction of it puts significant pressure on the systems tasked with processing, storing, and analyzing it at scale. The challenge isn’t just about volume. The data is fast-moving and highly diverse—from telematics and location tracking to infotainment usage and driver behavior. Without a consistent structure, this data is hard to use across systems and organizations. That’s why organizations across the industry are working to standardize how vehicle data is defined and exchanged. One such example is the Connected Vehicle Systems Alliance or COVESA , which developed the Vehicle Signal Specification (VSS)—a widely adopted, open data model that helps normalize vehicle signals and improve interoperability. But once data is modeled, how do you ensure it's persistent and available at all times in real-time? To meet these demands, you need a data layer that's flexible, reliable, and performant at scale. This is precisely where a robust data solution designed for modern needs becomes essential. In this blog, we’ll explore data strategies for connected vehicle systems using VSS as a reference model, with a focus on real-world applications like fleet management. These strategies are particularly effective when implemented on flexible, high-performance databases like MongoDB, a solution trusted by leading automotive companies . Is your data layer ready for the connected car era? Relational databases were built in an era when saving storage space was the top priority. They work well when data fits neatly into tables and columns—but that’s rarely the case with modern, high-volume, and fast-moving vehicle data. Telematics, GPS coordinates, sensor signals, infotainment activity, diagnostic logs—data that’s complex, semi-structured, and constantly evolving. Trying to force it into a rigid schema quickly becomes a bottleneck. That’s why many in the automotive world are moving to document-oriented databases. A full-fledged data solution, designed for modern needs, can significantly simplify how one works with data, scale effortlessly as demands grow, and adapt quickly as systems evolve. A solution embodying these capabilities, like MongoDB, supports the demands of complex connected vehicle systems. Its features include: Reduced complexity: The document model mirrors the way developers already structure data in their code. This makes it a natural fit for vehicle data, where data often comes in nested, hierarchical formats. Scale by design: MongoDB’s distributed architecture and flexible schema design help simplify scaling. It reduces interdependencies, making it easier to shard workloads without performance headaches. Built for change: Vehicle platforms are constantly evolving, and MongoDB makes it easy to update data models without costly migrations or downtime, keeping development fast and agile. AI-ready: MongoDB supports a wide variety of data types—structured, time series, vector, graph—which are essential for AI-driven applications. This makes it the natural choice for AI workloads, simplifying data integration and accelerating the development of smart systems. Figure 1. The MongoDB connected car data platform. These capabilities are especially relevant in connected vehicle systems. Companies like Volvo Connect use MongoDB Atlas to track 65 million daily events from over a million vehicles, ensuring real-time visibility at massive scale. Another example is SHARE NOW , which handles 2TB of IoT data per day from 11,000 vehicles across 16 cities, using MongoDB to streamline operations and deliver better mobility experiences. It’s not just the data—it’s how you use it Data modeling is where good design turns into great performance. In traditional relational systems, modeling starts with entities and relationships to focus on minimizing data duplication. MongoDB flips that mindset. You still care about entity relationships—but what really drives design is how the data will be used. The core principle? Data that is accessed together should be stored together. Let’s bring this to life. Take a fleet management system. The workload includes vehicle tracking, diagnostics, and usage reporting. Modeling in MongoDB starts by understanding how that data is produced and consumed. Who’s reading it, when, and how often? What’s being written, and at what rate? Below, we show a simplified workload table that maps out entities, operations, and expected rates. Table 1. Fleet management workload example. Now, to the big question: how do you model connected vehicle signal data in MongoDB? It depends on the workload. If you're using COVESA’s VSS as your signal definition model, you already have a helpful structure. VSS defines signals as a hierarchy: attributes (rarely change, like tank size), sensors (update often, like speed), and actuators (reflect commands, like door lock requests). This classification is a great modeling hint. VSS’s tree structure maps neatly to MongoDB documents. You could store the whole tree in a single document, but in most cases, it’s more effective to use multiple documents per vehicle. This approach better reflects how the data is produced and consumed—leading to a model that’s better suited for performance at scale. Now, let’s look at two examples that show different strategies depending on the workload. Figure 2. Sample VSS tree. Source: Vehicle Signal Specification documentation . Example 1: Modeling for historical analysis For historical analysis—like tracking fuel consumption trends—time-stamped data needs to be stored efficiently. Access patterns may include queries like “What was the average fuel consumption per km in the last hour?” or “How did the fuel level change over time?” Here, separating static attributes from dynamic sensor signals helps minimize unnecessary updates. Grouping signals by component (e.g., powertrain, battery) allows updates to be scoped and efficient. MongoDB Time Series collections are built for exactly this kind of data, offering optimized storage, automatic bucketing, and fast time-based queries. Example 2: Modeling for the last vehicle state If your focus is real-time state—like retrieving the latest signal values for a vehicle—you’ll prioritize fast reads and lightweight updates. Common queries include “What’s the latest coolant temperature?” or “Where are all fleet vehicles right now?” In this case, storing a single document per vehicle or update group with only the most recent signal values works well. Updating fields in place avoids document growth and keeps read complexity low. Grouping frequently updated signals together and flattening nested structures ensures that performance stays consistent as data grows. These are just two examples—tailored for different workloads—but MongoDB offers the flexibility to adapt your model as needs evolve. For a deeper dive into MongoDB data modeling best practices, check out our MongoDB University course and explore our Building with Patterns blog series . The right model isn't one-size-fits-all—it’s the one that matches your workload. How to model your vehicle signal data At the COVESA AMM Spring 2025 event, the MongoDB Industry Solutions team presented a prototype to help simplify how connected vehicle systems adopt the Vehicle Signal Specification. The concept: make it easier to move from abstract signal definitions to practical, scalable database designs. The goal wasn’t to deliver a production-ready tool—it was to spark discussion, test ideas, and validate patterns. It resonated with the community, and we’re continuing to iterate on it. For now, the use cases are limited, but they highlight important design decisions: how to structure vehicle signals, how to tailor that structure to the needs of an application, and how to test those assumptions in MongoDB. Figure 3. Vehicle Signal Data Model prototype high-level architecture. This vehicle signals data modeler is a web-based prototype built with Next.js and powered by MongoDB Atlas. It’s made up of three core modules: Schema builder: This is where it starts. You can visually explore the vehicle signals tree, select relevant data points, and define how they should be structured in your schema. Use case mapper: Once the schema is defined, this module helps map how the signals are used. Which signals are read together? Which are written most often? These insights help identify optimization opportunities before the data even hits your database. Database exporter: Finally, based on what you’ve defined, the tool generates an initial database schema optimized for your workload. You can load it with sample data, export it to a live MongoDB instance, and run aggregation pipelines to validate the design. Together, these modules walk you through the journey—from signal selection to schema generation and performance testing—all within a simple, intuitive interface. Figure 4. Vehicle signal data modeler demo in action. Build smarter, adapt faster, and scale more confidently Connected vehicle systems aren’t just about collecting data—they’re about using it, fast and at scale. To get there, you need more than a standardized signal model. You need a data solution that can keep up with constant change, massive volume, and real-time demands. That’s where MongoDB stands out. Its flexible document model, scalable architecture, and built-in support for time series and AI workloads make it a natural fit for the complexities of connected mobility. Whether you're building fleet dashboards, predictive maintenance systems, or next-gen mobility services, MongoDB helps you turn vehicle data into real-world outcomes—faster. To learn more about MongoDB-connected mobility solutions, visit the MongoDB for Manufacturing & Mobility webpage. You can also explore the vehicle signals data modeller prototype and related resources on our GitHub repository .

July 1, 2025

Home

Introducing Text-to-MQL with LangChain: Query MongoDB using Natural Language

We're excited to announce that we've added a powerful new capability to the MongoDB integration for LangChain: Text-to-MQL. This enhancement allows developers to easily transform natural language queries into MongoDB Query Language (MQL), enabling them to build new and intuitive application interfaces powered by large language models (LLMs). Whether you're building chatbots to interact with internal company data stored on MongoDB or AI agents that will work directly with MongoDB, this LangChain toolkit delivers out-of-the-box natural language querying with Text-to-MQL. Enabling new interfaces with Text-to-MQL LLMs are transforming the workplace by enabling people to “talk” to their data. Historically, accessing and querying databases required specialized knowledge or tools. Now, with natural language querying enabled by LLMs, developers can create new, intuitive interfaces that give virtually anyone access to data and insights—no specialized skills required. Using Text-to-MQL, developers can build applications that rely on natural language to generate insights or create visualizations for their users. This includes conversational interfaces that query MongoDB directly, democratizing database exploration and interactions. Robust database querying capabilities through natural language are also critical for building more sophisticated agentic systems. Agents leveraging MongoDB through MQL can interact autonomously with both operational and analytical data, greatly enhancing productivity across a wide range of operational and business tasks. Figure 1. Agent components and how MongoDB powers tools and memory. For instance, customer support agents leveraging Text-to-MQL capabilities can autonomously retrieve the most recent customer interactions and records directly from MongoDB databases, enabling faster and more informed responses. Similarly, agents generating application code can query database collections and schemas to ensure accurate and relevant data retrieval logic. In addition, MongoDB’s flexible document model aligns more naturally with how users describe data in plain language. Its support for nested, denormalized data in JSON-like BSON documents reduces the need for multi-table joins—an area where LLMs often struggle—making MongoDB more LLM-friendly than traditional SQL databases. Implementing Text-to-MQL with MongoDB and LangChain The LangChain and MongoDB integration package provides a comprehensive set of tools to accelerate AI application development. It supports advanced retrieval-augmented generation (RAG) implementations through integrations with MongoDB for vector search, hybrid search, GraphRAG, and more. It also enables agent development using LangGraph, with built-in support for memory persistence. The latest addition, Text-to-MQL, can be used either as a standalone component in your application or as a tool integrated into LangGraph agents. Figure 2. LangChain and MongoDB integration overview. Released in version 0.6.0 of the langchain-mongodb package, the agent_toolkit class introduces a set of methods that enable reliable interaction with MongoDB databases, without the need to develop custom integrations. The integration enables reliable database operations, including the following pre-defined tools: List the collections in the database Retrieve the schema and sample rows for specific collections Execute MongoDB queries to retrieve data Check MongoDB queries for correctness before executing them You can leverage the LangChain database toolkit as a standalone class in your application to interact with MongoDB from natural language and build custom text interfaces or more complex agentic systems. It is highly customizable, providing the flexibility and control needed to adapt it to your specific use cases. More specifically, you can tweak and expand the standard prompts and parameters offered by the integration. When building agents using LangGraph —LangChain’s orchestration framework—this integration serves as a reliable way to give your agents access to MongoDB databases and execute queries against them. Real-world considerations when implementing Text-to-MQL Natural language querying of databases by AI applications and agentic systems is a rapidly evolving space, with best practices still taking shape. Here are a few key considerations to keep in mind as you build: Ensuring accuracy The generated MongoDB Query Language (MQL) relies heavily on the capabilities of the underlying language model and the quality of the schema or data samples provided. Ambiguities in schemas, incomplete metadata, or vague instructions can lead to incorrect or suboptimal queries. It's important to validate outputs, apply rigorous testing, and consider adding guardrails or human review, especially for complex or sensitive queries. Preserving performance Providing AI applications and agents with access to MongoDB databases can present performance challenges. The non-deterministic nature of LLMs makes workload patterns unpredictable. To mitigate the impact on production performance, consider routing agent queries to a replica set or using dedicated, optimized search nodes . Maintaining security and privacy Granting AI apps and agents access to your database should be considered with care. Apply the common security principles and best practices: define and enforce roles and policies to implement least-privilege access, granting only the minimum permissions necessary for the task. Giving access to your data may involve sharing private and sensitive information with LLM providers. You should evaluate what kind of data should actually be sent (such as database names, collection names, or data samples) and whether that access can be toggled on or off to accommodate users. Build reliable AI apps and agents with MongoDB LLMs are redefining how we interact with databases. We're committed to providing developers the best paths forward for building reliable AI interfaces with MongoDB. We invite you to dive in, experiment, and explore the power of connecting AI applications and agents to your data. Try the LangChain MongoDB integration today! Ready to build? Dive into Text-to-MQL with this tutorial and get started building your own agents powered by LangGraph and MongoDB Atlas!

June 30, 2025

Home

Unified Commerce for Retail Innovation with MongoDB Atlas

Unified commerce is often touted as a transformative concept, yet it represents a long-standing challenge for retailers—disparate data sources and siloed systems. It’s less of a revolutionary concept and more of a necessary shift to make long-standing problems more manageable. Doing so provides a complete business overview—and enables personalized customer experiences—by breaking down silos and ensuring consistent interactions across online, in-store, and mobile channels. Real-time data analysis enables targeted content and recommendations. Unified commerce boosts operating efficiency by connecting systems and automating processes, reducing manual work, errors, and costs, while improving customer satisfaction. Positive customer experience results in repeat customers, improving revenue, and reducing the cost of customer acquisition. MongoDB Atlas offers a robust foundation for unified commerce, addressing critical challenges within the retail sector and providing capabilities that enhance customer experience, optimize operations, and foster business growth. Figure 1. Customer touchpoints in the retail ecosystem. Retail businesses are shifting to a customer-centric and data-driven approach by unifying the customer journey for a seamless, personalized experience that builds loyalty and growth. While retail has long relied on omnichannel strategies with stores, websites, apps, and social media, these often involve separate systems, causing fragmented experiences and inefficiencies. Unified commerce, integrating physical and digital retail via a unified data platform, is a necessary evolution for retailers facing challenges with diverse platforms and data silos. Cloud-based data architectures, AI, and event-driven processing can overcome these hurdles, enabling enhanced customer engagement, optimized operations, and revenue growth. This integration delivers a frictionless customer experience crucial in today's digital marketplace. Figure 2. Enabling a customer-centric approach with unified commerce. MongoDB Atlas for unified commerce MongoDB Atlas provides a strong foundation for unified commerce, addressing key challenges in the retail sector and offering capabilities that enhance customer experience, optimize operations, and drive business growth. MongoDB's flexible document model allows retailers to consolidate varied data, eliminating data silos. This provides consistent, real-time information across all channels for enhanced customer experiences and better decision-making. In MongoDB diverse data can store without rigid schemas, enabling quick adaptation to changing needs and faster integration of siloed physical and digital systems. Figure 3. Unified customer 360 using MongoDB. Real-world adoption: Lidl , part of Schwarz group, implemented an automatic stock reordering application for branches and warehouses, addressing complex data and high volumes to improve supply chain efficiency through real-time data synchronization. Real-time data synchronization for enhanced Cx In retail, real-time processing of customer interactions is crucial. MongoDB's Change Streams and event-driven architecture allow retailers to capture and react to customer behavior instantly. This enables personalized experiences like dynamic pricing, instant order updates, and tailored recommendations, fostering customer loyalty and driving conversions. Figure 4. Real-time data in the operational data layer for enhanced customer experiences. Atlas change streams and triggers enable real-time data synchronization across retail channels, ensuring consistent inventory information and preventing overselling on both physical and e-commerce platforms. Real-world adoption: CarGurus uses MongoDB Atlas to manage vast amounts of real-time data across its platform and support seamless, personalized user experiences both online and in person. The flexible document model helps them handle diverse data structures required for their automotive marketplace. Scalability & high traffic retail MongoDB Atlas's cloud-native architecture provides automatic horizontal scaling, enabling retailers to manage demand fluctuations like seasonal spikes and product expansions without impacting performance, which is crucial for scaling unified commerce. MongoDB Atlas' auto-scaling and multi-cloud features allow retailers to handle traffic spikes during peak periods(holiday, flash sales) without downtime or performance issues. The platform automatically adjusts resources based on demand, ensuring responsiveness and availability, which is vital for positive customer experiences and maximizing sales. Figure 5. Highly scalable MongoDB Atlas for high-traffic retail. Real-world adoption: Commercetools modernized its composable commerce platform using MongoDB Atlas and MACH architecture and achieved amazing throughput for Black Friday. This demonstrates Atlas's ability to handle high-volume retail events through its scalability features. AI and analytics integration MongoDB Atlas enables retailers to gain actionable insights from unified commerce data by integrating with AI and analytics tools. This facilitates personalized shopping, predictive inventory, and targeted marketing across online and offline channels through data-driven decisions. Personalization is a key driver of customer engagement and conversion in the retail industry. MongoDB Atlas Search , with its full-text and vector search capabilities, enables retailers to deliver intelligent product recommendations, visual search experiences, and AI-powered assistants. By leveraging these advanced search and AI capabilities, retailers can help customers find the products they're looking for quickly and easily, provide personalized recommendations based on their interests and preferences, and create a more intuitive and enjoyable shopping experience. Real-world adoption: L'Oréal improved customer experiences through personalized, inclusive, and responsible beauty across several apps. Retailers on MongoDB Atlas can leverage its unstructured data capabilities, vector search, and AI integrations to create real-time, AI-driven applications. Seamless data integration Atlas offers ETL/CDC connectors and APIs to consolidate diverse retail data into a unified operational layer. This single source of truth combines inventory, customer, transaction, and digital data from legacy systems, enabling consistent omnichannel experiences and eliminating data silos that hinder unified commerce. Figure 6. MongoDB Atlas for unified commerce. Real-world adoption: MongoDB helps global retailers, like Adeo , unify cross-channel data into an operational layer for easy synchronization across online and physical platforms, enabling better customer experiences. Advanced search capabilities MongoDB Atlas provides built-in text and vector search capabilities, enabling retailers to create advanced search experiences for enhanced product discovery and personalization across online and physical channels. Figure 7. Integrated search capabilities in MongoDB. Real-world adoption: MongoDB's data platform with integrated search enables retailers to improve customer experience and unify commerce. Customers like Albertsons use this for both customer-facing and back-office operations. Composable architecture with data mesh principles MongoDB supports a composable architecture that aligns with data mesh principles, enabling retailers to build decentralized, scalable, and self-service data infrastructure. Using a domain-driven design approach, different teams within the organization can manage their own data products (e.g., customers, orders, inventory) as independent services. This approach promotes agility, scalability, and data ownership, allowing teams to innovate and iterate quickly while maintaining data integrity and governance. Figure 7. MongoDB Atlas enables domain-driven design for the retail enterprise data foundation. Global distribution For international retailers using unified commerce, Atlas provides low-latency global data access, ensuring fast performance and data sovereignty compliance across multiple markets. MongoDB Atlas enables retailers to distribute data globally across AWS, Google Cloud, and Azure regions as needed, building distributed and multi-cloud architectures for low-latency customer access worldwide. Figure 8. Serving always-on, globally distributed, write-everywhere apps with MongoDB Atlas global clusters. Use cases: How unified commerce transforms retail Unified commerce streamlines the retail experience by integrating diverse channels into a cohesive system. This approach facilitates customer interactions across online and physical stores, enabling features such as real-time inventory checks, personalized recommendations based on purchase history regardless of the transaction location, and frictionless return processes. The objective is to create a seamless and efficient shopping journey through interconnected and collaborative functionalities using a modern data platform that enables the creation of such a data estate. Always-stocked shelves & knowing what's where: Real-time inventory Offer online ordering with delivery or pickup, providing stock estimates Store staff use real-time inventory to help customers and order, minimizing out-of-stocks Treating customers as individuals is a key aspect of Retail. Retail Enterprises need a unified view of customer data to offer personalized recommendations, offers, and content and offer dynamic pricing based on loyalty and market factors. Engaging customers on their preferred channels with consistent messaging and superior service builds lasting relationships. Seamless order orchestration is crucial, providing flexible fulfillment options (delivery, BOPIS, curbside, direct shipping) and keeping customers informed with real-time updates. Optimizing inventory across stores and warehouses ensures speedy, accurate fulfillment. Along with fulfillment, frictionless returns are vital, offering in-store returns for online purchases, efficient tracking, and immediate refunds. In the digital space, intelligent search and discovery are essential. Advanced search, image-based search, and AI chatbots simplify product discovery and support, boosting conversion rates and brand engagement. Leading retailers leverage MongoDB Atlas for these capabilities, powering AI recommendations, real-time inventory, and seamless omnichannel customer journeys to improve efficiency and satisfaction. The future of unified commerce To remain competitive, retailers should adopt flexible, cloud-based systems. MongoDB Atlas facilitates this transition, enabling unified commerce through real-time data, AI search, and scalable microservices for enhanced customer experiences and innovation. Visit our retail solutions page to learn more about how MongoDB Atlas can accelerate Unified Commerce.

June 26, 2025

Artificial Intelligence

Intellect Design Accelerates Modernization by 200% with MongoDB and Gen AI

It’s difficult to overstate the importance of modernization in the age of AI. Because organizations everywhere rely on software to connect with customers and run their businesses, how well they manage the AI-driven shift in what software does—from handling predefined tasks and following rules, to being a dynamic, problem-solving partner —will determine whether or note they succeed. Companies that want to stay ahead must evolve quickly. But this demands speed and flexibility, and most tech stacks weren’t designed for the continuous adaptation that AI requires. Which is where MongoDB comes in: we provide organizations a structured, proven approach to modernizing critical applications, reducing risk, and eliminating technical debt. Our approach to modernization has already led to successful, speedy, cost-effective migrations—and efficiency gains—for the likes of Bendigo Bank and Lombard Odier . So, I’m delighted to share the story of Intellect Design , one of the world’s largest enterprise fintech companies, which recently completed a project modernizing critical components of its Wealth Management platform using MongoDB and gen AI tools. The company, which works with large enterprises around the world, offers a range of banking and insurance technology products. Intellect’s project with MongoDB led to improved performance and reduced development cycle times and its platform is now better positioned to onboard clients, provide richer customer insights, and to unlock more gen AI use cases across the firm. Alongside those immediate benefits, the modernization effort is the first step in Intellect Design's long-term vision to have its entire application suite seamlessly integrated into a single AI service the company has built on MongoDB: Purple Fabric . This would create a powerful system of engagement for Intellect's customers but would only be possible once these key services have all been modernised. "This partnership with MongoDB has transformed how we approach legacy systems, turning bottlenecks into opportunities for rapid innovation. With this project, we’ve not only modernized our Wealth Management platform, but have unlocked the ability to deliver cutting-edge AI-driven services to clients faster than ever before," said Deepak Dastrala, Chief Technology Officer at Intellect Design. Legacy systems block scaling and innovation Intellect Design’s Wealth Management platform is used by some of the world's largest financial institutions to power key services—including portfolio management, systematic investment plans, customer onboarding, and know-your-customers processes—while also providing analytics to help relationship managers deliver personalized investment insights. However, as Intellect’s business grew in size and complexity, the platform’s reliance on relational databases and a monolithic architecture caused significant bottlenecks. Key business logic was locked in hundreds of SQL stored procedures, leading to batch processing delays of up to eight hours, and limiting scalability as transaction volumes grew. The rigid architecture also hindered innovation and blocked integration with other systems, such as treasury and insurance platforms, reducing efficiency, and preventing the delivery of unified financial services. In the past, modernizing such mission-critical legacy systems was seen as almost impossible —it was too expensive, too slow, and too risky. Traditional approaches relied on multi-year consulting engagements with minimal innovation, often replacing old architecture with equally outdated alternatives. Without modern tools capable of handling emerging workloads like AI, efforts were resource-heavy and prone to stalling, leaving businesses unable to evolve beyond incremental changes. MongoDB’s modernization methodology broke through these challenges with a structured approach, combining an agentic AI platform with modern database capabilities, all enabled by a team of experienced engineers. MongoDB demonstrates AI-driven scalability with Purple Fabric Before modernizing its Wealth Management platform, Intellect Design had already experienced the transformative power of a modern document database: the company began working with MongoDB in 2019, and its enterprise AI platform Purple Fabric is built on MongoDB Atlas . Purple Fabric processes vast amounts of structured and unstructured enterprise data to enable actionable compliance insights and risk predictions—both of which are critical for customers managing assets across geographies. An example of this is IntellectAI’s work with one of the largest sovereign wealth funds in the world, which manages over $1.5 trillion across 9,000 companies. By taking advantage of MongoDB Atlas's flexibility, advanced vector search capabilities, and multimodal data processing, Purple Fabric delivers over 90% accuracy in ESG compliance analyses, scaling operations to analyze data from over 8,000 companies—something legacy systems simply couldn’t achieve. This success demonstrated MongoDB’s ability to handle complex AI workloads and was instrumental in Intellect Design’s decision to adopt MongoDB for the modernization of its Wealth Management platform. Overhauling mission-critical components In February 2025, Intellect Design kicked off a project with MongoDB to modernize mission-critical functionalities within its Wealth Management platform. Areas like customer onboarding, transactions, and batch processing all faced legacy bottlenecks—including slow batch processing times and resource-intensive analytics. With MongoDB’s foundry approach to modernization—in which repeatable processes are used—and AI-driven automation and expert engineering, Intellect Design successfully overhauled these key components within just three months, unlocking new efficiency and scalability across its operations. Unlike traditional professional services or large language model (LLM) code conversion, which focus solely on rewriting code, MongoDB’s approach enables full-stack modernization, reengineering both application logic and data architecture to deliver faster, smarter, and more scalable systems. Through this approach, Intellect Design decoupled business logic from SQL-stored procedures, enabling faster updates, reduced operational complexities, and seamless integration with advanced AI tools. Batch-heavy workflows were optimized using frameworks like LMAX Disruptor to handle high-volume transactional data loads, and MongoDB’s robust architecture supported predictive analytics capabilities to pave the way for richer, faster customer experiences. The modernization project delivered measurable improvements across performance, scalability, and adaptability: With onboarding workflow times reduced by 85%, clients can now access critical portfolio insights faster than ever, speeding their decision-making and investment outcomes. Transaction processing times improved significantly, preparing the platform to accommodate large-scale operations for new clients without delays. Development transformation cycles were completed by as much as 200% faster, demonstrating the efficiency of automating traditionally resource-intensive workflows. This progress gives Intellect Design newfound freedom to connect its Wealth platform to broader systems, deliver cross-functional insights, and compete effectively in the AI era. Speeding insights, improving analytics, and unlocking AI While Intellect Design’s initial project with MongoDB focused on modernizing critical components, the company is now looking to extend its efforts to other essential functionalities within the Wealth platform. Key modules like reporting, analytics workflows, and ad-hoc data insights generation are next in line for modernization, with the goal of improving runtime efficiency for real-world use cases like machine learning-powered customer suggestions and enterprise-grade reporting. Additionally, Intellect Design plans to leverage MongoDB’s approach to modernization across other business units, including its capital markets/custody and insurance platforms, creating unified systems that enable seamless data exchange and AI-driven insights across its portfolio. By breaking free from legacy constraints, Intellect Design is unlocking faster insights, smarter analytics, and advanced AI capabilities for its customers. MongoDB’s modernization approach, tools, and team are the engine powering this transformation, preparing businesses like Intellect Design to thrive in an AI-driven future. As industries continue to evolve, MongoDB is committed to helping enterprises build the adaptive technologies needed to lead—and define—the next era of innovation. To learn more about how MongoDB helps customers modernize without friction—using AI to help them transform complex, outdated systems into scalable, modern systems up to ten times faster than traditional methods—visit MongoDB Application Modernization . Visit the Purple Fabric page for more on how Intellect Design’s Purple Fabric delivers secure, decision-grade intelligence with measurable business impact. For more about modernization and transformation at MongoDB, follow Vinod Bagal on LinkedIn .

June 26, 2025

Artificial Intelligence

Boost Search Relevance with MongoDB Atlas’ Native Hybrid Search

We’re excited to introduce a native hybrid search experience that seamlessly combines the power of MongoDB Atlas’ native text search and vector search capabilities. Now in public preview, this capability leverages reciprocal rank fusion (RRF) to rank result sets from both text and vector searches, significantly improving relevance and user experiences. It simplifies application development by eliminating the need for separate search engines and vector databases, allowing developers to implement hybrid search out-of-the-box and quickly refine search accuracy for their use cases. The new $rankFusion aggregation stage has helped us implement hybrid search much more easily and efficiently, as we can combine the power of keyword search and semantic search under one umbrella in MongoDB Atlas. This has improved the context retrieval accuracy for our Eddy AI chatbot by 30%. Dr. Selvaraaju Murugesan, Head of Data Science, Kovai.co Read the documentation to learn more. Hybrid search: Precision meets semantic relevance Search is the cornerstone of information discovery and application development. Whether you're shopping online, researching a medical condition, or analyzing business data, search helps us find the information we need. As technology evolves, so do user expectations of how they interact with searches and how their searches will perform. Modern applications—from catalog search and product recommendations to retrieval-augmented generation (RAG) and emerging AI agents —require high-quality retrieval to handle unstructured data effectively. Relying solely on text or vector search as a single retrieval technique may not deliver the highest relevance and accuracy. While excellent for precise keyword retrieval, text search can struggle to understand nuanced language or identify semantically similar concepts. On the other hand, vector search is ideal for open-ended queries and high-dimensional data (images, audio, video, etc.), but it may miss crucial information that requires exact matches. Consider the search query “quick and easy Italian dinner recipes.” While traditional text search efficiently identifies recipes containing these keywords in their titles, descriptions, or tags, it often lacks a deeper understanding of concepts like "quick and easy" or the nuances of Italian cuisine beyond literal terms. Adding vector search significantly enhances results by capturing semantically related ideas such as "simple," "minimal prep," or "10-minute meal," alongside recipes featuring classic Italian flavors like tomatoes, olive oil, garlic, oregano, and mozzarella, all suitable for a savory evening meal. This hybrid approach ultimately provides users with a more comprehensive and relevant search experience, balancing accuracy with broader discovery. Combine the best of both worlds for engaging user experiences Organizations are increasingly turning to hybrid search, using the combined strength of text and vector search to maximize the value of data and provide enhanced user experiences. Improved accuracy for search, RAG, and AI agents: By leveraging both approaches, you deliver search results that are semantically relevant and precisely accurate. Hybrid search provides the foundation for more insightful and accurate responses from your generative AI models, reducing hallucinations. Streamlined architecture: Organizations running hybrid search via separate search engines and vector databases typically encounter high complexity and costs. MongoDB simplifies hybrid search implementations by providing this integrated capability in the operational database, eliminating the need to manage multiple systems, reducing costs, and increasing efficiency. Enhanced user experiences: Users benefit from more relevant and comprehensive search results, increasing user satisfaction. How the Financial Times spearheaded AI-powered content discovery with hybrid search The Financial Times, the leading business news organization, has successfully implemented hybrid search, combining MongoDB’s native full-text and vector search capabilities to power content discovery on its website (aiming to deploy the capability across all its digital properties). This enables relevant results for natural language queries and delivers robust content recommendations, providing readers with an enhanced web experience. The highly relevant article suggestions boost engagement and retention by saving valuable time for busy executives—the Financial Times' core subscriber demographic. Hear more about the story in this video . Nowadays with AI becoming more popular for solving different problems, the ability to do vector search alongside the standard keyword search is crucial. MongoDB Atlas is not just offering a database or search. What drew our attention to it was that it's kind of a full package… It gives a lot of things that you can get from a platform out-of-the-box. Dimitar Terziev, Technical Director for Core Platforms, Financial Times Get started with hybrid search in MongoDB Atlas You can now streamline hybrid search implementations with a single $rankFusion aggregation stage. This allows for effortlessly combining full-text and vector search result sets into a unified ranked list, quickly surfacing the most relevant information. Leveraging native and powerful query capabilities within Atlas, you can flexibly combine $rankFusion with various MongoDB aggregation stages—$vectorSearch, $search, $geoNear, $sort, and $match. This allows for the adjustment of weighted criteria, the production of custom relevance scores, and result sorting to address diverse requirements and use cases. See the demo of hybrid search in action. View the documentation to get started.

June 25, 2025

Home

MongoDB Atlas CLI: Full API Coverage and Faster Updates

We’re thrilled to announce that starting today, you can access every feature in the MongoDB Atlas Administration API from the MongoDB Atlas CLI . This significant enhancement means that you’ll also get new features quicker than ever, within just days of their launch. No more hoping for feature support or switching between interfaces. If it’s in the MongoDB Atlas Admin API, it’s in your Atlas CLI. Full parity with the API Until now, you had to wait for the Atlas CLI team to manually implement support for new MongoDB Atlas Administration API endpoints. Those days are over. Every MongoDB Atlas capability, whether it launched today or has been around for years, will now automatically become available from your command line. With the new atlas api subcommands, you get: Full feature parity with the MongoDB Atlas Administration API. Quicker access to future MongoDB Atlas Administration API features. A unified, predictable command structure that makes automation easy. Simplified handling of long-running operations with the ⁠ --watch flag, eliminating the need for complex polling logic. The ability to pin a desired API version, ensuring your scripts remain reliable even if you update the CLI. The reason we built this We prioritized this feature to ensure our users could use all capabilities exposed by the MongoDB Atlas Administration API through the Atlas CLI without delays. This isn’t about adding new capabilities to MongoDB Atlas but about making existing API functionality accessible through the Atlas CLI, eliminating the previous gap in functionality. Our goal was simple: If a feature exists in the MongoDB Atlas Administration API, you should be able to access it through the CLI immediately, not weeks or months later. Now you can. The CLI simplifies API interactions With the arrival of the complete coverage of the MongoDB Atlas Administration API in the MongoDB Atlas CLI , users no longer need to implement workarounds in order to make use of these endpoints. For example: Authentication: The CLI makes API interactions simpler by automatically handling authentication. This means you don’t have to manage tokens or credentials on your own when making requests to endpoints that need authentication. Monitoring long-running operations: The CLI offers a powerful --watch flag for all API subcommands. This feature automatically monitors long-running operations until they are completed, eliminating the need to implement polling loops manually. Without this flag, you would have to repeatedly check the operation status by directly calling the API. Let’s take a look at an example of how the --watch flag simplifies waiting for long-running operations. atlas api clusters createCluster --file clusterspec.json --watch This command creates a cluster and waits until it’s fully provisioned before returning, eliminating the need for complex polling logic in your scripts. Practical applications The atlas api subcommands enable powerful workflows that were previously unavailable in the CLI: Cluster outage simulation : Simulate regional outage scenarios directly through the CLI. You can now script tests that simulate an entire cloud provider’s region being down, helping ensure your applications remain resilient during actual outages. Invoice investigation : Generate custom reports and retrieve billing information programmatically. Need to pull invoice data for your finance team? That’s now a simple CLI command away. Access tracking : Monitor and manage user access patterns across your MongoDB Atlas resources, enhancing your security posture without leaving the command line. These are just a few of the features now available through the new atlas api subcommands. Visit our documentation to explore the full range of available commands. Robust and fully documented API subcommands All atlas api subcommands are auto-generated from our OpenAPI specification , ensuring they stay up to date with the latest Atlas Administration API features. Additionally, these subcommands are versioned, which ensures that your scripts won’t break when the API updates —a critical feature for reliable automation. For detailed information on syntax and usage, please refer to our comprehensive documentation . Status: In public preview and ready for your feedback The introduction of atlas api subcommands represents a significant advancement in making MongoDB Atlas more accessible and automatable. By bringing the full power of the MongoDB Atlas Administration API to the command line, we’re enabling anyone who automates their MongoDB Atlas cloud to work more efficiently. Whether you’re managing infrastructure, implementing testing protocols, or generating reports, these new capabilities can transform your MongoDB Atlas experience—all without leaving the command line. As this feature is currently in public preview, we’re actively seeking your input. Here is how to get started: Get the latest CLI: Update your Atlas CLI today to access these new subcommands. Try an example: Try from this list of example Atlas CLI commands . Provide feedback: Share your thoughts on how we can improve through our feedback forum . Your feedback helps us understand how you’re using these capabilities and what improvements would make them even more valuable to your workflows. Learn more about the MongoDB Atlas CLI through our documentation .

June 24, 2025

Home

MongoDB and deepset Pave the Way for Effortless AI App Creation

Building robust AI-powered applications has often been a complex, resource-intensive process. It typically demands deep technical and domain expertise, significant development effort, and a long time to value. For IT decision-makers, the goal is clear: enable AI innovation to achieve real business outcomes without compromising scalability, flexibility, or performance, and without creating bottlenecks for development teams serving business teams and customers. Solutions from deepset and MongoDB empower organizations to overcome these challenges, enabling faster development, unlocking AI's potential, and ensuring the scalability and resilience required by modern businesses. Breaking barriers in AI development: The real-time data challenge For many industries, real-time data access is critical to unlocking insights and delivering exceptional customer experiences. AI-driven applications rely on seamless retrieval and processing of structured and unstructured data to fuel smarter decision-making, automate workflows, and improve user interactions. For example, in customer service platforms, instant access to relevant data ensures fast and accurate responses to user queries, improving satisfaction and efficiency. And healthcare applications require immediate access to patient records to enable personalized treatment plans that enhance patient outcomes. Similarly, financial systems rely on real-time analysis of market trends and borrower profiles to make smarter investment and credit decisions to stay competitive in dynamic environments. However, businesses often face challenges when scaling AI applications. These challenges include inconsistent data retrieval, where organizations struggle to efficiently query and access data across vast pools of information. Another challenge is complex query resolution, which involves interpreting multi-layered queries to retrieve the most relevant insights and provide smart recommendations. Data security concerns also pose obstacles, as businesses must ensure sensitive information remains protected while maintaining compliance with regulatory standards. Lastly, AI production-readiness is critical, requiring organizations to ensure their AI applications are properly configured and thoroughly tested to support mission-critical decisions and workflows with accuracy, speed, and adaptability to rapid changes in the AI ecosystem or world events. Addressing these challenges is vital for businesses looking to unlock the full potential of AI-powered innovations and maintain a competitive edge. Transformative solution: Deepset RAG expertise meets MongoDB Atlas Vector Search We’re excited to announce a new partnership between deepset and MongoDB. By integrating deepset’s expertise in retrieval-augmented generation (RAG) and intelligent agents with MongoDB Atlas, developers can now more easily build advanced AI-powered applications that deliver fast, accurate insights from large and complex datasets. We're thrilled to partner with MongoDB and build out an integrated end-to-end GenAI solution to speed up the time to value of customers' AI efforts and help solve their complex use cases to deliver key business outcomes. Mark Ghannam, Head of Partnerships, deepset What sets deepset apart is its product and documentation production-readiness, flexibility for solving complex use cases, and its library of ready-to-use templates, which allow businesses to get started fast to quickly deploy common RAG and agent functionalities, reducing the time and effort required for development. For teams needing customization, Haystack provides a modular, object-oriented design that supports drag-and-drop components , utilizing both standard integrations and custom components . This makes it highly accessible, enabling developers to configure workflows according to their specific application needs, without requiring extensive coding knowledge. On top of Haystack, deepset’s AI Platform makes the prototype to production process of building AI applications even faster and more efficient. It extends Haystack’s building block approach to AI application development, with a visual design interface, qualitative user testing, side-by-side configuration/large language model (LLM) testing, integrated debugging, and hallucination scoring, in addition to expert service assistance and support. The platform’s Studio Edition is free for developers to try. Through seamless integration with MongoDB Atlas Vector Search , deepset equips developers with the ability to incorporate advanced RAG and agent capabilities into their compound AI applications easily through the processes described, known as LLM orchestration. Key features enable several transformative possibilities across industries. Intelligent chatbots allow businesses to deliver precise and context-aware customer interactions, significantly enhancing call center efficiency. Automated content tagging optimizes and streamlines workflows in content management systems, enabling faster categorization and discovery of information. Tailored educational, research, and media platforms personalize learning materials, research, and media content based on user questions and preferences, improving engagement and effectiveness while adhering to institution and brand guidelines. Industry-specific planning systems and workflow automations simplify complex processes, such as lending due diligence. By leveraging the deepset framework alongside MongoDB Atlas Vector Search, developers gain a powerful toolkit to optimize the performance, scalability, and user experience of their applications. This collaboration provides tangible benefits across industries like customer service, content management, financial services, education, defense, healthcare, media, and law—all while keeping complexity to a minimum. Data security and compliance: A foundational priority As organizations adopt advanced AI technologies, protecting sensitive data is paramount. MongoDB Atlas and deepset offer robust protections to safeguard data integrity. MongoDB and deepset provide industry-standard security measures such as encryption, access controls, and auditing, along with compliance certifications like ISO 27001, SOC 2, and CSA STAR. These measures help ensure that sensitive data is handled with care and that client information remains secure, supporting businesses in meeting their regulatory obligations across different sectors. Incorporating MongoDB Atlas into AI solutions allows enterprises using deepset's RAG and Agent capabilities to confidently manage and protect data, ensuring compliance and reliability while maintaining operational excellence. Shaping the future of AI-powered innovation The partnership between MongoDB and deepset is more than a collaboration—it's a driving force for innovation. By merging cutting-edge language processing capabilities with the robust, scalable infrastructure of MongoDB Atlas, this alliance is empowering organizations to create tomorrow's AI applications, today. Whether it’s intelligent chatbots, personalized platforms, or complex workflow automations, MongoDB and deepset are paving the way for businesses to unlock new levels of efficiency and insight. At the core of this partnership is deepset’s advanced RAG and Agent technology, which enables efficient language processing and precise query resolution—essential components for developing sophisticated AI solutions. Complementing this is MongoDB’s reliable cloud database technology, providing unmatched scalability, fault tolerance, and the ability to effortlessly craft robust applications. The seamless integration of these technologies offers developers a powerful toolkit to create applications that prioritize fast time to value, innovation, and precision. MongoDB’s infrastructure ensures security, reliability, and efficiency, freeing developers to focus their efforts on enhancing application functionality without worrying about foundational stability. Through this strategic alliance, MongoDB and deepset are empowering developers to push the boundaries of intelligent application development. Together, they are delivering solutions that are not only highly responsive and innovative but also expertly balanced across security, reliability, and efficiency—meeting the demands of today’s dynamic markets with confidence. Jumpstart your journey Dive into deepset's comprehensive guide on RAG integration with MongoDB Atlas. Then get started with deepset Studio Edition (free) to start building. Transform your data experience and redefine the way you interact with information today! Learn more about MongoDB and deepset's partnership through our partner ecosystem page .

June 24, 2025

Artificial Intelligence

Spring Data MongoDB: Now with Vector Search and Queryable Encryption

MongoDB is pleased to announce new enhancements to the Spring Data MongoDB library with the release of version 4.5.0 , increasing capabilities related to vector search, vector search index creation, and queryable encryption. Spring Data MongoDB makes it easier for developers to integrate MongoDB into their Java applications, taking advantage of a potent combination of powerful MongoDB features and familiar Spring conventions. Vector search Vector embeddings convert disparate types of data into numbers that capture meaning and relationships. Many types of data—words, sentences, images, even videos—can be represented by a vector embedding for use in AI applications. In MongoDB, you can easily store and index vector embeddings alongside your other document data—no need to manage a separate vector database or maintain an ETL pipeline. In MongoDB, an aggregation pipeline consists of one or more stages that process documents, performing operations such as $count and $group . $vectorSearch is an aggregation pipeline stage for handling vector retrieval. It was released in MongoDB 6.0, and improved upon in MongoDB 7.0 and 8.0. Using the $vectorSearch stage to pre-filter your data and perform a semantic search against indexed fields, you can easily process vector embeddings in your aggregation pipeline. Vector search indexes Like other retrieval techniques, indexes are a key part of implementing vector search, allowing you to narrow the scope of your semantic search and exclude irrelevant vector embeddings. This is useful in an environment where it isn’t necessary to consider every vector embedding for comparison. Let’s see how easy it is to create a vector search index with Spring Data MongoDB 4.5.0! VectorIndex index = new VectorIndex("vector_index") .addVector("plotEmbedding", vector -> vector.dimensions(1536).similarity(COSINE)) .addFilter("year"); mongoTemplate.searchIndexOps(Movie.class) .createIndex(index); As you can see, the VectorIndex class offers intuitive methods such as addVector and addFilter that allow you to define exactly, with native Spring Data APIs, the vector you want to initialize. To actually execute a search operation that leverages the index, just issue an aggregation: VectorSearchOperation search = VectorSearchOperation.search("vector_index") .searchType(VectorSearchOperation.SearchType.ENN) .path("plotEmbedding") .vector( ... ) .limit(10) .numCandidates(150) .withSearchScore("score"); AggregationResults<MovieWithSearchScore> results = mongoTemplate .aggregate(newAggregation(Movie.class, search), MovieWithSearchScore.class); Leverage the power of MongoDB to run sophisticated vector search, directly from Spring. Queryable Encryption Support for vector search isn’t the only enhancement found in 4.5.0. Now, you can pass encryptedFields right into your CollectionsOptions class, giving Spring the context to understand which fields are encrypted. This context allows Spring to leverage the power of MongoDB Queryable Encryption (QE) to keep sensitive data protected in transit, at rest, or in use. QE allows you to encrypt sensitive application data, store it securely in an encrypted state in the MongoDB database, and perform equality and range queries directly on the encrypted data. Let’s look at how easy it is to create an encrypted collection with Spring Data MongoDB: CollectionOptions collectionOptions = CollectionOptions.encryptedCollection(options -> options .queryable(encrypted(string("ssn")).algorithm("Indexed"), equality().contention(0)) .queryable(encrypted(int32("age")).algorithm("Range"), range().contention(8).min(0).max(150)) .queryable(encrypted(int64("address.sign")).algorithm("Range"), range().contention(2).min(-10L).max(10L)) ); mongoTemplate.createCollection(Patient.class, collectionOptions); By declaring upfront the options allowed for different fields of the new collection, Spring and MongoDB work together to keep your data safe! We’re excited for you to start incorporating these exciting new features into applications built with Spring Data MongoDB. Here are some resources to help you get started: Explore the Spring Data MongoDB documentation Check out the GitHub repository Read the release notes for Spring Data MongoDB 4.5.0

June 23, 2025

Home

Teach & Learn: Professor Margaret Menzin, Simmons University

MongoDB’s Teach & Learn blog series interviews students and educators worldwide who are using MongoDB to enhance their classrooms. These stories highlight how MongoDB’s platform and resources are revolutionizing education and preparing tech professionals. The MongoDB for Educators program offers free resources and technology for creating interactive learning environments that connect theory and practice. Educators gain access to MongoDB Atlas credits, curriculum, certifications, and a global community. Unlocking potential: Integrating MongoDB to enhance learning in the classroom Professor Margaret Menzin is a dedicated educator at Simmons University, where she was instrumental in developing one of the first undergraduate data science majors in the United States. With a keen eye on industry trends, she revamped her database course to include NoSQL technologies like MongoDB, recognizing their growing importance in the professional world. Her approach blends practical skills with theoretical understanding, ensuring her students are well-prepared for real-world challenges. Professor Menzin also fosters a vibrant student community around MongoDB technology, empowering students to use these skills in their academic projects and future careers. Her MongoDB insights on curriculum and student engagement offer valuable perspectives for educators adapting to the evolving tech landscape, as you’ll see in our interview. 1. Tell us about your educational and professional journey and what initially sparked your interest in databases and MongoDB. At Simmons, we were one of the first US universities to offer an undergraduate major in data science, so we were very aware of the importance of NoSQL for handling big data. In 2017, I returned to teaching databases after a hiatus of about seven years—and when I looked at the textbooks, they hadn’t changed. But the world sure had. So, I checked the Stack Overflow survey of what professional developers were using and found that 25% of them were using MongoDB. With my colleague’s permission, I revised our course to be about one-third on NoSQL, and I had to develop my own materials. But my students adore using MongoDB. 2. What courses related to databases and MongoDB are you currently teaching? I teach a one-semester database course that’s required for all students majoring in computer science, data science, and information technology/cybersecurity. I also teach a course in full-stack web development, and students learn how to access MongoDB from Node.js. 3. What motivated you to incorporate MongoDB into your curriculum? I was motivated by what is happening in the real world, but as an instructor, I find that having students learn something else in addition to relational databases makes the discussions much livelier about atomicity, consistency, isolation, and durability (ACID) transactions and concurrency in relational database management systems (RDBMSs). Now, students see where ACID transactions are important and where they’re not. (Yes, I know that MongoDB supports ACID transactions.) Similarly, the design process is different for MongoDB and for entity-relationship design, and that highlights the strengths of each. Figure 1. Professor Margaret Menzin's students at Simmons University. 4. You have successfully built an active student community around MongoDB on your campus. Can you share some insights into how you achieved this and the impact it’s had on students? First, I tell students to put MongoDB on their curricula vitae because it gives them an edge. Second, students are so enthusiastic about MongoDB that they turn to it when they have to build projects for senior courses. I do require that students install the MongoDB Community Edition on their own computers, and—without any data to back this claim up—I think that makes it more likely that they will turn to it. And they do. This year, a group of four seniors built a complete software system for a nonprofit on our campus, and they chose to use MongoDB. (I was not the supervisor; they chose MongoDB because they liked it and thought it was the best choice.) 5. How do you design your course content to integrate MongoDB in a way that engages students and ensures practical learning experiences? In my course, I give students a set of comma-separated values (CSVs) for the Northwinds example (a pretty standard project with files for customers, products, orders, line items, etc.), and they denormalize the data. That is, they embed the line-item documents into the orders documents and do some computations, then embed the orders documents into the customers’ documents. They timed various operations with and without indexes. One thing I have learned is to put the exam on MongoDB before the project, so everyone on the team is ready to contribute to the project. I have a file of the approximately 5,000 restaurants in New York City that I use for the exam. 6. How has MongoDB supported you in enhancing your teaching methodologies and upskilling your students? First, my students make extensive use of the MongoDB documentation. Reading documentation is an important skill for students to learn, and MongoDB’s is excellent. Second, I have gone through all the MongoDB videos for teachers, and I especially use the ones on the design process. For the aggregation pipeline, we use the book Practical MongoDB Aggregations , linked to on your site, and the Mosh Hamedani videos on YouTube. And because I was one of the very early adopters among professors, I’ve had to develop a lot of my own materials, which I’ve shared. Figure 2. Professor Margaret Menzin's students at Simmons University. 7. Could you share a memorable experience or success story of a project from your time teaching MongoDB that stands out to you? After the first year I taught MongoDB, I asked my colleagues for feedback, and they suggested that I see what other people were doing on the Association for Computing Machinery (ACM) Special Interest Group on Computer Science Education (SIGCSE) LISTSERV. The result was a panel called “NoSQL is No Problem” for SIGCSE 2020. And there was a curated bibliography for various NoSQL platforms. 8. How has your role as a MongoDB educator impacted your professional growth and the growth of the student community at your university? As a faculty member, I am always trying to see what’s going to be important next and find out how to learn it. Students respond to that attitude. I also lean very heavily on small-group work and team projects in all my courses. Most of my database students are sophomores, and they don’t know each other well yet. So in any small-group work, I say, “Even if it’s your roommate, begin with ‘Hello, my name is…’” and they laugh, but it works. It happens that the database course (occurring fall of the sophomore year) is when we try to build a sense of cohesion among our majors. I also require my students to take out an ACM student membership so I can assign a variety of readings and videos, and that helps them build professional identities. And my students love the fact that this is cutting-edge and that they are moving away from textbooks. I’m sure that listing MongoDB among their skills on LinkedIn and elsewhere also helps them find internships. 9. What advice would you give to educators who are considering integrating MongoDB into their courses to ensure a successful and impactful learning experience for students? Allow about 30% of a first database course for the MongoDB work. It takes me about one and a half to two weeks to get students to install and learn basic MongoDB, and then another week and a half for the project. After that, use MongoDB as a jumping-off point to circle back to topics like forms of consistency, the CAP Theorem, design trade-offs, design decisions for distributed databases, and the choice of a database model. Comparing and contrasting MongoDB with an RDBMS is a very powerful way to summarize many of the key concepts in a database course. Finally, spending the last week on these high-level issues, when all of the students’ other courses are rushing to finish their projects, will make students very grateful. Apply to the MongoDB for Educators program and explore free resources for educators crafted by MongoDB experts to prepare learners with in-demand database skills and knowledge.

June 23, 2025

Home

Ready to get Started with MongoDB Atlas?

Start Free