High-performance Rust-based vector search engine. discord.com/invite/qdrant

Most vector databases treat retrieval as a single operation. That's the wrong abstraction. Storing embeddings and returning nearest neighbors is a solved problem. The hard problem is what happens next. We solve it through composable vector search, built in Rust. Today, led by AVP, with Bosch Ventures, Unusual Ventures, Spark Capital, and 42CAP, we're announcing our $50M Series B to accelerate it. Learn more about Qdrant’s composable vector search and our latest funding round here: qdrant.tech/blog/series-b-an…
6
6
27
2,977
We are proud to share that this new X AI feature just announced by @elonmusk is powered by the Qdrant Vector Database. Congrats to the team! In Rust we Trust! 🦀✊😎
AI-based “See similar” posts feature is rolling out now
33
88
1,056
297,317
How does Grok by @xai access real-time knowledge of the world via the 𝕏 platform? Right, using a Vector Database, powered by Qdrant. In Rust, we trust! 🦀✊ Stay tuned.
Example of Grok vs typical GPT, where Grok has current information, but other doesn’t
Community note
Grok received two segmented prompts. The typical GPT received a single prompt. Smaller chained prompts may result in a large language model providing more consistent answers. promptengineering.org/getting-starte
17
65
735
220,813
For 40 years, BM25 has been the standard for search engines. However, it falls short for modern RAG applications. Say hello to BM42: The combination of semantic and keyword search
8
120
697
109,709
📘 𝐉𝐮𝐬𝐭 𝐎𝐮𝐭: 𝐓𝐡𝐞 𝐋𝐋𝐌 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫’𝐬 𝐇𝐚𝐧𝐝𝐛𝐨𝐨𝐤! For those building scalable LLM and RAG systems, 𝐓𝐡𝐞 𝐋𝐋𝐌 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫’𝐬 𝐇𝐚𝐧𝐝𝐛𝐨𝐨𝐤 has just launched, providing a complete, end-to-end guide to deploying real-world applications with industry best practices. Written by @iusztinpaul and @maximelabonne. A quick look inside: ▪ 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐚𝐧𝐝 𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥: Steps for building data pipelines and making the most of Qdrant for efficient RAG retrieval. ▪ 𝐃𝐞𝐩𝐥𝐨𝐲𝐦𝐞𝐧𝐭 𝐚𝐭 𝐒𝐜𝐚𝐥𝐞: Straightforward advice for deploying on Amazon Web Services (AWS) with tools like @zenml_io and @Cometml’s Opik, from orchestration to monitoring. ▪ 𝐀𝐝𝐯𝐚𝐧𝐜𝐞𝐝 𝐓𝐞𝐜𝐡𝐧𝐢𝐪𝐮𝐞𝐬: Techniques for fine-tuning, preference alignment, and optimizing inference to keep things running smoothly in production. Early buyers get a limited-time discount on print and Kindle, plus a free PDF version. 🔗 Take a look on Amazon: buff.ly/4fkMXnv 🔗 Explore the Codebase: buff.ly/48veQ9E
5
103
640
35,382
PDFs can be tricky to pull clean text or images from. PyMuPDF4LLM is a new library designed to simplify that process. In his blog post, Benito Martin demonstrates how it works by: 📄 Extracting text in Markdown format 🖼️ Handling image extraction and embedding 🏷️ Chunking content with metadata for better context He also combines PyMuPDF4LLM with @qdrant_engine and @llama_index to build a more advanced system that handles both text and images in a vector database. Building RAG for PDFs? This is for you: medium.com/@benitomartin/bui…
7
94
567
50,332
⚕️ Multi-Agent Medical AI Assistant A full-stack system that combines agents, computer vision, hybrid RAG, and voice I/O into a unified pipeline for diagnosis, research, and patient interaction. The architecture includes: • LangGraph by @langchain for agent orchestration • PyTorch vision models • Hybrid RAG on Qdrant + @huggingface re-ranking • @UnstructuredIO for parsing medical PDFs • @tavilyai for real-time search • @elevenlabs voice interface • @FastAPI + Docker backend • Human-in-the-loop validation Built by Souvik Majumder 🚀 Code: github.com/souvikmajumder26/…
13
106
559
38,247
We solved *Semantic Search-as-you-Type* ⌨️ It is faster than embedding inference ⚡ It is Open-Source 👐 It is in Rust 🦀 We use it for our docs 📜 Read more how 👇
7
52
426
85,387
🚀 Improving RAG Accuracy with Hierarchical Reranking Merging internal and external retrieval in a single pass often lets noise from one source weaken the other, lowering relevance and increasing hallucinations. In his latest blog, @pavan_mantha1 explains a two-stage reranking process that fixes this. Queries are embedded with Google Gemini and retrieved from Qdrant via @llama_index in hybrid mode (dense + BM25), while a web 🌐 search agent runs in parallel for fresh context. 1️⃣ Stage one ranks internal results by query relevance. 2️⃣ Stage two reorders that refined set using the external context as a second signal before synthesis by Gemini LLM. Results show a significant drop in hallucinations and high correctness scores on queries requiring both domain-specific and real-time context. 🔗 Full breakdown with code and evaluation results: medium.com/@manthapavankumar…
3
55
251
12,594
It looks like @elonmusk understands pretty well how vector similarity-powered AI recommendations work. A guest candidate for our Vector Space Talks podcast series.😀
8
14
231
37,681
🎥 GraphRAG from video with Qdrant and @neo4j Build a multimodal retrieval pipeline that turns raw video into a knowledge graph. This project combines object detection, vision-to-text grounding, temporal tracking, and storage in both graphs and vectors to enable retrieval that understands meaning, relationships, and time. 🔗 Check it out: athrael.net/blog/experiments…
2
36
209
9,266
🧠📚 DecipherIt – AI-Powered Research Assistant An open-source NotebookLM alternative built with multi-agent orchestration, semantic search, and real-time web access. Upload documents, paste URLs, or type a topic. DecipherIt turns it into a full research workspace with summaries, mindmaps, audio overviews, FAQs, and semantic Q&A. Under the hood: • @crewAIInc agents handle research, synthesis, FAQ gen, and more • Bright Data MCP bypasses geo-blocks and bot detection • Qdrant + OpenAI power vector-based semantic search • @lemonfoxai generates podcast-style voice summaries • Full-stack app with @FastAPI + Next.js + React 19 Built by @mtwn105 🚀 🔗 Demo: decipherit.xyz 🖥️ Code: github.com/mtwn105/decipher-…
2
42
186
8,483
📚🤖 Agentic Retrieval System for Local Libraries E-Library-Agent by @itsclelia is a self-hosted AI agent that ingests, indexes, and queries your personal book or paper collections. It’s built on top of ingest-anything and powered by @llama_index, Qdrant, and @Linkup_platform. The agent handles local ingestion, context-aware Q&A, and web-based discovery from a single interface. 🚀 See how the ingestion pipeline works in the diagram below. 🔗 Explore the project: github.com/AstraBert/e-libra…
35
166
10,386
🧠 Chunking changes everything in RAG. This benchmark post evaluated Fixed, Semantic, Agentic, and Recursive chunking in Agentic RAG. Built with @AgnoAgi, @qdrant_engine, @ragas_io, and @llama_index. And measured with relevant metrics: Context Recall, Faithfulness, Factual Accuracy, and more. Writeup by @pavan_mantha1: skillenai.com/visualizing-ch…
4
33
167
10,161
🚀 Learn how to build lightweight, real-time @AgnoAgi agents for medical and legal tasks without hogging resources in the latest tutorial by @pavan_mantha1. How? ➡️ Modular agents that are easy to update without a full rebuild ➡️ Tracking performance and interactions with @langfuse ➡️ Efficient deployment techniques to keep resource usage low 👉 Read the full blog: medium.com/@manthapavankumar…
2
30
165
8,668
📊 Building Hybrid Recommender Systems Qdrant isn’t just for RAG. Learn how to combine dense embeddings (from movie plots) with sparse user ratings to build a hybrid recommender system that supports both content-based and collaborative filtering. Recommendations are generated with pure vector search: nearest neighbors, unseen item filtering, and re-ranking with reciprocal rank fusion. Fully vector-native, with no external models or pipelines. Writeup by Nicola Procopio: medium.com/@nickprock/not-on…
1
26
164
5,696
🚀 PapersChat – Chat with Research Papers PapersChat provides an agentic AI interface for querying papers, retrieving insights from ArXiv & PubMed, and structuring responses efficiently. Powered by @llama_index (parsing), Qdrant (vector search), and @MistralAI (reasoning), it enables: ✔️ Semantic search for precise retrieval ✔️ Context-aware Q&A to extract insights ✔️ Hybrid retrieval combining vectors + metadata Runs locally via Docker for full control. Built by @StarryNightDev. Repo here: 🔗 buff.ly/xmxacqv
2
36
149
7,924
🔊Build an Audio RAG from scratch Transcribe → Embed → Store → Retrieve → Generate Turn long-form audio (MP3, WAV, M4A) into an interactive chat experience using: ☑️ @AssemblyAI for transcription with speaker labels ☑️ Qdrant for fast, persistent vector storage ☑️ @deepseek_ai R1 via SambaNova for LLM inference ☑️ @huggingface + @llama_index for text embeddings ☑️ @streamlit for the chat UI
5
32
152
11,702
Advanced Hybrid RAG with miniCOIL, LangGraph, and @deepseek_ai 🚀 @TRJ_0751 shows how to build a hybrid Customer Support RAG chatbot using miniCOIL to augment sparse retrieval with semantic awareness ➡️ LangGraph by @langchain orchestrates the hybrid flow with MMR and re-ranking ➡️ Opik tracks and evaluates each step of the pipeline ➡️ DeepSeek-R1 by @SambaNovaAI delivers low-latency, focused answers 👉 Read it here: medium.aiplanet.com/advanced…
1
25
149
10,705
🚀 Search 36M+ vectors in <15ms @akshay_pachaar built a high-speed RAG pipeline on PubMed—no GPUs, just @SambaNovaAI RDUs, @llama_index, and Qdrant. 🔹 Binary quantization cuts memory while keeping precision 🔹 RDUs run inference 10x faster than GPUs 🔹 No CUDA, no vendor lock-in—bring your own models 🔗 Code: buff.ly/4iikdwm
4
42
145
11,161
⚡ miniCOIL: a lightweight sparse neural retriever capable of generalization Sparse Neural Retrieval holds excellent potential, making term-based retrieval semantically aware. The issue is that most modern sparse neural retrievers rely heavily on document expansion (making inference heavy) or perform poorly out of domain. ❗️🧵 We present our latest attempt to make sparse neural retrieval usable.
2
24
147
12,103
🚀 Building WhatsApp Agents with @langchain, @GroqInc, @elevenlabs, and Qdrant 🎓 In this 2.5-hour course, @moteropedrido and @_jesuscopado will show you how to build Ava — a multi-modal WhatsApp AI agent that talks, sees, remembers, and responds. You’ll learn to: ➡️ Structure LangGraph-based workflows ➡️ Use Qdrant for long-term vector memory ➡️ Add STT + TTS with Whisper and ElevenLabs ➡️ Process and generate images with LLaVA and diffusion models ➡️ Deploy to WhatsApp via @googlecloud
3
20
135
6,223
🚀 Qdrant + n8n: Automating Processes Beyond Similarity Search Low-code tools like @n8n_io make it easy to turn ideas into reality — and combining them with vector search unlocks a whole new level of that reality. To support the flowgramming side of our community, we’ve collected a few Qdrant-based tools built for n8n agents. 👇
3
18
135
8,332
🖼️ Multimodal RAG with Colnomic, Qdrant & @Minio A CLI-first RAG pipeline that indexes and searches PDFs + images without OCR. Embed with @nomic_ai Colnomic, store vectors with Qdrant using binary quantization, manage objects via MinIO, and accelerate large-scale retrieval 13x using ColPali-style mean pooling + reranking. 👉 Check out the project: athrael.net/blog/little-scri…
2
13
127
5,443
🚀 Reranking in Qdrant is insanely efficient Most teams misuse multivector search by indexing every token-level vector causing RAM bloat and slow inserts. There’s a better way: store token-level vectors 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐢𝐧𝐝𝐞𝐱𝐢𝐧𝐠. ✅ Native ColBERT-style reranking ✅ Skip HNSW for multivectors; index only dense vectors ✅ Run fast retrieval + accurate rerank in a single API call No more wasting compute indexing 1000s of vectors per doc. Just clean, scalable late interaction that works. Built with FastEmbed. Works at scale. ➡️ See the tutorial: qdrant.tech/documentation/ad… ➡️ Notebook: github.com/qdrant/examples/b…
4
15
123
6,947
Build an Agentic RAG pipeline using crewAI, Qdrant, and @langchain to analyze and summarize research papers. Key steps: ➡️ Load data from a PDF file. ➡️Use RecursiveCharacterTextSplitter to index the data in chunks. ➡️ Insert metadata into Qdrant. ➡️ Set up a RetrievalQA pipeline. Finally, use @crewAIInc Agents to generate two documents: ➡️ The Researcher Agent conducts an in-depth analysis of the paper. ➡️ The Writer Agent summarizes the main insights. Check out this project by Benito Martin using the "Attention is all you need" paper as an example: github.com/benitomartin/crew…
25
116
10,296
DSPy is set to become the "PyTorch" for Language Programs 🔥 Our new tutorial takes you from beginner to pro on DSPy and how to compile a retrieve-then-read RAG with DSPy and Qdrant 🚀
1
12
113
11,794
🤖🚀 Multi-Agent Meeting Assistant on GCP Fully serverless system that transcribes meetings, summarizes them with LLM agents, stores context in Qdrant, and syncs tasks to Trello with results delivered directly in Slack. Uses @AgnoAgi for agent orchestration, @FastAPI on Cloud Run, and @OpenAI for embeddings + reasoning. 🔗 Read the full build + walkthrough: medium.com/@eduardovasquez_0…
6
19
114
4,480
🎙️ Meet Oliva: a multi-agent voice assistant built with LangGraph by @langchain, @superlinked, and Qdrant. It handles product search using natural voice input, routes tasks across specialized agents, and keeps everything fast, accurate, and context-aware. Built on a graph-based architecture with real-time voice via @livekit and @DeepgramAI, and fully open source. 👉 Try it out: buff.ly/6kxjS5m
1
22
113
4,726
🦝 Meet RAGcoon: Your Agentic RAG assistant for startup building. It combines structured reasoning with retrieval strategies that adapt to the complexity of each query. Depending on the case, it switches between hybrid search, hypothetical document generation, or sub-query decomposition. Runs on Qwen-32B via @GroqInc, built with @qdrant_engine for dense and sparse vector search, and evaluated with @llama_index for faithfulness and relevance. 🚀 Developed by @StarryNightDev 👉 Try it out: github.com/AstraBert/ragcoon
19
109
4,596
🚀 Advanced RAG Evaluation with miniCOIL, LangGraph, and @deepseek_ai In his new blog, @TRJ_0751 shows how to evaluate and monitor every component of a Hybrid Search RAG pipeline using: ➡️ LLM-as-a-Judge for binary evaluation of context relevance, answer relevance, and groundedness ➡️ Opik for trace logging and post-hoc feedback loops ➡️ DeepSeek-R1 powered by @SambaNovaAI ➡️ Qdrant as the vector store for dense and sparse (miniCOIL) embeddings LangGraph by @langchain manages the full pipeline, including parallel evaluation steps after generation. Each one runs a structured LLM prompt to check how accurate and grounded the answer is. 👉 Read the full walkthrough: medium.aiplanet.com/evaluate…
1
20
110
5,267
We put together a detailed end-to-end notebook so you can get started with jina-colbert-v1-en. Get started here: colab.research.google.com/dr…
Introducing jina-colbert-v1-en. It takes late interactions & token-level embeddings of ColBERTv2 and has better zero-shot performance on many tasks (in and out-of-domain). Now on @huggingface under Apache 2.0 licence huggingface.co/jinaai/jina-c…
3
16
108
14,139
🚨 𝗤𝗱𝗿𝗮𝗻𝘁 𝟭.𝟭𝟰 𝗶𝘀 𝗼𝘂𝘁! What’s new: ➡️ Score-Boosting Reranker Define custom scoring logic in your query. Boost titles, prefer recent docs, factor in geo proximity, apply business rules. ➡️ Incremental HNSW Indexing Upserts now grow the index without full rebuilds. Faster ingest, lower load. ➡️ Faster Batch Queries Now multi-threaded, even on single-segment setups. Up to 2.3× speedup. ➡️ Smarter Segment Optimization Better CPU and IO usage during merges. Indexing 400M vectors: 40h → 28h. More performance, same setup. 🚀
5
14
103
7,553
We’re excited to announce the launch of Qdrant Hybrid Cloud, the first-ever managed vector database you can deploy anywhere—cloud, on-premise, or edge— designed for true deployment flexibility, data sovereignty, privacy, and control. Why is this big? 🚀 Deployment flexibility and data sovereignty are critical as the industry moves from prototyping to deploying production-ready AI applications. Easy integration with existing systems complements these advantages, streamlining development and operations. Key benefits of the Hybrid Cloud: ✔️ Deploy Anywhere: Deploy Qdrant in any environment of choice with our Kubernetes-native design. ✔️ Full Data Sovereignty: Enjoy privacy control with decoupled data and control planes with complete database isolation. ✔️ Fully Managed: Enjoy the benefits of a managed vector database within your own environment. ✔️ Effortless Setup: One-line installation by simply adding your environment to your Qdrant Cloud account. Thank you to our trusted launch partners for their collaboration: @OracleCloud, @RedHat, @Vultr, @OVHcloud, @Scaleway, @DigitalOcean, STACKIT, @llama_index, @langchain, @AirbyteHQ, @CivoCloud, @JinaAI_, @Aleph__Alpha, @Haystack_AI by @deepset_ai.
6
28
104
24,968
⚡️ Fast Multi-Document RAG using Qdrant, @SambaNovaAI, @deepseek_ai, and LangGraph In Part 1 of his new series, @TRJ_0751 shows how to build a high-speed, memory-efficient RAG system that can handle multiple documents at a large scale. ➡️🧠 32x memory savings with Binary Quantization ➡️ Fast, focused LLM responses via DeepSeek-R1 ➡️ Modular orchestration with LangGraph by @langchain 👉 Read the full blog: medium.aiplanet.com/fast-mul…
23
105
6,486
⚡️ FastEmbed 0.3.0 is here! Now featuring Image embeddings (ResNet50), multimodal embeddings (CLIP), late interaction embeddings (ColBERT), and an innovative type of sparse embeddings. 🙌 GitHub: github.com/qdrant/fastembed/ Change Log: github.com/qdrant/fastembed/…
1
21
100
15,155
🎥 Semantic Video Search with @twelve_labs + Qdrant In most AI systems, search still revolves around text. Even in video, we often rely on transcripts, tags, or OCR. This one does it by meaning. In this new tutorial, @hrishikesh_ai shows how to build a video retrieval engine that understands video: → Embeds entire video files with visual + temporal context with TwelveLabs → Stores vectors in Qdrant 🦀, with public S3 URLs as payload → Runs semantic search with mood + intent filtering → Returns relevant video clips by meaning, not keywords 🔗 Blog: twelvelabs.io/blog/content-r… 💻 GitHub: github.com/Hrishikesh332/Twe…
4
17
97
4,027
Is Vision All You Need? 👀 It's not a secret that text chunking methods in RAG are resource-demanding and often result in the loss of significant visual context. But what if you could skip chunking entirely? In his latest blog, @skvark shares how VLMs are revolutionizing RAG. By indexing entire document pages as images, strategies like ColPali can now capture both text and visual context faster and with no need for chunking. Could this be the future of RAG? 👉 Reall the full blog: buff.ly/4fsa532 🖥️ Try out the V-RAG demo on @SoftlandiaLtd’s GitHub: buff.ly/3Ym5oR6
23
98
10,834
🚀 Building RAG with native Qdrant nodes in @n8n_io Fully automated RAG pipeline that can be deployed, iterated, and extended entirely from within the n8n canvas. Chunk documents with @ChonkieAI, embed with @JinaAI_, generate vector payloads via @FastAPI, and return responses using a local @ollama instance running Qwen-14B. 👉 Read the tutorial by @pavan_mantha1: medium.com/@manthapavankumar…
2
12
95
4,367
Traditional RAG frameworks sometimes fall short in finding the most relevant content from large datasets. That’s why Darshil Modi created AutoMeta RAG—a framework that enhances data retrieval by adding dynamic metadata to RAG systems. Key benefits: ➡ More relevant and accurate results with metadata narrowing the search space ➡ Faster and precise searches using @qdrant_engine for fine-grained indexing ➡ Efficient and targeted searches with automatic metadata generation by LLMs ➡ Preserves data quality by using original data for final answers 🔗Learn more: thinkinbytes.medium.com/auto…
1
19
92
7,225
Exciting news! 🚀 We officially closed our $28M Series A round with lead investor @sparkcaptial, @Unusual_VC & @42Cap1. Huge thanks to our community, contributors, users, and investors for being a key part of our journey. 🙌 ▶️ Learn more: qdrant.tech/blog/series-a-fu…
3
15
86
12,140
Hey all! We actually did find a discrepancy with our previous benchmarks of bm42. Please don't trust us and always check performance on your own data. Our best effort to correct it is here: github.com/qdrant/bm42_eval
Okay, gloves off. What @qdrant_engine did with the BM42 post is unacceptable. They are misguiding the RAG community in a big way. 1) Presenting Quora as a relevant RAG question-answering dataset. It's not. 2) Presenting a fake result. Yes. Fake. - Quora might sound like a relevant RAG question-answering dataset. In reality, it is a question-to-question dataset with the task of finding duplicate questions. But Quora sounds appropriate for a RAG benchmark if you don't know the dataset. They report Precision@10 for BM42 is better than BM25 on Quora with 0.49. But how can Precision@10 be that for a dataset when the upper bound for Precision@10 is 0.2? It's fake. A baseline BM25 implementation on the dataset will have a recall@10 of 0.88, precision@10 0.12, and nDCG@10 0.78. Plus, you don't need to run embedding inference.
5
10
88
127,910
🚀 Build Your Own SSE-Based Agent In the part 3 of the MCP Playlist, @TRJ_0751 shows how to connect Qdrant's MCP Server with @AgnoAgi and Gemini 2.5 Flash to stream real-time responses from tool-based agents. The setup combines STDIO and SSE-based tools into a single structured workflow, with custom search tools powered by @DuckDuckGo. 🔗 Watch the full video: piped.video/watch?v=StXHQ3z5…
15
82
5,362
🤔 The Impact of Chunk Size & Overlap on RAG Pipeline Results
1
12
84
18,735
Dense embedding models are not giving up yet! Surprisingly, they are also pretty good late interaction models! Please welcome ColBERT-like retrieval with just sentence transformers 🎉
1
15
81
13,475
Have you tried miniCOIL v1 on @huggingface yet? 🤗 Word-level, contextualized 4D sparse embeddings with automatic BM25 fallback. 👉 Try it out: huggingface.co/Qdrant/minico…
1
12
75
22,448
⚡️ News Flash! ⚡️ DSPy, the framework for solving advanced tasks with language models and retrieval models from Stanford, can now be turbocharged with Qdrant's retrieval prowess. 🦀 🔍 Search and Retrieve: DSPy + Qdrant = Lightning-fast NLP tasks.
2
5
77
8,486
🛡️ Building Resilient RAG Applications with Guardrails and Semantic Caching Create a system to deliver accurate data retrieval and safe content generation, even when handling complex queries. In this article, @pavan_mantha1 walks us through creating a robust RAG architecture that integrates @qdrant_engine, @LiteLLM, @Redisinc, and Llama-Guard-3-8b. Key takeaways: ✔ 𝐇𝐲𝐛𝐫𝐢𝐝 𝐒𝐞𝐚𝐫𝐜𝐡 𝐰𝐢𝐭𝐡 𝐐𝐝𝐫𝐚𝐧𝐭: Combines dense and sparse models to enhance the precision of data retrieval. ✔ 𝐄𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐜𝐲 𝐰𝐢𝐭𝐡 𝐋𝐢𝐭𝐞𝐋𝐋𝐌 𝐚𝐧𝐝 𝐑𝐞𝐝𝐢𝐬: Utilizes semantic caching to speed up processing and improve consistency. ✔ 𝐒𝐚𝐟𝐞𝐭𝐲 𝐰𝐢𝐭𝐡 𝐋𝐥𝐚𝐦𝐚-𝐆𝐮𝐚𝐫𝐝-3-8𝐛: Implements stringent pre- and post-processing checks to ensure content safety and relevance. 🔗 Go deeper into the implementation in the full article: buff.ly/3XsqAG5
1
11
76
4,700
FastEmbed ⚡️ - Our lightweight and fast Python library designed for embedding generation, is now integrated into Langchain v0.0.335. Enabling painless sentence embedding generation with minimal overhead and no cost for your Langchain projects.
3
12
72
9,079
Relevance feedback allows retrieval systems to iteratively improve search results in the direction of relevance. It’s been studied for over 60 years — yet remains absent in modern production-level neural search. We looked into the research field to understand why — and along the way, gathered this summary of methods proposed over the years. ⬇️
5
16
73
5,584
🚀 Building a 100% Local RAG with Gemma 3, @ollama, and @qdrant_engine In one of his latest live streams, @maxedapps sets up a fully local RAG app from scratch. If you're looking to get hands-on with local AI, this is the video to watch. 📹 Watch the complete 2-hour tutorial: piped.video/6diVTn3J7QE?si=Uw11… 💻 See the final app on Github: github.com/mschwarzmueller/b…
1
10
69
3,317
Qdrant 1.12.0 is out! 🚀 In this release, we focused on making large-scale data handling not only more efficient but also more insightful. ➡️ Distance Matrix API: Simplifies clustering by calculating distances between vectors in a single request, useful for tasks like grouping similar data points. You can also visualize these results directly in the Web UI’s Graph Exploration Tool. ➡️ Facet API for Metadata Cardinality: Refine searches and identify patterns in your dataset by aggregating and counting values for a specific payload field. ➡️ Geo and Text Index On-Disk Support: Reduce memory usage by moving text and geo indices to disk.
3
10
64
12,845
We did it!⭐️Fastembed has reached 1,000 GitHub stars and is being used in almost 700 projects! A huge thank you to our community for your support and contributions. Fastembed aims to make text embedding fast, easy, and accessible to all. Try it out: github.com/qdrant/fastembed
3
8
64
4,410
🚀 𝗔𝗿𝗰𝗵𝗔𝗜: 𝗬𝗼𝘂𝗿 𝗖𝗼𝗱𝗲𝗯𝗮𝘀𝗲, 𝗘𝘅𝗽𝗹𝗮𝗶𝗻𝗲𝗱 𝗯𝘆 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 𝘄𝗶𝘁𝗵 @crewAIInc 𝗮𝗻𝗱 @qdrant_engine ArchAI automatically clones, analyzes, documents, and diagrams your codebase, backed by a crew of AI agents running on CrewAI, storing context in Qdrant, and integrating with SonarQube for inspection of code quality. 🧠 It understands code structure and function 📊 Generates PlantUML diagrams 📝 Produces clean, human-readable README-style documentation 📈 Pulls in SonarQube metrics 🔍 Uses Qdrant to remember what it learns Everything runs locally or with your preferred LLM (OpenAI, Gemini, @ollama, etc). Just point it to your repo and let it go. Want to test it or contribute? Here’s the repo: github.com/ArchAI-Labs/code_…
1
6
65
2,794
📂 Build a RAG pipeline with full observability and multi-format document support. This open-source project by @patrick_verol processes multiple file types using @langchain and @huggingface for chunking and embedding, with Qdrant handling vector storage and retrieval. ✅ Supports PDF, DOCX, PPTX, TXT ✅ Uses LLaMA 3 for generation ✅ @FastAPI serves the backend ✅ Real-time observability with @grafana ✅ Fully containerized with Docker
1
8
63
2,811
🧠 AI-Powered Log Analysis New open-source system for querying system logs using natural language, vector search, and LLMs. Combines Qdrant for semantic similarity, @langfuse for prompt observability, and @FastAPI to serve responses from ChatGPT or Claude.🚀 Logs are embedded using Sentence Transformers @huggingface, and the system supports feedback-driven improvement. 👉 Check it out: github.com/nthanhdo2610/ai-l…
1
8
62
2,609
Similarity learning surpasses classification when it comes to a significant amount of classes. Regular classifiers struggles when it comes to a thousand classes, but similarity models handles this case naturally.
New @icmlconf paper: We develop Siamese neural network architectures for extreme classification, allowing us to jointly and efficiently learn feature and label representations and achieve state-of-the-art results on tasks with upto 100 million labels. proceedings.mlr.press/v139/d…
11
59
Replying to @elonmusk
Powered by @qdrant_engine Qdrant Vector Database!🚀🚀🚀
4
7
58
10,243
🛠️ Using @apachekafka and Qdrant via MCP: Structured Tooling for Claude LLMs are very good at reasoning over static inputs. But real systems stream data, evolve state, and respond in real time @pavan_mantha1 connected Claude to Kafka, FastEmbed, and Qdrant, with each component running as its own MCP server. Claude decides what to use, and when to use it. ➡️ Kafka streams events ➡️ FastEmbed provides embeddings ➡️ Qdrant stores and retrieves the vectors ➡️ A Kafka to Qdrant connector keeps everything in sync, allowing memory to evolve in real time as new data arrives 📖 Full write-up with code and architecture: towardsdev.com/kafka-mcp-and…
3
10
57
2,750
🚀Real-time repo recommendations embedded in @github SimRepo is a browser extension that integrates Qdrant's ANN retrieval directly into the GitHub UI. It models repository similarity using a vector space trained on 300M+ GitHub star events with an SVC-based embedding. Repos are positioned based on co-star patterns, enabling fast discovery of semantically related projects. Install it on Chrome, Firefox, Edge, Brave, or even Tor. 🧪 Try it here → github.com/Mubelotix/SimRepo
1
9
58
2,658
🍾 New Year, New (Smarter) Agents! 🚀 Start 2025 by moving beyond static RAG pipelines. @Saboo_Shubham_ and @Gargi__Gupta share how to build a Corrective RAG Agent: a multi-stage pipeline designed to: ✅ Validate relevance with Claude 3.5 ✅ Refine queries dynamically ✅ Use @tavilyai for real-time data ✅ Orchestrate workflows with LangGraph
3
10
59
4,512
Building Smarter Agents with @llama_index and Qdrant's Hybrid Search 🦙🔍 @pavan_mantha1 explores an advanced RAG architecture combining LlamaIndex agents with Qdrant's hybrid search capabilities! The setup leverages both dense and sparse vector embeddings for precise data retrieval. Key components: 1️⃣ Orchestrator: Coordinates workflow via @RabbitMQ 2️⃣ Info_tool_agent: Retrieves data using Qdrant's hybrid search 3️⃣ Summary_tool: Compiles coherent responses 4️⃣ Hybrid search: Combines dense and sparse embeddings The implementation uses @SnowflakeDB /snowflake-arctic-embed-s for dense embeddings and prithivida/Splade_PP_en_v1 for sparse, with @MistralAI through @ollama for LLM tasks. Pavan gives us a detailed look at the architecture and implementation, showcasing how to build complex, efficient workflows for AI-driven data solutions. Check out the full article published in GoPenAI: blog.gopenai.com/building-sm…
11
57
6,114
📢 New in Qdrant: Visualize and expand data points, enabling controlled exploration of vector spaces in our Web UI.
5
5
58
3,589
Yes, it can. We made it work. With a few optimization tricks, TL;DR: - ONNX inference in Rust - Embeddings cache & lookup - Parallel & Batch requests - hybrid search with full-text filtering + vector re-scoring More details in the article: qdrant.tech/articles/search-…
2
6
56
2,287
What are vector embeddings? Every like, share, retweet, and even your interaction with this very post generates vector embeddings. 📱
1
9
54
4,848
This approach works especially well with the latest release of Qdrant. It supports hybrid search out of the box, and handles IDF computation for you. For an in-depth look at BM42, read the full article here: buff.ly/3XOZSrJ
1
8
56
4,748
🌐 Build a GraphRAG Pipeline with Qdrant and @neo4j Standard RAG still struggles with multi-hop reasoning and deeply connected knowledge. @ptdamiba shows you how you can fix that by combining Qdrant semantic search and symbolic reasoning from Neo4j. 🚀 In just 20 minutes, you'll learn to: → Design a dual-retrieval pipeline that adapts to query complexity → Connect dense embeddings with graph-based knowledge → Solve complex lookups that break traditional RAG → Keep everything in sync with a shared ID architecture 🎥 Watch the tutorial: piped.video/watch?v=o9pszzRu… 📘 Follow the step-by-step example: qdrant.tech/documentation/ex…
10
54
3,085
The secret to optimizing RAG pipelines? It all starts with choosing the right chunking strategy. Explore different chunking techniques and see how pairing them with Qdrant’s hybrid search and reranking leads to a more precise and accurate retrieval. In this article, @pavan_mantha1 takes you step-by-step through setting up a fully local RAG pipeline using @llama_index, @ollama, and Qdrant. 🚀 Start here: buff.ly/3NzfkC0
1
3
55
2,412
We're on @huggingface! We've been there for a while, but 2024 is a year of semantic search. And semantic search needs embeddings! 🔢 That's why we started publishing datasets with precomputed vectors, so you can build apps even faster. buff.ly/47DWr8l
3
11
51
17,251
Just some GitHub org with two repositories... @xai
1
9
52
6,729
FastEmbed now allows you to generate efficient and interpretable sparse vector embeddings using the SPLADE++ model. 🚀 Get started with @NirantK’s guide: buff.ly/43G45i8
6
50
14,873
.@JinaAI_ just released its binary embeddings, which are 96.875% smaller and faster. For 𝘫𝘪𝘯𝘢-𝘦𝘮𝘣𝘦𝘥𝘥𝘪𝘯𝘨𝘴-𝘷2-𝘣𝘢𝘴𝘦-𝘥𝘦, binary quantization reduces retrieval accuracy from 44.39% to 42.65%, a loss of only 4%. You might not need all those 32-bits: jina.ai/news/binary-embeddin…
5
50
3,579
The team at @Quora just revamped their embedding search system and chose Qdrant to make it happen. Here's why according to them 🧵
4
7
50
7,775
🚀 High-Performance RAG Agent with @cohere ⌘R Learn to create a fast, scalable system using Cohere's Command R7B, Qdrant, @langchain, and LangGraph. 💡 What You’ll Build: - PDF processing with Cohere embeddings stored in Qdrant. - A retrieval engine with web search fallback. - Seamless orchestration using LangGraph. ✍ By: @Saboo_Shubham_ & @Gargi__Gupta, @unwind_ai_
2
8
50
4,457
Qdrant engine repository is #1 on the worldwide GitHub trending list today. 🦀🚀🌟 8K stars just passed. 🎉 github.com/trending
2
3
48
4,951
Qdrant engine v1.10 has been released with new powerful features. 🚀 ➡ 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐚𝐥 𝐪𝐮𝐞𝐫𝐲 𝐀𝐏𝐈 with built-in Hybrid Search, a fusion merge of dense and sparse results, and multi-stage queries with re-scoring. ➡ 𝐌𝐮𝐥𝐭𝐢𝐯𝐞𝐜𝐭𝐨𝐫 𝐬𝐞𝐚𝐫𝐜𝐡 with late interaction models (e.g. ColBERT) ➡ 𝐈𝐧𝐯𝐞𝐫𝐬𝐞 𝐃𝐨𝐜𝐮𝐦𝐞𝐧𝐭 𝐅𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 (IDF) for stream updating of BM25 and 𝐁𝐌42 sparse embeddings. ➡ 𝘧𝘭𝘰𝘢𝘵16 and 𝘶𝘪𝘯𝘵8 datatype support, S3 compatible snapshots, issue reporting, and even more... 𝐑𝐞𝐥𝐞𝐚𝐬𝐞 𝐍𝐨𝐭𝐞𝐬: github.com/qdrant/qdrant/rel… 𝐀𝐧𝐧𝐨𝐮𝐧𝐜𝐞𝐦𝐞𝐧𝐭: qdrant.tech/blog/qdrant-1.10… 𝐉𝐨𝐢𝐧 𝐮𝐬 on July 11th in Berlin at the 𝐕𝐞𝐜𝐭𝐨𝐫 𝐒𝐩𝐚𝐜𝐞 𝐄𝐯𝐞𝐧𝐭 to discuss the new features and celebrate the release with us: lu.ma/88rdjnhg 🎉
1
9
44
10,636
A new wave of embedding models is here! Just today, Parallia showcased how versatile ModernBERT can be by releasing multilingual embedding models in English, French, Dutch, and German 🇬🇧 🇫🇷 🇳🇱 🇩🇪
1
6
49
2,135
🚀 Introducing Gridstore: Qdrant’s Custom Key-Value Store We built our own storage engine to replace RocksDB for payload and sparse vector storage. It eliminates compaction overhead, optimizes sequential key access, and provides direct control over data writes. Benchmarks show 2x faster ingestion with stable throughput. Now live in Qdrant 1.13!
1
10
46
3,077
🚀 Sparse Neural Retrievers in Sentence Transformers v5 @huggingface just released version 5 of Sentence Transformers, with full support for training and fine-tuning sparse neural retrievers, so you can bring your hybrid search to the next level. This release gives you everything you need for the full fine-tuning cycle, plus the ability to: ✅ Sparsify your favorite dense encoder; ✅ Train sparse encoders on multiple datasets simultaneously; ✅ Use static embeddings to optimize SPLADE query inference time. … and run retrieval on your fine-tuned model using the semantic_search_qdrant function! We’re proud to show that “𝘘𝘥𝘳𝘢𝘯𝘵 𝘰𝘧𝘧𝘦𝘳𝘴 𝘦𝘹𝘤𝘦𝘭𝘭𝘦𝘯𝘵 𝘴𝘶𝘱𝘱𝘰𝘳𝘵 𝘧𝘰𝘳 𝘴𝘱𝘢𝘳𝘴𝘦 𝘷𝘦𝘤𝘵𝘰𝘳𝘴 𝘸𝘪𝘵𝘩 𝘦𝘧𝘧𝘪𝘤𝘪𝘦𝘯𝘵 𝘴𝘵𝘰𝘳𝘢𝘨𝘦 𝘢𝘯𝘥 𝘧𝘢𝘴𝘵 𝘳𝘦𝘵𝘳𝘪𝘦𝘷𝘢𝘭 𝘤𝘢𝘱𝘢𝘣𝘪𝘭𝘪𝘵𝘪𝘦𝘴” & looking forward to seeing your lexical retrieval struggles resolved! 🔗 Blog post by Hugging Face: huggingface.co/blog/train-sp… 🔗 Qdrant integration: sbert.net/examples/sparse_en…
5
46
2,053
📊 𝗥𝗲𝗿𝗮𝗻𝗸𝗶𝗻𝗴 𝗶𝗻 𝗩𝗲𝗰𝘁𝗼𝗿 𝗦𝗲𝗮𝗿𝗰𝗵: 3 Pro Tips After building and tuning several vector search pipelines, these reranking strategies consistently deliver better relevance and performance: 🔹 Two-Stage Retrieval Pair binary quantization with reranking. Use fast approximate search to surface candidates, then apply a more precise model to rerank. Efficient and effective. 🔹 Use Better Models Integrate advanced reranking models like ColBERT or Late Interaction LLMs. These are especially valuable when precision matters (e.g., long-tail queries or nuanced contexts). 🔹 Oversample Strategically Retrieve more than you need, then let your reranker do the heavy lifting. It’s a simple change that often improves top-k quality without adding major infra costs. What’s your go-to reranking strategy?
9
45
2,903
Architect and build a real-world LLM system 🌍 @iusztinpaul explains how to build and design high-performance RAG inference pipelines in lesson 9 of the free LLM Twin course. He walks us through: ☑ Microservice vs monolithic LLM architectures ☑ Building a production-grade RAG business module with Qdrant and @superlinked ☑ Deploying LLM microservices on @Qwak_ai ☑ Implementing prompt monitoring with @Cometml 👉 Read the article: medium.com/decodingml/archit… 👉 See all 11 lessons of the LLM Twin course: medium.com/decodingml/llm-tw…
11
44
5,679
🧐 Simplicity is the ultimate sophistication! Build a Retrieval-Augmented Generation (#RAG) app from scratch with @KrisOgrabek's step-by-step guide using @qdrant_engine, @llama_index, and @OpenAI's GPT-4o mini! 🔥👉 ai.gopubby.com/simple-rag-wi…
3
9
42
6,757
docker run -p 6333:6333 qdrant/qdrant
8
42
4,301
Speed up your RAG projects with semantic caching! Join @infoslack for an in-depth exploration of how semantic caching works and how it can minimize latency and boost your app performance. 🚀 Watch the video: buff.ly/3UtjE8Z
11
46
13,306
Look what is trending on GitHub. Qdrant is heading to 15K stargazers! 🤩 Want to receive our swag package with a cool 𝐕𝐞𝐜𝐭𝐨𝐫 𝐒𝐩𝐚𝐜𝐞 𝐓-𝐬𝐡𝐢𝐫𝐭 and stickers? Reshare this post and tell about your experience working with #vectorsearch. 🙌 github.com/qdrant/qdrant
3
4
42
3,803
🔥 Qdrant MCP Server v0.7.1 is out! Now you can tailor tool descriptions to match your exact use case, whether it's code search, specialized knowledge bases, or other applications. This makes the server even more flexible and adaptable to different workflows. 🔗 Full changelog: github.com/qdrant/mcp-server…
2
5
43
3,434
A fresh dataset is available on @huggingface 🤗 1M pre-computed embeddings for DBpedia entities. Generated using OpenAI's text-embedding-3-large 1536 dimensions 🚀 Ready to use in your NLP tasks ⬇ huggingface.co/datasets/Qdra…
2
11
41
11,064
You need open-source software not only in case you want to contribute to the project or see how it works under the hood. Here is what are the hidden costs of engaging with closed-source services:
3
8
37
5,045
RAG, but with Vision Language Models! 📄👀 It's no secret that traditional RAG systems can miss out on important visual context. By integrating vision models, we’re transforming retrieval to deliver more accurate, context-aware results. That’s what @pavan_mantha1 explores with a dual-stream RAG setup using Vision Language Models (VLMs). 🔧 Key Tools & Strategies: ▪ Dual Processing: PDFs are split into text (extracted with pypdf) and images (pdf2image). ▪ Qdrant Multi-Vector Storage: Stores both text and image embeddings, using CLIP for visuals and MiniLM for text. ▪ Smart Retrieval: Uses Reciprocal Rank Fusion (RRF) to fetch the most relevant content, analyzing visuals with OpenAI’s GPT-4o Vision model. Build a RAG setup that gets the full picture — literally: buff.ly/4hmrsnD
10
41
6,971
🔥Practical Guide to RAG-based Pipelines Evaluation and Improvement @atitaarora (Solution Architect, @qdrant_engine) and @DeannaLEmery's (Founding AI Researcher, @QuotientAI) talk from the @aiDotEngineer World's Fair 2024 is finally available! They address core challenges in #RAG-based pipelines and show how to improve desirable metrics like faithfulness, chunk and context relevance step-by-step. 👇
1
8
41
2,989
🚀 Multimodal RAG with ColPali + Qdrant A hands-on guide to building a multimodal document QA system that keeps full visual context using ColQwen 2.5, Qdrant, Claude Sonnet, @supabase, and @huggingface. Built with @FastAPI and no text extraction at all. Check it out 👉 decodingml.substack.com/p/th…
1
3
41
1,570
What is RAG? RAG improves the accuracy and depth of LLMs by enabling them to access external data 🎯
1
4
39
7,516
You can now perform Hybrid Search using a single method with our Python SDK! 🔥 To process your requests, the hybrid search class requires three key components: 1. Models to transform your query into a vector, 2. The Qdrant client for executing search queries, 3. A fusion function to re-rank both dense and sparse search results. Fastembed simplifies this by integrating query encoding, search, and fusion into a single method call and utilizes reciprocal rank fusion to combine the results. 🚀
3
2
38
4,841
📊 Multi-stream stock analysis with Qdrant + @GroqInc RAGfolio uses advanced vector search to power LLM-driven investment analysis across: ✅ 10-K SEC filings for long-term business health ✅ 10-Q quarterly reports for short-term trends ✅ Recent news for market perception It combines dense vectors (MiniLM), sparse BM25, and ColBERTv2 for late interaction. LLM inference runs on Groq for ultra-fast, structured reasoning over the retrieved documents. ➡️ Full project: github.com/infoslack/RAGfoli…
2
7
38
1,930
🚀 Advancing GraphRAG with Qdrant, @neo4j, and dynamic ontology Athos Georgiou gbrings GraphRAG closer to production with automatic triplet extraction, vector search via Qdrant, and semantic graphs via Neo4j. Explore the project here: athrael.net/blog/experiments…
1
9
40
2,280
FastEmbed now supports GPU inference! 🚀 Generate embeddings faster than ever. Check it out: 🔗qdrant.to/fastembed (Image embeddings are coming soon. 😉)
1
8
36
9,743
🚀 Learn how to build modular RAG pipelines that boost answer quality with smart re-ranking in the latest tutorial by @pavan_mantha1. ➡️ Re-rankers from @cohere, ColBERT, @JinaAI_, and @VoyageAI ➡️ Easy LLM switching with @litellm ➡️ Full observability and trace tracking using @langfuse Each response is scored with DeepEvals to measure answer relevancy and reranker impact. 👉 Read the full blog: medium.com/@manthapavankumar…
1
8
35
1,548
Presenting Qdrant support in Semantic-Router! A library to build a decision-making layer for your AI agents! Rather than waiting for unreliable LLM generations to make tool-use/safety decisions, Semantic-Router uses vector space magic to route our requests with semantic meaning
2
5
37
3,682