AI/MLAI/MLLangChainOpenAINode.js

Build a RAG App with LangChain, OpenAI & Node.js — Step by Step

How I built a Retrieval-Augmented Generation system at India Today Group for intelligent article search and content automation using LangChain, OpenAI, and MongoDB Atlas Vector Search.

Deepak KumarFeb 20, 2026 12 min read

At India Today Group, we needed an intelligent search system that could understand editorial queries like "find articles about economic policy impact on rural India" — not just keyword matching, but semantic understanding. Here's how I built it using RAG (Retrieval-Augmented Generation).

What is RAG?

RAG combines a retrieval system (finding relevant documents) with a generation model (LLM like GPT-4) to produce accurate, context-aware answers grounded in your actual data — not hallucinated facts.

Architecture Overview

User Query → Embed Query (OpenAI) → Vector Search (MongoDB Atlas)
  → Top K Documents → LLM Prompt + Context → Generated Answer

Step 1: Setup Dependencies

npm install langchain @langchain/openai @langchain/community mongodb

Step 2: Document Ingestion Pipeline

First, we ingest articles from our CMS, split them into chunks, generate embeddings, and store in MongoDB Atlas Vector Search:

import { OpenAIEmbeddings } from '@langchain/openai';
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
import { MongoDBAtlasVectorSearch } from '@langchain/community/vectorstores/mongodb_atlas';

const embeddings = new OpenAIEmbeddings({
  openAIApiKey: process.env.OPENAI_API_KEY,
  modelName: 'text-embedding-3-small',
});

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 200,
});

const docs = await splitter.createDocuments(
  articles.map(a => a.content),
  articles.map(a => ({ title: a.title, id: a._id, date: a.publishedAt }))
);

await MongoDBAtlasVectorSearch.fromDocuments(docs, embeddings, {
  collection: mongoCollection,
  indexName: 'article_vector_index',
});

Step 3: Query Pipeline with LangChain

import { ChatOpenAI } from '@langchain/openai';
import { RetrievalQAChain } from 'langchain/chains';

const llm = new ChatOpenAI({ modelName: 'gpt-4-turbo-preview', temperature: 0.2 });

const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
  collection: mongoCollection,
  indexName: 'article_vector_index',
});

const chain = RetrievalQAChain.fromLLM(llm, vectorStore.asRetriever(5));

const result = await chain.call({
  query: 'What are the latest developments in India education policy?',
});
console.log(result.text); // AI-generated answer with citations

Step 4: Production Considerations

In production at India Today, we added: caching (Redis for repeated queries), rate limiting, streaming responses for real-time UX, and source attribution so editors can verify AI-generated summaries.

Results

The RAG system reduced editorial research time by 60% and powers the intelligent search across India Today's digital platform. Editors can now ask natural language questions and get accurate answers grounded in our 50,000+ article archive.

Deepak Kumar

Sr Software Engineer at India Today Group

MERN Stack · Generative AI · AI/ML · LangChain

Hire Me

🗳️

AI/MLLangChainOpenAINode.jsRAGGenerative AI

Build a RAG App with LangChain, OpenAI & Node.js — Step by Step

What is RAG?

Architecture Overview

Step 1: Setup Dependencies

Step 2: Document Ingestion Pipeline

Step 3: Query Pipeline with LangChain

Step 4: Production Considerations

Results

Related Articles

Engineering India Today's Election Dashboard: Real-Time Data at Scale

Introduction to Node.js: A Comprehensive Guide

MERN Stack Architecture in 2026: What's Changed & Best Practices

Integrating AI into Next.js Apps: OpenAI, Streaming & Edge Functions