2.7 Chatbots with LangChain

This chapter is about building and optimizing conversational chatbots with LangChain — a toolkit that connects language models to retrieval systems for dynamic QA. We take a practical route: set up the environment and load documents, build a vector store, choose advanced retrieval strategies, and add conversation memory so the bot maintains context and answers follow‑ups confidently. Conversational bots transform data interaction: instead of independent turns, they track and remember the dialogue, and LangChain’s modular architecture lets you plug in loaders (80+ formats), chunking, embeddings, semantic search, self‑query, and contextual compression step by step. One important detail is early environment and variable setup: observability and careful key handling speed up debugging and operations. We then assemble the core — the Conversational Retrieval Chain — combining the language model, retriever, and memory, and show how buffer memory preserves the sequence of messages and passes it along with new questions to keep the dialogue natural and coherent.

Start by initializing the environment and API keys to safely use cloud LLMs and prepare the interface:

# Import environment and API helpers
import os
from dotenv import load_dotenv, find_dotenv

# Ensure Panel is available for interactive apps
import panel as pn
pn.extension()

# Load environment variables (including the OpenAI API key)
_ = load_dotenv(find_dotenv())

# OPENAI_API_KEY is read by integrations automatically; no direct assignment required

Pick a language model version and fix it for the demo:

# Choose a model version
import datetime
current_date = datetime.datetime.now().date()
language_model_version = "gpt-3.5-turbo"
print(language_model_version)

Now connect embeddings and a vector store for baseline QA: load/index documents, retrieve relevant fragments, and prepare the model for answers. Then define a prompt template and assemble a RetrievalQA chain that will use your retriever and craft contextual answers:

# Embeddings and vector store
from langchain.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Replace 'your_directory_path' with the directory where you will persist embeddings
persist_directory = 'your_directory_path/'
embedding_function = OpenAIEmbeddings()
vector_database = Chroma(persist_directory=persist_directory, embedding_function=embedding_function)

# Query the vector store
search_question = "What are the key subjects covered in this course?"
top_documents = vector_database.similarity_search(search_question, k=3)
print(f"Relevant documents found: {len(top_documents)}")

# Initialize a chat model and try a simple greeting
from langchain_openai import ChatOpenAI
language_model = ChatOpenAI(model='gpt-4o-mini', temperature=0)
greeting_response = language_model.invoke("Greetings, universe!")
print(greeting_response)

# Prompt for concise, helpful answers
from langchain.prompts import PromptTemplate
prompt_template = """
Use the following pieces of context to answer the question at the end. If you're unsure about the answer, indicate so rather than speculating. 
Try to keep your response within three sentences for clarity and conciseness. 
End your answer with "thanks for asking!" to maintain a polite tone.

Context: {context}
Question: {question}
Helpful Answer:
"""
qa_prompt_template = PromptTemplate(input_variables=["context", "question"], template=prompt_template)

# RetrievalQA chain
default_question = "Does this course require understanding of probability?"
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    language_model,
    retriever=vector_database.as_retriever(),
    return_source_documents=True,
    chain_type_kwargs={"prompt": qa_prompt_template}
)
qa_result = qa_chain({"query": default_question})
print("Result:", qa_result["result"])

Implementing a Conversational Retrieval Chain with Memory for QA

This section targets ML engineers, data scientists, and developers building QA systems that understand and retain dialogue context. The focus is on integrating the Conversational Retrieval Chain with a memory component from LangChain.

Configure memory for dialogue history

Use ConversationBufferMemory so the system remembers context. It stores message history and allows referring to prior turns for relevant follow‑ups.

# Conversation memory
from langchain.memory import ConversationBufferMemory

conversation_history_memory = ConversationBufferMemory(
    memory_key="conversation_history",
    return_messages=True
)

Assemble the Conversational Retrieval Chain

Combine the language model, document retriever, and dialogue memory to answer questions in conversational context.

from langchain.chains import ConversationalRetrievalChain

document_retriever = vector_database.as_retriever()

question_answering_chain = ConversationalRetrievalChain.from_llm(
    llm=language_model,
    retriever=document_retriever,
    memory=conversation_history_memory
)

Handle questions and generate answers

After setup, the chain can process questions and generate answers using the saved conversation history for context.

initial_question = "Is probability a fundamental topic in this course?"
initial_result = question_answering_chain({"question": initial_question})
print("Answer:", initial_result['answer'])

follow_up_question = "Why are those topics considered prerequisites?"
follow_up_result = question_answering_chain({"question": follow_up_question})
print("Answer:", follow_up_result['answer'])

Building a Document‑Grounded QA Chatbot

This part provides an end‑to‑end guide to a chatbot that answers questions based on document content. It covers loading documents, splitting text, embeddings, and assembling a conversational retrieval chain.

Initial setup and imports

Import LangChain components for embeddings, text splitting, in‑memory search, document loading, conversational chains, and the chat model.

from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.document_loaders import TextLoader, PyPDFLoader
from langchain.chains import ConversationalRetrievalChain
from langchain_openai import ChatOpenAI

Load and process documents

Load documents, split them into manageable chunks, generate embeddings, and prepare a vector store; then return a ready‑to‑use conversational retrieval chain.

def load_documents_and_prepare_database(file_path, chain_type, top_k_results):
    """
    Load documents from a file, split into manageable chunks, generate embeddings,
    and prepare a vector database for retrieval.

    Args:
    - file_path: Path to the document file (PDF, text, etc.).
    - chain_type: Conversational chain type to use.
    - top_k_results: Number of top results to retrieve.

    Returns:
    - A conversational retrieval chain ready to answer questions.
    """
    # Load documents using a loader appropriate for the file type
    document_loader = PyPDFLoader(file_path)
    documents = document_loader.load()

    # Split into chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
    document_chunks = text_splitter.split_documents(documents)

    # Embed chunks and build the vector store
    embeddings_generator = OpenAIEmbeddings()
    vector_database = DocArrayInMemorySearch.from_documents(document_chunks, embeddings_generator)

    # Build the retriever
    document_retriever = vector_database.as_retriever(search_type="similarity", search_kwargs={"k": top_k_results})

    # Create the conversational retrieval chain
    chatbot_chain = ConversationalRetrievalChain.from_llm(
        llm=ChatOpenAI(model='gpt-4o-mini', temperature=0), 
        chain_type=chain_type, 
        retriever=document_retriever, 
        return_source_documents=True,
        return_generated_question=True,
    )

    return chatbot_chain

For convenience, add a thin wrapper used by the UI code:

def load_db(document_path, retrieval_type, top_k_results):
    return load_documents_and_prepare_database(document_path, retrieval_type, top_k_results)

Proceed to create a chatbot and a Panel‑based UI: import Panel (pn) and Param (param), then define a class that encapsulates document loading, query processing, and history.

import panel as pn
import param

Define a chatbot class that stores history, forms answers, and allows swapping the base document.

class DocumentBasedChatbot(param.Parameterized):
    conversation_history = param.List([])  # (question, answer) pairs
    current_answer = param.String("")      # Latest answer
    database_query = param.String("")      # Query sent to the document DB
    database_response = param.List([])     # Retrieved source documents

    def __init__(self, **params):
        super(DocumentBasedChatbot, self).__init__(**params)
        self.interface_elements = []  # UI elements for the conversation
        self.loaded_document = "docs/cs229_lectures/MachineLearning-Lecture01.pdf"  # Default document
        self.chatbot_model = load_db(self.loaded_document, "retrieval_type", 4)  # Initialize the bot model

Add document loading: load_document checks for a user file (or uses the default document), reloads the knowledge base, and clears history when the source changes.

    def load_document(self, upload_count):
        if upload_count == 0 or not file_input.value:
            return pn.pane.Markdown(f"Loaded document: {self.loaded_document}")
        else:
            file_input.save("temp.pdf")
            self.loaded_document = file_input.filename
            self.chatbot_model = load_db("temp.pdf", "retrieval_type", 4)
            self.clear_conversation_history()
        return pn.pane.Markdown(f"Loaded document: {self.loaded_document}")

Handle user turns: process_query sends the turn to the model, updates history and UI, and shows source snippets.

    def process_query(self, user_query):
        if not user_query:
            return pn.WidgetBox(pn.Row('User:', pn.pane.Markdown("", width=600)), scroll=True)
        result = self.chatbot_model({"question": user_query, "chat_history": self.conversation_history})
        self.conversation_history.extend([(user_query, result["answer"])])
        self.database_query = result["generated_question"]
        self.database_response = result["source_documents"]
        self.current_answer = result['answer']
        self.interface_elements.extend([
            pn.Row('User:', pn.pane.Markdown(user_query, width=600)),
            pn.Row('Assistant:', pn.pane.Markdown(self.current_answer, width=600, style={'background-color': '#F6F6F6'}))
        ])
        input_field.value = ''  # Clear input
        return pn.WidgetBox(*self.interface_elements, scroll=True)

For transparency, display the last DB query and the retrieved source documents.

    def display_last_database_query(self):
        if not self.database_query:
            return pn.Column(
                pn.Row(pn.pane.Markdown("Last database query:", style={'background-color': '#F6F6F6'})),
                pn.Row(pn.pane.Str("No database queries yet"))
            )
        return pn.Column(
            pn.Row(pn.pane.Markdown("Database query:", style={'background-color': '#F6F6F6'})),
            pn.pane.Str(self.database_query)
        )

    def display_database_responses(self):
        if not self.database_response:
            return
        response_list = [pn.Row(pn.pane.Markdown("Vector DB search result:", style={'background-color': '#F6F6F6'}))]
        for doc in self.database_response:
            response_list.append(pn.Row(pn.pane.Str(doc)))
        return pn.WidgetBox(*response_list, width=600, scroll=True)

Optionally, display the current chat history for quick inspection.

    def display_chat_history(self):
        if not self.conversation_history:
            return pn.WidgetBox(pn.Row('Chat:', pn.pane.Str('No messages yet.')), scroll=True)
        items = []
        for q, a in self.conversation_history:
            items.append(pn.Row('User:', pn.pane.Markdown(q, width=600)))
            items.append(pn.Row('Assistant:', pn.pane.Markdown(a, width=600, style={'background-color': "#FAFAFA"})))
        return pn.WidgetBox(*items, width=650, scroll=True)

Don’t forget reset: clear_conversation_history clears the current dialogue context.

    def clear_conversation_history(self, count=0):
        self.conversation_history = []

The result is a cohesive method: you set up the environment and keys, load documents and assemble a vector store, add advanced retrieval (self‑query, compression, semantic search), integrate dialogue memory, and build a Conversational Retrieval Chain where model, retriever, and memory work together. The examples and code show how the steps form a working bot; thanks to LangChain’s modularity, the result is easy to extend and debug.

Theory Questions

What components are needed to set up a LangChain chatbot development environment?
How does keeping dialogue history improve a chatbot’s functionality?
How are document chunks transformed into embeddings, and why?
Why are self‑query, compression, and semantic search useful?
How does the Conversational Retrieval Chain combine model, retriever, and memory?
How does ConversationBufferMemory help maintain dialogue context?
What are the steps to configure a vector store for semantic search in LangChain?
Why manage environment variables and API keys?
How does the modularity of LangChain’s retrieval methods increase development flexibility?
Why is choosing an appropriate LLM version important?

Practical Tasks

Create and populate a vector store (create_vector_store) from a list of strings (use a stub embedding function embed_document).
Implement semantic search (perform_semantic_search): embed a query, find the nearest document, return its index.
Add dialogue history to a Chatbot class and a respond_to_query method (generate via a stub generate_response).
Assemble a simplified Conversational Retrieval Chain with stubs (LanguageModel/DocumentRetriever/ConversationMemory).
In Chatbot, add methods to append and reset history; incorporate history when generating.
Document QA: load a string, split it, create embeddings, build a vector store, run semantic search, and generate an answer (stubs allowed).
Integrate memory into the retrieval chain (use the extensions from tasks 5–6).
Build a small CLI for chatting with the Chatbot: send queries, print answers, view/reset history.

Alternate Panel UI Variant and Dashboard

Below is an additional chatbot class and a ready-to-use Panel dashboard that mirrors the Russian version’s UI section.

import panel as pn
import param

class ChatWithYourDataBot(param.Parameterized):
    conversation_history = param.List([])
    latest_answer = param.String("")
    document_query = param.String("")
    document_response = param.List([])

    def __init__(self, **params):
        super(ChatWithYourDataBot, self).__init__(**params)
        self.interface_elements = []
        self.default_document_path = "docs/cs229_lectures/MachineLearning-Lecture01.pdf"
        self.chatbot_model = load_db(self.default_document_path, "retrieval_mode", 4)

Add minimal method implementations to support the bound UI actions and panels:

    def load_document(self, clicks):
        if not getattr(document_upload, 'value', None):
            return pn.pane.Markdown(f"Loaded document: {self.default_document_path}")
        document_upload.save("temp.pdf")
        self.default_document_path = document_upload.filename or self.default_document_path
        self.chatbot_model = load_db("temp.pdf", "retrieval_mode", 4)
        self.clear_history()
        return pn.pane.Markdown(f"Loaded document: {self.default_document_path}")

    def process_query(self, user_query):
        if not user_query:
            return pn.WidgetBox(pn.Row('User:', pn.pane.Markdown("", width=600)), scroll=True)
        result = self.chatbot_model({"question": user_query, "chat_history": self.conversation_history})
        self.conversation_history.extend([(user_query, result.get("answer", ""))])
        self.document_query = result.get("generated_question", "")
        self.document_response = result.get("source_documents", [])
        self.latest_answer = result.get('answer', "")
        self.interface_elements.extend([
            pn.Row('User:', pn.pane.Markdown(user_query, width=600)),
            pn.Row('Assistant:', pn.pane.Markdown(self.latest_answer, width=600, style={'background-color': "#F6F6F6"}))
        ])
        user_query_input.value = ""
        return pn.WidgetBox(*self.interface_elements, scroll=True)

    def display_last_database_query(self):
        if not self.document_query:
            return pn.Column(
                pn.Row(pn.pane.Markdown("Last database query:", style={'background-color': "#F6F6F6"})),
                pn.Row(pn.pane.Str("No database queries yet"))
            )
        return pn.Column(
            pn.Row(pn.pane.Markdown("Database query:", style={'background-color': "#F6F6F6"})),
            pn.pane.Str(self.document_query)
        )

    def display_database_responses(self):
        if not self.document_response:
            return
        items = [pn.Row(pn.pane.Markdown("Vector DB search result:", style={'background-color': "#F6F6F6"}))]
        for doc in self.document_response:
            items.append(pn.Row(pn.pane.Str(doc)))
        return pn.WidgetBox(*items, width=600, scroll=True)

    def display_chat_history(self):
        if not self.conversation_history:
            return pn.WidgetBox(pn.Row('Chat:', pn.pane.Str('No messages yet.')), scroll=True)
        items = []
        for q, a in self.conversation_history:
            items.append(pn.Row('User:', pn.pane.Markdown(q, width=600)))
            items.append(pn.Row('Assistant:', pn.pane.Markdown(a, width=600, style={'background-color': "#FAFAFA"})))
        return pn.WidgetBox(*items, width=650, scroll=True)

    def clear_history(self, *_):
        self.conversation_history = []

Create widgets and bind actions:

document_upload = pn.widgets.FileInput(accept='.pdf')
load_database_button = pn.widgets.Button(name="Load document", button_type='primary')
clear_history_button = pn.widgets.Button(name="Clear history", button_type='warning')
clear_history_button.on_click(ChatWithYourDataBot.clear_history)
user_query_input = pn.widgets.TextInput(placeholder='Type your question here…')

load_document_action = pn.bind(ChatWithYourDataBot.load_document, load_database_button.param.clicks)
process_query = pn.bind(ChatWithYourDataBot.process_query, user_query_input)

Assemble tabs and the dashboard:

conversation_visual = pn.pane.Image('./img/conversation_flow.jpg')

conversation_tab = pn.Column(
    pn.Row(user_query_input),
    pn.layout.Divider(),
    pn.panel(process_query, loading_indicator=True, height=300),
    pn.layout.Divider(),
)

database_query_tab = pn.Column(
    pn.panel(ChatWithYourDataBot.display_last_database_query),
    pn.layout.Divider(),
    pn.panel(ChatWithYourDataBot.display_database_responses),
)

chat_history_tab = pn.Column(
    pn.panel(ChatWithYourDataBot.display_chat_history),
    pn.layout.Divider(),
)

configuration_tab = pn.Column(
    pn.Row(document_upload, load_database_button, load_document_action),
    pn.Row(clear_history_button, pn.pane.Markdown("Clears the conversation for a new topic.")),
    pn.layout.Divider(),
    pn.Row(conversation_visual.clone(width=400)),
)

chatbot_dashboard = pn.Column(
    pn.Row(pn.pane.Markdown('# ChatWithYourDat-Bot')),
    pn.Tabs(('Conversation', conversation_tab), ('DB queries', database_query_tab), ('Chat history', chat_history_tab), ('Setup', configuration_tab))
)

This completes the UI parity with the Russian chapter while keeping the English text clear and idiomatic.