Information Retrieval & Conversational AI

01 — The Vision

Before LLMs existed, I built one from scratch.

The year was 2018. ChatGPT didn't exist. Neither did GPT-3. The goal was audacious: build an AI that could answer any question — not from a pre-built database, but by mining the entire internet in real time.

Voice in, knowledge out. A Seq2Seq LSTM chatbot handled general conversation while a parallel information retrieval engine scraped Wikipedia, Wolfram Alpha, Quora, Google, Bing, Yahoo, and WikiHow simultaneously — extracting, summarizing, and speaking the answers back.

"The pre-eminent objective is to fabricate a framework which can mimic artificial intelligence and give results to any kind of question asked by the user through voice." — from the thesis abstract

02 — System Architecture

Five layers. Zero pre-built answers.

Every question flowed through a purpose-built pipeline — from voice input to spoken answer.

Interface

AngularJS + Web Speech API

Browser-based voice interface. Speech recognition converts spoken questions to text. Speech synthesis reads answers aloud. Full hands-free loop.

AngularJS Web Speech API JavaScript

↓

Chatbot

Seq2Seq LSTM Neural Network

Encoder-decoder architecture trained on the Cornell Movie Dialogues dataset. 10,000 conversational samples. Handles general conversation and small talk.

Keras TensorFlow LSTM Paperspace GPU

↓

Retrieval

Multi-Source Web Mining Pipeline

When the chatbot can't answer, the retrieval engine fires in parallel — scraping and querying multiple knowledge sources simultaneously.

Wikipedia API Wolfram Alpha Quora Google / Bing / Yahoo WikiHow

↓

Processing

Text Extraction & Summarization

Raw HTML scraped from the web is parsed, core information extracted, and summarized into concise, meaningful answers for the user.

Beautiful Soup Text Summarization NLP

↓

Backend

Django + Python Orchestration

The brain that routes queries, manages the chatbot, dispatches retrieval tasks, and serves responses back to the frontend.

Django Python REST API

Voice Input → Speech-to-Text → Django Router → Chatbot / Retrieval → Summarize → Text-to-Speech

03 — The Seq2Seq Chatbot

Training a neural network to talk.

The conversational engine was built on an encoder-decoder LSTM architecture. The encoder reads the input question word by word, compressing it into a fixed-dimensional vector. The decoder then generates a response, one token at a time, from that compressed representation.

Training began with a 100-sample experiment to validate the approach — loss dropped from 230% to just 3.7% in 100 epochs. The model was then scaled to 10,000 samples from the Cornell Movie Dialogues dataset and trained on a Paperspace cloud GPU.

10k

Training Samples

3.7%

Final Loss (100 sample)

LSTM

Architecture

GPU

Cloud Trained

"The number of samples is directly proportional to the number of epochs the model needs to train. The model can clearly answer any question from its training bucket — the goal is a proper dataset with good samples and a high-end GPU to train on."

04 — Real-Time Information Retrieval

The killer feature: mining the web live.

When the chatbot couldn't answer a factual question, the system didn't give up. It fired off requests to multiple knowledge sources in parallel, extracted the relevant information, summarized it, and spoke the answer back — all in real time.

📚

wikipedia

Wikipedia

Encyclopedic knowledge via the Wikipedia API. Full article summaries, key facts, and structured data extracted on demand.

🧮

wolfram

Wolfram Alpha

Computational and mathematical queries. Unit conversions, scientific calculations, and data-driven answers from the computational knowledge engine.

💬

quora

Quora

Opinion-based and subjective answers. Scraped using Beautiful Soup to extract top-voted community responses to questions.

🔍

serp

Google / Bing / Yahoo

General web search across three major engines. SERP scraping extracts snippets, URLs, and page content for any open-domain query.

🛠

wikihow

WikiHow

Step-by-step how-to guides. HTML structure parsed to extract procedural knowledge for "how do I..." style questions.

✂

summarize

Text Summarization

All raw scraped content is filtered, cleaned, and condensed into concise answers — only meaningful insight reaches the user.

05 — Compared to the Giants

Outperforming Siri, Alexa & Google on open-domain Q&A.

In 2018, Siri, Google Assistant, and Alexa relied on pre-built knowledge bases and predefined workflows. When asked an unusual or complex question, they'd either redirect to a web search or fail silently. This system went to the source in real time — scraping, extracting, summarizing, and actually answering the question.

Capability	Apple Siri	Google Assistant	Amazon Alexa	This System
General Conversation	Limited	Limited	Limited	Seq2Seq LSTM
Open-Domain Q&A	Web redirect	Partial	Web redirect	Real-time retrieval
Computational Queries	Basic	Yes	Basic	Wolfram Alpha
Multi-Source Answers	No	No	No	5+ sources parallel
Answer Summarization	No	No	No	NLP summarization
No Pre-Built Knowledge	Relies on it	Relies on it	Relies on it	100% real-time
Voice-First Interface	Yes	Yes	Yes	Yes

06 — The Future Vision

A wearable PAS — predicted in 2018.

The thesis didn't just build a web app. It envisioned the endgame: a fully portable Personal Assistant System worn on the body. Cloud computing handles the AI. The user carries only a lightweight wearable. This was written years before Meta's smart glasses, Humane's AI Pin, or Apple Vision Pro existed.

⌚

Wrist-Projector Display

A smartband with a pico projector that beams a display onto your forearm. Visual output without a screen. Predicted wearable AR before it went mainstream.

🎧

Bluetooth Earpiece

Voice input and audio output through a lightweight earpiece. Hands-free, eyes-free interaction. The only hardware the user actually carries.

☁

Cloud Brain

All computation happens in the cloud. The chatbot, retrieval engine, summarization — everything runs on remote servers. The wearable is just an interface.

🌐

Always Connected

Low-energy Bluetooth + Wi-Fi keeps the system linked to the cloud at all times. Endless knowledge, zero local storage required.

"The only portable system carried with the user will be the small lightweight earpiece providing complete hands-free communication to the system — connecting the user to endless knowledge and information." — from the thesis conclusion

07 — The Full Thesis

Read the original paper.

38 pages. Encoder-decoder models, web scraping pipelines, training logs, system comparisons, and the wearable PAS vision. The complete Master's thesis, exactly as submitted to the University of Essex in August 2018.

MSP Final Report — sg17402 · University of Essex Download PDF