AI Journey
MSc Thesis · 2017 — 2018

Information Retrieval &
Conversational AI

A voice-driven AI that mines the entire web in real time to answer any question — built as a Master's thesis at the University of Essex.

University of Essex MSc Artificial Intelligence Supervisor: John C. Woods

Before LLMs existed, I built one from scratch.

The year was 2018. ChatGPT didn't exist. Neither did GPT-3. The goal was audacious: build an AI that could answer any question — not from a pre-built database, but by mining the entire internet in real time.

Voice in, knowledge out. A Seq2Seq LSTM chatbot handled general conversation while a parallel information retrieval engine scraped Wikipedia, Wolfram Alpha, Quora, Google, Bing, Yahoo, and WikiHow simultaneously — extracting, summarizing, and speaking the answers back.

"The pre-eminent objective is to fabricate a framework which can mimic artificial intelligence and give results to any kind of question asked by the user through voice." — from the thesis abstract


Five layers. Zero pre-built answers.

Every question flowed through a purpose-built pipeline — from voice input to spoken answer.

Interface

AngularJS + Web Speech API

Browser-based voice interface. Speech recognition converts spoken questions to text. Speech synthesis reads answers aloud. Full hands-free loop.

AngularJS Web Speech API JavaScript
Chatbot

Seq2Seq LSTM Neural Network

Encoder-decoder architecture trained on the Cornell Movie Dialogues dataset. 10,000 conversational samples. Handles general conversation and small talk.

Keras TensorFlow LSTM Paperspace GPU
Retrieval

Multi-Source Web Mining Pipeline

When the chatbot can't answer, the retrieval engine fires in parallel — scraping and querying multiple knowledge sources simultaneously.

Wikipedia API Wolfram Alpha Quora Google / Bing / Yahoo WikiHow
Processing

Text Extraction & Summarization

Raw HTML scraped from the web is parsed, core information extracted, and summarized into concise, meaningful answers for the user.

Beautiful Soup Text Summarization NLP
Backend

Django + Python Orchestration

The brain that routes queries, manages the chatbot, dispatches retrieval tasks, and serves responses back to the frontend.

Django Python REST API
Voice Input Speech-to-Text Django Router Chatbot / Retrieval Summarize Text-to-Speech

Training a neural network to talk.

The conversational engine was built on an encoder-decoder LSTM architecture. The encoder reads the input question word by word, compressing it into a fixed-dimensional vector. The decoder then generates a response, one token at a time, from that compressed representation.

Training began with a 100-sample experiment to validate the approach — loss dropped from 230% to just 3.7% in 100 epochs. The model was then scaled to 10,000 samples from the Cornell Movie Dialogues dataset and trained on a Paperspace cloud GPU.

10k
Training Samples
3.7%
Final Loss (100 sample)
LSTM
Architecture
GPU
Cloud Trained

"The number of samples is directly proportional to the number of epochs the model needs to train. The model can clearly answer any question from its training bucket — the goal is a proper dataset with good samples and a high-end GPU to train on."


The killer feature: mining the web live.

When the chatbot couldn't answer a factual question, the system didn't give up. It fired off requests to multiple knowledge sources in parallel, extracted the relevant information, summarized it, and spoke the answer back — all in real time.

📚
wikipedia

Wikipedia

Encyclopedic knowledge via the Wikipedia API. Full article summaries, key facts, and structured data extracted on demand.

🧮
wolfram

Wolfram Alpha

Computational and mathematical queries. Unit conversions, scientific calculations, and data-driven answers from the computational knowledge engine.

💬
quora

Quora

Opinion-based and subjective answers. Scraped using Beautiful Soup to extract top-voted community responses to questions.

🔍
serp

Google / Bing / Yahoo

General web search across three major engines. SERP scraping extracts snippets, URLs, and page content for any open-domain query.

🛠
wikihow

WikiHow

Step-by-step how-to guides. HTML structure parsed to extract procedural knowledge for "how do I..." style questions.

summarize

Text Summarization

All raw scraped content is filtered, cleaned, and condensed into concise answers — only meaningful insight reaches the user.


Outperforming Siri, Alexa & Google on open-domain Q&A.

In 2018, Siri, Google Assistant, and Alexa relied on pre-built knowledge bases and predefined workflows. When asked an unusual or complex question, they'd either redirect to a web search or fail silently. This system went to the source in real time — scraping, extracting, summarizing, and actually answering the question.

Capability Apple Siri Google Assistant Amazon Alexa This System
General Conversation Limited Limited Limited Seq2Seq LSTM
Open-Domain Q&A Web redirect Partial Web redirect Real-time retrieval
Computational Queries Basic Yes Basic Wolfram Alpha
Multi-Source Answers No No No 5+ sources parallel
Answer Summarization No No No NLP summarization
No Pre-Built Knowledge Relies on it Relies on it Relies on it 100% real-time
Voice-First Interface Yes Yes Yes Yes

A wearable PAS — predicted in 2018.

The thesis didn't just build a web app. It envisioned the endgame: a fully portable Personal Assistant System worn on the body. Cloud computing handles the AI. The user carries only a lightweight wearable. This was written years before Meta's smart glasses, Humane's AI Pin, or Apple Vision Pro existed.

Wrist-Projector Display

A smartband with a pico projector that beams a display onto your forearm. Visual output without a screen. Predicted wearable AR before it went mainstream.

🎧

Bluetooth Earpiece

Voice input and audio output through a lightweight earpiece. Hands-free, eyes-free interaction. The only hardware the user actually carries.

Cloud Brain

All computation happens in the cloud. The chatbot, retrieval engine, summarization — everything runs on remote servers. The wearable is just an interface.

🌐

Always Connected

Low-energy Bluetooth + Wi-Fi keeps the system linked to the cloud at all times. Endless knowledge, zero local storage required.

"The only portable system carried with the user will be the small lightweight earpiece providing complete hands-free communication to the system — connecting the user to endless knowledge and information." — from the thesis conclusion


Read the original paper.

38 pages. Encoder-decoder models, web scraping pipelines, training logs, system comparisons, and the wearable PAS vision. The complete Master's thesis, exactly as submitted to the University of Essex in August 2018.

MSP Final Report — sg17402 · University of Essex Download PDF