Topic Discovery
Surfacing topics within large text datasets
Topic Discovery
Surfacing topics within large text datasets
Synthesis
·
2023

Previews of the Topic Discovery platform.

Previews of the Topic Discovery platform.
Synthesis is a consultancy that creatively layers open data to expose commercial opportunities.
In 12 weeks, I designed and shipped a web app that enabled the consultancy team to uncover topics in large text datasets, turning thousands of social media posts, blogs, and research papers into distinct, sized themes.
Role
I was the sole Product Designer on the project, fully responsible for research and end-to-end design. I worked in a team with a data scientist, a frontend developer and two backend engineers.
Timeline
12 weeks
Tools
Figma
Material UI
Synthesis is a consultancy that creatively layers open data to expose commercial opportunities.
In 12 weeks, I designed and shipped a web app that enabled the consultancy team to uncover topics in large text datasets, turning thousands of social media posts, blogs, and research papers into distinct, sized themes.
Role
I was the sole Product Designer on the project, fully responsible for research and end-to-end design. I worked in a team with a data scientist, a frontend developer and two backend engineers.
Timeline
12 weeks
Tools
Figma
Material UI
Synthesis is a consultancy that creatively layers open data to expose commercial opportunities.
In 12 weeks, I designed and shipped a web app that enabled the consultancy team to uncover topics in large text datasets, turning thousands of social media posts, blogs, and research papers into distinct, sized themes.
Role
I was the sole Product Designer on the project, fully responsible for research and end-to-end design. I worked in a team with a data scientist, a frontend developer and two backend engineers.
Timeline
12 weeks
Tools
Figma
Material UI
The Challenge
Existing approaches were slow and manual
Identifying and sizing topics in large datasets was foundational to every Synthesis project, but the existing keyword-sizing based approach was manual and time-consuming.
For the Strategists who sifted through thousands of rows of data to identify themes, and the Data Scientists who had to manually run scripts, the work was repetitive, unfulfilling, and prone to bias. It also left less time for analysis and creative output.
The Challenge
Existing approaches were slow and manual
Identifying and sizing topics in large datasets was foundational to every Synthesis project, but the existing keyword-sizing based approach was manual and time-consuming.
For the Strategists who sifted through thousands of rows of data to identify themes, and the Data Scientists who had to manually run scripts, the work was repetitive, unfulfilling, and prone to bias. It also left less time for analysis and creative output.
“
I don’t know what I don’t know. I’m afraid I miss out on interesting themes.
— Strategist
“
I keep hitting Shift + Enter to run the same line of code.
— Data Scientist

A dataset on Google Sheets with 77,000 rows of social media posts.

A dataset on Google Sheets with 77,000 rows of social media posts.
Research and Approach
User priorities defined project goals
Through workflow audits, interviews and surveys, I learned that Strategists approached topic discovery differently across projects, but all prioritised:
Research and Approach
User priorities defined project goals
Through workflow audits, interviews and surveys, I learned that Strategists approached topic discovery differently across projects, but all prioritised:
1. Getting a "lay of the land" of topics.
Strategists needed to capture a broad snapshot of topics in their reports.
1. Getting a "lay of the land" of topics.
Strategists needed to capture a broad snapshot of topics in their reports.
2. Sizing topics accurately.
This would allow Strategists to prioritise findings and recommendations.
2. Sizing topics accurately.
This would allow Strategists to prioritise findings and recommendations.
Additionally, Strategists wanted to be able to:
3. Find topics without bias.
Strategists felt least confident in their ability to meet this goal without support.
3. Find topics without bias.
Strategists felt least confident in their ability to meet this goal without support.
4. Get work done faster.
They hoped to half time spent from 30% to 15% of project time.
4. Get work done faster.
They hoped to half time spent from 30% to 15% of project time.
5. Discover the unexpected.
Beyond the largest topics, they wanted to be surprised by something interesting.
5. Discover the unexpected.
Beyond the largest topics, they wanted to be surprised by something interesting.
These findings informed design direction and priorities. We set out to build a tool that would efficiently provide Strategists with an unbiased overview of topics in a dataset, with each topic accurately sized.
Together with the data science team, we decided on a topic modelling approach to surface themes and relevant keywords within a given text dataset.
Multiple iterations of both the interface and model eventually led to the Topic Discovery platform—a web app that allowed Strategists to independently run a topic model to cluster large text datasets into distinct sets of topics.
Final Designs
1. Overview cards summarise topics
After running their dataset through the topic model, users would see an overview of topics presented as cards. Each card contained the 10 most relevant keywords to the topic, the number of documents, and a line graph visualising the topic over time.
Users could click the topic name to rename it. Clicking the card would open a new page, where users could explore the individual documents within each topic.
The topic histogram on the left persisted across pages, and allowed users to compare the size of each topic, and quickly jump between topics.
Final Designs
1. Overview cards summarise topics
After running their dataset through the topic model, users would see an overview of topics presented as cards. Each card contained the 10 most relevant keywords to the topic, the number of documents, and a line graph visualising the topic over time.
Users could click the topic name to rename it. Clicking the card would open a new page, where users could explore the individual documents within each topic.
The topic histogram on the left persisted across pages, and allowed users to compare the size of each topic, and quickly jump between topics.

The topic overview page, summarising all topics through cards.

The topic overview page, summarising all topics through cards.
2. Tree view visualises topic relationships
The topic model defined relationships between topics based on their similarity to each other. A separate tree view allowed users to explore these relationships.
A granularity slider allowed users to adjust how topics fell into higher-order branches.
2. Tree view visualises topic relationships
The topic model defined relationships between topics based on their similarity to each other. A separate tree view allowed users to explore these relationships.
A granularity slider allowed users to adjust how topics fell into higher-order branches.

A tree view allowed users to explore relationships between topics.

A tree view allowed users to explore relationships between topics.
3. Flexible keyword search
A search box on the upper-right enabled users to flexibly query documents. By supporting boolean logic, users could also search for complex phrases.
The search results page highlights topics containing their search term, and lists out the matching documents.
3. Flexible keyword search
A search box on the upper-right enabled users to flexibly query documents. By supporting boolean logic, users could also search for complex phrases.
The search results page highlights topics containing their search term, and lists out the matching documents.

The search feature supported complex queries with boolean logic.

The search feature supported complex queries with boolean logic.
4. Auto name topics with AI
An "Auto Name" button sped up analysis by leveraging AI to summarise the contents of each cluster into a concise topic name.
4. Auto name topics with AI
An "Auto Name" button sped up analysis by leveraging AI to summarise the contents of each cluster into a concise topic name.
Clicking the "Auto name" button leverages AI to generate a descriptive topic name.
Clicking the "Auto name" button leverages AI to generate a descriptive topic name.
Louise Sunico
·
2025
Louise Sunico
·
2025
Louise Sunico
·
2025