Talking tagging taxonomies with Bec Sareff-Hibbert

Published

28 September 2021

Content

My approach to building a tagging taxonomy goes back to my training as an anthropologist. A significant part of anthropology is understanding different cultures through direct observation and participation. An anthropologist must avoid imposing their cultural perspectives on a new culture or risk compromising the reliability of their results. This experience has taught me to look at what the data is showing me instead of forcing my perspective on it.

I’ve since learned to adapt my approach to taxonomies to applied research and UX. What I’ve realized is that it’s often necessary to build different taxonomies, depending on the outcomes of the research.

This article will touch on what a good taxonomy looks like, cover my strategies for building a taxonomy, and explain how to create different taxonomies depending on the research goals.

What makes a good taxonomy?

A good taxonomy breaks down your data into components that can be used to synthesize insights.

On the other hand, a less helpful taxonomy labels or categorizes data (describing) rather than looking beneath the hood to see what’s really going on (analysis). In this way, it doesn’t reveal anything new about the data other than what you knew before you started tagging.

Tags vs. fields

The difference between analysis and description also means that tags are fundamentally different from fields. While tags analyze the content of your data, fields describe the data source itself and provide context around it. Similarly, this separates your project-level tags from your global tags in Dovetail. I use global tags to organize my data but not to analyze it.

Fields also supercharge your analysis, and they’re one of the things that make Dovetail so much more potent than an affinity sort. Using fields means that I can re-filter my data over time and get new perspectives on the same analysis.

Tags versus fields. Knowing the difference could save your life. Or at least, save you from bad research.

Here are some examples of fields I add to describe my participants:

The product they are using. This helps me organize who my insights are most relevant for in my team.
Their organization size. This might affect the needs that the participant has.
Their role. This helps me understand the participant’s context and perspective.

Man reading a burning book while peering at his laptop.

Four pitfalls of establishing a tagging taxonomy as a team of one

Building your taxonomy

We can make taxonomies in many different ways, but building the wrong taxonomy is a quick and easy way to bake confirmation bias into your analysis.

Roughly, there are two kinds of taxonomies—bottom-up and top-down. These are defined by whether you create from the ground up or come with a pre-made taxonomy.

My strategy for building a taxonomy changes based on the research project’s goals and the level of ambiguity within the research area.

If there is more ambiguity in the research area, I focus on building my taxonomy from the bottom-up, solidly grounding it in the data. An example might be a project in a new area where we don’t currently understand the user needs.

Conversely, if I already have a solid understanding (for example, from a previous research project), I can apply a top-down method. I use this approach when analyzing customer feedback in an area where we have already conducted contextual research and identified user needs.

The distinction between bottom-up and top-down taxonomies roughly matches the stage of the design lifecycle I’m supporting with my research.

Generative research involves much more ambiguity, and my goals revolve around discovery and exploration. As such, when I start in a new area with generative research, I’ll adopt a bottom-up strategy.

Once we get to evaluating an existing feature or product, I can adapt a taxonomy I created earlier in the design process. I can do this confidently as there is less uncertainty, and we have already identified the user needs.

Understanding where in the product design lifecycle you are will help you identify the best approach for a taxonomy.

Caution—it’s easy to overestimate our knowledge. For this reason, when thinking about how much I already know, I consciously nudge my answer towards the side of being less confident.

Here are some questions I use to figure out what strategy I’m going to use to build my taxonomy:

Question: What part of the design process are you supporting?

Action: If I’m working in early Discovery phases, I’ll adopt a bottom-up approach. If we’re closer to evaluation, my approach will be more top-down or a mix of the two.

Question: How much ambiguity is there in the research area already, and how much do we already know?

Action: It’s easy to bake confirmation bias into your results by building a taxonomy based on assumptions and previous misconceptions. Ask yourself how much certainty you have in the topic area, then take that answer and then nudge it a bit more to the uncertainty category. You can also do an assumption mapping exercise individually or with your product team, and rate your confidence in each assumption to make this explicit.

Question: What’s the depth of the project, and what’s the desired outcome?

Action: Building a bottom-up taxonomy is rewarding, but it’s also time-consuming. Not every project needs the depth or richness that this approach provides. I moderate my strategy based on the goals of the research.

Question: Are the research questions quantitative or qualitative?

Action: Using a taxonomy to quantify your research is a very different beast from using it as a qualitative analysis tool. If you’re planning on quantifying your tags, you should factor this into your strategy from the start. I go into more detail about quantifying tags below.

Building a bottom-up taxonomy

Strengths

Ideal for supporting exploration and discovery
Firmly grounded in the data, helping you find new insights and user needs
Helps to minimize confirmation bias by creating a new taxonomy suited to the data

Weaknesses

Needs to be adapted before reusing between projects
It’s time-consuming to create
It’s not quantifiable

Process

Start tagging from scratch. Go for a swim in your data. Really immerse yourself in it. You’ll start to notice patterns, which become tags organically. Your tags will start as relatively high-level, fuzzy, and broad. Keep separate tags for intriguing things you don’t understand yet (to break down later) and critical quotes to return to when writing your report.
Evolve and add. In between each transcript, go back and prune your board. As more tags emerge, you might realize that you can apply them to notes you have already analyzed. You might also merge redundant tags or split apart tags that contain more than one idea.
Regroup the tags. Go through the process of regrouping tags a few times. Consider whether you should move tags into other groups. These groups can become headings in your report later.
Optimize for insights. As you do multiple passes, your tags start getting more and more defined and turn into insights. That’s when you know it’s ready to come out of the oven.

This process creates a taxonomy that’s tailor-made to help you make meaning out of a previously chaotic bucket of data. Because these taxonomies are made-to-measure, reusing them can be risky:

Some risks with reusing taxonomies:

The taxonomy becomes static over time
Data is tagged too generally to fit the pre-existing taxonomy
Not adding tags when the opportunity arises.
Incorrectly applying a taxonomy to a project where there is less than ideal fit.

Forcing new data into a system made for a different context or topic is like using a cookie cutter and throwing away the scraps of cookie dough. You’re forcing the data into a different shape, which gives you misleading insights and risks missing new ideas. That said, if you’re starting a project in the same area, you can use your existing taxonomy as a starting point as long as you stay flexible enough to spot new insights.

Research repositories are the way of the future.

Three unexpected ways a research repository makes your life easier

Building a top-down taxonomy

Strengths

Reusable between projects
Quicker to create
Can be quantified

Weaknesses

Isn’t ideal for discovering new needs
Less grounded in the specific data set being analyzed
More prone to confirmation bias

I’ll be honest, working with a top-down taxonomy always makes me nervous. But it probably should—you’re essentially applying a set of assumptions to unfamiliar data.

However, there are times when it’s handy to have this strategy in your toolkit because the richness of your taxonomy should match the richness of your data.

For example, when you’re tagging a large volume of data that’s not particularly rich (like short pieces of feedback), your goals might be to quantify the themes within the dataset rather than discovering new ones.

It’s best in this situation to use a randomized sample from your data set, create a bottom-up taxonomy, and then use that to tag the rest of your data. However, you might base your top-down taxonomy on existing knowledge, like your prior research or other research you can access.

When using a top-down taxonomy, it’s even more important to be open to revising your taxonomy. I nudge myself in this direction by adding a tag at the beginning of my analysis for data that doesn’t fit into my existing tags. I can use this as I go, and it removes some of the temptation to force my data into my existing taxonomy.

Quantifying tags

Along with your research goals, the tagging strategy you’ve chosen will determine whether it makes sense to quantify your tags. If your research goals were generative, it’s not productive to quantify your tags.

Quantifying tags creates a summary by reducing your data to a number. Generative research aims to create depth and richness, which a summary does not serve. However, quantifying your tags is appropriate where your research goals were to measure a trend in your data (i.e., ‘how many’ ‘how much’).

Quantifying your tags also requires that you’ve been deliberate and consistent about the definitions and boundaries between your tags. This means you can’t be as organic and flexible with them (the ideal when tagging qualitatively).

However, if you’ve been consistent and deliberate about defining your tags, and your data set is big enough, it can be useful to quantify your tags to create a summary of the broader trends within your data set.

Tagging as a team

Tagging your data as a team is a great way to create alignment and benefit from the perspectives of others in your team. However, it’s easy for two people to interpret a title of a tag differently. Once you involve other brains in your analysis, it becomes essential to understand and use your tags consistently. This is especially important if you’re quantifying your data.

Here’s the process I use when tagging with other researchers or with my product team:

Get together in a room or on a call
Take a sample of the data to build a bottom-up taxonomy
Complete a couple of transcripts together
Identify what we’re learning from it and brainstorm the themes
Create a shared tagging system
Collectively write the definitions of those tags
If you have a large quantity of data or limited time, split up the remaining transcripts across the team and tackle them individually
If you have time, review how you’ve each used the taxonomy as a team to make sure you’re staying consistent

Variety is the spice of life

Having a variety of strategies ready for building a taxonomy is a great way to level up your analysis and gives you the flexibility to choose the best approach. This means your data can live its best life, and you can get the most out of your research.

Next time you’re starting a new research project, try thinking about what kind of taxonomy will best serve your goals—soon, you’ll have a whole toolkit ready to meet your next research challenge.

Want to learn more about tagging taxonomies? Join us in our community Slack channel, where 2000+ research hang out and discuss all things research-related.