Home Home Posts Rants about IT DevOps Stuff I'm working on

Tags and categories: what's the difference?

Arseni Mourzenko
Founder and lead developer, specializing in developer productivity and code quality
April 4, 2013

This article is originally published on Stack Exchange.

Is there a difference between tags and categories? Is this difference clear for people without technical background, i.e. ordinary users?

Tags are inherently different from categories and especially allow to solve the uniqueness of attribution problem I've already illustrated in the black sleeping cat example. Users, on the other hand, are not always aware of this difference. When I asked people what is the difference between categories and tags, there was a lot of different answers:

  1. “Categories are usually presented in a form of a tree; tags are never presented this way.”

    Wrong. As categories may be flat, tags may be contained in larger tags which are themselves in other tags, etc. The similar approach is used by Adobe Lightroom with its keywords: while a photo may have several keywords (tags), those keywords may be organized in a form of a tree. Assigning a child keyword usually (depends on the settings) assigns the parent keywords as well.

  2. “Categories are more general than tags.”

    Wrong. Categories are exactly the same as tags; the only difference is that the same element may have multiple tags, but the same element cannot be in several categories at a time.

  3. “Categories are used to group content, whereas tags are used to quickly find the content later.”

    Wrong. Both are used to group content. If I tag some of my photos my cat, it means that I have a group of photos with my cat on them.

    Both are used to quickly find the content later. When I put a file in G:\Development\<Project name> and another file in E:\Misc\Funny pictures, it's exactly for the purpose of finding the content easier later.

  4. “Tags are used to indicate something about the tagged element; categories show that the element belongs to something.”

    Wrong, or at least difficult to understand. When I tag a photo as Niagara Falls, it means exactly the same thing as if I were putting this photo in G:\Photos\Niagara Falls\ directory, i.e. that the photo has Niagara Falls on it, and that it belongs to the set of photos of Niagara Falls.

  5. “Categories are mutually exclusive. Tags are not.”


Conclusion: So, what do we get? We get that users don't really understand the difference between categories and tags. The worst of all is that they are expecting differences which don't exist, and that the differences they imagine may turn them against the usage of the tags.

How does it apply to search and filtering?

Search and filtering

There are basically three ways to search for content:

  1. Category-based searching.

    This is the basic form of search where the user remains passive. Tree-oriented structures are the most direct illustration of category-based searching. When I want to buy a new Xeon E5-2620 CPU on my favorite website, I go to:

    Hardware › Components › CPU › Socket 2011
  2. Meta-based and/or assisted searching.

    This search still assists the user, but enables the user to be more active. For example, when I want to buy the AF-S NIKKOR 70-200mm f/2.8G ED VR II lens, I may filter the list of every lens by specifying that I want to see only the lenses produced by Nikon, which are from 70 to 200mm and have vibration reduction.

  3. Free text searching.

    This search is the most permissive, since the user writes whatever she want. This is also the most powerful one when the user knows exactly what she's searching for and when search actually works (most of the time, it doesn't).

    For example, if I want to read the specs of Ford Fusion Hybrid SE, by typing Fusion Hybrid SE on a website which publishes vehicle specifications, I expect to see exactly the page corresponding to this model.

Many websites and applications allow several of those three search models. Often, tree-form categories-based content can also be found through a textual search, or there are tags and textual search at the same time, or categorized content can additionally be filtered with meta-based filters, etc.

This is done because of the simple observation: people are using the type of search they need in a specific circumstance. Taking again the black sleeping cat example:

  • The user who wants to find photos of black cats will simply use the tag cat and filter the images to display only black objects.

  • The user who wants an exact photo of a little black kitty having fun with a mouse will do a text search for black kitten playing with a mouse.

  • The user who just wants to spend the next hour studying specifications and gazing at the sexy photos of Socket 2011 CPUs will probably use categories tree.

The place of tags in search and filtering

Tags are weird, since they replace categories, but also belong to the second type of search: the meta-based one.

For the sake of simplicity, we can assume that there might be a difference between tags and metadata:¹ metadata would be presented more as a purely filtering technique, whereas tags would be primary used for search.

For example, the size of the photo would be pure meta, used to filter photos to show only the large ones.

In this case, tags would present themselves as a search element which is used when the user wants to remain passive. Nearly identical to categories, especially when placed in a form of a tree, tags would still remain different from categories because of their non-exclusivity.

When categories are replaced by tags, users may not really understand that they should include multiple tags in order to focus their search to what they really need. This is exactly as the issue of some people when it comes to using textual search. For example when searching for the tickets price they need to pay to go on a trip to Switzerland, they may start by trying to type “trip” alone, or “Switzerland” alone.

Conclusion: categories, despite being terrible as a way to organize information, would be more intuitive for beginners in order to use assisted, passive tree-based search. On the other hand, a well-implemented tagging system should help the user to understand both:

  • The fact that the search may contain multiple tags.

    Clickable tags, as it is implemented on Stack Overflow, are a poor way to show that tags may be used together in a search, since the user will simply click on one tag, then, later, on another tag, and always get the results for a single tag.

    To avoid this flaw, Stack Exchange uses an excellent technique: to put the clicked tag in a search box. A slightly curious user will try to type other tags to finally find that tags can be combined to refine the results further.

  • And the fact that same level tags are not mutually exclusive.

    This can be done by showing multiple tags for tagged items, instead of simply grouping the items by tags. This is exactly what is done at Stack Exchange, where questions, even in their collapsed mode (on the home page or in the list of search results), are showing all their tags.

Ambiguous terminology

  1. While tags themselves are used more and more, the term tag may not be understand clearly. GMail, for example, uses labels which are exactly the same as tags.

  2. The word tag has also a different meaning, closer to identity, like in animal tagging.

  3. Finally, in web communities where tags are assigned by moderators, tags may be perceived more like an approval (example: tagging a message to appear on a home page) or a disapproval (example: tagging a post as off-topic).

Those three points make it more difficult for beginners to understand what tags are in an unambiguous way. When there is a risk of misunderstanding, designers should:

  • Either use a different term, such as label used by GMail,
  • Or redesign the interface to make it clear what tags are for, especially to disambiguate this term in a case where tags are assigned by moderators.

¹ Even if the assumption that there is a difference between pure metadata and tags makes things simpler, this assumption is highly questionable. For example a date would be pure metadata, but still, it may be used as a first-class search element, like when somebody searches for photos of 9/11 attacks. In the same way, tags are often used to actually filter the content, instead of searching.