Zaki Mirza’s Blog


… About software and beyond!

Categories, Tags or the hybrid approach

Developing software is not the same as it used to be. With new paradigms, techniques and tools comming up its hard to choose the best methodology and the best tool to deliver quality products. Hence my delimma of organizing data. Would you rather categorize it (as it used to be) or tag it? There has been a lot of discussion on categories vs tags on blogs and forums lately. Had tags not been used/invented i wouldnt been writing this post or getting all confused on what to use.

So here is the scenario. You have a finances management software/tool where you write down your daily expenses etc. Just filing expenses in a software doesn’t do much good (at least not so much as to motivate me to make an exclusive custom software for it – in which case Microsoft’s Excel suffices). So, one of the major goals of this kind of tool is organization of your expenses. This is a narrow goal though. An even broader goal is planning and reporting. So we come to the part where organization is crucial. One way to organize you expenses is by making categories and n-level subcategories. Another is by tags. We discuss both, and see which one is better.

In strictly categories based approach we make categories and sub-categories. For example top level categories might be {house, car, loans, personal, kids, pets, traveling, communication}. Then you can have sub categories for each category. Say for example you can have {utility bills, groceries, maintainance, renovations, miscellaneous} for the “house” category. You can then have sub categories within these categories and so on (electric bills, phone bills, gas bills and what not!). This is the traditional approach to organizing such data. These categories need to be set before hand so that when entering new data you can choose the appropriate category. But what if you have a list of expenses on a paper and you just want to quickly add data? You do not have any predefined categories. So you add a on-the-go “add new category” feature. First you navigate to the appropriate parent category and then add the sub category inside it. From that organization point of view, thats fine. From the usability point of view, that sucks!. (imagine if you have all random data on little slips of paper that you want to enter in your software) Yes, you can say in a while we’ll have enough categories that we wouldn’t have to keep adding categories and come to a stable category list. Again, that sucks! Why? Well in the above example we have 6 top level categories and say, at the least 5 sub-categories to each, and then 3 more to each. so we end up with 6x5x3 categories. (okay maybe not that much, but it would be a fairly big figure). We cant really show all the categories in one screen to the user. So maybe a tree view would be the most useful way to represent this kind of categorized data. But again, having four levels of categories will require atleast 4 clicks. (if we restrict the user to adding to only the leaf categories, i.e. you cant add an expense to just “car”, you gotta tell what kinda “car” expense it is) then its really a pain and imagine how long it would take to add expenses, eventually driving the user away from the software (at least in the initial phases – and know that the real reporting/planning phase has not yet arrived. We’re talking about non-tech savvy people and regular pc users – not highend data entry people).

So now there is this concept of tags. Tags are words that describes the data. A datum can have multiple tags associated with it which act as a meta for the datum. For example, i have a picture. It’s of my friends, at KFC in Islamabad. Its a birthday party picture. Its in the evening. So “friends” “kfc” “Islamabad” “birthday party” “evening” are all tags for this picture. There can be other pictures at other KFCs around the country. There can be other pictures of Islamabad. There can be loads of pictures of birthday parties. The point is, we’re just defining the context on the go for this picture. There are no predefined categories or folders to put this picture in. I don’t want to make 5 categories and put this picture in all 5 categories. And if i realize its not Islamabad KFC its Karachi kfc then i have to go to Islamabad category and remove it from there and make a new Karachi category and so on. So when searching though these images ill have a lot more freedom since ill just have to search for specific tags with each image. (by the way this is how search engines organize information from around the world wide web and when you fire up a search it goes through the tags and shows the appropriate matching results. Not that simple, but thats the 30,000ft view of it). So benefits of tagging are maintainance, easier and more precise searches, no more redundant categories (if you go back and see the categories examples, we had maintainance category in house. It can so easily be in Car or any other category as well at any level, hence a redundant category even though the context is different). But what about usability? First we had to do clicks, now we have to type! (well not really, we can just click at the already defined tag). We’ve removed the hierarchical structure so that cuts down some milliseconds. We’ve removed redundant categories so we’ve managed to save some room. We’re making tags on-the-go so no pre-funtime grumbling. We can show a real time tag cloud (like the ones you can see in flickr or wordpress). We can, if we like see a real-time hierarchical structure of our expenses. (pick one tag and it will create that as a root and will show other sub tags etc. That would look really cool by-the-way its on my “must have” features). Some might say that as out database increases so would increase the time to get real-time clouds and categories but i guess that can be sorted out and we don’t really have to search all information for tags. (some have ranted about how slow tag clouds can perform on the web etc. ) And finally we also get a better visualization domain for our data. We can pick maintainance and see how much we’re spending on maintainance generally. (this will include car maintainance, house maintainance etc). Yes could have been done with categories but with tags we have a more logical structure for indexing and we dont have to search through categories and n-level deep sub categories to find “maintainace” category and then again go back up the list to see where it actually belonged and so on. (i hope im not being biased towards tags here).

So, so far tags look much useful than categories. I thinks its basically a matter of freedom. Do you want to restrict the user to certain categories? Are you giving any freedom? Sites like deviant art are strictly categories based and do not have much freedom. Not that its bad. They cant afford to. Everyone has his way of defining his art. but art has certain fixed categories. Even though there are a million painters the paint with mostly the usual tools and materials for example “oil” “water colors” “pastels” etc. Maybe they scale up to 100 but thats it. The domain of art is finite. (as in, the categories). Similarly the domain of music is finite. Yes, new genres are added every now and then but its not that dynamic. This lack of freedom is also because deviant art needs moderation. Moderation in terms of categorization. Categorization brings quality of art available at the site and people can search their favourite art genre. (maybe if they add a tagging system above this categorization system then it can be even cooler).

In our problem domain, we do not have this kind of restriction. A narrow minded look might say that okay at the most only these 6 major categories can exist for a person. Lets take that as a valid argument. But what about sub-categories? How deep will you define the categories? Maybe this guy who’s the guy-of-the-house is just managing some homely finances. There is another guy who manages his football team’s finances. None of them need each other’s categories. and its not a public service anyway. (we’re talking about personal finances and only you can define and choose where you throw your money away :p) So categories are not that useful here. Tags are the way. If the software requirements specify that categories need to be heirarical, we can use our tags to create trees of heirarchy. If it requires strict heirarchy then we’ll have to stick with on-the-go definition of categories. I’m still thinking new ways to deal with this problem.
Have your say about the issue of tags vs categories. What do you propose for a personal expenses management software/service? We’ll have a look at the n-dimensional hierarchal views of tagged data as soon as i get time to work on it.


Filed under: design, software, , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

RSS Google Shared Items

  • An error has occurred; the feed is probably down. Try again later.

RSS Google Reader Starred Items

  • An error has occurred; the feed is probably down. Try again later.

Top Clicks

  • None
%d bloggers like this: