Interest Graphs to Choice Engine: 5 critical components to succeed: Data [Part 2]

Product updates   |   
Published January 15, 2014   |   

When you build a Choice engine, I believe there are 5 key factors that have to be considered:
1. The basic ingredient: data. Is it collected implicitly or explicitly, and what is the quality of said data in its role of inferring interests?
2. Distilling ‘signals’ as reliable interest indicators among the ‘noise’ across the vast domain of information
3. Considering how the graph is created, and the patterns to look for
4. APIs: Enabling X-Commerce enterprises to connect to, and leverage the power of a Choice Engine
5. APIs to APPS: enabling consumers to interact with, and benefit from a Choice Engine anytime, anywhere. Excerpts from the previous post.

1/ The Data
It all starts with the data…
There are 2 approaches to collecting data on one’s interests, influences, and tastes: Implicit & Explicit. Companies such as Hunch and Nara have all attempted to address this by asking their users to share interests explicitly, with the incentive that users will get relevant and accurate choices in return as they use the application.
However, this approach has some inherent flaws. Why would I tell a choice engine that I like Chinese Vegetarian food, only for it to suggest the closest Chinese restaurant soon after? If I mention that I like Asian food in general, would I get suggestions for Korean, which is too exotic for my taste buds?
In order for me to get relevant choices, I would need to be more specific:
• I like Asian food – which is moderately spiced
• I prefer noodles over rice
• I’m vegetarian.
Based on these parameters, a good Choice Engine should be able to recommend restaurants I would like. If the amount of data that I have to volunteer in order to get a decent recommendation or relevant choice is too high, I might as well just search for the product directly!
Alternatively, explicit personalisation may result in a sort of a ‘blank-box’ syndrome – people are not great at coming up with data about themselves. I am complex, how do I represent myself in a list? It is not uncommon to see people completely freeze when presented with the question: “Describe yourself”. These sorts of open ended questions tend to produce data sets that are quite incomplete (I can only come up with two things off the top of my head) or overly vague (I say “sports” when all I really like is Cricket). How explicit data is collected is critical to the final quality of outcomes.
On the other hand, companies like Amazon are great at implicit data collection. Amazon all the elements of the equation: your purchase history, product search history, product correlations (people who bought this also bought that) and even product reviews. Bringing together the two parts of the equation:
1. Publicly available external data and
2. Deep enterprise data assets will determine how cohesive the graph will become for each enterprise.
While general social networks such as Facebook and Google have the luxury of collecting vast data of various types, it is possible that users may present inaccurate or artificial interest profiles based on their context
e.g. Peacocking – the introduction of spurious data as “decoration” to create an idealized public personae, is a byproduct of the socialization of the Internet.
Similarly, specialised social networks may only collect data for a single signal:
• Foursquare – Check-Ins
• Pinterest – liking photos that belong to certain category
Ultimately, what all this means is, that interests have to be inferred from normal web / app usage, where users have opted in, and the application / consumer’s normal usage has to provide strong interest indicators through multiple signals.
Be sure to tune in next week, where I will explore challenge of noise vs. signals in building a taste graph.
NEXT: Interest Graphs to Choice Engine: 5 critical components to succeed: Signals Vs. Noise [Part 3]