Generating predictive data is big but challenging business for Ratan Tata-backed Crayon Data

Published November 23, 2018 |

by Kristie Neo, Deal Street Asia
This article was originally published on Deal Street Asia on 23.11.2018.
If there’s one thing that’s true – data lies. Data lies a lot.
In “Everybody Lies: Big Data, New Data and What the Internet can tell us about who we really are”, Stephens-Davidowitz writes: “Big data allows us to finally see what people really want and really do, not what they say they want and say they do. Providing honest data is the second power of Big Data.”
But ‘honest data’ doesn’t come easy, and Crayon Data knows that well.
Founded in 2012, the Singapore-based Crayon Data has an AI algorithm which analyses consumer data to acquire customers for clients like Emirates Airlines, regional banks and hotels.
“One thing we’ve realised over the years, is that consumer behaviour cannot be so easily predicted from social media reactions like Facebook Likes and Twitter shares,” said Suresh Shankar, founder of Crayon Data.
“Let me give you an example. Let’s say I post something online about a Ferrari and you “Like” it. Are you liking Ferrari or are you liking Suresh’s post? There’s no real way to know. There’s a lot of noise out there. There’s a lot of data available, but it’s not powerful in prediction.”
Crayon claims its algorithm is smart enough even to predict offline behaviour, a crucial aspect for Southeast Asia where a majority of transactions still take place off the grid.
It also promises customer data privacy – a great thing to have but challenging if the company needs to profit from the very same data it needs more of.
Add to that – an escalating climate around data protection and localisation, and suddenly the big data business doesn’t seem like a merry place to be in at all. Crayon Data has one bright spot: it is already profitable.
It has solid set of backers to show for investor confidence including Kris Gopalakrishnan, Ratan Tata and Mitsui & Co. Last month, Crayon announced it was raising $11 million in its Series B round to fuel its expansion into markets like the US, UK and Australia.
Moving forward, Shankar plans to strengthen Crayon’s legal team, and potentially spin its legal and security arm out of the core business.
“We’re working on making changes to our current model where we want to build more of an ecosystem play where our platform sits in the center, between the data providers and our clients, who are users of the data…It wouldn’t just alter things from a cost perspective. It will also change the way our clients see their customers completely,” said Shankar.
According to Shankar, 15 to 20 per cent of Crayon’s total expenditure is on acquiring data. The bulk of however, remains with talent at 60 per cent. Shankar, however, foresees compliance costs rising in future.
***
Edited Excerpts:
What led to the idea of Crayon?
What we believe is that consumers in the world have too much choice. If I want to pick a coffee shop, there are 15 of them out there. We believe that what people don’t have is time. People don’t have time to navigate all the confusing options out there.
It takes over 45 minutes to make a choice. You read about 6 to 12 reviews, visit 10 sites to make one decision – where to get good coffee. If it’s a complex process like planning a holiday, this could be much longer. But it doesn’t have to be this hard. That was the problem that we wanted to solve.
Our algorithm can solve the problem for you, you are almost able to believe the smaller set of more relevant choices it presents you. You save time. In a way, you almost surrender to the algorithm, it helps simplify your life in more ways than you can imagine, though there’s a good and a bad side to such a surrender.
We want to build a platform for any consumer in the world, based on her personal tastes, to fundamentally be able to find the things that they want without spending the extra bit of time.
At Crayon Data, we put a very large emphasis on privacy. Our engines do not search individuals or track people on the Internet. We infer and decode your tastes and preferences based on what you do, as opposed to who you are.
But isn’t that where the paradox is? A company like Crayon will be so dependent on the amount of data that’s publicly available about people. In a way, you need a lack of awareness of privacy before you can even sell to businesses.
Yes, that’s true, but only to a limited extent. What we do is work behind the scenes with enterprises. In other words, banks or airlines will provide data about their customers and what they’ve done. Their data may be limited and not organised. But we take on the job of making sense of it and converting it into a set of personal tastes of each consumer. All the limitations and problems in their data set becomes problems that our algorithm takes on.
One thing we’ve realised over the years, is that consumer behaviour cannot be so easily predicted from social media reactions like Facebook Likes and Twitter shares.
Let me give you an example. Let’s say I post something online about a Ferrari and you
“Like” it. Are you liking Ferrari or are you liking Suresh’s post? There’s no real way to know. There’s a lot of noise out there. There’s a lot of data available, but it’s not powerful in prediction.
We’ve worked to ensure that the predictive power of our AI algorithm takes into account the actual physical behaviour – or the actual transaction that their customers make. Traditional enterprises like banks, retailers and airlines have that behaviour. On the other hand, banks and airlines don’t know you that well, beyond the transactions you’ve done with them.
It’s only when you mix internal enterprise data with external preference data, that consumer behaviour can be more easily predicted. This is where our Maya engine comes in.
So how do you circumvent this? Data tends to lie, a lot.
This is one of the things that we’re training our algorithm to do.
We’re basically saying that we’re living in an uncertain situation where at any given time, an enterprise may or may not have all data about you. Even if they gave us data about you, there may not be data on the internet that I can match it to. Crayon is making the algorithm work extra hard in a model where you’re uncertain about how much data you have on each individual.
Crayon spent three years being a big data analytics service and we realised that unless enterprises go digital, they’re not going to be able to leverage the power of our algorithm. We spent the last few years building the digital part of our Maya product and platform.
Today we tell our enterprise clients – give us your customers’ transaction data, Maya will tell you all about your customers’ individual tastes. And also give you an API that enables you to create a taste-led digital store front – one for each customer.
Here’s a personal example: I’m a customer with a bank for 17 years with 4 credit cards. This bank is still sending me offers to eat at steak restaurants (when I’m vegetarian!) You go to their portal and it’s got nothing to do with you – it’s not personalised at all. What we give them is an API that says, this is the page you have to give your customers according to the way he or she shops, dines and eats. The API can be used on the web, on mobile, bot, widget on a webpage. That’s the digital storefront.
But it comes across to me that the space that you’re occupying is getting pretty crowded.
There are different forms of competition. First there are the guys who sell the data. Then there are also the guys selling tools like SAP and Oracle. Finally, the digital guys who will provide the front-end personalisation. The difference for us is that the moment you put the data into our platform, it does all of it for you. All it takes is three weeks. So, we’re a full stack solution.
I think client mindsets are changing slowly but surely. Today, some of the company CEOs we meet say “We can hire ten people to do this, but we recognise that it will take a long time to build. We should just partner with a specialist and get it out faster?” Most of the resistance tends to come from middle management.
Which sectors do you think are more receptive to Crayon today?
Crayon believes that airlines, hotels and retailers will move faster. The reason is that their business model is getting destroyed faster than other industries. They’re under threat. They’re also living in a world where the internet has completely dominated their lives. They don’t know when someone like Google will take over everything. That’s where the big growth opportunity for us.
We also see opportunities in the credit card, debit card business. Some players think everything is okay. But they’re beginning to realise that with Amazon Pay, Alipay, GoPay and GrabPay, everyone’s doing their own payment platform and wallet. The threat is not the payment revenues, but the invaluable data that each transaction carries about customer preferences. Lose that, and you lose ownership of the customer.
Would you consider partnering with e-wallets?
We’re working on that. One of the things that we’re doing is to go to these guys and say – you already have your own wallet and a set of services that you offer we can help you quickly get into another vertical. Like a Go-Jek. Every month they’re adding a new service which you buy on the Go-Jek platform. We can say – why don’t you take over the travel business, I already do this for one of the world’s best airlines. Instead of building your own travel product, you can easily use my platform.
We also believe that the pure digital banks will begin to use more of us. We’re talking to a few in the US and one in the Middle East. We’re working with a completely new digital bank that just set up in the Middle East. We’re also working with one of the biggest banks in in Yangon, Myanmar which is interesting. They want to move from a cash economy to a WeChat economy, where they want to skip all the intermediate steps and go direct to a mobile, digital model.
But most of the transactions in Southeast Asia are still offline. How is your product going to bridge this offline-online divide?
That’s what our algorithm aims to do. One thing we’ve learned is that if you’re trying to do something innovative that requires you to move the market, the markets themselves need time to adopt what you’re trying to do. This is especially so in a B2B context. It’s not just about my knowing what’s right, it’s also about convincing the market that this is right. Gartner predicted in 2016 that it will take two years for AI-led personalisation to go mainstream, and it literally is happening as we speak, so we’re in the right place now.
You have a number of investors backing you, one of who is Kris Gopalakrishnan. How do you plan to tap them?
Kris is one of the co-founders of Infosys, the Indian tech giant. It’s quite interesting because the first time we met, Kris said – you can create something similar to a Bloomberg terminal for the consumer bank business. In other words, you can take the data feeds and present the data insights to a consumer bank and say – this is where your customer is now, and this is what we think he will do next. It instantly becomes big value for them and a big opportunity for us, because we are converting the data into something enterprises can actually use.
How much public data do you really have? Don’t you have to acquire the data at some point from someone?
Yes, we acquire data, we partner with people, we also do manual curation. All that’s what goes into our graph.
Do you spend a lot on acquiring data?
Yes. But we spend more on talent. That’s the most expensive. Data and cloud infrastructure comes second. Good engineers are hard to get. The Singapore government gives incentives to set up an AI team in Singapore but there’s only a very small pool. The problem we face is that we train these engineers up and then they get poached by large enterprises.
How much of your annual expenditure do you set aside for people and data?
I would say we spend about 60 per cent of our annual expenditure on people, and 15 to 20 per cent on data and infrastructure. We spent a lot more in our first year on data and infrastructure learning data sets, but now we’re more efficient so we spend less.
Do you think acquiring data in the future will grow more expensive though? Especially if data becomes increasingly difficult to get.
Not if we change the business model. We’re working on making changes to our current model where we want to build more of an ecosystem play where our platform sits in the center, between the data providers and our clients, who are users of the data.
But isn’t that what you’re already doing?
No, not really. Currently we’re working with clients across different industries where they may not be sharing data with one another. If we can create a model where all of these companies start to share their own datasets, while we bring in the external data, the entire economics of how we deliver data to people will change completely. It wouldn’t just alter things from a cost perspective. It will also completely change the way our clients see their consumers completely.
Imagine a world where Singtel, DBS, Singapore Airlines are able to share the preferences of their common customers, without actually sharing the data. Maya will be at the centre of such a “taste exchange”.
But why would these different companies want to share data with one another? That’s really difficult to imagine.
It’s happened before. There is a company called Quantium in Australia. It takes Woolworth, ADB, Foxtel. But they’re running it differently where the companies are almost running it like a JV. They’re putting it together and saying they can look at consumers differently whether it’s retail spending, television viewing or banking habits. So, it is possible but what will make the difference will be blockchain and encryption. It will be an entire new way of looking at data.
Do you pay a significant financial penalty when it comes to data regulation and privacy?
In terms of GDPR, I would say everyone else has been clamping down significantly on data privacy with the exception of the US. The definitions that regulators use for how you use data classifies us as a data processor. This means we are not the enterprise which bears the financial penalty for the misuse of data. In other words, I’m not using the data, I’m using it on behalf of my clients. There are still consequences, but the main burden is with the owner of the customer.
Do you foresee yourself beefing up your legal team moving forward?
We actually just hired a full-time legal person. One of the things we’re discussing is at what stage we take data, data privacy or security out of the mainstream business. It’s almost like having an internal auditor. This will probably be something we have to look into more closely in 2019. We will probably have a full-time data officer whose job is just to work on legal, data security and privacy and not be part of the main business. We will look into adding to the legal team next year. I can also foresee compliance costs going up next year.