Toggle Menu

States Title at the AI Summit in NY – Andy Mahdavi

 

Will Thompson: All right, so last panel of the day, I hope you’ve all had your coffee, you know, because it’s gonna help us get through this. But at the same time, we’ve got a huge amount of interesting things to discuss because when you look at the potential for artificial intelligence to make a real economic impact, look no further than the financial services industry. And we have here today representatives of a number of different facets of that. But what’s holding back the true potential of AI and finance?

Just the same way that we can’t stop the subway in New York to try to overhaul it, no, you can’t really shut down all of your legacy systems and spend the next couple of years building things from scratch – nor would your CFO let you if you could manage to convince somebody else to do that.

One of the problems is, you know, skills, culture, and we’ll touch on that. But it’s also legacy technology. Just the same way that we can’t stop the subway in New York to try to overhaul it, no, you can’t really shut down all of your legacy systems and spend the next couple of years building things from scratch – nor would your CFO let you if you could manage to convince somebody else to do that.

So we have to work with what we’ve got here, right? I guess the first question is, when it comes to legacy technology in financial services, how do you think about actually adopting artificial intelligence? Is it something that you bolt on to your old system? Do you have to sort of run a parallel system where you start from scratch? I guess I’ll start with you, Andy, because you have a very unique perspective as being a startup that actually acquired a legacy business.

Andy Mahdavi: That’s right. I’ll tell you a little bit about it. It’s been this amazing journey – we didn’t exist three years ago and we started a title insurance carrier, which is – by the way, don’t try this at home – it’s an extremely difficult and long regulatory process – but at the end of it, we had a machine learning instant underwriting algorithm, which is kind of normal in other areas of business, but was a new thing in the title insurance business.

That allowed us to actually reverse acquire North American Title, the number eight player in our systems. And at that point we had kind of these two worlds, right? We had the Amazon-based ML [machine learning] system with all the fancy stacks that you’ve seen on TV today in this room that’s been so great to hear about. And then we had Microsoft – maybe it was running on a VM in Azure, but other than that, a very much click-through system.

What’s the alternative? The alternative is: Grow the thing that’s new until it’s able to do everything that the legacy could do with ML at it.

The question was, what do we do about that, and we actually spent a long time evaluating the bolt-on approach versus a different approach. So the bolt-on approach says, ‘Hey, we’ve got this thing, it’s generating revenue in our legacy system. Let’s start adding pieces to it. Let’s start adding AI systems and kind of jigga ways of getting that ML into the legacy path.’ What’s the alternative? The alternative is: Grow the thing that’s new until it’s able to do everything that the legacy could do with ML at it.

Those are the two alternatives I’ve seen, not just our company, but others. I was previously at Capital One where they faced a very similar problem with their credit card application system. The no-new-on-old kind of one, both at Capital One and at State Title, we decided we would grow the existing high tech system until it could replicate all the functions of the legacy system, but with machine learning as a core part of it and to slowly manage that shifting of volume from one to the other.

Will Thompson: One of the problems that we have in finance generally is just the explosion of unstructured data – which you were referring to – and the quality of it is pretty abysmal a lot of times. So how do you go about managing that, especially when a lot of that historical data may not be, again, as clean or kept in as modern of an environment as you might like it? Maybe we’ll start with Andy.

Andy Mahdavi: Yeah, sure. Well, first of all, in my industry we have problems with the structured data – forget about the unstructured – and so that’s a whole thing.

Will Thompson: Well, so talk about that first.

Andy Mahdavi: So, when you close a mortgage, there’s somebody typing in literally everything into the system. You know, there is in many cases no standardized way to make sure people’s names and addresses are actually spelled correctly. That itself is a problem where – maybe you’re tracking realtors who are all the different realtors you dealt with and the same person will have their name misspelled in many ways, right? And so that’s a pretty common problem to the point that right outside this door to your left, there’s a booth of a company I spoke to that specializes on that structure, data cleaning problem with ML – it’s a real issue.

Will Thompson: And then is there a way to address that just using legacy systems or there’s no way?

Andy Mahdavi: I think legacy systems that I’ve seen have a lot of room for just legacy improvement without ML. One example is just legacy systems that are customer facing frequently lack validation, meaning in this particular case, don’t enter the realtor’s name over and over every single time. Use an address book that’s validated to enter it.

Will Thompson: Right.

Andy Mahdavi: So I think on the legacy side, there’s still work if you want to maintain those systems to do it without trying to, you know, again, bolt on that ML approach.

Will Thompson: So maybe about the unstructured process problems that we’re facing.

Andy Mahdavi: Yeah, really interesting to hear you talk about the pains with using some of this data. And you were talking about how sometimes in trading PDFs might be used to feed into alpha improvement models. We’ve seen a plethora of vendors – we saw a system, for example, on TV an hour ago, the quick spread used by Moody’s – but if you go to a mortgage conference, you’ll see vendors on taking on this unstructured data and essentially the pitch is: ‘Why build your own document parser or image parser? Use our service because, you know, we’ll automate that for you.’

… what we found at state title is that, while standard forms that you might see could very well be processed by vendor systems and have a place there, the plethora of this data that the consumer sends us – rotated drivers licenses or some of it is blocked out, all kinds of stories I could tell about that quality – you really need still that human in the loop that we’re talking about earlier, continually improving these models.

And generally what we found at state title is that, while standard forms that you might see could very well be processed by vendor systems and have a place there, the plethora of this data that the consumer sends us – rotated drivers licenses or some of it is blocked out, all kinds of stories I could tell about that quality – you really need still that human in the loop that we’re talking about earlier, continually improving these models. So there’s not going to be a pure off-the-shelf, solve-it-all solution for unstructured data for a while.

Will Thompson: Right, but on the compliance standpoint, there’s more risk in automating compliance operations than there are in automating ad serving, for instance – you get something wrong, you’re going to get smacked, right? And so the question is: How do you go about balancing that with – obviously human in the loop is one thing that you can think about – but what are the parts of compliance that can be automated, that can be used, or through new systems and what sort of needs to be as it is for a reason?

Andy Mahdavi: So to me this is a really important and actually I think very interesting topic. I think there’s two types of compliance that are involved. One is in a system where data science systems have been making decisions on consumers for the past 10 years. There – you all know your companies have model risk offices which are used to looking at models, putting them through tests – and the question for them is now that machine learning is here, active learning, learning every day, and essentially releasing a new model version every day, how do we do compliance there?

And there, I think the solution is compliance as code, meaning that now, instead of vetting each model version one after the other, that the code is actually making the decisions and then allow that to self improve on its own. So that’s an evolving field in the mature areas. And then in the States Title area where it’s really new to regulators using any kind of ML in our product, there it’s about education, bringing the regulator along: ‘Here, look at this system. And, you know, they use it in this other area of insurance and we would like to use it.’ So really, making friends with the regulator we found at States Title, is actually a really good thing for us to try to do.

Will Thompson: That is very interesting. So let me ask probably a stupid question, but not being an actual engineer or IT expert – I’m a journalist. It just kind of seems patently crazy on its face that we’re just going to accept like one of the clients that we work with still is running mainframes all these years later. And we sort of accept this as a fact of life, that these things are just always going to be around kind of like the subway. Is that just how it’s going to be or are we ever going to get to a point where, as as you were saying, we’re gonna fully overhaul the systems?

Andy Mahdavi: So our company’s based in San Francisco and there’s a lot of software engineers, as you might have heard, out there. And to software engineers, what’s most fun is creating something new. They love that much more than working on something that’s existing. So you get this bias towards ‘Let’s build it from scratch because that other thing is legacy.’ And what you find is, when you look at quote unquote legacy, you actually find a lot of cool stuff in there from time to time, right?

… to software engineers, what’s most fun is creating something new. They love that much more than working on something that’s existing. So you get this bias towards ‘Let’s build it from scratch because that other thing is legacy.’ And what you find is, when you look at quote unquote legacy, you actually find a lot of cool stuff in there from time to time, right?

And so you have to find that balance, where is it really not serving my use case and I need to build the new stuff versus: ‘Oh, actually, they had some pretty great ideas 10 years ago that we didn’t even think of when we were just thinking out of the box, and this system is really good to go – and you’ve got to find that balance in your company.

Will Thompson: Well, with that, what a great way to put a bow tie on this. Thank you, and let’s get a hand for our panelists.