2018 NYU Stern FinTech Conference: Panel 2: Using Alternative Data for Credit Assessment and Payment

{Music} – Alright, so I’m Foster
Provost, as I guess you know. And we’re at the second
panel of the afternoon which is using alternative data for credit assessment
and payment. On the panel, we have, I guess from your left to right, we have Joe Breeden, who’s
the CEO of Prescient Models, Prescient Models, yes. We have Alberto Corvo, who
is the CEO of Motive Labs, And we have Eric Starr, who’s
a partner at CapX Partners. – Good afternoon. – Hello. – So good afternoon everyone. – So um – [Eric] Our moderator is the
lead singer of Mean Reversion. – Oh thank you, you
mean reversion, yes. You got to go check it out. Yes, so besides my main job
which is my singing career, I’m a machine learning guy,
and for 25 years or more, I’ve been looking at building
machine learning models on alternative data for a
wide variety of applications for some of the biggest
banks in the world for a bunch of startups, But I’ve never actually worked
on using alternative data for credit assessment
and payment. So I’m not really
gonna tell you much about my opinions on things. I’ll leave it to our
to our panelists. One thing that is out there
that may be very helpful for some people, is we
did publish a case study that’s attended for
teaching purposes on using machine learning
on Lending Club data to be able to see if
you can build models to decide how you should
build out your portfolio of Lending Club loans. So you can find that if you
just sort of google my name and Lending Club. But that wasn’t, that was
more academic case study than a real application. So let’s start talking
about real applications. And I think maybe it’s
important, a good way to start is by about saying what we’re really
talking about here. And so what are these
alternative data? How are they different
from traditional data? What applications are
we really talking about? And we’ve decided
that we’ll start with consumer, with
consumer credit. So, maybe Alberto, can you
tell us a little bit about, like different ways in
which alternative data can improve credit assessment and other things
as well if l like? – Yeah. So, hi everybody. So the story here is
working with banks. And having been in the
industry for a while, the challenge has always been, there are some markets where
there’s a lot of access to what we’ll call
traditional data or data to do credits assessment
of corporates, of a consumer data
or for lending goals. But the reality is that
there’s a lot of markets, and actually I would
argue is the majority of the world population where there’s not
really the type of data that you would imagine, like payment date,
or borrowing data, or timeliness of payment. And so, more and more
banks and lenders at larger have been looking for, okay, what are other ways
for us to be able to assess the
creditworthiness of someone that either arrived into
the country a week ago, or just it’s never been banked? So we need to you know
figure out other ways. And we’ve been
working with banks in the Emirates, in
Brazil, in India, to bank the unbanked and
to create credit for people based on social media, based on other
types of approaches, some of them are public,
some of them are less public, but is how do you grab data
that is not the classic data and then extrapolate. Okay, is this a good payer? Is this a bad payer? And also, how do you
actually use data, non-traditional data, like the
FICO score in this country, to extend the the credit and
actually understand and adjust on a constant basis. Because a lot of times, what
happens is that it goes away and then while, you know,
they pull your credit data. But then, if unless they pull
it again, they don’t know it, may be a lot of things happen, so there’s a constant
adjustment needs to be done. So, it’s two really
tendencies, extra credit, and assess and continue
reassessing credit. That’s been what we’ll be
working on for a while now, so. – And Joe you’ve
been investigating
a bit the difference between non-standard data and what you called
near standard data. I wonder if you can
tell a little bit about, a little bit about that. – Sure. We do work with clients
around the world, but, you know, focus
on the US for a minute. And there are different
definitions of what
alternate means. Clearly, some people have
the sense that anything that isn’t from the credit
bureau, that isn’t LTV, well that’s some kind
of alternate data,
and that could be. Others would say, what’s social
media data, and you know, what’s your Airbnb rating,
and things like that. Mine’s pretty good. So, you know, that’s
interesting too. But when we work with lenders, we work with lenders
of all sizes, big ones, small
ones, cornershop. And, you know, on the one hand, they struggle with
things like the cost of refreshing a FICO score. Their FICO scores may be out
of date since origination on a four year old auto loan. So ironically, one opportunity
for using non-standard data is just anything that can
fill in more costly data. And certainly in
the underbanked, that’s one thing
that’s happening is that FICO scores are
not really applicable for the thin file
on the under bank. So what does that mean? What do we look at? Well, one of my roles is I
sit on board of directors for a company called Upgrade, which is a FinTech that
originates personal loans. I’m not gonna say anything
that they do specifically, but just sitting in that
universe of, you know, Lending Club and Upgrade,
and these others. Often what they’re doing is
what might be non-traditional, but it’s really kind
of near traditional, when you apply for
loan they’d like access to your checking account. And they’ve got an
algorithm, sounds fancy, but really they’re
just going through to figure out the
payment frequency and
the volumes monthly and can they figure out
what your paycheck is? Well, most the time, yes. And so is that non-traditional
alternate data? Well, not really. Your regulators won’t
care, won’t mind that you’re doing that,
depending on the regulators that they have to talk to. So, some of those things
not too surprising. Another example
of a credit union. It’s surprising how
many credit unions are actually subprime lenders. So this is a credit union who
is 80% subprime auto loans. And one of the things
they look at is, they have a contract with a
company that can tell them if you’re making your
insurance payments. And if you’re not making your
insurance payments on the car, there’s a good chance
you’re not gonna pay for the loan on the car. So, risk goes up, right? So those are kind of,
again, regulars won’t mind, but they’re looking at that. It’s not at all like
the the more exotic type of information, we’ll talk– – We’ll talk some more
about that in a little bit. Okay, so, and I think actually
something that is, you know, we need to talk about
a little bit first before we talk, before we
dig into some of the details is just, it seems that
based on what you guys have said already, that
we have a difference between the opportunities
and the restrictions in the US and other
developed countries versus in the
developing countries. So I wonder if you guys could
talk a little bit about that. – Yeah, so, one thing
that you got to think when you talk about
alternative, it is also, there’s a dimension
of trustworthiness
of the data source. So, you have, you know,
you get a FICO score, it’s a FICO score,
it is what it is. But then you have trusted
data, semi trusted data, trusted sources,
semi trusted sources, the timeliness of when you get
the data, and how you get it, that factors in. So, when you go to emerging
markets like we’re doing a lot, it’s really, most of the data,
it’s dirty to start with. So there is some signal,
but there’s a lot of noise. And then, you gotta get
a lot of data points about an individual and
about the population to be able to make
that inference. Whereas, I think
that when you go to more evolved markets in
some ways, you can have, okay, I have something
that I can hang my hat on. It might be old, but at
least is something that, and then it becomes a
work of updating. Okay? This guy has started
slowing the payments, or I can tell from
his LinkedIn profile that he lost his job, or
that he had one more kid, and then all these things
are started to factor, so the spending
profile is different to how do I adjust my credit. But there’s some people, for one of my our
partners in Dubai is like, there’s laborers that
arrive the week before, and they gotta buy a
car, so now good luck. How are you gonna do that? And then the data you can
extract it’s dirty by nature, and sometimes people
try to game it. So there’s a lot of machine
learning at work here, is how do you eliminate
a lot of the chaff and try to get the wheat that
is actually the information. – And what about for instance from the from the
regulatory perspective. I have had some colleagues
who did a study, looking at improving
credit scoring using social media data. And we’ll get a little bit
more into some of the details of what sort of
data in a minute. But, you know, if it’s
sort of fine grained data on what you’re doing
on social media, or what’s on your
profile, right? Which as far as I understand it, is being done in some
countries I mean, but it can’t be done everywhere, and so are we seeing
that kind of difference between what’s being done in
different parts of the world? Do we know? – Well I do know that
there are lenders, especially in what we would
call developing countries where they’re, as you were
saying, less access to data, and they necessarily have to
use that kind of information. And there’s a social
benefit to that, because more lending to
people wouldn’t otherwise. In the US, you know, people
think that finance companies can be more creative because
they don’t have bank regulators sitting on their doorstep. At the same time, I know some
of those finance companies. They want to resell those loans
to people who are regulated. And so they say, now
we stay away from that cause we want to resell to banks and credit unions and others. So there’s, I feel like
there’s less actually going on than the marketing would
say, to tell you the truth, that some people who make
a lot of noise about it may not be using it quite
as much as they say. – Alberto, you would said
something to the effect of maybe the emerging banks
might be further ahead than, banks in emerging markets may be further ahead than maybe the banks in the more developed. – Yeah, I mean, they have to
make a virtue of a necessity. And so, like it or
not, and regulators, right. Regulators are a
little more lenient because they understand the
reality of what’s going on. And if they wanna inject
some credit into the system, that’s the only
way to go about it. It’s also a very
important point. A lot of these lenders
in emerging markets, they carry the risk that
they don’t package it, that there’s no secondary
market for the risk. So at the end of the day, it’s,
you know, I’ll get my data, I’ll try to figure out
a way to to price it, and then I like, standalone,
and then I’ll carry it, and I’ll suffer the
consequences or make the money. And yeah, it’s
really night and day. And it’s incredibly interesting
what these people are doing. And the accuracy
they are getting, and the type of data
they do, I mean, there’s like, they were
looking if the first purchase that people make after having
gotten a line of credit is fuel for the car, they know
that’s already a bad credit. They can tell you why,
they can explain it. But that’s, the number
say that the risk all of a sudden went up. And so again, it’s a
constant improvement. And yeah, it’s miles
and miles ahead. Also, there’s no
need here as much. – So we’re talking
about consumer credit. Let’s move let’s move on
to lending to businesses, because although
some of us I think, immediately start to
think of consumer credit when we think of credit. Of cause that’s just,
you know, only a portion of the credit market. So Eric, you’ve looked a good
a bit lending to businesses, can you tell us a bit
about the application of alternative data,
and maybe whether there’s a particular
segment of businesses where it’s particularly
adds particular value? – Yes, so, at CAPx where we
are a balance sheet lender, so we lend directly to
commercial customers. For 20 years, really we focused on what’s called
the middle market. And these are pretty
large businesses within the context of FinTech, two to $25 million deals. But that being said, we
do look at some of the, we’re starting to
look more and more at the small and medium
sized enterprises where there’s much more
alternative data sets available. I think of your your
Yelps scores or Glassdoor, things of that nature. Not Glassdoor, but Yelp,
where you can get some lift in your credit modeling, your predictions
for credit quality. But not, that’s not
where we live a lot. Where we live is again
in the middle market. And what’s, and
I’ll pick a little, pick up where Joe
touched upon earlier, and that is alternative versus kind of alternative data. In the middle market which is, think of it as the
third of the US economy, those businesses. The lending, amount of lending that goes in there is enormous. But what’s really challenging, and what makes it
really difficult, and why you do not hear about
it at FinTech conferences is nobody’s lending $25 million
dollars on a FICO score. But you need the
traditional information. Now the problem is that, that
means financial statements, and these companies are private. So if I ask five companies
to send me their financials, I’m gonna get five
completely different sets of data in different forms. What do I do with them? Well, the model is, you
hire credit analysts that type the data
into your model, and then you start doing
your credit analysis. Well, what’s unlocking
what’s happening now is we’re fine through the use
of AI and machine learning, is capturing this
analog unstructured data of traditional information, and organizing in a way
that can be crunched, that can be analyzed
using technology. So, it sounds like, oh my God,
this is like 10 years ago, but it’s one third of the US
economy’s lending marketplace, and it’s just now coming
into this FinTech space, so you’re gonna see a
lot of growth in this. And we’re starting with
the traditional data, so that’s the traditional stuff. Then, what’s also happening? Because it’s almost easier to capture alternative data than it is to capture
this traditional data. We’re seeing a lot of
growth in that area, is well simultaneously. And that information
are things like, some of it we’ve heard
about this afternoon, but what’s going on
in employee growth, is the companies, are
they hiring more people? Are they releasing new products? Are they attending conferences? Are they speaking
at conferences? And all this data is now,
through the use of of AI, is being scraped and put
into separate datasets which then can be incorporated along with now this digitized
financial information to arrive at credit
scoring models that are far more automated
than have ever been before. So, that’s what’s going
on in the middle market. And again, in the small and
medium sized enterprises, there’s still a lot
of the consumer facing data. Again, the Yelps of
scores and what not, that are being used
and have demonstrated lift in that space. But again, that’s not an area where we’re currently living in, but there’s a lot of activity
going on in the middle market, and it’s really
untapped, and is about, you’re about to see a lot
of growth in that space. – And so, do you see, I mean, I don’t know how many
years ago, 15 years ago when the hedge fund, you
know, sort of industry was kind of transformed
with the ability to go and gather different sorts of data like the you know the famous
story being, you know, looking at how full
the parking lots were from satellite images. There’s many examples
that are similar to that. Are we seeing that
kind of stuff start to, or maybe it’s already been
happening for a long time in issuing credit to businesses? – No, I think what
you’re seeing is, I’m not in a room with
middle market lenders, right? Nobody here a middle
market lender? Okay good. – [Foster] Let’s
talk about them. – The vast majority of
middle market lenders are ostriches who are running
on the treadmill to oblivion. They have no idea that this
stuff is coming down the pike. Its relationship lending, it’s
going out for more coffees and lunches and dinners. And the fact that data and
AI and machine learning can essentially replace a lot of the jobs, and roles, that our traditional, is
just completely foreign. So, there are some. And by the way, you can
go outside of this country China does a pretty
good job of this. But the US or the North
America, it’s really slow that it’s really really slow. And we’ll see, I mean, that’s
why, I mean, I love it. I love this stuff, because the
opportunity is there. Remember, it’s a third
of the US economy that I’m talking
about, it’s gigantic. – So I thought we might
give something a try. So a lot of times when we
have, when we talk these days about AI and machine learning,
and data, and all that, we sort of talk at a rather, nobody here of cause, talk about, to talk in a
rather superficial level, I was wondering whether
we could sort of dig down and start to ask, what about
some very specific examples of specific sorts of data,
for specific applications, and why, why does the
alternative data actually allow us to do things better. Anybody wanna take a stab at it. – Sure. So, as I was talking about it, the challenge is people have
never been in a banking system, in a banking ecosystem, and you wanna bring them
into this ecosystem, and how do you
extend that credit. So there’s just two steps. First of all, it’s making sure you can connect to
them in some way, and make it interesting
for them to do it. And the second term is okay, at what price this
becomes a good idea, and what’s their
ability to repay. And so it’s not only
alternative data in the scoring. And as we mentioned a bunch, and depending on
the type of client that you’re going after. It’s social media data
which can be from Facebook to LinkedIn to having an app that they gotta download
into their phone, because obviously, they
skipped a generation of phones, and everybody has smartphones. And you say, okay if you
wanna be into the system, you got a download this
and the app spice to that. But that’s, but also is, okay, now you’re walking
into this kind of, you’re a small-time contractor, and you’re doing work, and then okay, you’re going
to a place where you can buy material for your construction. Okay, so instead of
trying to float it with the money that you’re
trying to get from the client that is paying you and paid you, how about we look
at your population, people are similar to, you’re
gonna need X amount of money, let us float it to you,
and this is how much you’re gonna pay, and this
is how you’re gonna pay, and then it becomes seamless. This is up and running,
actually, this is happening, but people are
pitching this actively, or like people that do It’s kind of like,
kind of services. Okay, you’ve been driving
next thousands of miles. From our analysis, you’re
gonna need to repair the car, and this is how much
you’re gonna need, and here’s, we’re
gonna extend you credit because we know that you
work X a number of hours, because we know that you’re
actually bringing back. So, those are real life
examples that are happening, and in the consumer
credit market, that here we’re not seeing
because there’s no need yet. But that’s a big
source of revenue because the margins
are actually good and the algorithms are
actually pretty accurate, time will tell. But I would like to see five
years defaults on these things a few years from now. What was it? – So I’ll just chime
in there for… Please think about
alternative datasets with respect to lending, and let’s make it simple. There’s two types of
loans that are made. One type is because that
borrower is growing, and they need access to capital, whatever that
growing means, right? It could be personal,
could be consumed, it could be consumer, it
could be the commercial. The other is, they
need liquidity. They need money because they
don’t have enough of it. Now, you would
wanna lend liquidity because they have capacity
to borrow more, right? They have the liquidity,
it’s just not tapped. So there’s untapped liquidity, you don’t want to lend if they don’t have
any more liquidity. So now you can use
alternative datasets, again, to get back to
answer your question. You can use alternative datasets to understand where
that borrower is in that continuum of
growth versus liquidity. Now it gets more
challenging is well, do they have the availability
on the liquidity side? But you can use alternative data to figure out if you’re
on the growth side or the liquidity side
of that continuum. On a consumer side, you can
look at social media data, are they taking more vacations? Are they, on the flip side, is like do they
have more children? And now it’s, you know,
it’s growth in one way but not necessarily in another. And so you can use social
media to help you arrive at whether or not this is
a growth type of customer or a liquidity type customer. – Thanks. And when you look
in other industries, and the sorts of data that are, where we consistently
see big improvements when you actually
apply, I’m gonna use the application machine
learning in this case, when you apply a machine
learning to data, very often it’s where
you kind of move to a different sort of data,
and that is very massive, very fine-grain data. So we saw this in
online advertising, when you went to things like every webpage that people visit. And you look at
that so fine grain, and it has a real, it describes at a
very fine granularity, exactly what people are
thinking about sort of, because that’s what
they’re clicking on, I mean up to a
certain extent, right? You look at when the
mobile-data started to become available,
and now you see where people are all
day long, you know? Except for those of you who have the ability to turn
your phone’s off. You think you’re turning
your, like GPS location off, keeps people from finding
out where you are? Sorry, if you’re on the stern,
on the stern Wi-Fi here, all there needs to be
is one other of you who has your GPS
on, and then we know where everybody is, is on
this on this GPS, right? You look at banks and, you know, if you move towards from
aggregate sociodemographics and so on about individuals
and you start to put in, what are the actual merchants that they’re
transacting with, right? In all of those cases, we
see big jumps in the ability of the of the models, right? And I guess what I don’t
know is whether these sorts of ultra fine grained data from the purpose of getting
lift, whether we’re getting, I mean I understand the we
have these two different things that we’re talking
about earlier. You just don’t have
enough data, a thin file, business or a thin
file individual, and then you have somebody who potentially has a
lot of traditional data but we may be able to
get a lot more lift. And so the, yeah. So wonder we could talk
a little bit maybe, Joe, about getting more
lift from models, whether it’s through
the kind of data that I, you know, I’ve seen
being giving a lot of lift or through some other way. – Yeah, in fact, there was a, well there was a
talk this morning where we heard
about increased lift through some alternate data
compared to just FICO scores, and we think about this in
two dimensions, if you will. A lot of people, when they
are looking at alternate data, well that’s pretty recent stuff. And these, you tend to have
kind of point in time models. When we talk about lift, it’s
on a narrow slice of time, how much better can i
distinguish different consumers. One thing we like to look
at is change through time over long time periods. What can we say about how
credit worthiness has changed, and our ability to capture that? I’m a physicist by training,
and we like pictures, So I’m gonna grab
the clicker there, then let me have
a couple pictures. I like pictures. (chuckles) So this picture is a
couple different studies we did glued together. The first study was
done at the end of 2005 which was kind of
an ominous moment to do a credit cycle study. In fact, I at the
time in early 2006, I was showing that
to everybody I could, and people nodded
nicely, so forth. But what this slide shows you, this is specifically
mortgage data, and it’s a couple different
studies sutured together. But that blue line
is the credit cycle. Now it’s not FICO score, it’s taking mortgage loan
performance by vintage. The axis here is origination
date, the x-axis. And so it’s saying for loans
originated in a certain date, no matter how old they are, whether they’re
recent or old loans, you normalize for the
economic conditions, you normalize for
how old the loan is, and the life cycle effects. And you get down to a kind
of intrinsic credit quality, but we don’t know why. We only know what. So this is a good way to measure what was originated
industry-wide. And you see, you know,
a huge cycle here. The axis on the Left, that’s actually change
in log-odds of default. And so if you do a little math, you’ll find that
that’s a huge range that these loans
are varying through. So what this says
is, in 1994-95, that this sharp drop, bad
quality loans were booked right before that kind of near
recession that we had in 95. Then in 2000, 2001, end of 1999, another pretty dramatic
drop in credit quality. And when we started
working a lot with vintage models like
this, everybody was saying, what’s wrong with
our 2000 vintage. People didn’t understand because the FICO
scores hadn’t changed. And so, in a static FICO
average FICO score environment, what’s going on? Where the bad loans coming from? And again, there are
a lot of explanations about the mortgage crisis, and they’re all
correct, honestly. Was it the moral hazard
of securitization? Yes. Was it the collapse
in house prices? Yes. But you’ll notice the credit
quality deterioration began back in 2004-2005. We were well into bad
loans by end of 2005. And this is before those
other things happened, this was still a good
economy but with bad loans. And you’ll also notice the
volume being originated, that’s that red line. When we got good loans,
we’re getting high volume. When we get the bad loans,
we’re getting low volume, and that happens
all the way through. So why is that? And it’s important to note that these drops
in credit quality are always right
before the recession, they’re not during
the recession. In fact, we get good loans
during the recession. So this, you could call
this alternate data, but it won’t help you
in loan origination. It’s not distinguishing
consumers. This is the senior loan
officer opinion survey that the Fed runs. And I love the survey, it
comes out every quarter. I need to check and see if
it’s out yet this month. But this is one of
the best correlations to time series that I
see in this industry, because every time
that red line shows, every time consumer demand
is dramatically falling, the quality is bad. And so, and when demand is
high, the quality is good, average quality. So what this means is that, if we’ve run this
and we’ve got a paper that was published
this year jointly with researcher
at the Philly Fed, looking at this through
the 2009 crisis. And what this is saying,
is that if you take a slice of FICO score,
take a certain product, a certain slice of FICO, you still have a
distribution of risk, right? Because FICO doesn’t
know everything, and that’s what
this talk is about. But it’s saying, as you go
through the economic cycle, that distribution
doesn’t just increase or decrease symmetrically,
it’s asymmetric, that when the
situation is not right, the risky consumers are left
with a good consumer stay home. And for mortgage not right
means interest rates are rising and house prices are
going up too fast. And that’s exactly what scared off the conservative consumers
and left the other folks. So we get this? And I said they
published us regularly, so where are we now? It’s worth noting that
we are at a stall speed, right now in the economy. The economy is doing great,
but from a lending perspective, that the demand for
loans is at neutral, and the arrow is pointed down, and that’s for all consumer
and all commercial. So this is exactly the
point at which we see an inflection and
we start to worry. But to the point of this panel, what I would like to use
alternative data for, is to find these people who are good when
conditions are bad, because maybe they had to move, so there’s still the right
people in the wrong time. I’d like to loan only to those,
but there aren’t very many. Or to find the ones who, they
never really pay attention, they’re only good if the
economy makes them good, because when the
economy changes, well, then I’ve got a problem. And so how do I filter them out? And it’s not ever gonna
come from the bureau, because the Bureau is the past. And there’s something
psychological here about demand for
loans ties directly to the psychology who borrow. So we do need alternate
data to find this. – I 100% agreed. The other thing– – Babe, thank you. (laughs) – Well, I could also disagree. – [Male] But. – But, but, but, but, but, but. There is a point where
the the type of data and the way we have
approached data, and the way banks and
lenders thought about it was very punctual. So, you know,
something would happen, they would make an update
in your credit record. You would buy something,
here’s a transaction you bought something. But a lot of things
happen in the middle. And what a alternated
is trying to do, is trying to catch
what are you thinking. So you decided to
go on a vacation in, you know, and
spend that money, what do you know that
the bank doesn’t know? Is it that you got a new job, and this job involves
a promotion, probably, you made more money,
and you’re celebrating? And that’s what I think the
big role of alternate data that still is not happening. But here in these country
is, how do you fill in that? And that you can extend it to
other things like suitability, when you talk about
us, and management, you talk about insurance, you know, how do
you get this data that doesn’t come with a
transaction at the end? Because the bank’s thought they knew everything
about everybody. No, you only know
when I buy something. You know what made me buy,
or what things I didn’t buy. What other options
are considered? So you think I went
for a X grand vacation? You think I’m overextending? Well, you, may be, I was looking
for two X grand vacation, which is two X
extent overextending. So you wanna know
that the guy is crazy. So, that’s the thing,
and that will help in what you just said. Because you will
have to start saying, okay, this is what
people are thinking, and now that’s how
we go get them. – I just don’t wanna… credit cycle adjustment
is more macro, and it’s not necessarily
alternative data. It’s an alternative data, but
it’s not the alternative data that we’re talking about. – [Joe] Exactly, this is meant
to show the opportunity– – Right right right right right. I just wanna make, I just
want to make that clear that there’s a distinction here. There’s credit cycle data and
then there’s alternative data. And I think one is, obviously, one is micro, one is very macro. One is macro, one is very micro. – So data science people think about things like predicting
future life events which seems very relevant. I think the the
infamous case of cause is target’s predicting that
someone will have a baby. I mean, they’re really
predicting current pregnancy, but, you know, you
could see it that, but this is gonna be a very
big change in somebody’s, you know, in somebody’s life. Wharton had a, oh this was maybe
half a dozen years ago now, but Wharton had a competition where there was a one of
these data, data competitions, where it was predicting
life events, and you know, you could find that there, I forget what the actual events
that were very predictable, like an obvious one
might be retirement. People just do certain things
leading up to retirement. And I imagine, these
are the kind of things that could possibly
address this opportunity. – Well, there’s actually been some good psychological
studies about the difference between a risk-taking psychology and a conservative psychology. And and there’s some
evidence that that might be what we’re looking at here, that at certain points
in the economic cycle, the conservative
consumer stays home. And I always put this in the
context of a dinner party. If you’re at a dinner party
with friends, and you ask, “Is it a good time
to buy a home?” And some people
will say, “You know, “interest rates are going up, “I don’t like the
long rates right now. “It makes it more
expensive to buy a home. “I live in San
Francisco, and jeez, “there, the prices are up again. “I’m gonna stay where I am. “My little house is fine.” And the other guy
across the table says, “I never really thought
about it, I don’t know.” They are clearly
different psychologies. And so that may be
what’s behind this, is conservative
versus risk taker. – And when you, just
to clarify earlier, you were saying that the the
FICO scores remain the same, does that also mean
that the FICO scores of the loan applicants,
the distribution over the loan applicants? Because I guess
intuitively, I would think that people, all
else being equal, people with higher FICO scores are generally gonna be
kind of more conservative in their, you know, in their
risk-taking behavior, but no. – FICO is just a
measure of past luck. You know, you can be a risk
taker and been lucky, right? – And in fact– – [Foster] Fortune
favors they prepared. – Yeah, it’s true, it’s both. And in fact, if you look
at average FICO scores through the crisis, they won’t
move until people default. So going into the crisis until
you’re well into the crisis and defaults are happening, you’re not going to get
a shift in FICO scores. So, these risks that
were measuring, you know, five or six or seven,
aren’t gonna be– – Just just to
clarify, I didn’t, what I was really asking was
a more technical question which was what, do we mean the
distribution of FICO scores across the population, or the
distribution of FICO scores among those who are
actually applying for loans? – Well, this is the… Across the population
isn’t changing? – Right. – And no, actually,
the loan applicants, if you do this analysis for
a slice of loan applicants, and this is what
we did, we said, let’s just take the
prime consumers, 30-year fixed-rate mortgage
the standard mortgage stuff, it looks by origination criteria
to be static, but it’s not. Yeah. – Anything more on
this before we move on? Alright, so I guess
I wanna make sure we save time for questions. Actually, why don’t we see
if there’s some questions. There are mics. So please go to the mic, and
just whoever gets in line at the mic, gets to
have the the questions. No fighting. Let’s use the mic because
they’re recording it, and they want to have
the questions on– – Can I ask a question? Okay, so you know.
– [Foster] Yes please. I think a lot of you have actually referred
to sort of the behavioral or a psychological
aspect of lending, right? So the reason why
relationship lending lasted for so long is,
there’s this ability to repay, there’s this risk of default. But another way of actually kind of assessing this
in the right way is, the willingness to repay as
well as sort of the stigma for not defaulting. And in many ways, the
alternative data right now is the social media is a
replacement or digitized version of a lot of the chit-chats
that the relationship bankers have actually found about
sort of the borrowers. So, you know, if we were to
actually go three years forward, and if you look back on
sort of the alternative data that were not
currently harvesting, to really understand
that behavioral behavior, the behavior of the borrower
or to incentivize them for the credit assessment,
what would they be? – What’s the question? – You know, what
would you look at? – Not quite sure what
the question was. – [Female] (mumbles)
alternative data. What kind of data is
not currently harvesting to get that impact
in terms of an– – So in terms of
the, so in terms, if I may try to
rephrase the question, in terms of the kind of desire, I forget your word, you had better word than I. A desire to repay, a
willingness to repay. Are there specific
alternative data that may give us
insight on that? And also, maybe do you
even, you see big changes, you would imagine that if
everybody’s defaulting, why shouldn’t I default as well. And so with some credit cycle, I guess my point is there some, would you think of the
same credit cycle aspect of that as well? – All I can say is in terms of
what data would lenders like, they are trying to
collect it all, NSA like, the amount of what they are
trying to collect them more. So which ones will signal, where are the signals
for the time being? The ones who have
answers from that, they are not sharing that. – I think people
are still exploring, and especially on the
issue of what’s stable? A couple of different
lenders I work with. One said, “We will not do
any alternate data sources “because we’re worried
about stability.” And I passed this comment
along to another group that we’re working
with and they said, “Those guys are chickens.” And what he went on to
explain that the difference, this second person used to work at a major pretty
sophisticated lender, who, he said that at
that sophisticated lender whose name we would all say,
yeah, they’re pretty good. He said for them to go from
data set, to model built, “to validated, to deployed, well, just the deployed
part is 18 months, and the stuff before
that is over a year. And so the stability
concern is about if I have something
that’s based on what brand of cell
phone do you use. By the time I deploy that, the world has changed
and it’s useless. And the other guy who
said we use everything, he said, “We have built
from the ground up “our in-house tech stack, “so I can go from data set
to deployment three months.” And if the data changes were– – So this is one of the reasons why this actually
is a FinTech panel, and not just a fin
data panel, right? Because in certain
cases, you actually need the tech in order to be able to take advantage of the data. Why do we alternate and
go over to this side, and we’ll, and we’ll try
to get as many in as we can before we’re forced to quit. – So banks have been
penalized in the past because of buyers and
lending decisions. With the use of AI,
could you give examples how you control for buyers
in lending decisions that the model might make? Because it’s driven
off the data set. If you could give some real
examples, that would be great. – Well what I’ve seen is people post-processing the outputs to, you know, you’ve got to
look at correlations, and I wanna avoid
using some names here, but I knew a guy who worked
for a famous AI modeler, and his company developed a
fraud solution and so forth. And this student told
me that, yeah, actually, that famous fraud
detection algorithm was basically
detecting zip code. If you looked at its outputs
it correlated really well, but zip code wasn’t an input, so they had no problem with it. Well I don’t think
it’s that easy today. I know that they
would be looking at correlations on the outputs. But there are some
fancier algorithms
for backing into that. So I don’t have the paper
references (mumbles) – So it’s super important
with respect to a buyers. And I by the way, let me know if I can give you a bunch of
references on related things. I don’t know if they’re gonna
directly answer your question. But it seems like one of
the most important things is to be able to
provide explanations of the decisions that are made. People always wave their hands
about, oh it’s a black box, that doesn’t make
any difference, you can explain the decisions
that are made by AI systems, see whether it’s a
black box or not. And if, you know, if people
are waving their hands about the fact it’s a black box and we can’t explain
what it’s doing, then go hire somebody else. You know, because
we are basically, it’s the input-output behavior
that makes a difference, you know, and, you
know, we should be able to explain the difference in cases where it’s
going to be making a big, especially in consumer credit where it might be making a big, having a big impact
on somebody’s life. – Hi, one of my observations from the crisis back in 2009, is that credit models tend
are traditionally made. Looking backwards, and I’d like to have your comments on the use of alternative data as a to help with like
forward-looking risk indicators. – I’m actually a firm believer that all crises are
pricing failures, that what I saw from
mortgage lenders in 2005, and six, and seven, is
that they were pricing for previous loans in
the previous economy, not for the loans they
were booking today. They were basically
using roll rate models. And so, if you get
the pricing right, and let’s say it differently,
and just using ordinary data, if they had been forward-looking instead of backward looking
using ordinary data at the time, price incorrectly instead
of pricing subprime loans like prime loans, much of
that wouldn’t have happened. Now, of cause I think you can do better than that potentially with some of these other
data sources as well. And as my co-panelists
here have said that in other parts
of the market, you really can’t do thin
file lending very well if you don’t have
alternate data. Those sorts of things, so you can’t price in the
beginning without that. – Let me take my
prerogative as moderator because we’re just
about out of time and Carlos has been
waiting patiently there. And so, why don’t we take the
last question from Carlos. – So, one of the things
that I’ve been thinking is like one of the challenges of actually acquiring
this alternative data certainly purchase the
data or to build a systems that are gonna go
out there and try to gather this information
from informal sources, right? And then, once
you get this data, the usefulness that you
would get out of it, it’s gonna depend on the
predictive task at hand, right? Like it could be super valuable to predict for certain people, but not really valuable
for something else. And at the end of the day,
it’s like an economic decision of whether I should
invest in this system to gather this data or should
I purchase this data or not? And I’ve been wondering if you, like how would you put
a value to this data? How would you decide whether
to build this system, to gather this information? Thank you. – I can tell you what
the people I work with. It’s all still
regarded as innovation. And so, as I was kind of
tongue-in-cheek saying before, we’ll know a few years from now if the fact, the loans that
were originated in the credit were a standard based on these
models actually works out. And to his point
is like, you know, it’s all a pricing error. So, were you pricing the right
things at the right point, and to the right people? I think we’re too
early in the cycle. So far, they seem very excited, but you know everybody’s
excited until something happens. So, – Inevitably we’re
going to be wrong. It’s just a nature. Every model works until it
doesn’t for a period of time. It’s just the nature
of what we do. we’re trying to be perfect, and there’s no way
we’re gonna be perfect. So the question about
how do you decide? That’s a cultural issue, it’s
not necessarily a data issue. And the culture, if your
culture is that of innovation, then you’re always
going to be looking for the next best thing. and that’s a cultural thing. We’re not gonna know. We don’t know, we don’t
know, that’s why we explore. And through that exploration,
we hope we learn something. Sometimes we don’t, and we
waste our time and money. But without the effort, we’re
never gonna learn anything. So we’re exploring. – Yeah, and two years from
now, it’ll be different. And three years from now,
it will be different, it will be a
constant adjustment. – Okay, we’re out of time. Join me in thanking Eric,
and Alberto, and Joe, for the panel, thanks. (audience applauding) – Thank you. – Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *