When the ICE (Internal Combustion Engine) first came out imagine if there was a law saying it can’t be used where it discriminates based on age or disability. So no cars because they can’t be driven by children, blind people, etc.

Now let’s look at approving loan applications. We have 10 applications that are right on the edge of qualifying. The worlds most expert loan approver is given them and approves 5 and declines 5.

Now we give them to 10 different experienced loan approvers. They won’t match. They won’t match each other or the world’s expert. Not just on which, but how many. So what is the correct solution there? There isn’t one. Put all 10 people in a room to argue it out and you’re going to get 12 solutions.

A.I. evaluating loan application

So how do you even measure if the A.I. is performing correctly in this situation? You can’t. You can say it’s in the range of acceptable, but that range then means differential treatment is acceptable, to an extent.

It gets even more complicated. When an A.I. is trained it ends up with millions to billions of parameters. It is then fed a prompt where it guides the decision making. How on earth do you determine if bias was applied across those billions of parameters as applied to the text of the prompt.

Where running the same set of applications the next day will likely get slightly different results. Same input, same A.I., slightly different approval results. Reword the prompt, with the same requirements, but phrasing it differently, and again, different results.

Alchemy

Not to downplay the research and science behind A.I. but when it is all put together, in ways using it is more alchemy than science. There’s the case where a team mistakenly trained an A.I. repeatedly on the same data over a weekend - and the result was the first A.I. that could handle math problems correctly. By being told the exact same thing multiple times.

There’s no way, with what we know today about A.I. to determine if it discriminates. Or better said - it almost certainly does discriminate. We just don’t in what ways and to what degree. Anything you feed it, it will make use of.

Maybe borrowers whose last name starts with “Th” are better credit risks. Then it discriminates in favor of Mike Thomas which means it discriminates against Mike Smith. It will find every correlation it can and use each to the extent it makes sense - to the A.I.

And the larger the data set, the more it constitutes everything, the more the human biases that occurred in the decisions creating that dataset, the more those same biases will be in the A.I. With that said, the A.I. dataset will include results so if humans favored borrowers whose last name started with “Th” but those borrowers defaulted at a higher rate, the A.I. knows to downgrade Mike Thomas.

What is Feasible

First off, you absolutely can have companies list what decisions are made by A.I.

Second, you can varied enough datasets for training. Deciding loan approval across Colorado based on loans in Boulder - no. Based on a random set of loans from across the state - yes.

Third, you can limit what demographic information can be entered. So loans cannot identify race, ethnicity, etc. But this is a two edged sword. If loans cannot identify religion, then members of a religion that are very good credit risks do not get approvals at a higher level - that with discrimination they would get.

And keep in mind its not just approve/disprove. A.I. can also determine the interest rate, what loan types are allowed, etc. It can build a custom loan that makes the most sense for that individual. But the more the A.I. knows, the better it can customize.

Fourth, for now require the A.I. uses to measure treatment/impact of the system in operation. And come up with valid measurements. For example, the question is not does it approve or disprove loans to white people more so than others. The question is does the company have the same late payment & default rates for white people as for others.

Applicants for loans, jobs, etc. will have disproportionate numbers that do/don’t qualify based on age, race, etc. It’s the nature of the beast that you don’t have proportional applicants with the same spread of income, skills, etc. across each demographic. So an accurate system will have differential treatment.

The question is — is it getting it right with equitable treatment regardless of race, gender, etc.