How PayPal beats the bad guys with machine learning

As big cloud players roll out machine learning tools to developers, Dr. Hui Wang of PayPal offers a peek at some of the most advanced work in the field

machine learning robot touch screen
Shutterstock

When Amazon Web Services announced a new machine learning service for its cloud last week, it was a sort of mini-milestone. Now all four of the top clouds -- Amazon, Microsoft, Google, and IBM -- will offer developers the means to build machine learning into their cloud applications.

What sort of applications? As InfoWorld’s Andrew Oliver has observed, both machine learning and big data will eventually disappear as separate technology categories and insinuate themselves into many, many different aspects of computing.

Nonetheless, right now certain uses of machine learning stand out for their immediate payback.

Fraud detection is first among them, because it addresses an urgent problem that would be impractical to solve if machine learning didn't exist. To get a sense of how machine learning is combating fraud, I interviewed Dr. Hui Wang, senior director of risk sciences for PayPal. Wang holds a Ph.D. in statistics from UC Berkeley, and prior to her 11 years at PayPal conducted credit scoring research at Fair Isaac.

You can easily imagine why PayPal would be concerned about fraud, given the innumerable scams that have targeted PayPal users. As it turns out, however, PayPal has already ventured beyond fraud detection to address other areas of risk management, including “modern machine learning in the credit decision world,” which Wang says is a lot more complex -- in part due to regulatory requirements.

According to Wang, PayPal is a pioneer in risk management, although some advanced efforts are just now emerging from the lab. PayPal uses three types of machine learning algorithms for risk management: linear, neural network, and deep learning. Experience has shown PayPal that in many cases, the most effective approach is to use all three at once.

Running linear algorithms to detect fraud is an established practice, Wang says, where “we can separate the good from the bad with one straight line.” When this either/or categorization fails, however, things get more interesting:

We soon realized linear doesn’t work because the world, of course, is not linear. So instead of saying one line can separate the world of good from bad, let’s use multiple lines or curve the lines. Within that category, I guess people might be familiar with the neural network, [which imitates] how neurons work in the human world. Also, a lot of algorithms are tree-based, mimicking a human being when we have to make a judgment -- for example, we would say if it’s raining, I’ll take an umbrella.

Neural net algorithms were developed decades ago, but today’s modern computing infrastructure -- along with the enormous quantity of data we can now throw at those algorithms -- has increased neural net effectiveness by a magnitude. Wang says these advances have been essential for risk management:

We take trust very seriously. It’s our brand. We have to decide in a couple of hundred milliseconds whether this is a good person, [in which case] we will give him or her the best and the fastest and the most convenient experience. Or is it a potentially bad guy and we have to insert some friction? The recent progress on the infrastructure side enables the application of a neural network in a practical payment risk management world possible.

Quickly determining trustworthy customers and putting them in the express lane to a transaction is a key objective, Wang explains, using caching mechanisms to run relational queries and linear algorithms, among other techniques. The more sophisticated algorithms apply to customers who may be problematic, which slows down the system a bit as it acquires more data to perform in-depth screening.

This downstream process extends all the way to deep learning, which today also powers computer vision, speech recognition, and other applications. When I asked Wang for a layman’s explanation of the difference between neural nets and deep learning, she offered this explanation:

A neural net tries to mimic a human’s way of processing information. We take ABC and try to create a relationship among them, and we take a CDE and create another relationship, and then on a higher level abstract the intermediate mini-model. So it’s kind of mimicking the human thought process. But in deep learning you’re basically taking it to many, many layers. It’s not just ABCDE, there are like 3,000 features out there and then within that 3,000 there are a lot of mini-classes of features. They have all kinds of relationships and we’re just adding layers and layers of these intermediary mini-models or mini-abstractions of the information -- and in the end come up with the top level.

Wang emphasizes that you need large quantities of data to support these complex neural network structures. PayPal itself collects gargantuan amounts of data about buyers and sellers, including their network information, machine information, and financial data. The deep learning beast is well fed.

But again, PayPal does not use deep learning in isolation. It applies all three together: linear, neural network, and deep learning algorithms. Wang explains why:

Let’s take a linear algorithm. You might think it’s outdated, but it still potentially catches something the nonlinear algorithm might not be able to. So in order to get the best out of all [three], we “ensemble” them together. We have a “voting committee.” One is linear and one is nonlinear and we just ask them: What is your opinion on this file? Then we take their vote and eventually ensemble them together for our final assessment … [It’s like] taking a lot of doctors and listening to all of them. With that kind of community-based voting, hopefully something better or more accurate will come out.

Wang says she is proud to be managing the data science team at PayPal, which is on the forefront of developing machine learning and data mining technology. Because her team is so advanced, particularly in the practical application of deep learning, I couldn’t resist asking her what she thought about widely publicized warnings from Stephen Hawking, Elon Musk, and Bill Gates regarding the potential dangers of artificial intelligence in the future:

I never worry that these machines will replace humans. Yes, we can add layers, but you can talk to any machine learning scientist and they will say that the algorithm is important, but at the end of the day what really makes the difference is that a machine cannot find data automatically … There is so much data, so much variety, but the flip side is: What is useful? We still rely on human oversight to decide what ingredients to pump into the machine.

There’s a practical lesson in that statement: Now and in the future, machine learning depends not only on big data, but on the right data. Cloud infrastructure and integration presents abundant compute power for deep learning -- as well as access to unthinkably large data sets across a potentially unlimited number of domains.

What we refer to as big data and machine learning today will tomorrow simply be integrated into the fabric of computing. As the major cloud providers open up those capabilities to all developers, the stage is set for a new wave of applications that will be much more intelligent than before.

Copyright © 2015 IDG Communications, Inc.