Cleansing International Addresses

A problem in data cleansing I have come across several times is when you have some name and address registrations where it is uncertain to which country the different addresses belong.

Many address-cleansing tools and services requires a country code as the first parameter in order to utilize external reference data for address cleansing and verification. Most business cases for address cleansing is indeed about a large number of business-to-consumer (B2C) addresses within a particular country. But sometimes you have a batch of typical business-to-business (B2B) addresses with no clear country registration.

The problem is that many location names applies to many different places. That is true within a given country – which was the main driver for having postal codes around. If a none-interactive tool or service have to look for a location all over the world that gets really difficult.

For example I’m in Richmond today. That could actually be a lot of places all over the world as seen on Wikipedia.

popeI am actually in the Richmond in the London, England, UK area. If I were in the state capital of the US state of Virginia, I could have written I’m in “Richmond, VA”. If an international address-cleansing tool looked at that address, I guess it would first look for a country code, quickly find VA as a two-character country code in the end of the string and firmly conclude I’m at something called Richmond in the Vatican City State.

Have you tried using or constructing an international address cleansing process? Where did you end up?

Bookmark and Share

3 thoughts on “Cleansing International Addresses

  1. Gary Allemann 21st July 2014 / 13:41

    Hi Henrik.

    Of course this is also a challenge when trying to categorise client records to countries for legal reasons – for example in support of sanctions or for legislation such as FATCA.

    There are, of course, some data quality tools that can make decisions about the country of a record based on other data – but never a trivial problem to solve.

    G

  2. John Owens 21st July 2014 / 20:09

    Hi Henrik

    One of the major reasons why this seems difficult is that our perceptions of how an address should be entered are based on a practice for hand delivered messages that precedes the invention of the postage stamp by 200 years.

    When we adopt an international perspective, rather than a parochial one, we will automatically say, “Of course you should enter the country first, how else would you know what the format should be?”.

    What surprises me is that so few organisations doing international trade, even international couriers, have had this realisation as doing so makes effective addressing several orders of magnitude easier.

    Kind regards
    John

    I show the structures to enable quality international address entry at http://johnowensblog.com/data-quality-dynamic-data-entry/

  3. Henrik Liliendahl Sørensen 24th July 2014 / 08:56

    Thanks for adding in Gary and John. Indeed not trivial. Having a good data capture process in place is in my eyes a very good start.

Leave a comment