Origins in Detail

Origins is a segmentation system which classifies consumers according to the part of the world from which their forebears are most likely to have originated.

Each consumer on a customer file can be placed into one of 200 different ‘Origins’ types on the basis of their personal and family names. The segmentation could be used for example to identify people on a customer file whose ancestry is most likely to be from Ireland, Italy, Albania or Myanmar.

Origins uses the same information to code customers on the basis of their most likely language and religion. An age estimate and gender can also be appended to the records of most customers based on their names.

What information is used to build Origins

The Origins classification is built from a global file containing the personal and family names of some 527,000,000 adults from around the world. In addition we have access to personal and family name frequencies covering another 529,000,000 adults. These billion adults are resident in 18 different countries.

Using this information we have been able to establish the likely Origins code for some 2,000,000 different family names and some 700,000 personal names.

The source of these names is shown below:

Country Count Combinations Family Names Personal names
Australia 11,304,082 x x x
Brazil 146,303,353 x x x
Denmark 4,274,408 x x x
Germany 64,664,703 x x x
Ireland 3,077,097 x x x
Italy 16,795,661 x x x
Netherlands 8,257,541 x x x
Norway 2,889,297 x x x
Romania 4,032,320 x x x
Spain 11,820,566 x x x
Sweden 4,300,059 x x x
United Kingdom 47,695,794 x x x
USA 201,362,132 x x x
Belgium 9,089,379 x x
France 48,019,781 x x
India 434,286,400 x
Japan 47,179,928 x
New Zealand 599,381 x
Total 1,065,951,882 526,777,013 1,065,951,882 583,886,173

Who uses Origins and How is it used

Origins is used to profile customers and customer segments, citizens and service users, employees and even suppliers. By profiling customers you can identify which groups are under or over-represented on your customer file. You can find out which groups prefer to use which products, channels and outlets, which ones you are good or poor at retaining and which are responsive to which types of promotion or reward.

Origins is used to code customers. By coding customers you can target campaigns to improve awareness and take up of public services by members of specific minority groups. You can also target products, such as cosmetics, media channels and travel, at audiences for whom they have been especially developed.

Origins is used to classify postcodes. Using a table which identifies the dominant Origins type in each postcode you can identify and map the locations in which individual communities have established themselves right down to street level.

How does Origins work

In order to code individual customers, Origins makes use of a table which contains information on over 700,000 personal names and over 2,000,000 family names. Each of these names has been examined in such a way as to identify the Origins type to which it is most likely to belong. This evaluation makes use of a number of criteria including the Origins codes of the surnames held by bearers of each personal name, and vice versa; the geographical concentration of the name both within and between countries; the Mosaic codes in which the names are mostly found ; and the appearance of diagnostic letter sequences.

This evaluation also establishes the confidence with which we can say a particular name belongs to a particular Origins type.

Looking at the codes associated with both the personal name and the family name, and taking into account the confidence level of each, Origins identifies the Origins type to which each customer name is most likely to belong.

What is Origins coverage rate

Provide you files free of data capture errors, you should be able to code 99.5% of your customer records by Origins type. The residue are either names which the system does not recognise, because they are rare, or ones which the system can not allocate to any particular Origins type.

What is Origins level of accuracy

The level of accuracy varies from one Origins type to another. Origins achieves accuracy rates in excess of 90% in identifying South Asians and Muslims, and 70% in identifying Black Africans, Greeks, Armenians and people from East and South East Europe. It achieves accuracy rates of 50% with Hispanics. Lower accuracy rates are achieved with people of Nordic or French origin, with Jews and Black Caribbeans.

As would be expected the system is more accurate when coding names to a general categories, such as South Asians or Greeks or Greek Cypriots, than to specific sub-categories, such as Sri Lankans or Greek Cypriots.

How does Origins handle persons of mixed ancestry

Origins can be used to identify persons whose names come from more than one tradition – for example a person with an English personal name and a Finnish family name.

The confidence score given to each name combination can also be used to select or deselect people who are most likely to be of mixed ancestry. Restricting a communication to names with high confidence scores is an effective way of avoiding communicating with individuals who are least likely to belong to the selected target group.

Profiling using Origins

When Origins is used to profile customer, citizen or employee files it is possible to compare the distribution of records by Origins on your file with the distribution of the population by Origins in the geographical region which you serve or from which your employees are drawn. For example you can specify as your base comparison any administrative region, local authority district, postcode area, police, education or health area in Great Britain. The distribution of the population by Origins is also available for regions of the USA and other European countries.

Using Origins in different countries

Although Origins is a single application, it has facilities whereby it can be optimised for specific international markets. These international versions code certain names differently in different markets. For example a ‘Roger’, which would be coded as ‘English’ in Britain, would be coded ‘French’ in France. Non GB versions of Origins also allow the mix of names by Origins type to be compared with the Origins mix for the specific market in which the analysis is undertaken.

The product is particularly attractive to international organisations who need a consistent basis for analysing diversity in each of the national markets in which they operate.

Output can be configured for local languages and needs. For example the way in which the Origins categories are best grouped will be different in Australia from in the Netherlands. The system provides complete flexibility over classing.

How is Origins accessed

Origins types and groups can be appended to customer records using Origins software applications. This systems are licenced to clients by Experian or a local partner. The application is downloadable from the internet and makes use of files which themselves are updated on a regular basis as names from more countries are introduced. The licence fee depends upon the version of the application licenced. For example it is possible to licence a standard GB version designed to code names appearing on British or Irish customer lists. An enhanced version also appends gender and an estimate of lifestage. Alternatively users can licence versions of Origins optimised for different overseas markets.

If they prefer users can have the codes appended to their files by Experian on a bureau basis or use a web-based coding system.

Other users are provided by Experian with Origins at the postcode level, whether a file containing the dominant Origins category in each postcode or the mix of Origins types for different levels in the postcode geographic hierarchy.

The Origins postcode classification is typically accessed via Experian's Micromarketer Generation 3 geographcal analysis software.