What is surrogate key ? Where we use it explain with examples?

Editorial / Best Answer

Answered by: saravanan

  • Dec 2nd, 2005


Surrogate key is a unique identification key, it is like an artificial or alternative key to production key, bz the production key may be alphanumeric or composite key but the surrogate key is always single numeric key. Assume the production key is an alphanumeric field if you create an index for this fields it will occupy more space, so it is not advisable to join/index, bz generally all the datawarehousing fact table are having historical data. These factable are linked with so many dimension table. if it's a numerical fields the performance is high

Showing Answers 1 - 32 of 32 Answers

rvishnudas

  • Oct 30th, 2005
 

Surrogate key is a substitution for the natural primary key.

It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table.

Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use Infa sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key.

It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.

Some tables have columns such as AIRPORT_NAME or CITY_NAME which are stated as the primary keys (according to the business users) but, not only can these change, indexing on a numerical value is probably better and you could consider creating a surrogate key called, say, AIRPORT_ID. This would be internal to the system and as far as the client is concerned you may display only the AIRPORT_NAME.

Another benefit you can get from surrogate keys (SID) is :

Tracking the SCD - Slowly Changing Dimension.

Let me give you a simple, classical example:

On the 1st of January 2002, Employee 'E1' belongs to Business Unit 'BU1' (that's what would be in your Employee Dimension). This employee has a turnover allocated to him on the Business Unit 'BU1' But on the 2nd of June the Employee 'E1' is muted from Business Unit 'BU1' to Business Unit 'BU2.' The entire new turnovers have to belong to the new Business Unit 'BU2' but the old one should belong to the Business Unit 'BU1.'

If you used the natural business key 'E1' for your employee within your datawarehouse everything would be allocated to Business Unit 'BU2' even what actually belongs to 'BU1.'

If you use surrogate keys, you could create on the 2nd of June a new record for the Employee 'E1' in your Employee Dimension with a new surrogate key.

This way, in your fact table, you have your old data (before 2nd of June) with the SID of the Employee 'E1' + 'BU1.' All new data (after 2nd of June) would take the SID of the employee 'E1' + 'BU2.'

You could consider Slowly Changing Dimension as an enlargement of your natural key: natural key of the Employee was Employee Code 'E1' but for you it becomes

Employee Code + Business Unit - 'E1' + 'BU1' or 'E1' + 'BU2.' But the difference with the natural key enlargement process, is that you might not have all part of your new key within your fact table, so you might not be able to do the join on the new enlarge key -> so you need another id.

  Was this answer useful?  Yes

Dear User,

When creating a dimension table in a data warehouse, we generally create the tables with a system generated key to uniquely identify a row in the dimension. This key is also known as a surrogate key. The surrogate key is used as the primary key in the dimension table.

The surrogate key will also be placed in the fact table and a foreign key will be defined between the two tables. When you ultimately join the data it will join just as any other join within the database.

Venkat

girish

  • Nov 28th, 2005
 

Surrogate key is system generated artificial primary key values

eg: any candidate key can be considered as surrogate key.

  Was this answer useful?  Yes

saravanan

  • Dec 2nd, 2005
 

Surrogate key is a unique identification key, it is like an artificial or alternative key to production key, bz the production key may be alphanumeric or composite key but the surrogate key is always single numeric key.

Assume the production key is an alphanumeric field if you create an index for this fields it will occupy more space, so it is not advisable to join/index, bz generally all the datawarehousing fact table are having historical data. These factable are linked with so many dimension table. if it's a numerical fields the performance is high

sithusithu

  • Jan 3rd, 2006
 

Surrogate key in a data warehouse is more than just a substitute for a natural key. In a data warehouse, a surrogate key is a necessary generalization of the natural production key and is one of the basic elements of data warehouse design. Sithu

  Was this answer useful?  Yes

ravi kumar guturi

  • Mar 9th, 2006
 

Surrogate key is a system generated key, It is mainly used for critical in dwh,Here critical means nothing but it is a column which when we updated  in the them most dwh in to OLTP systems

sunil

  • Mar 17th, 2006
 

Surrogate keys are that which join dimension tables and fact table.

  Was this answer useful?  Yes

Giridhar

  • Mar 29th, 2006
 

          

       Surrogate Key is the solution for critical column problems.

               For example the customer purchases different items in different locations,for this situation we have to maintain historical data.

                By using surrogate key we can introduce the row in the data warehouse to maintain historical data.

  Was this answer useful?  Yes

Jyothsna Tallapally

  • Apr 27th, 2006
 

A surrogate key is any column or set of columns that can be declared as the primary key instead of a "real" or natural key. Sometimes there can be several natural keys that could be declared as the primary key, and these are all called candidate keys.So a surrogate is a candidate key. A table could actually have more than one surrogate key, although this would be unusual. The most common type of surrogate key is an incrementing integer, such as an auto_increment column in MySQL, or a sequence in Oracle, or an identity column in SQL Server.Use of surrogate key:Every join between dimension tables and fact tables in a data warehouse environment should be based on surrogate keys, not natural keys. It is up to the data extract logic to systematically look up and replace every incoming natural key with a data warehouse surrogate key each time either a dimension record or a fact record is brought into the data warehouse environment.

  Was this answer useful?  Yes

Surajit Das

  • Jun 13th, 2006
 

Nice example. One more point to add:

Surrogate key by nature is number , so for joining or while processing complex querry it will take lesser time for integer comparison comparative to character comparison.

Guest

  • Jul 15th, 2006
 

Hai Friends

Surrogate Key a simple concept.

Correct n exact answer for SURROGATE KEY IS BELOW:

Definition of Surrogate Key:

Alternate of Primary Key that allows duplication of datas/records.

Need, Where & Why we use Surrogate Key:

OLTP is of "Normalised Form" whereas OLAP (i.e.) Datawarehouse is of "De-normalised form".

Actually the DWH concept is to maintain the historic datas for analysing. So its should denormalized form.

To be denormalise form duplication should be allowed in DWH. When datas entering the DWH Surrogate key a new column named serial number is introduced to allow duplication in OLAP Systems to maintain historic datas.

You all know one thing a single mobile is used by other person if it is not in use for more than one year. how is it posssible just because of this Surrogate Key.

Thanks

Suresh

  Was this answer useful?  Yes

suresh kanth.C

  • Nov 20th, 2006
 

We can say "Surrogate key" is a User defined primary key..

  Was this answer useful?  Yes

I think there already enough statements of what a surrogate key is.  From experience, the huge advantage:-

Our warehouse captured data from an existing system which identified SALES by a unique number.  We implemented this as the "Natural" or "Business" Key but also generated a unique "Surrogate" key.  Seems pointless as the incoming business key is numeric already.  But we did it.

Six months into the project, the users announced the feeder system would be replaced, but the two SALES systems would run in parallel for a year.  Our existing system had a numeric natural key, the new system had a unique ID of a 20 character field.

We simply added the 20 character field to the existing dimension, and the ETL looked up using the appropriate key (existing system key or new replacement ID).  Either way it matched and was converted a source system independent surrogate key.

No fuss, no re-design.  It just worked.   Surrogate key importance proven.

So there's little room for confusion:-

Dimension:  SALESMAN

SALES_ID                              Number not null   /* Primary Key */
LEGACY_SALE_IDENTIFIER    Number               /* Legacy system business key */
NEW_SALE_IDENTIFIER         varchar(20)        /* New system business key */
...Other attributes...


The SALES_ID the the "Surrogate key" which is the Foreign Key link to the Fact Table(s).  This key is simply a generated sequence whenever a new SALESMAN entry is created. 

The design (as already described) can be extended to support type 2 Slowly changing dimensions.

  Was this answer useful?  Yes

sudhakarnbh

  • May 12th, 2008
 

Surrogate is mainly used in slowly changing dimensions,it maintaining the uniqueness in the table.it is used to track the old value with the new one.And it is derived from primary key.

  Was this answer useful?  Yes

Surrogate is mainly used in slowly changing dimensions

Is not strictly true.  While a surrogate key can be used to support a SCD, it's primary purpose is to insulate the warehouse from changes in the source systems. 

As a by-product it can also reduce disk space usage in fact tables as foreign keys to dimension tables are simple integers.

I have implemented a number of systems with Dimension Tables which did not use type 2 or 3 SCDs, but did implement surrogate keys.

  Was this answer useful?  Yes

Consider,we are analyzing a table with columns "NAME,AGE,DOB"suppose the key for this table be "NAME".
So,as the "NAME" column was keyed, it was kept in the virtual memory for quicker retrieval.
If the "name" containing some characters was in the memory it cannot retrieve  soon.
So,system is in need of some integers to store in the memory instead of storing characters.
Hence the arrival of surrogate key for that primary key"NAME" will be easier for the system to retrieve quicker than before.

  Was this answer useful?  Yes

somesh

  • Nov 28th, 2010
 

Actually your explanation is very useful. but still I am not clear. Can I use the surrogate key as a foreign key in another table and reference to the master table. In that case will it work ok. Is it a good practice to take surrogate key as primary key in each table except of some of like customer table, employee detalils table etc.

  Was this answer useful?  Yes

somesh

  • Nov 28th, 2010
 

Is it somewhere necessary for the table to have an additional key with surrogate key to uniquely identify each row or during join.

  Was this answer useful?  Yes

santosh kumar

  • Apr 2nd, 2012
 

Surrogate key and primary key both are same, but point of application is different.

Primary key is part of transactional system but surrogate key is part of analytical system.

Surrogate key is used uniquely identify each and every records in dimension tables in data warehouse, primary key is used to identify ,each and every records in transactional system in OLTP.

  Was this answer useful?  Yes

rj_dwbi

  • Apr 2nd, 2012
 

Surrogate key is a numeric or integer value in a table to identify each row uniquely. Primary key and surrogate key are same but surrogate key is a system generated value has a define incremental value for each row in a table. Surrogate key does not have any business importance for the value it holds but primary key has a significant business value.

  Was this answer useful?  Yes

Abdulhakim

  • Jun 26th, 2014
 

Surrogate keys are keys used in identifying other keys like

  Was this answer useful?  Yes

Manners Dev

  • Nov 21st, 2014
 

Please add please before asking a question. And once you get answer please show some gratitude.
Thank you

  Was this answer useful?  Yes

Venkat

  • Dec 2nd, 2014
 

Surrogate keys are sub system of natural Keys. it will track the changes of natural key by increasing the value and it always holds numeric

  Was this answer useful?  Yes

Anjaneyulu

  • May 4th, 2015
 

Surrogate is nothing but a primary key but it has no business meaning

  Was this answer useful?  Yes

eranjan2007

  • Aug 13th, 2015
 

Simple Example:
Suppose you have a table for user of bank:
user_name Rakesh Ranjan
Surrogate Key: user_type saving123
Primary Key: user_id 10001

  Was this answer useful?  Yes

srinivasan

  • Sep 16th, 2015
 

Could you please clarify on your below statement please?
You all know one thing a single mobile is used by other person if it is not in use for more than one year. How is it possible just because of this Surrogate Key.

  Was this answer useful?  Yes

sanyam

  • Feb 1st, 2016
 

Thank you for your advice

  Was this answer useful?  Yes

praveen

  • Oct 16th, 2016
 

A surrogate key is any column or set of columns that can be declared as the primary key instead of a "real" or natural key. Sometimes there can be several natural keys that could be declared as the primary key, and these are all called candidate keys.So a surrogate is a candidate key.

  Was this answer useful?  Yes

Anukul Thawarey

  • Aug 8th, 2017
 

Just to add on to this point, we can also have date stamp as in the format (yyyymmdd) which is actually an 4 Byte unsigned integer to keep track of the historical data. This date stamp column can also be considered as an surrogate key.

  Was this answer useful?  Yes

P.Manideep

  • Oct 29th, 2017
 

Surrogate key is system generated key. It is a column or set of columns declared as primary key instead of real of natural key.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions