Data Normalization Forms
Data normalization can be divided into different types of normal forms. The most popular ones are 1NF, 2NF, 3NF, and BCNF. Let us dive into all these normal forms with the help of an example. Assume that a company has a database of all their employees and their key skills as shown in the table below.
Salutation
Full Name
Address
Skills
Mr.
John Denver
12, Bates Brothers Road
Content writing, Social media marketing
Ms.
Mary Ann
34, Shadowman Drive
Deep Learning, Data science
Ms.
Nancy Drew
4, First Plot Street
DBMS
1NF – First Normal Form
The most basic form of data normalization is 1NF which ensures there are no two same entries in a group. For a table to be in the first normal form, it should satisfy the following rules:
- Each cell should contain a single value
- Each record should be unique
The table in 1NF will look like this:
Salutation
Full Name
Address
Skills
Mr.
John Denver
12, Bates Brothers Road
Content writing
Mr.
John Denver
12, Bates Brothers Road
Social media marketing
Ms.
Mary Ann
34, Shadowman Drive
Machine Learning
Ms.
Mary Ann
34, Shadowman Drive
Data science
Ms.
Nancy Drew
4, First Plot Street
DBMS
2NF – Second Normal Form
In a 2NF table, all the subsets of data that can be placed in multiple rows are placed in separate tables. For a table to be in the second normal form, it should satisfy the following rules:
- It should be in 1F
- The primary key should not be functionally dependant on any subset of candidate key
Let’s divide the 1NF table into two tables – Table 1 and Table 2. Table 1 contains all the employee information. Table 2 contains information on their key skills.
Table 1
Employee ID
Salutation
Full Name
Address
1
Mr.
John Denver
12, Bates Brothers Road
2
Ms.
Mary Ann
34, Shadowman Drive
3
Ms.
Nancy Drew
4, First Plot Street
Table 2
Employee ID
Key skills
1
Content marketing
1
Social media marketing
2
Machine learning
2
Data science
3
DBMS
We have introduced a new column called Employee ID which is the primary key for Table 1. The records can be uniquely identified using this primary key.
In Table 2, Employee ID is the foreign key.
3NF – Third Normal Form
For a table to be in the third normal form, it should satisfy the following rules:
- It should be in 2F
- It should not have any transitive functional dependencies
Read more : Which Is Better Post 9/11 Or Montgomery
A transitive functional dependency is when a change in a column (which is not a primary key) may cause any of the other columns to change.
In our example, if there is a name change (male to female), there may be a change in the salutation (Mr., Ms., Mrs., etc.). Hence we will introduce a new table that stores the salutations
Table 1
Employee ID
Full Name
Address
Salutation
1
John Denver
12, Bates Brothers Road
1
2
Mary Ann
34, Shadowman Drive
2
3
Nancy Drew
4, First Plot Street
2
Table 2
Employee ID
Key skills
1
Content marketing
1
Social media marketing
2
Machine learning
2
Data science
3
DBMS
Table 3
Salutation ID
Salutation
1
Mr.
2
Ms.
3
Mrs.
Now, there are no transitive functional dependencies and our table is now in 3F. Salutation ID is the primary key in Table 3. Salutation ID in Table 1 is foreign to the primary key in Table 3.
BCNF – Boyce and Codd Normal Form
Boyce and Codd Normal Form is a higher version of 3NF and is also known as 3.5NF. A BCNF is a 3NF table that does not have multiple overlapping candidate keys. For a table to be in BCNF, it should satisfy the following rules:
- It should be in 3F
- For each functional dependency ( X → Y ), X should be a super key
Source: https://t-tees.com
Category: WHICH