In relational database design, the process of organizing data to minimize redundancy is called normalization. It usually involves dividing a database into 2 or more tables and defining relationships between tables. Objective is to isolate data so that additions, deletions, and modifications can be made in just one table.
Benefits:-
- Eliminate data redundancy
- Improve performance
- Query optimization
- Faster update due to less number of columns in one table
- Index improvement
There are multiple forms of Normalizations in a database.
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Fourth Normal Form (4NF -BCNF NF)
First normal form (1NF)
Eliminate duplicative columns from the same table.
· Create separate tables for each group of related data and identify each row with a unique column or set of columns.
· Remove repetitive groups
· Create Primary Key
Name |
State |
Country |
Phone1 |
Phone2 |
Phone3 |
John |
101 |
1 |
488-511-3258 |
781-896-9897 |
425-983-9812 |
Bob |
102 |
1 |
861-856-6987 |
|
|
Rob |
201 |
2 |
587-963-8425 |
425-698-9684 |
|
PK |
|
|
[ Phone Nos ] |
|
|
? |
|
|
|
? |
|
ID |
Name |
State |
Country |
Phone |
|
1 |
John |
101 |
1 |
488-511-3258 |
|
2 |
John |
101 |
1 |
781-896-9897 |
|
3 |
John |
101 |
1 |
425-983-9812 |
|
4 |
Bob |
102 |
1 |
861-856-6987 |
|
5 |
Rob |
201 |
2 |
587-963-8425 |
|
6 |
Rob |
201 |
2 |
425-698-9684 |
|
Second Normal Form (2NF)
Second normal form (2NF) further addresses the concept of removing duplicative data:
- · Meet all the requirements of the first normal form.
- · Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- · Create relationships between these new tables and their predecessors through the use of foreign keys.
- Remove columns which create duplicate data in a table and related a new table with Primary Key – Foreign Key relationship.
Third Normal Form (3NF)
Third normal form (3NF) goes one large step further:
- · Meet all the requirements of the second normal form.
- · Remove columns that are not dependent upon the primary key.
Country can be derived from State also… so removing country
ID |
Name |
State |
Country |
1 |
John |
101 |
1 |
2 |
Bob |
102 |
1 |
3 |
Rob |
201 |
2 |
Fourth Normal Form (4NF)
Finally, fourth normal form (4NF) has one additional requirement:
- Meet all the requirements of the third normal form.
- A relation is in 4NF if it has no multi-valued dependencies.
If PK is composed of multiple columns then all non-key attributes should be derived from FULL PK only. If some non-key attribute can be derived from partial PK then remove it
The 4NF also known as BCNF NF
TeacherID |
StudentID |
SubjectID |
StudentName |
101 |
1001 |
1 |
John |
101 |
1002 |
2 |
Rob |
201 |
1002 |
3 |
Bob |
201 |
1001 |
2 |
Rob |
|
|
|
|
|
|
|
|
TeacherID |
StudentID |
SubjectID |
StudentName |
101 |
1001 |
1 |
X |
101 |
1002 |
2 |
X |
201 |
1001 |
3 |
X |
201 |
1002 |
2 |
X |