Introduction

Definition

  • Normalization is the process of organizing data properly and efficiently to minimize data redundancy and inconsistencies in the database by dividing the larger or complex or redundant database into simpler and smaller ones and also linking them using proper relationships as per requirements.

Characteristics

  • The process of Normalization occurs through a series of stages called Normal Forms.
  • The normal forms apply to individual relations/tables. 
  • The normalization process follows a set of rules or normal forms to eliminate redundancy and inconsistencies from a database.

Advantages

  • Normalization is used to reduce/eliminate redundancy, dependency, and anomalies from the database table.  
  • It maintains and enhances the data integrity & efficiency of the database and makes it more flexible.

Types of Normalisation

  • The relation or table is said to be in a particular normal form if it satisfies the constraints of that normal form finally.
  • There are several normal forms, each having different aspects of data organization and its own set of rules and constraints that must be satisfied 
  • The most commonly used normal forms are as follows:-
    • First Normal Form (1NF):
      • The first Normal Form ensures that each column in a table contains atomic (indivisible) values and that there are no repeating groups of columns.
    • Second Normal Form (2NF):
      • It requires that a table must be in 1NF.
      • It additionally ensures that all non-key attributes are fully functionally dependent on the entire primary key.
      • It eliminates partial dependencies.
    • Third Normal Form (3NF):
      • It requires that a table must be in 2NF.
      • It ensures that there are no transitive dependencies, meaning no non-key attribute depends on another non-key attribute.
    • Boyce-Codd Normal Form (BCNF):
      • A stricter version of 3NF ensures that every determinant (attribute determining another attribute) in a table is a candidate key.
    • Fourth Normal Form (4NF):
      • These forms deal with more complex dependencies and are less commonly encountered in typical database design.
    • Fifth Normal Form (5NF):
      • These forms also deal with more complex dependencies and are less commonly encountered in typical database design.

De-normalization

  • Denormalization is the reverse process of Normalization.
  • Denormalization is the process of intentionally introducing redundancy and inconsistency into a database design to improve performance, usually at the cost of some level of data integrity.
  • Denormalization involves adding redundant data or grouping data together that was previously separated through normalization.
  • Denormalization is often employed in scenarios where read performance is critical and where the complexity of joining multiple tables in a normalized schema might impact query execution time.
  • Denormalization can improve query performance by reducing the number of joins required and simplifying the structure of the database, which can be beneficial in certain scenarios, especially in read-heavy applications or reporting systems.
  • Denormalization can be achieved in several ways, some common are:
    • Flattening Tables: Combining multiple related tables into a single table to reduce the need for joins and simplify queries. This often involves duplicating some data across these tables.
    • Introducing Redundant Data: Storing redundant information in one table that was previously stored across multiple tables. For instance, duplicating a frequently accessed column in multiple tables to avoid joins.
    • Using Precomputed Aggregates: Storing precomputed aggregations (like sums, averages, counts) in the database rather than calculating them on the fly. This can speed up queries involving aggregate functions.
  • Denormalization also comes with some potential disadvantages/downsides. These are –
    • It can lead to data inconsistency if redundant data is not properly maintained, as updates may need to be propagated across multiple places.
    • It can also increase storage requirements and potentially impact write performance due to increased data duplication and complexity. 

Loading

Categories: DBMS

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.