- What is normalization?
It is a process of analysing the given relation schemas based on their Functional Dependencies (FDs) and primary key to achieve the properties
(1).Minimizing redundancy, (2). Minimizing insertion, deletion and update anomalies.
Normalization is the process of reducing data redundancy and maintains data integrity. This is performed by creating relationships among tables through primary and foreign keys. Normalization procedure includes 1NF, 2NF, 3NF, BCNF, and then the data is normalized.
It is set of rules that have been established to aid in the design of tables that are meant to be connected through relationships. This set of rules is known as Normalization.
- Benefits of Normalizing your database include:
Reducing required storage space
Preventing the need to restructure existing tables to accommodate new data
Increased speed and flexibility of queries, sorts, and summaries
- First Normal Form
- Second Normal Form
- Third Normal Form
References:
http://www.indiabix.com/technical/dbms-basics/7
http://www.techpreparation.com/mysql-interview-questions-answers1.htm
http://www.dotnetfunda.com/interview/exclusive/showcatquestion.aspx?category=38
- Normalisation Demonstration
http://www.youtube.com/watch?v=fg7r3DgS3rA&feature=related
no data is unnecessarily duplicated
- SQL Normalization - The Basics - 1st, 2nd, 3rd Normal Form Software Engineering Tutorial
http://www.youtube.com/watch?v=ygfikznRjpw&feature=related
a process eliminating redundancy for a database
reduces data anamolies
making OLTP updates efficient by allowing concurrency
1NF
no repeating groups
every row should be unique
eg you can't create a product column and then put nuts cake coak all together in this product column
or
you can't create product1,product2,product3 columns and the seperately put products above in these fields
2NF
first of all must be in 1NF
no partial dependencies
all non-key fields(attributes) must depend on primary key.
but non-key fields don't have to depend ONLY on primary key.
3NF
non-key fields have to depend ONLY on primary key.
connection table(3rd table between tables)
OLAP normalizes up to 2nf means OLAP tables must comply with 1NF and 2NF but not 3NF
- Advantages And Disadvantages Of Database Normalization
Advantages of normalization
1. Smaller database: By eliminating duplicate data, you will be able to reduce the overall size of the database.
2. Better performance:
a. Narrow tables: Having more fine-tuned tables allows your tables to have less columns and allows you to fit more records per data page.
b. Fewer indexes per table mean faster maintenance tasks such as index rebuilds.
c. Only join tables that you need.
Disadvantages of normalization
1. More tables to join: By spreading out your data into more tables, you increase the need to join tables.
2. Tables contain codes instead of real data: Repeated data is stored as codes rather than meaningful data. Therefore, there is always a need to go to the lookup table for the value.
3. Data model is difficult to query against: The data model is optimized for applications, not for ad hoc querying.
http://softwaretestinginterviewfaqs.wordpress.com/category/disadvantages-of-normalization/
Advantages
1) Avoids data modification (INSERT/DELETE/UPDATE) anomalies as each data item lives in One place
2) Greater flexibility in getting the expected data in atomic granular
3) Normalization is conceptually cleaner and easier to maintain and change as your needs change
4) Fewer null values and less opportunity for inconsistency
5) A better handle on database security
6) Increased storage efficiency
7) The normalization process helps maximize the use of clustered indexes, which is the most powerful and useful type of index available. As more data is separated into multiple
tables because of normalization, the more clustered indexes become available to help
speed up data access.
Disadvantages
1) Requires much more CPU, memory, and I/O to process thus normalized data gives reduced database performance
2) Requires more joins to get the desired result. A poorly-written query can bring the database down
3) Maintenance overhead. The higher the level of normalization, the greater the number of tables in the database.
http://sql-databases.blurtit.com/q362724.html
- 1st Normal Form Requirements
The requirements to satisfy the 1st NF:
Each table has a primary key: minimal set of
attributes which can uniquely identify a record
The values in each column of a table are
atomic (No multi-value attributes allowed).
There are no repeating groups: two columns
do not store similar information in the same
table.
The requirements to satisfy the 2nd NF:
All requirements for 1st NF must be met.
Redundant data across multiple rows of a table must be moved to a separate table.
The resulting tables must be related to each other by use of foreign key.
The requirements to satisfy the 3rd NF:
All requirements for 2nd NF must be met.
Eliminate fields that do not depend on the primary key;
That is, any field that is dependent not only on the primary key but also on another field must be moved to another table.
https://docs.google.com/viewer?a=v&q=cache:CKxApoGZTDgJ:www.cs.sjsu.edu/~lee/cs157/26FCS157-Normal_Forms_Carlos_Alvarado.ppt+&hl=en&pid=bl&srcid=ADGEEShIjKxn2PSxvC83qyRjKj3OIOhRHbwR6SrbT9bpSJQfOpVbtlMrK6oz27NEmGApjgDHIbEk3Z2D6dbhHVL8HMotwWeMXw-r0OZ000ZJtxpWa8r3d3hgpjdABya6__jtpR9ldxwT&sig=AHIEtbSNTQYm3z30HRVzmKmURIVFUi4jhQ
- First Normal Form (1st NF)
The table cells must be of single value.
Eliminate repeating groups in individual tables.
Create a separate table for each set of related data.
Identify each set of related data with a primary key.
Do not use multiple fields in a single table to store similar data.
For example, to track an inventory item that may come from two
possible sources, an inventory record may contain fields for
Vendor Code 1 and Vendor Code 2. But what happens when you
a dd a third vendor? Adding a field is not the answer; it requires
program and table modifications and does not smoothly
accommodate a dynamic number of vendors. Instead, place all
vendor information in a separate table called Vendors, then link
inventory t o vendors with an item number key, or vendors to
inventory with a vendor code key.
Second Normal Form (2nd NF)
If it's in 1st NF and if the Primary Key is composite (multiple
columns) then any fact in that table must be a fact about the
entire composite Primary Key not just part of the Primary Key.
For example, if an inventory table has a primary key made
up of two attributes PartId and WarehouseId. Suppose the
inventory table has the warehouse address in it, since
warehouse address is a fact about WarehoseId and not
about the PartId the w arehouse address is in the wrong
table. This is in violation of the 2nd Normal Form.
Third Normal Form (3rd NF)
If it's in the 2nd NF and there are no non-key fields that
depend on attributes in the table other than the Primary Key.
Suppose in the Student table you had student birth date as an
attribute and you also had student's age. Students age depends
on the student's birth date (a fact about his/her birth date) so
3rd Normal Form is violated.
Also, a student table that contains the address of the Department of his/her major.
That address is in the wrong table, because that address should be in the
Department table. That address is not a fact about the Student
Primary Key. A violation of 3rd Normal Form.
A (non-key) fact in a table should be about the key, the whole key,
and nothing but the key.
BCNF
A table is in BCNF if it's in 3rd NF and every determinant can be used as a Primary Key.
Fourth Normal Form A table is in the 4NF if it's in BCNF and has no attribute with mutivalued dependencies.
Note that StudentId can be associated with many major as well as many activities (multivalued dependency).
Multivalued dependency lead to modification anomalies.
https://docs.google.com/viewer?a=v&q=cache:2NuHEWsb7NoJ:myweb.brooklyn.liu.edu/gnarra/database/downloads/Normalization.pdf+&hl=en&pid=bl&srcid=ADGEESjEt-GQQAGsVAqLfHJcw42ot8iruhuv9B-opTx0N7nYC8Q8c6tzeN8QH-v2XuWtpCPc5BYJ9qZolgOIMCzXQjo1TJyRwEBJLhmYaP0OtiNX84rZkqhYd0jH_Nvjooz1_J09ns1m&sig=AHIEtbR3XZpPxlh8ZGzJCUh-8Jg6IzLEig
- Objectives of Normalization
Develop a good description of the data, its relationships and constraints
Produce a stable set of relations that
Is a faithful model of the enterprise
Is highly flexible
Reduces redundancy-saves space and reduces inconsistency in data
Is free of update, insertion and deletion anomalies
Normal Forms
First normal form -1NF
Second normal form-2NF
Third normal form-3NF
Boyce-Codd normal form-BCNF
Fourth normal form-4NF
Fifth normal form-5NF
Domain/Key normal form-DKNF
Each is contained within the previous form – each has stricter rules than the previous form Objectives of Normalization
Develop a good description of the data, its relationships and constraints
Produce a stable set of relations that
Is a faithful model of the enterprise
Is highly flexible
Reduces redundancy-saves space and reduces inconsistency in data
Is free of update, insertion and deletion anomalies
Normal Forms
First normal form -1NF
Second normal form-2NF
Third normal form-3NF
Boyce-Codd normal form-BCNF
Fourth normal form-4NF
Fifth normal form-5NF
Domain/Key normal form-DKNF
Each is contained within the previous form – each has stricter rules than the previous form Objectives of Normalization
http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html