Sunday, October 30, 2011

Normalization

  • What is normalization?

It is a process of analysing the given relation schemas based on their Functional Dependencies (FDs) and primary key to achieve the properties
(1).Minimizing redundancy, (2). Minimizing insertion, deletion and update anomalies.

Normalization is the process of reducing data redundancy and maintains data integrity. This is performed by creating relationships among tables through primary and foreign keys. Normalization procedure includes 1NF, 2NF, 3NF, BCNF, and then the data is normalized.

It is set of rules that have been established to aid in the design of tables that are meant to be connected through relationships. This set of rules is known as Normalization.

  • Benefits of Normalizing your database include:
Avoiding repetitive entries
Reducing required storage space
Preventing the need to restructure existing tables to accommodate new data
Increased speed and flexibility of queries, sorts, and summaries


  • First Normal Form
In addition to breaking data up into the smallest meaningful values, tables in first normal form should not contain repetitions groups of fields

  • Second Normal Form
The second normal form states that each field in a multiple field primary key table must be directly related to the entire primary key. In other words, each non-key field should be a fact about all the fields in the primary key.


  • Third Normal Form
A non-key field should not depend on another Non-key field. The field Total is dependent on Unit price and qty.


References:
http://www.indiabix.com/technical/dbms-basics/7
http://www.techpreparation.com/mysql-interview-questions-answers1.htm
http://www.dotnetfunda.com/interview/exclusive/showcatquestion.aspx?category=38







  • Normalisation Demonstration

http://www.youtube.com/watch?v=fg7r3DgS3rA&feature=related
no data is unnecessarily duplicated







  • SQL Normalization - The Basics - 1st, 2nd, 3rd Normal Form Software Engineering Tutorial

http://www.youtube.com/watch?v=ygfikznRjpw&feature=related

a process eliminating redundancy for a database
reduces data anamolies
making OLTP updates efficient by allowing concurrency

1NF
no repeating groups
every row should be unique

eg you can't create a product column and then put nuts cake coak all together in this product column
or
you can't create product1,product2,product3 columns and the seperately put products above in these fields

2NF
first of all must be in 1NF
no partial dependencies

all non-key fields(attributes) must depend on primary key.
but non-key fields don't have to depend ONLY on primary key.

3NF
non-key fields  have to depend ONLY on primary key.
connection table(3rd table between tables)

OLAP normalizes up to 2nf means OLAP tables must comply with 1NF and 2NF but not 3NF





  • Advantages And Disadvantages Of Database Normalization



Advantages of normalization

1. Smaller database: By eliminating duplicate data, you will be able to reduce the overall size of the database.
2. Better performance:

a. Narrow tables: Having more fine-tuned tables allows your tables to have less columns and allows you to fit more records per data page.
b. Fewer indexes per table mean faster maintenance tasks such as index rebuilds.
c. Only join tables that you need.


Disadvantages of normalization

1.                  More tables to join: By spreading out your data into more tables, you increase the need to join tables.
2. Tables contain codes instead of real data: Repeated data is stored as codes rather than meaningful data. Therefore, there is always a need to go to the lookup table for the value.
3. Data model is difficult to query against: The data model is optimized for applications, not for ad hoc querying.

http://softwaretestinginterviewfaqs.wordpress.com/category/disadvantages-of-normalization/



Advantages

1) Avoids data modification (INSERT/DELETE/UPDATE) anomalies as each data item lives in One place

2) Greater flexibility in getting the expected data in atomic granular

3) Normalization is conceptually cleaner and easier to maintain and change as your needs change

4) Fewer null values and less opportunity for inconsistency

5) A better handle on database security

6) Increased storage efficiency

7) The normalization process helps maximize the use of clustered indexes, which is the most powerful and useful type of index available. As more data is separated into multiple
tables because of normalization, the more clustered indexes become available to help
speed up data access.


Disadvantages

1) Requires much more CPU, memory, and I/O to process thus normalized data gives reduced database performance

2) Requires more joins to get the desired result. A poorly-written query can bring the database down

3) Maintenance overhead. The higher the level of normalization, the greater the number of tables in the database.

http://sql-databases.blurtit.com/q362724.html





  • 1st Normal Form Requirements

The requirements to satisfy the 1st NF:

Each table has a primary key: minimal set of
attributes which can uniquely identify a record

The values in each column of a table are
atomic (No multi-value attributes allowed).

There are no repeating groups: two columns
do not store similar information in the same
table.



The requirements to satisfy the 2nd NF:

All requirements for 1st NF must be met.
Redundant data across multiple rows of a table must be moved to a separate table.
The resulting tables must be related to each other by use of foreign key.


The requirements to satisfy the 3rd NF:

All requirements for 2nd NF must be met.
Eliminate fields that do not depend on the primary key;
That is, any field that is dependent not only on the  primary key but also on another field must be moved to another table.

https://docs.google.com/viewer?a=v&q=cache:CKxApoGZTDgJ:www.cs.sjsu.edu/~lee/cs157/26FCS157-Normal_Forms_Carlos_Alvarado.ppt+&hl=en&pid=bl&srcid=ADGEEShIjKxn2PSxvC83qyRjKj3OIOhRHbwR6SrbT9bpSJQfOpVbtlMrK6oz27NEmGApjgDHIbEk3Z2D6dbhHVL8HMotwWeMXw-r0OZ000ZJtxpWa8r3d3hgpjdABya6__jtpR9ldxwT&sig=AHIEtbSNTQYm3z30HRVzmKmURIVFUi4jhQ




  • First Normal Form (1st NF)

The table cells must be of single value.
Eliminate repeating groups in individual tables.
Create a separate table for each set of related data.
Identify each set of related data with a primary key.


Do not use multiple fields in a single table to store similar data.
For example, to track an inventory item that may come from two
possible sources, an inventory record may contain fields for
Vendor Code 1 and Vendor Code 2. But what happens when you
a dd a third vendor? Adding a field is not the answer; it requires
program and table modifications and does not smoothly
accommodate a dynamic number of vendors. Instead, place all
vendor information in a separate table called Vendors, then link
inventory t o vendors with an item number key, or vendors to
inventory with a vendor code key.



Second Normal Form (2nd NF)
If it's in 1st NF and if the Primary Key is composite (multiple
columns) then any fact in that table must be a fact about the
entire composite Primary Key not just part of the Primary Key.

For example, if an inventory table has a primary key made
up of two attributes PartId and WarehouseId. Suppose the
inventory table has the warehouse address in it, since
warehouse address is a fact about WarehoseId and not
about the PartId the w arehouse address is in the wrong
table. This is in violation of the 2nd Normal Form.


Third Normal Form (3rd NF)
If it's in the 2nd NF and there are no non-key fields that
depend on attributes in the table other than the Primary Key.

Suppose in the Student table you had student birth date as an
attribute and you also had student's age. Students age depends
on the student's birth date (a fact about his/her birth date) so
3rd Normal Form is violated.

Also, a student table that contains the address of the Department of his/her major.
That address is in the wrong table, because that address should be in the
Department table. That address is not a fact about the Student
Primary Key. A violation of 3rd Normal Form.

A (non-key) fact in a table should be about the key, the whole key,
and nothing but the key.




BCNF
A table is in BCNF if it's in 3rd NF and every determinant can be used as a Primary Key.

Fourth Normal Form A table is in the 4NF if it's in BCNF and has no attribute with mutivalued dependencies.

Note that StudentId can be associated with many major as well as many activities (multivalued dependency).
Multivalued dependency lead to modification anomalies.


https://docs.google.com/viewer?a=v&q=cache:2NuHEWsb7NoJ:myweb.brooklyn.liu.edu/gnarra/database/downloads/Normalization.pdf+&hl=en&pid=bl&srcid=ADGEESjEt-GQQAGsVAqLfHJcw42ot8iruhuv9B-opTx0N7nYC8Q8c6tzeN8QH-v2XuWtpCPc5BYJ9qZolgOIMCzXQjo1TJyRwEBJLhmYaP0OtiNX84rZkqhYd0jH_Nvjooz1_J09ns1m&sig=AHIEtbR3XZpPxlh8ZGzJCUh-8Jg6IzLEig


  • Objectives of Normalization

Develop a good description of the data, its relationships and constraints

Produce a stable set of relations that

    Is a faithful model of the enterprise
    Is highly flexible
    Reduces redundancy-saves space and reduces inconsistency in data
    Is free of update, insertion and deletion anomalies

Normal Forms

    First normal form -1NF
    Second normal form-2NF
    Third normal form-3NF
    Boyce-Codd normal form-BCNF
    Fourth normal form-4NF
    Fifth normal form-5NF
    Domain/Key normal form-DKNF

Each is contained within the previous form – each has stricter rules than the previous form Objectives of Normalization

Develop a good description of the data, its relationships and constraints

Produce a stable set of relations that

    Is a faithful model of the enterprise
    Is highly flexible
    Reduces redundancy-saves space and reduces inconsistency in data
    Is free of update, insertion and deletion anomalies

Normal Forms

    First normal form -1NF
    Second normal form-2NF
    Third normal form-3NF
    Boyce-Codd normal form-BCNF
    Fourth normal form-4NF
    Fifth normal form-5NF
    Domain/Key normal form-DKNF

Each is contained within the previous form – each has stricter rules than the previous form Objectives of Normalization

http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html


No comments:

Post a Comment