Tuesday, January 28, 2014

Functional Dependency

  • Example Functional Dependencies

Let R be
NewStudent(stuId, lastName, major, credits, status, socSecNo)

FDs in R include

    {stuId}→{lastName}, but not the reverse
    {stuId} →{lastName, major, credits, status, socSecNo, stuId}
    {socSecNo} →{stuId, lastName, major, credits, status, socSecNo}
    {credits}→{status}, but not {status}→{credits}

ZipCode→AddressCity

    16652 is Huntingdon’s ZIP

ArtistName→BirthYear

    Picasso was born in 1881

Autobrand→Manufacturer, Engine type

    Pontiac is built by General Motors with gasoline engine

Author, Title→PublDate

    Shakespeare’s Hamlet was published in 1600

Trivial Functional Dependency

The FD X→Y is trivial if set {Y} is a subset of set {X}

Examples: If A and B are attributes of R,

    {A}→{A}
    {A,B} →{A}
    {A,B} →{B}
    {A,B} →{A,B}

are all trivial FDs and will not contribute to the evaluation of normalization.

http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html



  •  (Rel. DB Design) Consider the EMP_PROJ relation schema with the attributes SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC (project location). The following set of functional dependencies hold on this schema:

SSN->ENAME
PNUMBER -> PNAME,PLOC
SSN,PNUMBER -> HOURS.

(5) Compute the closure of {SSN,PNUMBER} from these functional dependencies.
SSN,PNUMBER->ENAME, PNAME, PLOC, HOURS.

(1) What is a candidate key for this relation schema?
   Candidate key is {SSN, PNUMBER}.

(4) Is this schema in BCNF? State one key problem with this schema.
It is not in BCNF because SSN->ENAME is not a superkey dependency.
     Therefore, all tuples containing the same SSN will also have the same ENAME, causing redundancy.
This results in waste of space and problems with updating the data consistently.

(10) Decompose this schema into BCNF or 3NF (your choice).
3NF decomposition: {SSN, ENAME},  {PNUMBER, PNAME, PLOC},  {SSN, PNUMBER, HOURS}
     BCNF decomposition: same.

http://cs.nyu.edu/courses/spring00/G22.2433-001/answers.html





  • 5. For parts a-c, assume we have a relation with the scheme

Book (Title, Author, Publisher, PubAddress, PubZip, CopyrightYear, ISBN )
//ISBN = International Standard Book Numbers

a.  What would be the likely primary key attribute(s)? ________ISBN____________________[2]

b. List all non-trivial functional dependencies [7]?

ISBN -> {Title, Author, Publisher, PubAddress, PubZip, CopyrightYear }

{Title, Author} ->{ Publisher, PubAddress, PubZip, CopyrightYear, ISBN}

Publisher -> PubAddress, PubZip


c. If this relation were used as defined (not normalized), describe the insertion and deletion anomalies that could arise. [3]
A publisher’s address cannot be stored additionally without at least a book
A publisher’s address is replicated and if changed, would have to update many records




http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html


  • FD Axioms
Understanding: Functional Dependencies are recognized by analysis of the real world; no automation or algorithm.
Finding or recognizing them are the database designer's task.

Axiom Name      Axiom  Example
Reflexivity    if a is set of attributes, b ⊆ a, then a →b    SSN,Name → SSN
Augmentation    if a→ b holds and c is a set of attributes, then ca→cb  SSN → Name then
SSN,Phone → Name, Phone
Transitivity    if a →b holds and b→c holds, then a→ c holds    SSN →Zip and Zip → City then SSN →City
Union or Additivity *  if a → b and a → c holds then a→ bc holds      SSN→Name and SSN→Zip then SSN→Name,Zip
Decomposition or Projectivity*  if a → bc holds then a → b and a → c holds      SSN→Name,Zip then SSN→Name and SSN→Zip
Pseudotransitivity*    if a → b and cb → d hold then ac → d holds      Address → Project and Project,Date →Amount then Address,Date → Amount
(NOTE)  ab→ c does NOT imply a → b and b → c

*Armstrong's Axioms (basic axioms)


http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html



  • Consider relation R = (A, B, C, D) and the following statements for functional dependencies.
For each statement, if it is true, prove it. Otherwise, show a counter-example to disprove it.
(4 points)
(a) If A → B and A → C, then A → BC (2 points)
⇒ Answer: True.
For A → B, then A → AB (augmentation rule);
For A → C, then AB → BC (augmentation rule);
Then A → BC (transitivity rule)
(b) If A → B and C → D, then AC → BD (2 points)
⇒ Answer: True.
AC → A (reflexivity rule);
For A → B, s.t. AC → B (transitivity rule);
AC → C(reflexivity rule);
For C → D, s.t. AC → D (transitivity rule);
So AC → BD
Rubric: Each correct proof gets two points. If the answer is FALSE, the student gets zero
points

webcache.googleusercontent.com/search?q=cache:8vRupC7L9OYJ:https://wiki.engr.illinois.edu/download/attachments/227743489/CS411-F2011-Final-Sol.pdf%3Fversion%3D1%26modificationDate%3D1380470739000+&cd=3&hl=tr&ct=clnk&gl=tr&client=firefox-a




  • Schema Design Example

Problem 2 (10 points) Schema Design Using the following database description,
create a relational schema. Remember to identify primary keys and foriegn keys correctly. Select
approaches that yield the fewest number of relations. Remember to state any assumptions you use
while creating the schema.
Let’s create a movie database, similiar to what IMDB would use.
• A movie has an ID number, a release date, a title, and a running time.
• A movie has a number of people that work on it, including 1 director and 1 producer.
• A movie has many actors. Each actor has a character name for a particular movie.
• Each person has an ID Number, a real name, a birthday, and an address
• Our website will have users that log on. These users are different from the people associated
with the movies.
• Each user had a unique logon name and a password.
• A user can leave reviews of movies. They can only leave one review per movie, but can review
as many movies as they like.
Solution
Movie(MovieID,ReleaseDate,Title,RunningTime,DirectorID,ProducerID)
Person(PersonID,Name,Birthday,Address)
Actors(MovieID,PersonID,CharacterName)
Users(UserName,Password)
Reviews(UserName,MovieID,Review)
Foriegn Keys: Movie.DirectorID is a foriegn key to Person. Movie.ProducerID is a foriegn key
to Person. Actors.MovieID is a foriegn key to Movie. Actors.PersonID is a foriegn key to Person.
Reviews.UserName is a foriegn key to Users. Reviews.MovieID is a foriegn key to Movie.

webcache.googleusercontent.com/search?q=cache:8vRupC7L9OYJ:https://wiki.engr.illinois.edu/download/attachments/227743489/CS411-F2011-Final-Sol.pdf%3Fversion%3D1%26modificationDate%3D1380470739000+&cd=3&hl=tr&ct=clnk&gl=tr&client=firefox-a



  • For the remaining questions, use the following relational schema for a music albums database.  Keys are (mostly) underlined.  The attributes should be self-evident.  If not, please ask for clarification.  For a given music track, we code the title, its play length in time (minutes:seconds), its genre (pop, metal, jazz, etc.) and a 5 star maximum rating.  The musicians, singers and instrumentalists are all listed in on their contribution to the track.  A person may have 1 or more listing for a track. For example someone may both sing and play the piano.  The album is a collection of tracks.  An album is distributed and owned by a company called the label and has a producer and an engineer.

PEOPLE (PID, name, address, zip, phone)
CSZ (zip, city, state)
TRACKS (trID, title, length, genre, rating, albID) //trID is unique across all albums
ALBUMS (albID, albumTitle, year, label, prodPID, engPID, length, price)
CONTRIBS (trID, PID, role)


a) List all names and phone numbers of people from zip 90210. [5]


SELECT P.name, P.phone
        FROM PEOPLE P
        WHERE zip = ‘90210’;  --may or may not be quoted, alias not necessary


        b) List album titles and labels and producer names with a list price of more than $18. [5]


SELECT A.albumTitle, A.label, A.price, P.name
FROM Albums A, People P
WHERE A.price > 18 AND A,prodPID=P.PID

c) List all the musicians by name and what they played or contributed to on all jazz type tracks. [5]

SELECT P.name, C.role
FROM TRACKS T NATURAL JOIN CONTRIBS C
      NATURAL JOIN PEOPLE P
WHERE T.genre = ‘JAZZ’


d) Get a list of names and addresses of people who produced OR engineered an album, but did not perform on any track.  (Hint: subselect and set operations are very helpful). [6]



SELECT P.name, P.address, Z.city, Z.state, Z.zip
FROM PEOPLE P
WHERE P.PID IN
  (SELECT A.prodPID
    FROM ALBUMS A
      UNION
  SELECT B.engPID
    FROM ALBUMS B
    EXCEPT
  SELECT C.PID
    FROM CONTRIBS
)


e) List names of musicians who have contributed in at least two different roles on the same tracks with ratings 4 or higher. (Use group by… having and not a self-join). [6]



SELECT P.name
FROM PEOPLE P NATURAL JOIN CONTRIBS C NATURAL JOIN TRACKS T
WHERE T.rating>4
GROUP BY C.trID, C.PID
HAVING COUNT(DISTINCT C.role)>=2




f) What is the average price of albums for each year of release (show years), but only for albums with 6 or more tracks and length of 30 or more minutes. (Need a subselect and group by having.) [8]

SELECT AVG(A.price), A.year
FROM ALBUMS A
WHERE A.length >=30 and A.albID IN
    (SELECT T.albID
      FROM TRACKS T
      GROUP BY T.albID
      HAVING COUNT (*)>=6
)
GROUP BY A.years


jcsites.juniata.edu/faculty/rhodes/dbms/exams/mid2f13key.docx

No comments:

Post a Comment