Thursday, April 12, 2012

Relational Algebra - Select and Project Operators


  • Relational Algebra - Select and Project Operators

http://www.youtube.com/watch?v=yVh_LcOcQdg



  • Relational Algebra


Relational SELECT

SELECT is used to obtain a subset of the tuples of a relation that satisfy a select condition.

For example, find all employees born after 1st Jan 1950:

SELECTdob '01/JAN/1950'(employee)


Relational PROJECT

The PROJECT operation is used to select a subset of the attributes of a relation by specifying the names of the required attributes.

For example, to get a list of all employees surnames and employee numbers:

PROJECTsurname,empno(employee)

http://db.grussell.org/section010.html#_Toc67114472




  • Relational Algebra: 5 Basic Operations


• Selection () Selects a subset of rows from
relation (horizontal).
• Projection () Retains only wanted columns
from relation (vertical).

• Cross-product (x) Allows us to combine two
relations.
• Set-difference (–) Tuples in r1, but not in r2.

• Union ( ) Tuples in r1 and/or in r2.

https://docs.google.com/viewer?a=v&q=cache:VDokuEkCX5wJ:inst.eecs.berkeley.edu/~cs186/sp06/lecs/lecture8Alg.ppt+&hl=en&pid=bl&srcid=ADGEESgiCeJZcOiv5iPRotaxu6pomoztERrMYuEVScwpi1kqlrF3ep4OJFlHIAWi4oJY0lFzFdq_eN73o0g7LQQo0Hvq34G_A9_pPIHPycr-NpyCL8B4brQhGmZwtReFMTuvHpynj-w5&sig=AHIEtbS3ZRzsDc-Udg4I5DzR4P1muwA-VA



  • Relational Algebra


An algebra is a formal structure consisting of sets and operations on those sets.
Relational algebra is a formal system for manipulating relations.

Operands of this algebra are relations.
Operations of this algebra include the usual set operations (since relations are sets of tuples), and special operations defined for relations
selection
projection
join
http://www.cs.rochester.edu/~nelson/courses/csc_173/relations/algebra.html



  • relational schema for a music albums database.
Keys are (mostly) underlined.
The attributes should be self-evident.
For a given music track, we code the title, its play length in time (minutes:seconds), its genre (pop, metal, jazz, etc.) and a 5 star maximum rating.
The musicians, singers and instrumentalists are all listed in on their contribution to the track.
A person may have 1 or more listing for a track. For example someone may both sing and play the piano.
The album is a collection of tracks.  An album is distributed and owned by a company called the label and has a producer and an engineer.
For a given music track, we code the title, its play length in time (minutes:seconds), its genre (pop, metal, jazz, etc.) and a 5 star maximum rating.
The musicians, singers and instrumentalists are all listed in on their contribution to the track.
A person may have 1 or more listing for a track. For example someone may both sing and play the piano.
The album is a collection of tracks.
An album is distributed and owned by a company called the label and has a producer and an engineer.


PEOPLE (PID, name, address, zip, phone)
CSZ (zip, city, state)
TRACKS (trID, title, length, genre, rating, albID) //trID is unique across all albums
ALBUMS (albID, albumTitle, year, label, prodPID, engPID, length, price)
CONTRIBS (trID, PID, role)


Use the R.A. notation below.
BE EXPLICIT in the join condition which attributes make the join where necessary.

Syntax reminder for Relational Algebra expressions:
SELECT :  condition(relation)
PROJECT : attribute-list(relation)
SET Operations and JOIN:  relation1 OP relation2, where OP is  , , - , , , and  ||condition
RENAME:  relation[new attribute names]
ASSIGN:    new-relation(attrs)  R.A. expression


a) List all names and phone numbers of people from zip 90210.
name, phone(zip=90210(PEOPLE))

b) List album titles and labels with a list price of more than $18.
albumTitle, label(price>18(ALBUMS))


c) List all the musicians and what they played or contributed to on all jazz type tracks.
name, role(genre= ‘jazz’(TRACKS |X| trID=trID CONTRIBS |X| PID= PID PEOPLE))


d) Get a list of names of people who produced OR engineered an album, but did not perform on any track.  (Hint: set operations are very helpful)
d) name(((prodPID ALBUMS)[PID]  (engrPID ALBUMS)[PID]) - PID CONTRIB)

          |X| PID= PID PEOPLE)


e) List names of musicians who have contributed in at least two different roles on the same tracks with ratings 4 or higher. (Hint: self-join)
name, role(rating>= 4 and role <>role2
(CONTRIBS |X| trID=trID and PID=PID CONTRIBS[trID, PID, role2] )
|X| PID= PID PEOPLE))


http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html



  • relational algebra

Consider the following relation database
schema of people who places book orders.
Book(BookID,title,price)
Person(PersonID,Name,Zip)
Orders(PersonID,BookID,quantity,BillingID)
Billing(BillingID,PersonID,CreditCardNum)
Answer the following questions based on this schema. Pay particular attention to the language we
ask for the query in.
(a) Write a query in Relational Algebra to find the title of the book(s) with the lowest price. (3
points)
(b) Write a query in Relational Algebra to find Zip of every person who ordered the book with
the title ’Database Systems’. (4 points)
(c) Write a query in SQL to create the table Billing. Remember to specify the Primary Key and
the foriegn key constraints. All columns are of type Varchar(255). No column is allowed to
be NULL. (4 points)
(d) Write a query in SQL to find how much money has been spent on the books. (4 points)

Solution:
(a) πtitle(Book) − πB1.title(ρB1(Book) c ρB2(Book))
Where c = B1.price > B2.price
(b) πZip(σtitle= DatabaseSystems ((Book c1 Orders) c2 P erson))
Where c1 = Book.BookID = Orders.BookID
Where c2 = Orders.PersonID = Person.PersonID
(c) CREATE TABLE Billing (
BillingID Varchar(255) PRIMARY KEY,
PersonID Varchar(255) NOT NULL,
CreditCardNum Varchar(255) NOT NULL,
FOREIGN KEY (PersonID) REFERENCES Person(PersonID)
)
(d) SELECT SUM(booksales) FROM (SELECT (price*quantity) AS booksales FROM Orders o
LEFT JOIN Book b ON o.BookID = b.BookID)

webcache.googleusercontent.com/search?q=cache:8vRupC7L9OYJ:https://wiki.engr.illinois.edu/download/attachments/227743489/CS411-F2011-Final-Sol.pdf%3Fversion%3D1%26modificationDate%3D1380470739000+&cd=3&hl=tr&ct=clnk&gl=tr&client=firefox-a

Why paging is used?


Paging is solution to external fragmentation problem which is to permit the logical address space of a process to be noncontiguous, thus allowing a process to be allocating physical memory wherever the latter is available.

http://www.techinterviews.com/operating-system-questions

Disk-scheduling algorithms


  • Disk Scheduling

The seek time is the time for the disk arm to move the heads to the cylinder containing the desired sector.
The rotational latency is the additional time for the disk to rotate the desired sector to the disk head.


Disk-scheduling algorithms.


  • FCFS Scheduling

The simplest form of disk scheduling is the first-come, first-served (FCFS) algorithm. This algorithm is intrinsically fair, but it generally does not provide the fastest service.



  • SSTF Scheduling

It seems reasonable to service all the requests close to the current head position before moving the head far away to service other requests. This assumption is the basis for the shortest-seek-time-first (SSTF) algorithm.



  • SCAN Scheduling

The SCAN algorithm is sometimes called the elevator algorithm, since the disk arms behaves just like an elevator in a building, first servicing all the requests going up and then reversing to service requests the other way.



  • C-SCAN Scheduling

Circular SCAN (C-SCAN) scheduling is a variant of SCAN designed to provide a more uniform wait time
When the head reaches the other end, however, it immediately returns to the beginning of the disk, without servicing any requests on the return trip




  • LOOK Scheduling

LOOK scheduling improves upon SCAN by looking ahead at the queue of pending requests, and not moving the heads any farther towards the end of the disk than is necessary.


http://siber.cankaya.edu.tr/ozdogan/OperatingSystems/ceng328/node263.html



  • TYPES OF DISK SCHEDULING ALGORITHMS 


Given the following queue -- 95, 180, 34, 119, 11, 123, 62, 64 with the Read-write head initially at the track 50 and the tail track being at 199 let us now discuss the different algorithms.

1. First Come -First Serve (FCFS)
All incoming requests are placed at the end of the queue. Whatever number that is next in the queue will be the next number served.
To determine the number of head movements you would simply find the number of tracks it took to move from one request to the next. For this case it went from 50 to 95 to 180 and so on. From 50 to 95 it moved 45 tracks. If you tally up the total number of tracks you will find how many tracks it had to go through before finishing the entire request. In this example, it had a total head movement of 640 tracks. The disadvantage of this algorithm is noted by the oscillation from track 50 to track 180 and then back to track 11 to 123 then to 64. As you will soon see, this is the worse algorithm that one can use.


2. Shortest Seek Time First (SSTF)
In this case request is serviced according to next shortest distance
Starting at 50, the next shortest distance would be 62 instead of 34 since it is only 12 tracks away from 62 and 16 tracks away from 34
For example the next case would be to move from 62 to 64 instead of 34 since there are only 2 tracks between them and not 18 if it were to go the other way.
Although this seems to be a better service being that it moved a total of 236 tracks
The reason for this is if there were a lot of requests close to eachother the other requests will never be handled since the distance will always be greater.

3. Elevator (SCAN)
This approach works like an elevator does. It scans down towards the nearest end and then when it hits the bottom it scans up servicing the requests that it didn't get going down. If a request comes in after it has been scanned it will not be serviced until the process comes back down or moves back up. This process moved a total of 230 tracks

4. Circular Scan (C-SCAN)
Circular scanning works just like the elevator to some extent. It begins its scan toward the nearest end and works it way all the way to the end of the system. Once it hits the bottom or top it jumps to the other end and moves in the same direction. Keep in mind that the huge jump doesn't count as a head movement. The total head movement for this algorithm is only 187 track, but still this isn't the mose sufficient.


5. C-LOOK
This is just an enhanced version of C-SCAN. In this the scanning doesn't go past the last request in the direction that it is moving. It too jumps to the other end but not all the way to the end. Just to the furthest request. C-SCAN had a total movement of 187 but this scan (C-LOOK) reduced it down to 157 tracks.

http://www.cs.iit.edu/~cs561/cs450/disksched/disksched.html



  • https://www.cs.washington.edu/education/courses/451/04au/section/section7.pdf


  • The set of requests is 98 183 37 122 14 124 65 67

and the disk head starts at cylinder 53. 

Where direction is important (LOOK and SCAN), the disk head is moving outward.

Order of Service
algorithm request order
fcfs 98 183 37 122 14 124 65 67
pickup 65 67 98 122 124 183 37 14
sstf 65 67 37 14 98 122 124 183
scan 37 14 65 67 98 122 124 183
look 37 14 65 67 98 122 124 183
c-scan 65 67 98 122 124 183 14 37
c-look 65 67 98 122 124 183 14 37
Head Motion

This chart shows how far the disk heads move to service each request, and the mean and standard deviation of the head motion.
algorithm total number of cylinders moved total avg stdev
fcfs 45 85 146 85 108 110 59 2 640 80.00 44.47
pickup 12 2 31 24 2 59 146 23 299 37.38 47.57
sstf 12 2 30 23 84 24 2 59 236 29.50 28.62
scan 16 23 79 2 31 24 2 59 236 29.50 26.97
look 16 23 51 2 31 24 2 59 208 26.00 20.72
c-scan 12 2 31 24 2 59 231 23 384 48.00 76.18
c-look 12 2 31 24 2 59 169 23 322 40.25 55.16


http://nob.cs.ucdavis.edu/classes/ecs150-2008-02/handouts/io/io-example.html

Swapping

Swapping
A process must be in memory to be executed. A process, however, can be swapped temporarily out of memory to a backing store (disk) and then brought back into memory for continued execution.

http://siber.cankaya.edu.tr/ozdogan/OperatingSystems/ceng328/node179.html

memory-mapping

Rather than accessing data files directly via the file system with every file access, data files can be paged into memory the same as process files, resulting in much

faster accesses ( except of course when page-faults occur. ) This is known as memory-mapping a file.

9.7.2 Shared Memory in the Win32 API
Windows implements shared memory using shared memory-mapped files


http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/9_VirtualMemory.html

Page Replacement Algoritms

Page Replacement Algoritms


9.4.2 FIFO Page Replacement
A simple and obvious page replacement strategy is FIFO, i.e. first-in-first-out.

An interesting effect that can occur with FIFO is Belady's anomaly, in which increasing the number of frames available can actually increase the number of page faults

that occur!


9.4.3 Optimal Page Replacement
The discovery of Belady's anomaly lead to the search for an optimal page-replacement algorithm, which is simply that which yields the lowest of all possible page-

faults, and which does not suffer from Belady's anomaly.

This algorithm is simply "Replace the page that will not be used for the longest time in the future."


9.4.4 LRU Page Replacement

The prediction behind LRU, the Least Recently Used, algorithm is that the page that has not been used in the longest time is the one that will not be used again in the

near future. ( Note the distinction between FIFO and LRU: The former looks at the oldest load time, and the latter looks at the oldest use time. )


9.4.5 LRU-Approximation Page Replacement

http://www.youtube.com/watch?v=pGWbb7QIapQ
http://www.youtube.com/watch?v=GUL2txPndHs
http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/9_VirtualMemory.html





  • Page Replacement Algorithms


Fixed Number of Frames
First In/First Out (FIFO)
Optimal (OPT, MIN)
Least Recently Used (LRU)
Not-Recently-Used or Not Used Recently (NRU, NUR)
Clock
Second-chance Cyclic

Variable Number of Frames
Working Set (WS)
Page Fault Frequency (PFF)
http://nob.cs.ucdavis.edu/classes/ecs150-2008-02/handouts/memory/mm-pagexample.html




  • Paging


• If a page is not in physical memory

– find the page on disk
– find a free frame
– bring the page into memory


• What if there is no free frame in memory?
Page Replacement

Basic idea
if there is a free page in memory, use it
if not, select a victim frame
write the victim out to disk
read the desired page into the now free frame
update page tables
restart the process

Page Replacement

• Main objective of a good replacement algorithm is to achieve a low page fault rate
– insure that heavily used pages stay in memory
– the replaced page should not be needed for
some time

• Secondary objective is to reduce latency of a page fault
– efficient code
– replace pages that do not need to be written out


https://docs.google.com/viewer?a=v&q=cache:Z21kbpmJIh4J:pages.cs.wisc.edu/~mattmcc/cs537/notes/Replacement.ppt+&hl=en&pid=bl&srcid=ADGEESj7uIgxkdOGy1nM__CGFctASur2ho79p326r_SPmkGuw_5G0sBrMh9curCHZ-yMfReLi5KTfYDWT-26pMp-ua-0JlYqejMr0MuE5K_g7otBipb33lvXDxWBWmql18tZhNjMFp1J&sig=AHIEtbRh8mSv8F9CqmQBImgSfm0uSEen-g





  • Page Replacement Algorithms


Want lowest page-fault rate
Evaluate algorithm by running it on a particular string of memory references (reference string)
and computing the number of page faults on that string

https://docs.google.com/viewer?a=v&q=cache:1EsrAb83UboJ:www.cs.gsu.edu/~sguo/slides/4320/ch9.ppt+&hl=en&pid=bl&srcid=ADGEESg9YCt0XM4M5Kbx0qMipuZRja3jt3YNkX9El58HVxjyc5YUBhRQnDNWjEb-g82gxVAZSnZku0E0YPDcStMOlsGvxNfpe_D9KVxt-Hj-sl2E2H8al0Cf5ovtlio0vk7empKIc3eg&sig=AHIEtbRmJmBULv6rfF0J5nUwxXhs6qZKKA



  • Page Replacement Algoritms

http://www.sal.ksu.edu/faculty/tim/ossg/Memory/virt_mem/page_replace.html



  • Simulation

http://www.cs.uiuc.edu/class/sp06/cs241/Animations/PageReplace/replacement.html



  • Page replacement algorithms


CPU cache
Web server cache of web pages
Buffered I/O (file) caches

https://docs.google.com/viewer?a=v&q=cache:87q4IkXXN4wJ:www.sju.edu/~ggrevera/csc4035/csc4035-4-4.ppt+&hl=en&pid=bl&srcid=ADGEESiss2DUzrHjwkVrMB7ryPxqMJcbpPULKDK2dgWPwLRBZeaGOjjtIwvnJ4Gub_0Niz5e9C5QF7pA1plk2Wi_6AWcZuAmGgZ5RL0JyYLSc6CjWrBGH0u3qvVgPk1vkbOvesVwLywL&sig=AHIEtbSiMSED0PP13zzQIvs5KiXkHgbjCg



  • Page-Replacement Algorithms

A page replacement algorithm picks a page to  paged out and free up a frame

https://docs.google.com/viewer?a=v&q=cache:s6l_kPSedtYJ:www.cs.utah.edu/~mflatt/past-courses/cs5460/lecture10.pdf+&hl=en&pid=bl&srcid=ADGEEShMd71xSLfqZl12xDeMg6Ttawvbf94YiUSaGHHDw4vYKerSTntGWld6kN2iJEt2fdIzlJTrJw1Pp7M1rJaTXQaYSKUpIKMOtSttnmXTdjqgoFks2wwE_1n1SEKXZsDzwPRBN2d8&sig=AHIEtbQa2NhT70uDN_TLxB2fuPQ5dUnOwQ

Page Replacement

Page Replacement

In order to make the most use of virtual memory, we load several processes into memory at the same time. Since we only load the pages that are actually needed by each

process at any given time, there is room to load many more processes than if we had to load in the entire process

what happens if some process suddenly decides it needs more pages and there aren't any free frames available?


This is known as page replacement, and is the most common solution. There are many different algorithms for page replacement


9.4.1 Basic Page Replacement

page-fault processing assumed that there would be free frames available on the free-frame list
Now the page-fault handling must be modified to free up a frame if necessary

Find a free frame:

If there is a free frame, use it.
If there is no free frame, use a page-replacement algorithm to select an existing frame to be replaced, known as the victim frame.

http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/9_VirtualMemory.html

Virtual Memory

Virtual Memory

Virtual memory is a technique that allows the execution of processes that are not completely in memory.
One major advantage of this scheme is that programs can be larger than physical memory.
Further, virtual memory abstracts main memory into an extremely large, uniform array of storage, separating logical memory as viewed by the user from physical

memory.

http://siber.cankaya.edu.tr/ozdogan/OperatingSystems/ceng328/node191.html

Segmentation

Segmentation

An important aspect of memory management that became unavoidable with paging is the separation of the user's view of memory and the actual physical memory.
The user's view of memory is not the same as the actual physical memory. The user's view is mapped onto physical memory.
This mapping allows differentiation between logical memory and physical memory.

http://siber.cankaya.edu.tr/ozdogan/OperatingSystems/ceng328/node188.html

Paging


  • Paging


Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous.
Eliminates problems with fragmentation by allocating memory in equal sized blocks known as pages.

The basic idea behind paging is to divide physical memory into a number of equal sized blocks called frames,
and to divide a programs logical memory space into blocks of the same size called pages
Any page ( from any process ) can be placed into any available frame.
The page table is used to look up what frame a particular page is stored in at the moment

http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/8_MainMemory.html



  • Paging

http://www.youtube.com/watch?v=-Ypa8Uwf5YA&feature=related

paging provides shared memory among multiple  userspace processes