fakecineaste

Sunday, February 23, 2014

The Joel Test

The Joel Test

Do you use source control?
Can you make a build in one step?
Do you make daily builds?
Do you have a bug database?
Do you fix bugs before writing new code?
Do you have an up-to-date schedule?
Do you have a spec?
Do programmers have quiet working conditions?
Do you use the best tools money can buy?
Do you have testers?
Do new candidates write code during their interview?
Do you do hallway usability testing?

2. Can you make a build in one step?
If it takes 20 steps to compile the code, run the installation builder, etc., you're going to go crazy and you're going to make silly mistakes.
we required that the installation process be able to run, from a script, automatically, overnight, using the NT scheduler, and WISE couldn't run from the scheduler overnight, so we threw it out. (The kind folks at WISE assure me that their latest version does support nightly builds.)

3. Do you make daily builds?
Breaking the build is so bad (and so common) that it helps to make daily builds, to insure that no breakage goes unnoticed. On large teams, one good way to insure that breakages are fixed right away is to do the daily build every afternoon at, say, lunchtime. Everyone does as many checkins as possible before lunch. When they come back, the build is done. If it worked, great! Everybody checks out the latest version of the source and goes on working. If the build failed, you fix it, but everybody can keep on working with the pre-build, unbroken version of the source.

4. Do you have a bug database?

complete steps to reproduce the bug
expected behavior
observed (buggy) behavior
who it's assigned to
whether it has been fixed or not

If the complexity of bug tracking software is the only thing stopping you from tracking your bugs, just make a simple 5 column table with these crucial fields and start using it.

http://www.joelonsoftware.com/articles/fog0000000043.html

Monday, February 17, 2014

Spaghetti code

Spaghetti code is a pejorative term for source code that has a complex and tangled control structure, especially one using many GOTOs, exceptions, threads, or other "unstructured" branching constructs. It is named such because program flow is conceptually like a bowl of spaghetti, i.e. twisted and tangled.

pejorative
a word or expression that is pejorative is used to show disapproval or to insult someone
http://en.wikipedia.org/wiki/Spaghetti_code

Monday, February 3, 2014

MySQL High Availability (HA) tools

Tungsten

Tungsten Replicator is a high performance, open source, data replication engine for MySQL
https://code.google.com/p/tungsten-replicator/

Multi-Master Replication Manager for MySQL

MMM (Multi-Master Replication Manager for MySQL) is a set of flexible scripts to perform monitoring/failover and management of MySQL master-master replication configurations
http://mysql-mmm.org/

DRBD

DRBD refers to block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based raid-1.
http://www.drbd.org/

Liquibase

Liquibase | Database Refactoring | Liquibase
Supports code branching and merging
Powerful refactoring commands
Command Line, Ant, Maven, Spring, and Servlet integrations
www.liquibase.org

I work on a Groovy/Grails project, and Grails uses Hibernate underneath for all its ORM (called "GORM")

We use Liquibase to manage all SQL schema changes, which we do fairly often as our app evolves with new features.

The awesome thing is that I can take a totally blank slate MySQL database on my laptop, fire up the app, and right away the schema is set up for me. It also makes it easy to test schema changes by applying these to a local-dev or staging DB first.

The easiest way to get started with it would probably be to take your existing DB and then use Liquibase to generate an initial baseline.xml file. Then in the future, you can just append to it and let liquibase take over managing schema changes.

http://stackoverflow.com/questions/221379/hibernate-hbm2ddl-auto-update-in-production

Galera Cluster for MySQL is a true Multimaster Cluster based on synchronous replication. It is an easy-to-use, high-availability solution, which provides high system up-time, no data loss and scalability for future growth.

http://galeracluster.com/

Percona, a leader in open source database software and services, today announced Percona Server for MySQL 8.0, the latest version of the company’s free, enhanced, drop-in replacement for MySQL Community Edition. Percona Server for MySQL 8.0 includes all the features of MySQL Community Edition 8.0, along with enterprise-class features from Percona that make it ideal for enterprise production environments. The latest release offers increased reliability, performance and security.

https://www.percona.com/about-percona/newsroom/press-releases/percona-server-mysql-80-delivers-increased-reliability

Get MySQL Replication up and running in 5 minutes

MySQL allows you to build up complex replication hierarchies, such as multi-master, chains of read slaves, backup databases at a remote site or any combination of these.
The first step in setting up replication involves editing the “my.cnf” file on the servers that will serve as the master and slave
http://www.clusterdb.com/mysql-cluster/get-mysql-replication-up-and-running-in-5-minutes/

How To Set Up Database Replication In MySQL

First we have to edit /etc/mysql/my.cnf. We have to enable networking for MySQL, and MySQL should listen on all IP addresses,

Furthermore we have to tell MySQL for which database it should write logs (these logs are used by the slave to see what has changed on the master), which log file it should use, and we have to specify that this MySQL server is the master. We want to replicate the database exampledb, so we put the following lines into /etc/mysql/my.cnf:

There are two possibilities to get the existing tables and data from exampledb from the master to the slave. The first one is to make a database dump, the second one is to use the LOAD DATA FROM MASTER; command on the slave. The latter has the disadvantage the the database on the master will be locked during this operation, so if you have a large database on a high-traffic production system, this is not what you want, and I recommend to follow the first method in this case. However, the latter method is very fast, so I will describe both here.

http://www.howtoforge.com/mysql_database_replication

Tuesday, January 28, 2014

One’s complement representation

One’s complement representation

In one's complement, positive numbers are represented as usual in regular binary.
To negate a number, replace all zeros with ones, and ones with zeros - flip the bits.
12 would be 00001100, and -12 would be 11110011
As in signed magnitude, the leftmost bit indicates the sign (1 is negative, 0 is positive).
To compute the value of a negative number, flip the bits and translate as before.

http://www.math.grin.edu/~rebelsky/Courses/152/97F/Readings/student-binary#sign

11012 = 1310 (a 4-bit unsigned number)

0 1101 = +1310 (a positive number in 5-bit one’s complement)
1 0010 = -1310 (a negative number in 5-bit one’s complement)

01002 = 410 (a 4-bit unsigned number)
0 0100 = +410 (a positive number in 5-bit one’s complement)
1 1011 = -410 (a negative number in 5-bit one’s complement)

http://webcache.googleusercontent.com/search?q=cache:OC8tJ0Dj968J:https://wiki.engr.illinois.edu/download/attachments/183861726/10-Subtractions-sol.ppt%3Fversion%3D1%26modificationDate%3D1317908161000+&cd=1&hl=tr&ct=clnk&gl=tr&client=firefox-a

ASCII vs Unicode

ASCII

On both Windows/DOS and Unix systems, the 128 most commonly-used characters are each represented by a sequence of 7 bits known as the character’s ASCII code.
They are traditionally stored as bytes (8 bits),
i.e. the 7-bit ASCII code plus a leading zero.
http://www.itk.ilstu.edu/staff/drathke/277web/WebContent/reading/asciiprint.html

Unicode
Java uses Unicode, in which all the characters are represented by 16 bits (2 bytes).
A total of 32,768 different characters are possible in Unicode, thereby allowing it to be a truly international character set.
The first 128 Unicode characters are the same as the ASCII characters, but with an extra leading zero byte in front of them

Unicode Test:
The file is called "testing1.txt" and was created in Notepad on Win2K
The 15 indicates the file is 15 bytes long.

Now I have saved the file as a "Unicode" file.
The file is called testing2.txt.

http://www.itk.ilstu.edu/staff/drathke/277web/WebContent/reading/AsciiandUnicode.html

signed magnitude

3. Represent the decimal number 107 in binary using 8-bit signed magnitude, one's complement and two's complement form.

c) Signed magnitude  01101011

One’s complement  01101011

Two’s complement  01101011

2. Convert the fractional decimal number 190.03125 to binary with a maximum of six places to the right of the binary point.

a) 10111110.00001

1. What are the values of X, Y and Z.

d) X=120101, Y=4266, Z=832

4. If the maximum positive number that can be represented in two's complement form is y, how many bits are used in this representation?

e) 1 + log2(y+1)

5. Given a (very) tiny computer that has a word size of 6 bits, what are the smallest negative number and the largest positive number that this computer can represent in two’s complement form?

d) Smallest Negative: (100000)2, Largest Positive: (011111)2

6. A 10-bit floating point number has 1 bit for the sign of the number, 3 bits for the exponent and 6 bits for the mantissa (which is normalized). Numbers in the exponent are in two’s complement representation. No bias is used and there are no implied bits. Show the representation for the smallest positive number this machine can representation
e) 0100100000

7. Given that the ASCII code for the character "A" is 1000001, the ASCII code for "F" would be?

c) 1000110

Here is ascii table in hex values
http://core.ecu.edu/csci/wirthj/Basen/asciiCode-t.html

char hex decimal binary
A 41 65 1000001
B 42 66
C 43 67
D 44 68
E 45 69
F 46 70 1000110

8. A text file that is stored by using Unicode character coding system occupies 150 Kbytes.
How much space is required for another text file that contains exactly the same characters but uses ASCII character coding system?

a) 75 Kbytes

9. Given the 8-bit binary number: 1 0 0 1 1 1 0 1

What decimal number does this represent if the computer uses signed magnitude, one's complement and two's complement form.

a) -29(signed magnitude), -98(one's complement), -99(two's complement)

http://webcache.googleusercontent.com/search?q=cache:xLGumDrB-pIJ:www.fatih.edu.tr/~emanetn/courses/spring2010/ceng252/ceng252_2009_midterm1.doc+&cd=1&hl=tr&ct=clnk&gl=tr&client=firefox-a

Signed magnitude representation

Humans use a signed-magnitude system: we add + or - in front of a magnitude to indicate the signed
We could do this in binary as well, by adding an extra sign bit to the front of our numbers.

A 0 sign bit represents a positive number.
A 1 sign bit represents a negative number.

1101base2 = 13base10 (a 4-bit unsigned number)

0 1101base2 = +13base10 (a positive number in 5-bit signed magnitude)

1 1101base2 = -1310base10 (a negative number in 5-bit signed magnitude)

http://webcache.googleusercontent.com/search?q=cache:OC8tJ0Dj968J:https://wiki.engr.illinois.edu/download/attachments/183861726/10-Subtractions-sol.ppt%3Fversion%3D1%26modificationDate%3D1317908161000+&cd=1&hl=tr&ct=clnk&gl=tr&client=firefox-a

Signed Magnitude:

In signed magnitude, the left-most bit is not actually part of the number, but is just the equivalent of a +/- sign.
"0" indicates that the number is positive, "1" indicates negative.
In 8 bits, 00001100 would be 12 (break this down into (1*2^3) + (1*2^2) ).
To indicate -12, we would simply put a "1" rather than a "0" as the first bit: 10001100.

http://www.math.grin.edu/~rebelsky/Courses/152/97F/Readings/student-binary#signed

ER diagram

6. For the following Presidential ER diagram develop the relational schema using the pattern TableName(attribute-list).

http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html

2. Principles of mapping ER diagrams to relational schemas. Fill in the blanks with relation and attribute names.

a. If you have an entity E with attributes A, B and C in your ER diagram, with A as the primary key, what is its corresponding relation in relational schema? [3]
a. ____E____ ( ____A, B, C_____________ )

b. If entity F has attributes G, H and I with G as the primary key, but is related to entity E in a 1-many relationship, what is its corresponding relation? [4]
b. ___F____ ( _____G, H, I, A________________ )

c. If entity L with attributes M, N and P with M as the primary key, and it is related to entity E in a many-many relationship called R, what is the corresponding relation that properly establishes the relationship? [4]

___R____ ( _______A,M__________________ )

jcsites.juniata.edu/faculty/rhodes/dbms/exams/mid2f13key.docx

Limitations of E-R Designs

E-R modeling provides a set of guidelines, but does not result in a unique database schema.
Normalization theory provides a mechanism for analyzing and refining the schema produced by an E-R design, or any other design.
http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html

Medium access protocols

Medium access protocols

Multiple network nodes often share the same medium.
For example, several computers might connect to a wireless access point or plug into an Ethernet hub.
We need a protocol to decide which one can access the medium if more than one has information to send at the same time.
We need a media access protocol (MAC)

Some MAC protocols

CSMA/CA, Carrier Sense Multiple Access/Collision Avoidance:
CSMA/CD, Carrier Sense Multiple Access/Collision Detection:
Polling:
Token ring:
RTS/CTS, Request to Send/Clear to Send:

http://bpastudio.csudh.edu/fac/lpress/471/hout/netech/mac.htm

Functional Dependency

Example Functional Dependencies

Let R be
NewStudent(stuId, lastName, major, credits, status, socSecNo)

FDs in R include

{stuId}→{lastName}, but not the reverse
{stuId} →{lastName, major, credits, status, socSecNo, stuId}
{socSecNo} →{stuId, lastName, major, credits, status, socSecNo}
{credits}→{status}, but not {status}→{credits}

ZipCode→AddressCity

16652 is Huntingdon’s ZIP

ArtistName→BirthYear

Picasso was born in 1881

Autobrand→Manufacturer, Engine type

Pontiac is built by General Motors with gasoline engine

Author, Title→PublDate

Shakespeare’s Hamlet was published in 1600

Trivial Functional Dependency

The FD X→Y is trivial if set {Y} is a subset of set {X}

Examples: If A and B are attributes of R,

{A}→{A}
{A,B} →{A}
{A,B} →{B}
{A,B} →{A,B}

are all trivial FDs and will not contribute to the evaluation of normalization.

http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html

(Rel. DB Design) Consider the EMP_PROJ relation schema with the attributes SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC (project location). The following set of functional dependencies hold on this schema:

SSN->ENAME
PNUMBER -> PNAME,PLOC
SSN,PNUMBER -> HOURS.

(5) Compute the closure of {SSN,PNUMBER} from these functional dependencies.
SSN,PNUMBER->ENAME, PNAME, PLOC, HOURS.

(1) What is a candidate key for this relation schema?
Candidate key is {SSN, PNUMBER}.

(4) Is this schema in BCNF? State one key problem with this schema.
It is not in BCNF because SSN->ENAME is not a superkey dependency.
Therefore, all tuples containing the same SSN will also have the same ENAME, causing redundancy.
This results in waste of space and problems with updating the data consistently.

(10) Decompose this schema into BCNF or 3NF (your choice).
3NF decomposition: {SSN, ENAME}, {PNUMBER, PNAME, PLOC}, {SSN, PNUMBER, HOURS}
BCNF decomposition: same.

http://cs.nyu.edu/courses/spring00/G22.2433-001/answers.html

5. For parts a-c, assume we have a relation with the scheme

Book (Title, Author, Publisher, PubAddress, PubZip, CopyrightYear, ISBN )
//ISBN = International Standard Book Numbers

a. What would be the likely primary key attribute(s)? ________ISBN____________________[2]

b. List all non-trivial functional dependencies [7]?

ISBN -> {Title, Author, Publisher, PubAddress, PubZip, CopyrightYear }

{Title, Author} ->{ Publisher, PubAddress, PubZip, CopyrightYear, ISBN}

Publisher -> PubAddress, PubZip

c. If this relation were used as defined (not normalized), describe the insertion and deletion anomalies that could arise. [3]
A publisher’s address cannot be stored additionally without at least a book
A publisher’s address is replicated and if changed, would have to update many records

http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html

FD Axioms

Understanding: Functional Dependencies are recognized by analysis of the real world; no automation or algorithm.
Finding or recognizing them are the database designer's task.

Axiom Name Axiom Example
Reflexivity if a is set of attributes, b ⊆ a, then a →b SSN,Name → SSN
Augmentation if a→ b holds and c is a set of attributes, then ca→cb SSN → Name then
SSN,Phone → Name, Phone
Transitivity if a →b holds and b→c holds, then a→ c holds SSN →Zip and Zip → City then SSN →City
Union or Additivity * if a → b and a → c holds then a→ bc holds SSN→Name and SSN→Zip then SSN→Name,Zip
Decomposition or Projectivity* if a → bc holds then a → b and a → c holds SSN→Name,Zip then SSN→Name and SSN→Zip
Pseudotransitivity* if a → b and cb → d hold then ac → d holds Address → Project and Project,Date →Amount then Address,Date → Amount
(NOTE) ab→ c does NOT imply a → b and b → c

*Armstrong's Axioms (basic axioms)

http://jcsites.juniata.edu/faculty/rhodes/dbms/funcdep.html

Consider relation R = (A, B, C, D) and the following statements for functional dependencies.

For each statement, if it is true, prove it. Otherwise, show a counter-example to disprove it.
(4 points)
(a) If A → B and A → C, then A → BC (2 points)
⇒ Answer: True.
For A → B, then A → AB (augmentation rule);
For A → C, then AB → BC (augmentation rule);
Then A → BC (transitivity rule)
(b) If A → B and C → D, then AC → BD (2 points)
⇒ Answer: True.
AC → A (reflexivity rule);
For A → B, s.t. AC → B (transitivity rule);
AC → C(reflexivity rule);
For C → D, s.t. AC → D (transitivity rule);
So AC → BD
Rubric: Each correct proof gets two points. If the answer is FALSE, the student gets zero
points

webcache.googleusercontent.com/search?q=cache:8vRupC7L9OYJ:https://wiki.engr.illinois.edu/download/attachments/227743489/CS411-F2011-Final-Sol.pdf%3Fversion%3D1%26modificationDate%3D1380470739000+&cd=3&hl=tr&ct=clnk&gl=tr&client=firefox-a

Schema Design Example

Problem 2 (10 points) Schema Design Using the following database description,
create a relational schema. Remember to identify primary keys and foriegn keys correctly. Select
approaches that yield the fewest number of relations. Remember to state any assumptions you use
while creating the schema.
Let’s create a movie database, similiar to what IMDB would use.
• A movie has an ID number, a release date, a title, and a running time.
• A movie has a number of people that work on it, including 1 director and 1 producer.
• A movie has many actors. Each actor has a character name for a particular movie.
• Each person has an ID Number, a real name, a birthday, and an address
• Our website will have users that log on. These users are different from the people associated
with the movies.
• Each user had a unique logon name and a password.
• A user can leave reviews of movies. They can only leave one review per movie, but can review
as many movies as they like.
Solution
Movie(MovieID,ReleaseDate,Title,RunningTime,DirectorID,ProducerID)
Person(PersonID,Name,Birthday,Address)
Actors(MovieID,PersonID,CharacterName)
Users(UserName,Password)
Reviews(UserName,MovieID,Review)
Foriegn Keys: Movie.DirectorID is a foriegn key to Person. Movie.ProducerID is a foriegn key
to Person. Actors.MovieID is a foriegn key to Movie. Actors.PersonID is a foriegn key to Person.
Reviews.UserName is a foriegn key to Users. Reviews.MovieID is a foriegn key to Movie.

webcache.googleusercontent.com/search?q=cache:8vRupC7L9OYJ:https://wiki.engr.illinois.edu/download/attachments/227743489/CS411-F2011-Final-Sol.pdf%3Fversion%3D1%26modificationDate%3D1380470739000+&cd=3&hl=tr&ct=clnk&gl=tr&client=firefox-a

For the remaining questions, use the following relational schema for a music albums database. Keys are (mostly) underlined. The attributes should be self-evident. If not, please ask for clarification. For a given music track, we code the title, its play length in time (minutes:seconds), its genre (pop, metal, jazz, etc.) and a 5 star maximum rating. The musicians, singers and instrumentalists are all listed in on their contribution to the track. A person may have 1 or more listing for a track. For example someone may both sing and play the piano. The album is a collection of tracks. An album is distributed and owned by a company called the label and has a producer and an engineer.

PEOPLE (PID, name, address, zip, phone)
CSZ (zip, city, state)
TRACKS (trID, title, length, genre, rating, albID) //trID is unique across all albums
ALBUMS (albID, albumTitle, year, label, prodPID, engPID, length, price)
CONTRIBS (trID, PID, role)

a) List all names and phone numbers of people from zip 90210. [5]

SELECT P.name, P.phone
FROM PEOPLE P
WHERE zip = ‘90210’; --may or may not be quoted, alias not necessary

b) List album titles and labels and producer names with a list price of more than $18. [5]

SELECT A.albumTitle, A.label, A.price, P.name
FROM Albums A, People P
WHERE A.price > 18 AND A,prodPID=P.PID

c) List all the musicians by name and what they played or contributed to on all jazz type tracks. [5]

SELECT P.name, C.role
FROM TRACKS T NATURAL JOIN CONTRIBS C
NATURAL JOIN PEOPLE P
WHERE T.genre = ‘JAZZ’

d) Get a list of names and addresses of people who produced OR engineered an album, but did not perform on any track. (Hint: subselect and set operations are very helpful). [6]

SELECT P.name, P.address, Z.city, Z.state, Z.zip
FROM PEOPLE P
WHERE P.PID IN
(SELECT A.prodPID
FROM ALBUMS A
UNION
SELECT B.engPID
FROM ALBUMS B
EXCEPT
SELECT C.PID
FROM CONTRIBS
)

e) List names of musicians who have contributed in at least two different roles on the same tracks with ratings 4 or higher. (Use group by… having and not a self-join). [6]

SELECT P.name
FROM PEOPLE P NATURAL JOIN CONTRIBS C NATURAL JOIN TRACKS T
WHERE T.rating>4
GROUP BY C.trID, C.PID
HAVING COUNT(DISTINCT C.role)>=2

f) What is the average price of albums for each year of release (show years), but only for albums with 6 or more tracks and length of 30 or more minutes. (Need a subselect and group by having.) [8]

SELECT AVG(A.price), A.year
FROM ALBUMS A
WHERE A.length >=30 and A.albID IN
(SELECT T.albID
FROM TRACKS T
GROUP BY T.albID
HAVING COUNT (*)>=6
)
GROUP BY A.years

jcsites.juniata.edu/faculty/rhodes/dbms/exams/mid2f13key.docx

data independence in relational model

According Codd's 12 rules, there are two kinds of data independence:

Physical Data Independence requires that changes at the physical level (like data structures) have no impact in the applications that consume the database. For example, let's say you decide to stop using a Hash Index in your table and decide to use a B-Tree Index instead: Your application that executes queries against this table doesn't have to change at all.

Logical Data Independence states that changes at the logical level (tables, columns, rows) will have no impact in the applications that access the database. As you already noticed, this feature is harder to implement that Physical Data Independence but there are still cases when this feature works. For example, if you add Tables, Columns or Rows to your current scheme the already working queries aren't affected at all.
http://stackoverflow.com/questions/10861501/data-independence-in-relational-database

Logical Data Independence

Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk. If we do some changes on table format it should not change the data residing on disk.
Physical Data Independence
All schemas are logical and actual data is stored in bit format on the disk. Physical data independence is the power to change the physical data without impacting the schema or logical data.
For example, in case we want to change or upgrade the storage system itself, that is, using SSD instead of Hard-disks should not have any impact on logical data or schemas.
http://www.tutorialspoint.com/dbms/dbms_data_independence.htm

Binary Exponential Backoff Algorithm

Binary Exponential Backoff Algorithm

When there is a collision, the stations involved in the collision will execute the binary exponential backoff algorithm to reduce the possibility of futher collisions.

When a collision is detected, the sender generates a noise burst to insure that all stations recognize the condition and aborts the transmission.
Wait 0 or 1 contention period (2 tex2html_wrap_inline740 , i.e. 2 end-to-end propagation time) before attempting transmission again.
If another collision is detected, wait 0, 1, 2, or 3 contention period. And repeat the protocol.
In general, wait between 0 and tex2html_wrap_inline770 contention periods, where r is the number of collisions encountered.
Finally, freeze interval at 1023 contention periods after 10 attempts, and give up (report failure) after 16 attempts.

http://kevscode.com/csnotes/utpa/6345meng/notes/chpt-3/node18.html

In a variety of computer networks, binary exponential backoff or truncated binary exponential backoff refers to an algorithm used to space out repeated retransmissions of the same block of data, often as part of network congestion avoidance.

Binary exponential backoff refers to a collision resolution mechanism used in random access MAC protocols.
This algorithm is used in Ethernet (IEEE 802.3) wired LANs. In Ethernet networks, this algorithm is commonly used to schedule retransmissions after collisions.

http://en.wikipedia.org/wiki/Exponential_backoff

securerandom vs random

Instances of java.util.Random are not cryptographically secure. Consider instead using SecureRandom to get a cryptographically secure pseudo-random number generator for use by security-sensitive applications.

Java offers a few ways to generate random numbers, the default being java.util.Random. java.security.SecureRandom offers a more-secure extension of java.util.Random which “provides a cryptographically strong random number generator”.

If you run twice java.util.Random.nextLong() with the same seed, it will produce the same number. For security reasons you want to stick with java.security.SecureRandom because it's a lot less predictable.

The Random Class
Java states that the Random class and its subclasses must produce predictable results when seeded with the same data
This however is not why this is insecure, and it is useful when testing.
The reason that this class is predictable though is the way in which it is seeded.
The Random class, in the absence of a seed in its constructor it will seed its random number generator with the current time in milliseconds.
This means that if somebody knows the time that the Random object was seeded and has several consecutive bytes of output then they can reasonably predict the next numbers.
Once somebody has discovered the seed for the generator all number produced from it can be seen as compromised.

The SecureRandom Class
The SecureRandom class is different, it again uses algorithms that when seeded will produce predictable results, but the algorithm is much more complex.
It uses a digest algorithm such as SHA-1 on the seed and a counter to generate random data.
Its true strength however lies in the method in which it is seeded.
The SecureRandom class is seeded using true random data gathered by the operating system
This is data gathered by the OS from sources of true randomisation, such as mouse movements, network packet arrival times, IO statistics and interrupts.
On Linux the data is gathered from /dev/random and on Windows via the CryptGenRandom() call in Windows.

When using SecureRandom
The more random numbers some can get a hold of the more likely they can figure out the seed. You should either throw away the SecureRandom object every now and then or reseed it. Keeping in mind the next point though.
The seeding the generator takes entropy out of the system, if it cannot get any entropy it will block until the system has some. This means if you’re reseeding the generator too often your program will hang along with anything else on the system requiring entropy.
Don’t seed the SecureRandom class yourself, unless you are 100% absolutely sure you are seeding it with purely random data, or you are testing and need repeatable results

if what you are generating is a security token of some sort then you will need a secure generator.
For example a session id, a one time password or an encryption key.

http://www.danielhall.me/2009/09/cryptographically-secure-random-numbers-in-java/

Monday, January 27, 2014

ArtifactTransferException: Failure to transfer ...

If you get this error message starting with this you can trace the failing jar or dependency file on local folder below.This path is a generic one for hibernate

C:\Users\user1\.m2\repository\org\hibernate\hibernate-core

Remove folder or back up just in case you need.Update project via eclipse with m2eclipse

Sign-Magnitude

Sign-Magnitude Representation
There are many schemes for representing negative integers with patterns of bits
One scheme is sign-magnitude.
It uses one bit (usually the leftmost) to indicate the sign. "0" indicates a positive integer, and "1" indicates a negative integer.
The rest of the bits are used for the magnitude of the number.
So -2410 is represented as:

1001 1000

The sign "1" means negative
The magnitude is 24 (in 7-bit binary)

http://chortle.ccsu.edu/AssemblyTutorial/Chapter-08/ass08_12.html

BigInteger

BigInteger, What Are They?

(From sun.com) "Immutable arbitrary-precision integers."
should be used whenever you need to handle very large numbers, anything larger then 'long' variables. Long's have a max a max value of 9223372036854775807. As well, BigInteger provides some useful functions

for bit manipulation, GCD, random number, and primality testing and generation.
http://compsci.ca/v3/viewtopic.php?t=13193

Multitenancy

Multitenancy

Multitenancy refers to a principle in software architecture where a single instance of the software runs on a server,
serving multiple client organizations (tenants). Multitenancy is contrasted with a multi-instance architecture where separate software instances (or hardware systems) are set up for different client organizations.
With a multitenant architecture, a software application is designed to virtually partition its data and configuration, and each client organization works with a customized virtual application instance.

Friday, January 24, 2014

SecureRandom

// Get the instance of SecureRandom class with specified PRNG algorithm

SecureRandom secureRandom = new SecureRandom();

// You can use the getInstance() of the Secure Random class to create an object of SecureRandam
// where you would need to specify the algorithm name.
// SecureRandom secureRandom = SecureRandom.getInstance("SHA1PRNG");

// You also specify the algorithm provider in the getInstance() method
// SecureRandom secureRandom = SecureRandom.getInstance("SHA1PRNG", "SUN");

// A call to the setSeed() method will seed the SecureRandom object.
// If a call is not made to setSeed(),
// The first call to nextBytes method will force the SecureRandom object to seed itself.

http://javadigest.wordpress.com/tag/securerandom-example/

I think it is best to let the SecureRandom seed itself.

This is done by calling nextBytes immediately after it's creation (calling setSeed will prevent this).

final byte[] dummy = new byte[512];

SecureRandom sr = SecureRandom.getInstance("SHA1PRNG");

sr.nextBytes(dummy);

http://stackoverflow.com/questions/12249235/securerandom-safe-seed-in-java

Instances of java.util.Random are not cryptographically secure. Consider instead using SecureRandom to get a cryptographically secure pseudo-random number generator for use by security-sensitive applications.

http://docs.oracle.com/javase/7/docs/api/java/util/Random.html

When generating random numbers in Java for cryptographic purposes, many developers often use the java.security.SecureRandom class

if it is used improperly the output can become predictable.
The java.security.SecureRandom class does not actually implement a pseudorandom number generator (PRNG) itself.

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG),[1] is an algorithm for generating a sequence of numbers that approximates the properties of random numbers. The sequence is not truly random in that it is completely determined by a relatively small set of initial values, called the PRNG's state, which includes a truly random seed.
Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom numbers are important in practice for their speed in number generation and their reproducibility.
Cryptographic applications require the output to also be unpredictable, and more elaborate algorithms, which do not inherit the linearity of simpler solutions, are needed.
java.security.SecureRandom class does not actually implement a pseudorandom number generator (PRNG) itself.
It uses PRNG implementations in other classes to generate random numbers
The PRNGs are part of Java cryptographic service providers (CSPs). In Sun’s Java implementation, the SUN CSP is used by default.
On Windows, the SUN CSP uses the SHA1PRNG implemented in sun.security.provider.SecureRandom by default.

SecureRandom sr1 = new SecureRandom();
// The following will create SUN SHA1PRNG if the highest priority CSP is SUN
SecureRandom sr2 = SecureRandom.getInstance("SHA1PRNG");
// The following will always create SUN SHA1PRNG
SecureRandom sr3 = SecureRandom.getInstance("SHA1PRNG", "SUN");
according to Sun’s documentation, the returned java.security.SecureRandom instance is not seeded by any of these calls.
java.security.SecureRandom.nextBytes(byte[]) is called, then the PRNG is seeded using a secure mechanism provided by the underlying operating system (starting with JRE 1.4.1 in Windows and JRE 1.4.2 in Linux and Solaris

If java.security.SecureRandom.setSeed(long) or java.security.SecureRandom.setSeed(byte[]) is called before a call to java.security.SecureRandom.nextBytes(byte[]), then the internal seeding mechanism is bypassed, and only the provided seed is used to generate random numbers.

Always specify the exact PRNG and provider that you wish to use. If you just use the default PRNG, you may end up with different PRNGs on different installations of your application that may need to be called differently in order to work properly. Using the following code to get a PRNG instance is appropriate:
SecureRandom sr = SecureRandom.getInstance("SHA1PRNG", "SUN");

When using the SHA1PRNG, always call java.security.SecureRandom.nextBytes(byte[]) immediately after creating a new instance of the PRNG. This will force the PRNG to seed itself securely. If for testing purposes, you need predictable output, ignoring this rule and seeding the PRNG with hard-coded/predictable values may be appropriate.

http://www.cigital.com/justice-league-blog/2009/08/14/proper-use-of-javas-securerandom

For general statistics, Random is fine. Its a typical modulo congruent function.

SecureRandom is more random. Specifically, it aims to make it impossible to predict the next "random" number from a sequence, which is trivial to do with most modulo congruent algorithms.

Consider a Monti Carlo simulation. You call the nextRan() function and are happy as long as the function's pseudo random numbers pass the usual random tests.

Consider a cryptographic message protocol, where you generate random session keys. Once a few sequential keys are know, you do not want the bad guy (traditionally labelled Mallet or Eve) to be able to predict the next key generated from the "random" function.

So the use of a traditional modulo congruent algorithm is not at all suitable in a crypto application.
http://www.coderanch.com/t/410832/java/java/Java-Random-SecureRandom