Difference between revisions of "Advanced Databases"

From Sinfronteras
Jump to: navigation, search
(Part II b - Database Design)
(Transactions)
 
Line 566: Line 566:
 
A transaction is a collection of one or more database operations that must all either complete successfully or fail entirely.
 
A transaction is a collection of one or more database operations that must all either complete successfully or fail entirely.
  
These typically constitute a single real-world transaction, such as he transfer of money between bank accounts, where if one account is debited, the other must be simultaneously credited.
+
These typically constitute a single real-world transaction, such as the transfer of money between bank accounts, where if one account is debited, the other must be simultaneously credited.
  
 
A transaction is an all or nothing proposition. There are only to possible outcomes for a transaction. Either it is committed and all the requested changes are made to the database or it aborts and no changes are made. This is what is referred to as '''atomicity'''.
 
A transaction is an all or nothing proposition. There are only to possible outcomes for a transaction. Either it is committed and all the requested changes are made to the database or it aborts and no changes are made. This is what is referred to as '''atomicity'''.

Latest revision as of 21:41, 5 January 2019

We will be using a number of sample databases throughout this module:

Contents

Workbench

https://dev.mysql.com/downloads/workbench/

Para isntalarlo se puede hacer desde los repositorios de Ubuntu:

sudo apt-get install mysql-workbench

También se puede descargar workbench de la página oficial de MySQL and double click the downloaded package; the software center takes over the installation.

La última vez que lo intalé, la versión de los repositorios estaba bastante desactualizada y presentó errores. Recomiendo instalar la última versión descargada de la página oficial.

SQL PSM - Persistent Stored Module

  • SQL PSM is an ISO standard defining procedural language extensions to SQL to allow the creation of stored procedures and has been part of the SQL standard since SQL:1999.
  • It provides syntatical elements to defne:
    • Control fow;
    • Exception handling;
    • Local variables;
    • Assignment of expressions to variables and parameters; and
    • Procedural use of cursors.
  • DB2 and MySQL’s procedure languages are the closest to the SQL/PSM standard, although the language was heavily infuenced by PL/pgSQL.

Stored Procedures

  • A stored procedure (SP) is a set of program statements compiled and stored in the RDBMS.
  • Given the required permissions, they are accessible by end users and applications.
  • Unlike regular SQL, which is a declarative language, they permit imperative programming styles, using control-flow constructs such as IF and FOR.
  • SPs can be nested, where one procedure can call another.
  • They support a limited degree of recursion, where a procedure can call itself.
  • Each RDBMS supports a different range of programming languages for creating SPs:
    • MySQL: SQL/PSM;
    • Oracle: PL/SQL, Java;
    • ...


A complex Query:

Which employee has the highest sales?

SELECT employeeNumber, CONCAT(lastName,,,firstName) AS employeeName, SUM(customerTotal) AS totalSales
FROM employees JOIN
     (SELECT salesRepEmployeeNumber, c.customerNumber, SUM(orderTotal) AS customerTotal
      FROM customers c
      JOIN orders o
      ON c.customerNumber = o.customerNumber
      JOIN
            (SELECT orderNumber, SUM(quantityOrdered * priceEach) AS orderTotal
            FROM orderdetails
            GROUP BY orderNumber) AS details
            ON o.orderNumber = details.orderNumber
            GROUP BY customerNumber) AS salesByCustomer
      ON employeeNumber = salesRepEmployeeNumber
      GROUP BY employeeNumber
      ORDER BY totalSales DESC;
  • In order to run this query without a stored procedure, the user would need permissions to access a number of different tables.
  • Standard RDBMS permissions are not suficiently fine-grained to allow only these queries.
  • In most scenarios we would want specific users to be able to execute only particular queries.
  • Using stored procedures, access permissions can be limited to only certain queries.

Generic Syntax

NorthwindDB
DELIMITER //
CREATE PROCEDURE getAllProducts()
BEGIN 
	SELECT *
    FROM   Products;
END //
DELIMITER ;

Calling this procedure is simple:

CALL getAllProducts();

Parameters

  • A stored procedure can take zero or more parameters and each can be one of three types of parameter:
    • IN parameters. Used to pass values from the caller to the procedure. These need to be passed in with the CALL statement. Any changes made to the parameter as part of the script are not passed back to the caller. This is the default if no type is specifed.
    • OUT parameters. Used to pass values from the procedure to the caller. The caller will need to declare and provide a suitable variable.
    • INOUT parameters. Used to pass values from the caller to the procedure and vice-versa. Any changes made to the parameter during the procedure are available to the caller when the procedure finishes executing.

IN Parameter Example

NorthwindDB
DELIMITER //
CREATE PROCEDURE getEmployeeByID(IN employee_id int)
BEGIN
	SELECT EmployeeID, LastName
    FROM   Employees
    WHERE  EmployeeID = employee_id;
END //
DELIMITER ;
CALL getEmployeeByID(7);

Como siempre, en este tipo de operaciones (EmployeeID = employee_id) el nombre del parameter tiene que ser distinto al nombre del Attribute. Es muy importante notar que SQL no hace distinción entre Mayúsculas y Minúsculas. Por tanto, definir la operación de arriba de la siguiente forma: EmployeeID = employeeid sería incorrecto.


Ejemplo con otra DB
DELIMITER //
CREATE PROCEDURE GetSalesByCountry (IN countryid int)
BEGIN
     SELECT transaction_id,country_id, trans_date_time,product_id,quantity,amount
     FROM sales
     WHERE country_id = countryid;
END//
DELIMITER ;
CALL GetSalesByCountry(353); 10/30

OUT Parameter Example

NorthwindDB
DELIMITER //
CREATE PROCEDURE GetTotalPrice(OUT totalprice decimal(19,4))
BEGIN
     SELECT SUM(UnitPrice) INTO totalprice
     FROM Products;
END //
DELIMITER ;
«decimal(19,4)» is the Type of variable. You can check it with DESCRIBE Products;
CALL GetTotalPrice(@totalprice);
SELECT @totalprice AS Total_price;


NorthwindDB
DELIMITER //
CREATE PROCEDURE GetTotalPriceUntilID(IN product_id int, OUT totalprice decimal(19,4))
BEGIN
     SELECT SUM(UnitPrice) INTO totalprice
     FROM Products
     WHERE ProductID < product_id;
END //
DELIMITER ;
CALL GetTotalPriceUntilID(6, @totalprice);
SELECT @totalprice AS Total_price;

INOUT Parameter Example

No DB involved in this operation
DROP PROCEDURE AddSalesTax;
DELIMITER //
CREATE PROCEDURE AddSalesTax(INOUT sales_amount DECIMAL(8,2), IN tax_percentage double)
BEGIN
     SET sales_amount = sales_amount + (sales_amount * tax_percentage);
END//
DELIMITER ;
SET @salesamount = 100.11;
CALL AddSalesTax(@salesamount, 50.20);
SELECT @salesamount;

Variables

  • Variables are used to temporarily hold values for processing during the execution of a stored procedure.
  • They are created using the DECLARE keyword and can be assigned a default value using the DEFAULT keyword.
  • With the exception of user-defined variables (those starting with a @), variables must be assigned a type, such as VARCHAR(10), INT.
  • Values can be assigned to variables either using:
    • SET: for assigning a variable a literal value or the value of another variable (or itself)
    • SELECT . . . INTO: to assign the results of a SELECT statement to the variable.


An example of the use of variables:

DELIMITER //
CREATE PROCEDURE GetTotalProductCount()
BEGIN
     DECLARE totalcount INT DEFAULT 0 ;
     SELECT COUNT (DISTINCT productCode)
     FROM current_products
     INTO totalcount;
     SELECT COUNT (DISTINCT productCode) + totalcount AS totalproductcount
     FROM archived_products;
END //
DELIMITER;

Control Flow

The IF statement

IF condition THEN
     ...statements...
ELSEIF condition THEN
     ...statements...
ELSE
     ...statements...
END IF;

The IF ELSE statement

IF condition THEN
     ...statements...
ELSE
     ...statements...
END IF;

The IF ELSEIF ELSE statement

IF condition THEN
     ...statements...
ELSEIF condition THEN
     ...statements...
ELSE
     ...statements...
END IF;

The CASE statement

  • The CASE statement is an alternative to IF statments with multiple ELSEIF clauses.
  • Code written with the CASE statement is more readble and also gives a slight increase in performance over the IF. . . ELSEIF alternative.
  • The CASE statement comes in two forms, "simple" and "searched"


The Simple CASE statement:

The simple CASE statement checks for a match against one or more non-conditional expressions. The ELSE clause is optional but will raise an error if no match is found

CASE expression
     WHEN expression_a THEN statements;
     WHEN expression_b THEN statements;
     ...
     ELSE statements;
END CASE;

CASE shippingType
     WHEN 'Express' THEN SET shipping_time = 1;
     WHEN 'Normal' THEN SET shipping_time = 3;
     ELSE SET shipping_time = 5;
END CASE;


The Searched CASE statement: The searched CASE statement allows the use of checks for a match against one or more conditional expressions. The ELSE clause is optional but will raise an error if no match is found.

CASE
     WHEN condition_a THEN statements;
     WHEN condition_b THEN statements;
     ...
     ELSE statements;
END CASE;

CASE
     WHEN (num_orders > 100 AND total_value > 100000) THEN SET cust_category = 'Premium';
     WHEN (num_orders > 50 OR total_value > 50000) THEN SET cust_category = 'Advantage';
     ELSE SET customer_category = 'Standard';
END CASE;

The WHILE statement

WHILE condition DO
     ...statements...
END WHILE;
WHILE a < 20 DO
     a = a + 1;
END WHILE;

The REPEAT statement

REPEAT
     ...statements...
UNTIL condition;

REPEAT
     a = a + 1;
UNTIL a = 20;

The LOOP statement

label : LOOP
...statements...
IF condition THEN LEAVE label;
END IF;
...statements...
END LOOP label;

Stored Functions

  • A stored function is stored routine that returns a single value.
  • It is not called like a stored procedure, but is used as part of an expression inside SQL queries or stored procedures.
  • Stored functions take only IN parameters and must specify a return data type (INT, VARCHAR, etc.)
  • The function must include at least one RETURN statement
  • If the function is deterministic (produces identical output for any given input) then this must be declared.
CREATE FUNCTION function_name(param1,param2,...) RETURNS datatype [NOT] DETERMINISTIC
BEGIN
     statements
END


DELIMITER //
CREATE FUNCTION inverse(value DOUBLE) RETURNS DOUBLE DETERMINISTIC
BEGIN
     RETURN 1/value;
END //
DELIMITER ;


DELIMITER //
CREATE FUNCTION customerLevel(creditLimit DOUBLE) RETURNS VARCHAR(10)
DETERMINISTIC
BEGIN
     DECLARE level VARCHAR(10);
     IF creditLimit > 50000 THEN SET level='PLATINUM';
     ELSEIF (creditLimit <=50000 AND creditLimit >=10000) THEN SET level='GOLD';
     ELSEIF creditLimit <10000 THEN SET level=SILVER;
END IF;
RETURN(lvl);
END //
DELIMITER ;

Triggers

  • Triggers are a special kind of stored procedure that executes automatically when certain operations are carried out on table.
  • Depending on the RDBMS in use they can be defined to run before, after or instead of DML (Data Manipulation Language) operations such as INSERT, UPDATE, and DELETE.
  • MySQL does not have an "instead of" option
  • They are useful for implementing complex integrity checks as well as auditing.
  • Although useful, they add some overhead to the RDBMS


Generic syntax for creating a trigger in MySQL:

CREATE TRIGGER «triggername» {BEFORE|AFTER} {INSERT|UPDATE|DELETE }
ON tablename
FOR EACH ROW {FOLLOWS|PRECEDES} «othertriggername»
BEGIN
END;
  • On creation, the trigger becomes associated with the table named tablename, which must refer to a permanent table.
  • Triggers can not be created for a TEMPORARY table or a view.
  • The optional FOLLOWS or PRECEDES «othertriggername» allows triggers to be fired in a specific sequence.


Creating triggers:

Although not required, it is good practice to follow a naming convention such as:

time_tablename_operation (for example: before_customer_update)
tablename_time_operation (for example: customer_before_update)
  • Inside the body of the trigger, the keyword OLD is used to refer to the row (tuple) before it is updated or deleted and NEW is used to refer to the row that will be inserted or an existing row after it is updated.
  • Triggers on INSERT operations can only use the NEW values, DELETE only to the OLD values, and UPDATE can use both.


Trigger Example:

DELIMITER \\
CREATE TRIGGER after_payroll_update AFTER INSERT ON payroll FOR EACH ROW
BEGIN
INSERT INTO payroll_audit SET
     operation       = 'INSERT',
     employeeNumber  = OLD.employeeNumber,
     gross_pay       = OLD.gross_pay,
     tax             = OLD.tax,
     prsi            = OLD.prsi,
     net_pay         = OLD.net_pay,
     user            = CURRENT_USER(),
     audit_timestamp = NOW();
END \\
DELIMITER ;


Triggers Exercise:

Using the Sakila database, create a trigger that sets the "Active" column of the customer record to True when they rent a video, if and only if the record is no longer false.

USE sakila;
DELIMITER \\
CREATE TRIGGER SetCustomerActive AFTER INSERT ON rental
FOR EACH ROW 
BEGIN
  UPDATE customer 
  SET active = TRUE 
  WHERE customer_id = NEW.customer_id;
END \\
DELIMITER ;

Events

An event is a special kind of trigger that executes according to a schedule rather than when a particular operation is carried out on a table. As such they can be considered time-based triggers.

Events require that the event scheduler thread is enabled. This can be done using the following code:

SET GLOBAL event_scheduler = ON;


To verify that the event scheduler is running, use the following command:

SHOW PROCESSLIST;


Generic syntax for creating an event in MySQL:

CREATE EVENT eventname
ON SCHEDULE eventschedule DO
eventbody;
CREATE EVENT IF NOT EXISTS test_event
ON SCHEDULE AT CURRENT_TIMESTAMP DO
INSERT INTO messages(message,created at)
VALUES('Test MySQL Event 1',NOW());

The eventschedule clause takes one of two forms, depending whether the event is a 'one-shot' or repeated. See the following slides for details.

Event Schedules

One shot

These examples assume the existence of a table created as follows:

CREATE TABLE IF NOT EXISTS messages (
    id INT PRIMARY KEY AUTO_INCREMENT,
    message VARCHAR(255) NOT NULL,
    created_at DATETIME NOT NULL
);


The AT timestamp syntax is used for a one-time event. In this case the event will execute once at the date and time given by timestamp:

CREATE EVENT IF NOT EXISTS test_event2 ON SCHEDULE
AT CURRENT_TIMESTAMP + INTERVAL 1 HOUR DO
INSERT INTO messages(message,created at)
VALUES(Test MySQL Event 2,NOW());

Repeated

To have an event repeat at a regular interval, the EVERY interval syntax is used. The generic form is:

EVERY interval
STARTS timestamp [+ interval] ENDS timestamp [+ interval]


Example:

CREATE EVENT IF NOT EXISTS test_event ON SCHEDULE
EVERY 1 HOUR STARTS CURRENT_TIMESTAMP
ENDS CURRENT_TIMESTAMP + 1 DAY DO
INSERT INTO messages(message,created at)
VALUES(Test MySQL Event 3,NOW());


Events Exercise:

  • Using the Sakila database, create an event that runs every night at midnight and returns a list of email addresses for customers who are overdue.

Cursors

A database cursor is a movable pointer to an individual row within a query result. This allows rows to be operated on individually, using the imperative programming style, instead of as a set, using the declarative programming style of standard SQL.

A cursor can be used inside any stored program, such as a stored procedure. Any given stored program may use more than one cursor provided that each is given a unique name.

Cursors in MySQL have the following properties:

  • They are read only, in other words, the result set is not updatable
  • Cursor read the entire results of their select statement into memory, so should be used with caution
  • They are non-scrollable. Cursors can only be traversed in one direction and rows can not be skipped.

Declaring and using cursors

Cursors are declared for a SELECT statement using the syntax below:

DECLARE cursor_name CURSOR
FOR SELECT ...;


In order to use the cursor, it must be opened. This is when the SELECT statement is executed:

OPEN cursor_name;


Rows in the cursor are accessed using the FETCH keyword:

FETCH cursor_name INTO variables;


The FETCH statement will raise a NOT FOUND condition when no more rows are available. In order to handle this, a NOT FOUND handler must be declared as part of the stored program. This usually sets a boolean variable that is used as part of a WHILE loop to a state where the loop terminates:

DECLARE CONTINUE HANDLER FOR NOT FOUND SET no_more_rows = 1;


When they are no longer needed, cursors need to be closed to release any associated memory used on the server:

CLOSE cursor_name;

Cursor Example

DELIMITER //
CREATE PROCEDURE build_email_list (INOUT email_list varchar(4000))
BEGIN
    DECLARE v_done INTEGER DEFAULT 0;
    DECLARE v_email varchar(100) DEFAULT "";
    DECLARE email_cursor CURSOR FOR SELECT email FROM employees; -- declare cursor
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_done = 1; -- declare NOT FOUND handler
    OPEN email_cursor; -- open the cursor
    FETCH email_cursor INTO v_email;
    WHILE NOT v_done DO -- loop over result set
        SET email_list = CONCAT(v_email,";",email_list);
        FETCH email_cursor INTO v_email;
    END WHILE;
END//
DELIMITER ;

Cursors Exercise

Using the Sakila database, create a stored procedure that uses a cursor to iterate through a query on the the rentals table giving the last rental for each customer and sets the active column in the customer table to false if this date is more than a year ago.

You will need to back up the database before doing this and restore it afterwards, as the procedure changes the data. The query to return the most recent rental for each customer is shown below. Use this as the select statement for your cursor.

SELECT r1.
FROM rental r1
WHERE r1.rental_date = (SELECT MAX(r2.rental_date)
                        FROM rental r2
                        WHERE r2.customer_id = r1.customer_id)

Transactions

A transaction is a collection of one or more database operations that must all either complete successfully or fail entirely.

These typically constitute a single real-world transaction, such as the transfer of money between bank accounts, where if one account is debited, the other must be simultaneously credited.

A transaction is an all or nothing proposition. There are only to possible outcomes for a transaction. Either it is committed and all the requested changes are made to the database or it aborts and no changes are made. This is what is referred to as atomicity.

ACID Properties

In a relational database, all transactions have four fundamental properties:

  • Atomicity, all parts of the transaction must either succeed or fail.
  • Consistency, the database must be transformed from one consistent state to another.
  • Isolation, the effects of incomplete transactions should not be visible to other transactions.
  • Durability, the effects of a committed transaction are permanent and should not be lost due to later failure.

Transactions in MySQL

There are four statements in MySQL that provide transaction functionality. The SET TRANSACTION statement is used to specify the properties of transactions.

SET [GLOBAL | SESSION] TRANSACTION
   [REPEATABLE READ | READ COMMITTED | READ UNCOMMITTED | SERIALIZABLE]
   [READ WRITE | READ ONLY]

When the GLOBAL option is specified, the statement applies globally for all subsequent sessions. Existing sessions are unaffected. If the SESSION keyword is used, the statement applies to all subsequent transactions performed within the current session.

If neither is specified, the statement applies to the next transaction performed in the current session. Any subsequent transactions will use the SESSION isolation level.


Transaction isolation:

The SQL standard specifies four transaction isolation levels, the first two of which are described below:

  • Read Uncommitted - Any transaction may read changes made by other transaction that have not yet been committed, allowing "dirty reads" to take place. Transactions with this level are not isolated from each other.
  • Read Committed - A transaction may only read is changes that have been committed by other transactions, disallowing "dirty reads." In order to achieve this, a transaction will place a read or write lock on the row it is currently updating, preventing other transactions from reading, updating or deleting it.
  • Repeatable Read - A transaction will place read locks on all rows it references and write locks on all rows it will modify. Other transactions cannot read, update or delete these rows, ensuring all reads are repeatable.
  • Serializable - This is the most restrictive level and guarantees that a transaction will retrieve exactly the same data when it repeats a read operation. However, the locking strategy associated with this level can cause problems for other users in multi-user systems.

All four are supported by MySQL when the InnoDB engine is used. The default isolation level is REPEATABLE READ.


Starting a transaction:

Transactions are initiated with the START TRANSACTION command:

START TRANSACTION [WITH CONSISTENT SNAPSHOT | READ WRITE | READ ONLY]

The WITH CONSISTENT SNAPSHOT option applies only to InnoDB databases and is effectively the same as issuing a START TRANSACTION followed by a SELECT from any InnoDB table.

The READ WRITE and READ ONLY options are used to allow or disallow changes to the tables used in the transaction. The READ ONLY option prevents the transaction from modifying or locking both transactional and nontransactional tables that are visible to other transactions.


Committing and rolling back a transaction:

Assuming a transaction has been started with START TRANSACTION:

  • The COMMIT statement will commit all the changes made by the transaction, making its changes permanent.
  • The ROLLBACK statement will “roll back”" all of the transaction, cancelling its changes.

Rollbacks can happen for a variety of reasons, either under the control of the database programmer or the system:

  • Account balance is not sufficient to complete transfer
  • Order quantity is less than the quantity in stock
  • Timeout of a transaction or recovery from a system crash

Exception Handling

When an error occurs in a stored procedure, stored function or trigger, it is important that it is handled appropriately, rather than allowing the database engine to return a (cryptic) error message.

MySQL provides straightforward functionality to define handlers that deal with a range of error conditions from the more general, such as warnings or exceptions, to more specific error codes.

Declaring Exception Handlers

Handlers must be declared after variables and cursors, and use the following syntax:

DECLARE <handler action> HANDLER FOR <condition value>
<statement>


Where handler action can be one of:

  • CONTINUE - Execution of the current program continues.
  • EXIT - Execution terminates for the BEGIN . . . END compound statement in which the handler is declared. This is true even if the condition occurs in an inner block.

A further value "UNDO" is currently not supported in MySQL but is intended to execute a rollback.


Where condition value can be one of:

  • An integer literal indicating a MySQL error code;
  • SQLSTATE value other than those that begin with ‘00’, as these indicate success rather than an error condition;
  • SQLWARNING condition - shorthand for any SQLSTATE value that begins with ‘01’; or
  • SQLEXCEPTION condition - shorthand any SQLSTATE value that do not begin with '00', '01', or '02'.
  • NOT FOUND condition - shorthand for SQLSTATE values that begin with ‘02’; or
  • Named condition associated with a MySQL error code or SQLSTATE value.

Examples

If SQLSTATE 23000 (duplicate key value) is encountered, issue an error message for the calling program (using the SELECT statement) and then continue execution:

DECLARE CONTINUE HANDLER FOR SQLSTATE '23000'
     SELECT 'Duplicate key in index';


If MySQL error value is 1216 (FK constraint violated) issue an error message to the calling program and exit.

BEGIN
     DECLARE CONTINUE HANDLER FOR 1216 SELECT 'Foreign key constraint violated';
     INSERT INTO employees(first_name,surname, date_of_birth, department_id)
          VALUES ('Jane','Smith','1997-03-12',12);
END;


Example using Sakila Film table

If the title of a film being inserted into the film table are identical to those that already exist, then return a message to calling program.

DELIMITER $$
CREATE PROCEDURE insert_film(IN title VARCHAR(255) ... <other columns>)
BEGIN
     -- declare a continue handler
     DECLARE CONTINUE HANDLER FOR 1062
     SELECT CONCAT('Warning: Film with title ',title,' already found') AS msg;
     -- insert a new record into film
     INSERT INTO film(title, <other columns>) VALUES(title, <other values>);
END


Exercise:

  • Using the Sakila database, add a unique index for film title and year, as shown below. We will revisit indexing in the coming weeks.
ALTER TABLE film ADD CONSTRAINT uk_film_year
UNIQUE KEY (title,release_year);
  • Create a stored procedure to add a film to the film table, taking IN parameters for all columns except film_id and last_update.
  • In the stored procedure, output a message if the combination of the title and year are not unique.

Optimisation

Optimisation Through Denormalisation and Partitioning

  • Normalisation is a technique of group attributes together in a relation due to a functional dependency between them.
  • Normalisation results in a logical database design that has minimal redundancy.
  • However, a normalised database design does not necessarily provide optimum eficiency.
  • In some circumstances it may be preferable to forego of some of the benefits of a normalised design to achieve performance.


  • Denormalisation should only be considered when the performance requirements of a database are unlikely to be met by other means.
  • Denormalisation makes the implementation of the database considerably more complex.
  • Denormalisation frequently makes the design less flexible.
  • Denormalisation is generally used to speed up reads but can negatively impact the speed of writes..
  • If two or more relations have a low insert and update rate and a very high query rate, denormalisation may be a viable option.


  • Increased query speed often comes at the cost of reduced data integrity.
  • Denormalisation is a human intensive process.
  • It introduces data redundancy, so not only can updates be expensive but anomalies can be introduced.
  • There isn’t much by way of academic literature to suggest which denormalisation strategies are best.


  • El Prof. explicó que normalisation requiere muchos «JOINS», operación que es muy costosa. Ésta es una de las razones principales por las que normalisation podría no ser la mejor opción.

Database System Resources

  • Optimising the performance of an RDBMS involves:
    • Maximising throughput;
    • Minimising response time; and optionally
    • Minimising storage requirements
  • In order to optimise an operational RDBMS, the database designer needs to develop an awareness of how the hardware components of the system impact on the performance of the RDBMS. These are:
    • The main memory (RAM)
    • The processor (CPU)
    • Secondary storage (disks)
    • Network

Main Memory

  • Many RDBMS operations require the use of main memory (RAM).
  • Main memory access is up to a thousand times faster than accessing secondary storage.
  • When the amount of memory required exceeds available memory, the operating system will transfer the least frequently-used pages of memory to secondary storage (virtual memory) to make more physical memory in a process known as swapping or paging. (esto por supuesto hará que el sistema funcione lento si requiere acceder a la swap memory.)
    • The next time one of the swapped pages is required, the OS moves it back from disk into physical memory.
  • Serious performance problems will occur if the OS becomes heavily reliant on paging to allocate memory to running processes.
  • In order to optimise RAM usage by the RDBMS, we need to know how it uses memory, and how users are likely to use the database.
  • If the installed memory is deemed insuficient, adding memory is the only solution, provided that the maximum installable memory has not already been reached. This relatively inexpensive and will significantly improve performance.

The processor

Secondary storage

RAID

Network

Profiling Server Performance

  • In the context of a relational database, performance is response time.
  • The response time is a combination of the time spent doing useful work and that spent waiting to do useful work.
  • Optimising performance is primarily about measuring and only when sufficient data are gathered, making careful, informed changes.
  • Knowing what to measure and when to measure it is of paramount importance.
  • Performance problems can either be due to a single query (or a number of queries) or can be system-wide.

Optimising Server Performance

Performance Tuning

The following MySQL settings have a major impact on system performance:

innodb_buffer_pool_size

innodb_log_file_size

max_connections

innodb_file_per_table

innodb_flush_method

innodb_log_buffer_size

Monitoring Server Performance

Query Optimisation

First a look at how MySQL executes queries

The EXPLAIN keyword

The EXPLAIN keyword will show you the details of the query plan so that you can see where bottlenecks may be occuring. It is an essential tool in query debugging, but only works on SELECT statements.

The Enhanced Entity-Relation Model

  • The basic E-R model, while appropriate for many situations, has its limitations.
  • A number of extensions to the E-R model allow for more advanced modeling:
    • Subclasses, Superclasses and Inheritance
    • Generalisation and specialisation
    • Aggregation
    • Modelling of UNION types using categories
  • These extensions can also be converted to a relational model.

Subclasses - Superclasses - Inheritance

  • Inheritance is when subclass entity shares all the attributes and relationships of its superclass
  • Single inheritance is where a subclass inherits from one and only one superclass. Some modeling tools are limited to single inheritance.
  • Mutiple inheritance is where a subclass inherits from more than one superclass.

Generalisation and specialisation

Generalisation

Generalization is a bottom-up approach in which two lower level entities combine to form a higher level entity. In generalization, the higher level entity can also combine with other lower level entities to make further higher level entity.

It's more like Superclass and Subclass system, but the only difference is the approach, which is bottom-up. Hence, entities are combined to form a more generalised entity, in other words, sub-classes are combined to form a super-class.

  • Example 1: a car (with attributes colour, engine size and price) and truck (with attributes colour, engine size, tonnage and price) can be generalised into vehicle (with attributes colour, engine size and price).
  • Example2: Saving and Current account types entities can be generalised and an entity with name Account can be created, which covers both.
Generalization

Specialisation

Specialization is opposite to Generalization. It is a top-down approach in which one higher level entity can be broken down into two lower level entity. In specialization, a higher level entity may not have any lower-level entity sets, it's possible.

  • Example: Employees. They have attributes in common but also distinct attributes:
    • Secretaries
    • Managers
    • Engineers
Specialization

Constraints on Specialisation and Generalisation

Disjointness Constraint

This specifies whether a supertype can be a member of two or more subtypes. It has two rules:

  • Disjoint Rule: an instance of a supertype may not simultaneously be a member of two (or more) subtypes.
    • This specifies that the subclasses of a specialisation must be disjointed. This means that an entity must be a member of at most one of the specialisation's subclasses.
    • The disjoint rule is specified by adding a 'd' at the junction between the superclass and the subclasses.
An EER diagram with disjoint subclasses. In a company, an employee can either be (for instance) a secretary, a programmer or a manager, but not more than one.


  • Overlap Rule: an instance of a supertype may simultaneously be a member of two (or more) subtypes.
    • If the specialisation is not disjointed, then it is overlapped.
    • An overlapped specialisation allows an entity to be a member of more than one subclass of the specialisation.
    • The overlapped constraint is specified by adding a 'o' at the junction between the superclass and the subclasses.
An EER diagram with overlapping subclasses. In a university, a person can be a student, a member of faculty. They can also be both, for instance, when a student is also a teaching assistant.
Completeness Constraint
  • A total completeness constraint specifies that every entity in the superclass must be a member of some subclass in the specialisation or generalisation.
  • A total completeness constraint is denoted in the EER diagram using a double line.
  • When an entity does not have to belong to any of the subclasses, this is known as partial completeness constraint.
  • A partial completeness constraint is shown in EER diagrams by a single line

Aggregration

Aggregration is a process when relation between two entities is treated as a single entity.

Aggregration

In the diagram above, the relationship between Center and Course together, is acting as an Entity, which is in relationship with another entity Visitor. Now in real world, if a Visitor or a Student visits a Coaching Center, he/she will never enquire about the center only or just about the course, rather he/she will ask enquire about both.

Union type

Partial union: Not every member of the sets of superclasses must participate. In the example on the previous slide, not every social fund or employer has to be a sponsor. This is shown as a single line between the union circle and the subclass (sponsor, in this example).

Total union: Every member of the sets of superclasses must participate. This is shown as a double line between the union circle and the subclass. In the example on the previous slide, if every employer and social fund had to be a sponsor, the line between the union circle and the sponsor entity would be doubled.


An EER diagram with a union subclass. A sponsor paying a student’s fees is either an employer, a government social fund, but not both.

Strong - Weak - Associative Entities

A strong entity can exist independent of any other entity in the schema. It has a primary key and does not have to have a total participation in a relationship. It is represented by a single rectangle.

A weak entity can not exist without a related strong entity. It is dependent on the strong entity for its existence and always has a total participation constraint. It cannot be uniquely identified by its attributes alone and must use a foreign key in conjunction with its attributes to create a primary key. It is represented by a double rectangle.

An associative entity is used to relate two entities as part of a many-to-many relationship, by referencing the primary key of each relation. It is denoted by a dotted rectangle.

Strong, Weak and Associative Entities - Left: A loan payment can not exist without a loan. Right: A student can take many courses. A course can have many students.

Idenfifying and Non-Idenfifying Relationships

In an identifying relationship, a row in a child table depends on a row in a parent table and can not exist without it. For example, a Person can have one or more phone numbers. In order to facilitate this, a second table named PhoneNumbers is needed. The primary key includes the PersonID referencing the primary key in the Person table. An identifying relationship is shown with a double diamond

Idenfifying Relationships


A non-identifying relationship is when the primary key of the parent does not form part of the primary key of the child. For example, in a lookup table for cities, with a foreign key on the Person entity referencing the primary key of the City lookup table. Person is a child table with respect to City. However, a row in Person can not be considered to be identified by its city attribute.

A non-identifying relationship can be either optional or mandatory, where the foreign key column in the child table permits NULLs or not. A non-identifying relationship is shown with a single diamond

Non-Idenfifying Relationships

Mapping an EERD to a Relational Schema

  • For each entity type without subclasses, create a relation and assign it a primary key.
  • For each weak entity type, create a relation. The primary key will be a combination of the identifier of the parent entity and the identifier of the weak entity. In other words, it will be a composite primary key.
  • For a single optional subclass, create a relation for the superclass and give it a primary key, then create a relation for the single subclass, giving it the same primary key as the superclass, and add any attributes specific to the subclass.
  • For multiple disjoint optional subclasses, create a relation for the superclass and give it a primary key. Create a relation for each subclass, giving each one the same primary key as the superclass. Then add any attributes specific to each subclass.
  • For multiple disjoint mandatory subclasses, create a relation for each of the subclasses using the same primary key you would have chosen if you had created a superclass, then add the all the attributes specific to each subclass.
  • For each overlapping optional subclass, create one single relation to represent both the superclass and all its subclasses. Give it a primary key and an additional attribute to specify class membership. This is not recommended if each subclass has many specific attributes, as it will result in numerous null values for specific subclass attributes.
  • For each overlapping mandatory subclass, create a single relation to represent the superclass and all its subclasses. Provide a primary key and logical attributes for each subclass. At least one of these logical attributes should be set to TRUE.

Exercises

Exercise 1

Translate the following EERD to a relational schema. Are there any elements of the design that do not appear in the schema?

Exercise1-Translate the following EERD to a relational schema.png

Exercise 2

Translate the following EERD to a relational schema. Are there any elements of the design that do not appear in the schema?

Exercise2-Translate the following EERD to a relational schema.png

Assessment

Part I - Procedural SQL - Individual Assessment

Submit your code for this assessment as a single SQL file, naming it with your student number xxxxxxxx.sql, where xxxxxxxx is your student number. This should be submitted to the Individual Assessment link on Moodle.

Due date: 18th November 2018, 23:55 (at the very latest).

Question 1 - Load the NorthWind database using the supplied SQL format backup file

http://perso.sinfronteras.ws/index.php/File:NorthwindDB.zip

mysql -u root -p < /home/adelo/1-system/desktop/it_cct/3-Advanced_Databases/0-1-databases_usadas_en_este_curso/NorthwindDB/NorthwindDB.sql

Question 2 - Create a stored procedure

Using the NorthWind database, create a stored procedure named sp_LowStock to return the following columns (in order) for all products where the stock level is below the reorder level:

  • Supplier Name;
  • Product Name;
  • Stock Deficit (units in stock minus reorder level); and
  • Reorder Level

Create SQL statements to execute this procedure and display the results.

NorthwindDB
DELIMITER //
DROP PROCEDURE IF EXISTS sp_LowStock()//
CREATE PROCEDURE sp_LowStock()
BEGIN
	SELECT CompanyName  as "Supplier Name", 
		   ProductName  as "Product Name",
		   UnitsInStock - ReorderLevel as "Stock Deficit",
		   ReorderLevel as "Reorder Level"
	FROM   Products p INNER JOIN Suppliers s
	ON     p.ProductID = s.SupplierID and UnitsInStock < ReorderLevel;
END //
DELIMITER ;

CALL sp_LowStock();

Question 3 - Create a stored procedure

Create a stored procedure called sp_EmployeeTerritories that takes one IN parameter (the Employee ID) and returns the following columns using two OUT parameters:

  • Employee Name (joining Title, FirstName and LastName as one column); and
  • Territories (a comma-separated list of territories assigned to the employee).

Create SQL statements to execute this procedure for an employee id of your choosing and display the results.

NorthwindDB
DELIMITER //
DROP PROCEDURE IF EXISTS sp_EmployeeTerritories//
CREATE PROCEDURE sp_EmployeeTerritories(
     IN  employee_id      int,
     OUT employee_details char(200),
     OUT terr_assigned    char(200)
)
BEGIN
     SELECT DISTINCT
            CONCAT(Title,", ",FirstName," ", LastName)        AS "Employee",
            GROUP_CONCAT(TerritoryDescription SEPARATOR ", ") AS "Territories assigned"
     INTO   employee_details, terr_assigned
     FROM   Employees e INNER JOIN EmployeeTerritories et INNER JOIN Territories t
     ON     e.EmployeeID  = et.EmployeeID  AND
            t.TerritoryID = et.TerritoryID AND
            e.EmployeeID  = employee_id;
END //
DELIMITER ;

CALL sp_EmployeeTerritories(4, @employee_details, @terr_assigned);
SELECT @employee_details AS "Employee", @terr_assigned AS "Territories assigned";

Question 4 - Create a stored procedure

Create a stored procedure called sp_AddOrderItem that takes four IN parameters:

  • The order id;
  • The product id;
  • The quantity; and
  • A discount amount;

and returns the quantity added to the order.

This procedure will need to:

  • check the quantity of the product in stock and add the requested quantity if sufficient stock is available, otherwise add the available quantity;
  • fetch the current price from the products table and use that as the unit price;
  • deduct the quantity ordered from the units in stock and adjust the units in stock accordingly; and
  • Contain an exception hander to handle duplicated order items. You will need to decide if this should be a CONTINUE or EXIT handler.
DELIMITER //
DROP PROCEDURE IF EXISTS sp_AddOrderItem//
CREATE PROCEDURE sp_AddOrderItem(
    IN  in_order_id        int,
    IN  in_product_id      int,
    IN  in_quantity_asked  int,
    IN  in_disc_amount     float,
    OUT out_quantity_added int
)

BEGIN
    DECLARE v_units_in_stock int;
    DECLARE v_unit_price     decimal(19,4);
    DECLARE v_in_order_id    int;
    DECLARE v_in_product_id  int;
    DECLARE v_quantity_order_details int;

	/*
    DECLARE EXIT HANDLER FOR NOT FOUND
		BEGIN
			CASE
				WHEN v_in_order_id    IS NULL THEN SELECT CONCAT('OrderID: ', in_order_id,   ' Not found')                AS WARNING;
				WHEN v_in_product_id  IS NULL THEN SELECT CONCAT('ProductID: ', in_product_id, ' Not found')                AS WARNING;
				WHEN v_units_in_stock IS NULL THEN SELECT CONCAT('UnitsInStock Not found for ProductID: ', in_product_id) AS WARNING;
				WHEN v_unit_price     IS NULL THEN SELECT CONCAT('UnitPrice not found for ProductID: ', in_product_id)    AS WARNING;
			END CASE; 
		END;
	*/
    
    SELECT  OrderID INTO v_in_order_id FROM Orders WHERE OrderID = in_order_id;

    SELECT  ProductID,       UnitPrice,    UnitsInStock
    INTO    v_in_product_id, v_unit_price, v_units_in_stock
    FROM    Products
    WHERE   ProductID = in_product_id;
    
    SELECT Quantity
    INTO   v_quantity_order_details
    FROM   Order_Details
    WHERE  OrderID = in_order_id AND ProductID = in_product_id;
	
    
	CASE
		WHEN v_in_order_id IS NULL THEN
		SELECT CONCAT('OrderID: ', in_order_id, ' Not found') AS WARNING;
		SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'OrderID Not found';
		
		WHEN v_in_product_id IS NULL THEN
		SELECT CONCAT('ProductID: ', in_product_id, ' Not found') AS WARNING;
		SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'ProductID Not found';
		
		WHEN v_units_in_stock IS NULL THEN
		SELECT CONCAT('UnitsInStock Not found for ProductID: ', in_product_id) AS WARNING;
		SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'UnitsInStock Not found (NULL) for ProductID entered';
		
		WHEN v_unit_price IS NULL THEN
		SELECT CONCAT('UnitPrice not found for ProductID: ', in_product_id) AS WARNING;
		SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'UnitPrice not found (NULL) for ProductID entered';
        
		ELSE BEGIN END;
	END CASE;
    
    
    IF  v_quantity_order_details IS NOT NULL THEN
		SET v_units_in_stock = v_units_in_stock + v_quantity_order_details;
    END IF;
    
    
	IF  (v_units_in_stock = 0) THEN
		SELECT CONCAT("No units in stock for ProductID: ", v_in_product_id) AS WARNING;
		SIGNAL SQLSTATE '45000'
		SET MESSAGE_TEXT = 'No units in stock for selected product';
	END IF;
	
    
    IF  (v_units_in_stock  >= in_quantity_asked) THEN
		SET out_quantity_added = in_quantity_asked;
	ELSE
		SET out_quantity_added = v_units_in_stock;
        SELECT CONCAT("Units in stock (",v_units_in_stock,") less than units asked (",in_quantity_asked,"). Only ", v_units_in_stock, " units were added to to order") AS WARNING;
	END IF;


    START TRANSACTION;
               
        UPDATE  Products
        SET     UnitsInStock = v_units_in_stock - out_quantity_added
        WHERE   ProductID = in_product_id;
        
        
        IF v_quantity_order_details IS NOT NULL THEN
			UPDATE  Order_Details
			SET     UnitPrice = v_unit_price,
					Quantity  = out_quantity_added,
                    Discount  = in_disc_amount
			WHERE   OrderID   = in_order_id  AND  ProductID = in_product_id;
		ELSE
			INSERT  INTO  Order_Details  values(in_order_id,
												in_product_id,
											    v_unit_price,
											    out_quantity_added,
											    in_disc_amount);
		END IF;
        
    COMMIT;

END //
DELIMITER ;
CALL sp_AddOrderItem(10248,9,9,0,@out_quantity_added);
SELECT @out_quantity_added AS "Quantity added to the order";

Question 5 - Create a stored function

Create a stored function called OrderFulfilmentCycleTime to return the average difference between the date the order was placed and the date it was shipped.

DELIMITER //
DROP FUNCTION IF EXISTS OrderFulfilmentCycleTime//
CREATE FUNCTION OrderFulfilmentCycleTime() RETURNS decimal(19,1) DETERMINISTIC
BEGIN
    RETURN(SELECT AVG(DATEDIFF(ShippedDate, OrderDate)) FROM Orders);
END //
DELIMITER ;

select OrderFulfilmentCycleTime();


-- -- Pruebas:
-- SELECT AVG(DATEDIFF(ShippedDate, OrderDate))
-- FROM Orders;

-- SELECT OrderID, ShippedDate, OrderDate, (DATEDIFF(ShippedDate, OrderDate))
-- FROM Orders;

-- --

-- SELECT AVG(DATE(ShippedDate)-DATE(OrderDate))
-- FROM Orders;
 
-- SELECT OrderID, ShippedDate, OrderDate, (DATE(ShippedDate)-DATE(OrderDate))
-- FROM Orders;

Question 6 - Create a trigger

Create a trigger for the orders table to prevent more than one order being deleted at a given time and returning an error message to the client. You will need to determine when this trigger should be fired.

DELIMITER //
DROP TRIGGER IF EXISTS before_Orders_delete//
CREATE TRIGGER before_Orders_delete BEFORE DELETE ON Orders FOR EACH ROW
BEGIN
-- DECLARE contar1 int;
-- DECLARE contar2 int;
-- SELECT COUNT(*) INTO contar1 FROM Orders where Orders.OrderID=old.OrderID;
-- SELECT COUNT(*) INTO contar2 FROM Orders;
-- set @a = ROW_COUNT();

    -- IF ROW_COUNT() > 0 THEN
    -- IF contar2 < (contar1 - 1) THEN
    
    IF @contar IS NULL THEN
        SET @contar = 1;
    ELSE
        SET @contar = NULL;
        SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'You are trying to delete more than one row at the same time and that is not allowed. No column has been deleted.';
    END IF;
END //
DELIMITER ;

DELETE FROM Orders WHERE OrderID=10469 OR OrderID=10473;
-- DELETE FROM Orders WHERE OrderID > 10374;
SELECT ROW_COUNT();

Question 7 - Create a Trigger

Create a table SupplierOrders with the following structure.

  • ProductID int(11)
  • SupplierID int(11)
  • Date datetime
  • Quantity smallint(6)

You will need to choose an appropriate primary key for the table.

Once you have created the table, then create a trigger that automatically inserts an order in this table when the UnitsInStock value for the product goes below ReorderLevel. You will need to determine when this trigger will be fired.


DROP TABLE IF EXISTS SupplierOrders;

CREATE TABLE SupplierOrders(
    ProductID  INT(11)      NOT NULL,
    SupplierID INT(11)      NOT NULL,
    Date       DATETIME     NOT NULL,
    Quantity   SMALLINT(6),
    CONSTRAINT SupplierOrders_PK           PRIMARY KEY (ProductID, SupplierID, Date),
    CONSTRAINT SupplierOrders_Products_FK  FOREIGN KEY (ProductID)  REFERENCES Products  (ProductID),
    CONSTRAINT SupplierOrders_Suppliers_FK FOREIGN KEY (SupplierID) REFERENCES Suppliers (SupplierID)
);

DELIMITER //
DROP TRIGGER IF EXISTS after_Products_update//
CREATE TRIGGER after_Products_update AFTER UPDATE ON Products FOR EACH ROW
BEGIN
    IF  (NEW.UnitsInStock < NEW.ReorderLevel) THEN
        INSERT INTO SupplierOrders values(NEW.ProductID,
                                          NEW.SupplierID,
                                          CURRENT_TIMESTAMP,
                                          100 + NEW.ReorderLevel);
    END IF;
END //
DELIMITER ;

UPDATE  Products
SET     UnitsInStock = 24
WHERE   ProductID    = 22;

Question 8 - Create a Event

Add a column to the Products table to indicate whether or not the product has not been sold during the last 12 months. Call this column LegacyProduct giving it a default value of FALSE. Create an event called LegacyProducts that checks if the product has been ordered within the last twelve months and sets the value of the LegacyProduct to TRUE if it has not. This event should fire once per week, at 2330 on Sundays.

ALTER TABLE Products DROP LegacyProduct;

ALTER TABLE Products ADD LegacyProduct boolean NOT NULL DEFAULT FALSE;

SET GLOBAL event_scheduler = ON;

DELIMITER //
CREATE EVENT IF NOT EXISTS LegacyProducts ON SCHEDULE
-- EVERY 2 MINUTE STARTS "2018-11-25 01:54:00" DO
EVERY 1 WEEK STARTS "2018-11-25 23:30:00" DO
BEGIN
-- SELECT * FROM Products WHERE ProductID IN (
UPDATE Products SET LegacyProduct = TRUE WHERE ProductID NOT IN (
    SELECT ProductID
    FROM   Orders o INNER JOIN Order_Details od
    ON     o.OrderID = od.OrderID
    WHERE  OrderDate > DATE_SUB(NOW(), INTERVAL 1 YEAR)
);
END //
DELIMITER ;

Part II - Group Assessment

Part II a - Query Optimisation

Question 1 - Propose an example query to demonstrate something you have studied in the recommended reading or found online

Propose an example query to demonstrate something you have studied in the recommended reading or found online. Use either the Employees Database or the Northwind database for this task and reference the pages of the recommended reading or provide the URL of the web site where you got the idea to test this type of SQL query.

Note: A summary of your proposal should be posted to Moodle in the All Questions Forum under the topic Assessment Part IIa. The query you intend to test cannot be already posted by another group.

Question 2 - State what you expect to happen for the chosen database

State what you expect to happen for the chosen database. Give a detailed explanation, including calculations of rows you expect to be accessed etc., clearly outlining what you think should happen.

Question 3 - Write imaginary usage statistics for your chosen query

Write imaginary usage statistics for your chosen query: i.e. "we will suppose that this query runs 60 times per hour on average & 150 times per hour at peak". Provide a transaction analysis as per the Connolly and Begg book, figure 18.4 in 5th and 6th Editions, figure 17.4 in 4th Edition.

Question 4 - Give a detailed account of what you did

Give a detailed account of what you did, including:

a) a full SQL audit trail;
b) an explanation of each step you carried out;
c) the state of the database before the query execution;
d) the output of the EXPLAIN command;
e) the state of the database after the query execution; and
f) the time taken for query execution
Question 5 - Comment on the actual outcome versus the expected outcome

Comment on the actual outcome versus the expected outcome. State whether or not there is a difference between what you expected and what you got – give an explanation.

Question 6 - Propose an optimisation of for an update, delete or insert operation

Propose an optimisation of for an update, delete or insert operation. Outline the effect your optimisation can be expected to have.


Part II b - Database Design

A university's central library wishes to create a database to store details of its titles and borrowers. Details include the following:

  • Each book has a unique ISBN number, a title, an edition and one or more authors.
  • Each serial has a unique ISSN number, a title, an issue date, an editor and one or more authors
  • The library often stocks several copies of any book or serial publication, each of which is given a unique accession number. The price that was paid for each copy and its location is also recorded.
  • The central library has one or more branch libraries in individual faculty buildings.
  • Each borrower has a name and student number. A borrower can have more than one book or serial publication on loan, but each physical copy can only be on loan to one borrower.
  • Some books are designated "reference only" and can not be borrowed, but only read in the library.
  • A borrower may borrow the same book on several separate occasions.


  • Using an appropriate tool, create an Extended-Entity-Relationship model to comply with the above specification.


Extended Entity-Relationship model

We are going to use the Chen's ERD style.

Regular entities and attributes
Book Serial Publication Borrower
accessionNumber accessionNumber accessionNumber studentNumber
ISBNnumber ISBNnumber ISBNnumber name
title title title
authors authors authors
location location location
pricePaid pricePaid pricePaid
edition editor
referenceOnly issueDate
  • Each book has a unique ISBN number. The library often stocks several copies of any book or serial publication, each of which is given a:
    • Unique accession number: So, this would be the primary key because is unique for each copy
    • The price that was paid for each copy and
    • Its location is also recorded
  • Some books are designated "reference only" and can not be borrowed, but only read in the library.


  • authors: Multi-value attribute. A publication can have more than one author.
    • In this case, a second table named «Authors» is needed. The primary key will includes the ISBNnumber referencing the primary key in the Publication table. There will be an identifying relationship between «Publication» and «Author».
Relationship between entities
  • Cardinality ratios:
    • A borrower can have more than one book or serial publication on loan, but each physical copy can only be on loan to one borrower:
      • One-to-many (1:M) relationship.


  • Partial/Total participation:
    • Logical Assumptions:
      • An item (Book or Serial) can not be on loan at a given time: Partial participation
      • A Borrower can have no books on loan at a given time: Partial participation


  • Generalization:
    • In this case «Books» and «Serials» can be generalized into «Publications»
    • This generalization must be disjointed. A Publication can either be a Book or a Serial, but not both.
    • No me queda completamente clara la diferencia entre Disjoint/Overlap con Union

  • «Borrower»Partial 1<borrows>N «Publication»Partial


  • A borrower may borrow the same book on several separate occasions.

Part III - Individual Report on Group Work

  • What parts did you participate in for Parts IIa and IIb above?
  • What did you learn from working within a team?
  • What would you do differently if you had to build it again?
  • What did you find most difficult to implement or understand.