Difference between revisions of "Data Structures & Algorithms"

From Sinfronteras
Jump to: navigation, search
(Stack Abstract Data Type)
(Array-based Stack)
Line 296: Line 296:
 
* '''boolean isEmpty()''': indicates whether no elements are stored
 
* '''boolean isEmpty()''': indicates whether no elements are stored
  
===Array-based Stack===
+
===Implementing an Array-based Stack in Java===
  
 
==The Queue ADT==
 
==The Queue ADT==

Revision as of 15:29, 22 October 2018

What is Data Structure?

Diferentes definiciones:

  1. A data structure is the logical or mathematical model of a particular organization of data.
  2. A data structure is a way to logically organize data that specifies:
    • A set of data elements i.e., a data object and,
    • A set of operations which may legally be applied to elements of this data object.
  3. A data structure is a method of organizing information so that the information can be stored and retrieved efficiently.


  • A data structure not only stores data, but also supports the operations for manipulating data in the structure.


  • For example, an array is a data structure that holds a collection of data in a sequential order. You can find the size of the array, store, retrieve, and modify data in the array.
    • Array is simple and easy to use, but it has two limitations:
      • Once an array is created, its size cannot be altered.
      • Array provides inadequate support for inserting, deleting, sorting, and searching operations.


Applications of Data Structure

  • Graphics
  • Compiler Construction
  • Database Management Systems
  • Numerical Analysis
  • Simulations
  • Statistical analysis package
  • Financial Modelling
  • Artificial Intelligence


Two categories of data structures

  • Linear data structures:
ArrayList, Queue and Stack are linear collections, each of them serves as a repository in which entries may be added or removed at will.
  • ArrayList is a linear collection of entries in which entries may be added, removed, and searched for without restrictions.
  • Queue:
  • Entries may only be removed in the order in which they are added.
  • First out (FIFO) data structures
  • No search for an entry in the Queue
  • Stack:
  • Entries may only be removed in the reverse order in which they are added.
  • Last In, First Out (LIFO)
  • No search for an entry in the Stack.
  • Heap as a Priority Queue:
  • A priority queue is a specialization of the FIFO Queue.
  • Entries are assigned priorities.
  • The entry with the highest priority is the one to leave first.


  • Nonlinear data structures:
    • Trees
    • Binary Tree
      • Consists of entries each of which contributes to the tree as a whole based on its position in the tree.
      • Moving an entry from one position to another changes the meaning of the Binary Tree.
    • General Tree
      • Models a hierarchy such as the organizational structure of a company, or a family tree.
      • A non-linear arrangement of entries, it is a generalization of the binary tree structure, hence the name.
    • Hash Table
    • Graphs

What is an algorithm?

An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any legitimate input in a finite amount of time.

The algorithms are independent of programming languages.


Classes of algorithms on data structures

  • Search
  • Sort
  • Insert
  • Update
  • Delete


Five Features of Algorithm

  • Finiteness: An algorithm must always terminate after a finite number of steps. Similar procedures which differ only in that they do not terminate can be described as computational methods.
  • Definiteness: Each step of an algorithm must be precisely defined; the actions to be carried out must be rigorously and unambiguously specified for each case.
  • Input: An algorithm has zero or more inputs: quantities that are given to it initially before the algorithm begins, or dynamically as the algorithm runs.
  • Output: An algorithm has one or more outputs: quantities that have a specified relation to the inputs.
  • Effectiveness: An algorithm is also generally expected to be effective, in the sense that its operations must all be sufficiently basic that they can in principle be done.


Correctness and Efficiency

What do we need?: Correctness and Efficiency

Correctness can be described as:

  • We can be more confident in the logic behind our programs if we can be sure of and agree on the logic of the data structures used
  • A given data structure has defined access and modification methods

Efficiency can be considered in terms of multiple resources:

  • These include time and space considerations
  • We will mostly consider time but at times we may have to compromise with regard to space

Maintainable Code

We need to take the following into account:

  • Long functions:
    • Long functions can complicate the code
    • It becomes unclear what the purpose of the function is
    • Additionally, it is difficult to debug longer functions
    • Distribute the load of programming logic into small parts to obtain better efficiency
  • Multiple responsibilities:
    • Across the board, ensure that the methods, classes and modules are only responsible for what they need to be responsible for.
    • For example, the same class should not process data, display data, and handle I/O.
    • High Cohesion = clearly defined responsibility.
  • Don't repeat yourself
    • If the same code is in multiple places, then updates need to be applied to all locations.
    • Calling a suitable function rather than inserting a block of code increases legibility.
  • Naming conventions
    • Consider names as a form of documentation.
    • Selecting good names and following a specific style makes everyone's life easier.
    • For both naming conventions and code style we will be following the Google Style guides throughout this course.
Naming conventions in java
  • Use assertions
    • Assertions are very useful for both:
      • Debugging and
      • In-code documentation
public int divide(int i, int j){
    ASERT (J>0);
    return i/j;
}


  • Refactoring
  • Testing

Analysis of algorithms

  • How good is the algorithm?
    • Time efficiency
    • Space efficiency
  • Does there exist a better algorithm?
    • Lower bounds
    • Optimality

Analysis of algorithms

  • The area of computer science that provides tools for contrasting efficiency of different algorithms
  • We consider comparisons of algorithms, not programs
Algorithm analysis should be independent of Specific implementations, computers, and data.
  • Difficulties with comparing programs (instead of algorithms):
  • How are the algorithms coded
  • What computer will be used
  • What data should the program use


Methods to analyze an algorithm:

  • Transform the algorithm into some programming language and measure the execution time for a set of input values.
  • Use the Big O notation (mathematical notation) to analyse the behaviour of the algorithms. (Independent of software and hardware).

Big-Oh Notation

  • Big-O notation expresses the performance of an algorithm as a function of the number of items to be processed
  • This permits algorithms to be compared for efficiency


Analyzing Loop Execution:

  • First determine the order of the body of the loop, then multiply that by the number of times the loop will execute
  • N loop executions times O(1) operations results in a O(n) efficiency
count = 1;                            // O(1)
while (count < n)                     // n times  
{
    count *= 2;                       // O(2)
    // some sequence of O(1) steps
}
  • O(1) * n * O(2)
  • n * O(2) because it is constant
  • n * O(1)
  • O(n)
  • The loop is executed n times, so the loop is O(n)
for (int count = 0; count < n; count++) {
    for (int count2 = 0; count2 < n; count2++) {
        // Some sequence of O(1) steps
    }
}
  • When the loops are nested, we multiply the complexity of the outer loop by the complexity of the inner loop
  • Both the inner and outer loops have complexity of O(n)
  • The overall efficiency is O(n^2)


  • The body of a loop may contain a call to a method
  • To determine the order of the loop body, the order of the method must be taken into account
  • The overhead of the method call itself is generally ignored


for (int i = 0; i < n; i++) {        //------
    for (int j = 0; j < n; j++) {    //
        Simple Statement             //  This nested loop executes a Simple Statement  n2  times
    }                                //
}                                    //------
for (int i = 0; i < n; i++) {        //------
    Simple Statement  1              //
    Simple Statement 2               //
    Simple Statement 3               // This loop executes 5 Simple Statements  n times (5n)
    Simple Statement 4               //
    Simple Statement 5               //
}                                    //------
Simple Statement 6                   //------ 
Simple Statement 7                   // Finally, 25 Simple Statements  are executed
...                                  //
Simple Statement 30                  //------ 

// We can conclude that the relationship between processing time and n (the number of date items processed)  is:
// T(n) = n2 + 5n + 25


An O(n) algorithm
for i = 1 to n
    sum = sum + i
An O(n^2) algorithm
for i = 1 to n
    for i = 1 to n
        sum = sum + i

Data Structures in Java

Several data structures provided by the Java utility packages and the major DS are:

  • Enumeration: It allows retrieval of successive elements from other data structures (acts as an interface).
  • BitSet: Collection of bits (used as flags) with the ability to set or clear as appropriate.
  • Vector: Variable length array (dynamic array).
  • Stack: Collection of objects with a defined order.
  • Dictionary: Dictionary (map, association list) is a data structure, which is generally an association of unique keys with some values. We can bind a value to a key, delete a key (and naturally an associated value) and lookup for a value by the key.
  • Hashtable: Hashtable stores key/value pairs in a hash table. Keys are used to specify the objects along with their link to values. The keys are hashed and resulting hash code is used as the index at which the value is stored within the table.
  • Properties: Properties is a subclass of Hashtable. It is used to maintain the list of values in which the key is a String and the value is also a String.

Stack Abstract Data Type

  • A stack is one of the most commonly used data structures in computer science
  • A stack can be compared to a Pez dispenser
    • Only the top item can be accessed
    • You can extract only one item at a time


  • A stack is a linear collection whose elements are added in a last in, first out (LIFO) manner:
    • This means that the last element to be put on a stack is the first one to be removed.


  • Push: Add an entry to the top of the stack
  • Pop: Delete the entry at the top of the stack
Stack1.png

Example and applications of Stack

  • Surfing the Web on a browser:
    • The sequence of back clicks loads the browser with Web pages in reverse order of visit.
    • The last visited page is the first loaded when going back.

Direct applications:

  • Undo sequence in a text editor
  • Chain of method calls in the Java Virtual Machine

Indirect applications:

  • Auxiliary data structure for algorithms
  • Component of other data structures

Stack operations

Main operations:

  • object push(): inserts an element
  • object pop(): removes and returns the last inserted element

Auxiliary operations:

  • object peek() or top(): returns the last inserted element without removing it
  • integer size(): returns the number of elements stored
  • boolean isEmpty(): indicates whether no elements are stored

Implementing an Array-based Stack in Java

The Queue ADT

Properties:

  • The Queue ADT stores arbitrary objects.
  • Queues are FIFO data structures. Insertions and deletions follow the first-in first-out scheme. All insertions are done at the rear and all deletions at the front.
  • The queue has a fixed upper bound (capacity) on the number of elements it can store.

Attributes:

  • capacity: The maximum number of elements that can be in the queue at one time.
  • size: The number of elements in the queue: 0 ≤ size ≤ capacity at all times.
  • front: The first element of the queue or a null value not part of the queue when the queue is empty.
  • rear: The position in the queue where the next element is to be inserted or a null value not part of the queue if the queue is full.

Queue operations

Main queue operations:

  • enqueue(object): inserts an element at the end of the queue
  • object dequeue(): removes and returns the element at the front of the queue

Auxiliary queue operations:

  • object firstElement(): returns the element at the front without removing it
  • integer size(): returns the number of elements stored
  • boolean isEmpty(): indicates whether no elements are stored

Boundary cases:

  • Attempting the execution of dequeue or first on an empty queue returns null

Applications of Queues

Direct applications:

  • Waiting lists, bureaucracy
  • Access to shared resources (e.g., printer)
  • Multiprogramming

Indirect applications:

  • Auxiliary data structure for algorithms
  • Component of other data structures

Array-based Queue

  • Using an array to store elements of a queue, such that the first element inserted, “A”, is at cell 0, the second element inserted, “B”, at cell 1, and so on.
Queue1.png

Array

An array is a sequenced collection of variables all of the same type. Each variable, or cell, in an array has an index, which uniquely refers to the value stored in that cell. The cells of an array, A, are numbered 0, 1, 2, and so on. Each value stored in an array is often called an element of that array.

  • An array can store primitive elements, such as characters.
  • An array can also store references to objects.
Array.png


  • Adding an Entry:
    • To add an entry e into array board at index i, we need to make room for it by shifting forward the n - i entries board[i],...,board[n - 1]
Adding an entry to an array.png


  • Removing an Entry:
    • To remove the entry e at index i, we need to fill the hole left by e by shifting backward the n - i - 1 elements board[i + 1],...,board[n - 1]
Removing an entry from an array.png

List ADT

https://www.youtube.com/watch?v=HdFG8L1sajw

List as Abstract Data Type. Cuando hablamos de ADT, queremos simplemente definir el comportamiento de la estructura de datos que queremmos; y no la definición estricta de una implementación en programación. Por tanto, en este sentido, a List ADT sería:

  • A way to organize data
    • Examples
      • To-do list
      • Gift lists
      • Grocery Lists
  • Items in list have position
  • Items may be added anywhere
    • Write/modify items at a position.
Lista ADT.png

Array Lists

An obvious choice for implementing the list ADT is to use an array, A, where A[i] stores (a reference to) the element with index i. With a representation based on an array A, the get(i) and set(i, e) methods are easy to implement by accessing A[i] (assuming i is a legitimate index).

  • Adding an Entry:
    • As we already saw on Arrays, in an operation add(i,o), we need to make room for the new element by shifting forward the n - i elements A[i],..., A[n - 1]
    • In the worst case (i = 0), this takes Order n time [O(n) times]
    • In an add operation, when the array is full, instead of throwing an exception, we can replace the array with a larger one.


  • Removing an Entry:
    • In an operation remove(i), we need to fill the hole left by the removed element by shifting backward the n - i - 1 elements A[i + 1],..., A[n - 1]
    • In the worst case (i = 0), this takes O(n) time


  • Performance:
    • In an array-based implementation of a dynamic list:
      • The space used by the data structure is O(n)
      • Indexing the element at i takes O(1) time
      • add and remove run in O(n) time
Performance ArrayList.png

Single-Linked Lists

  • A singly linked list is a concrete data structure consisting of a sequence of nodes, starting from a head pointer.
  • Each node stores
    • Element or data
    • Link (reference or address) to the next node
Single linked lists.png
  • The ArrayList is limited because its add and remove methods operate in linear (O(n)) time—requiring a loop to shift elements.
  • A linked list is useful for inserting and removing at arbitrary locations
    • In a linked list, instead of an index, each element is linked to the following element
    • A linked list can add and remove elements at a known location in O(1) time
Single linked lists3.png
Single linked lists2.jpg

Implementing a Single-Linked List in Java

NodeInterface.java
// Data Structures and Algorithms
// CCT College Dublin
// Dr. Muhammad Iqbal


public interface NodeInterface <E> {
    public void addFirst(E item);
    public void addAfter(Node<E> node, E item);
    
    public E removeFirst();
    public E removeAfter(Node<E> node);
    
    public Node<E> getNode(int index);
    
    public E get(int index);
    public E set(int index, E newValue);
    
    
    public void add(int index, E item);
    public boolean add(E item);
    
    public E remove(int index);
    public boolean remove(E item);
    
    public int size();
    
}
Node.java
// Data Structures and Algorithms
// CCT College Dublin
// Dr. Muhammad Iqbal


public class Node<E> {

    /** The data value. */
    public E data;
    /** The link */
    public Node<E> next = null;

    /**
     * Construct a node with the given data value and link
     * @param data - The data value 
     * @param next - The link
     */
    public Node(E data, Node<E> next) {
        this.data = data;
        this.next = next;
    }

    /**
     * Construct a node with the given data value
     * @param data - The data value 
     */
    public Node(E data) {
        this(data, null);
    }
}
SingleLinkedList.java
// Data Structures and Algorithms
// CCT College Dublin
// Dr. Muhammad Iqbal

public class SingleLinkedList<E> implements NodeInterface<E> {

    public Node<E> head = null;
    /** The size of the list */
    private int size = 0;

    // Helper Methods
    /** Insert an item as the first item of the list.
     *	@param item The item to be inserted
     */
    public void addFirst(E item) {
        head = new Node<E>(item, head);
        size++;
    }

    /**
     * Add a node after a given node
     * @param node The node which the new item is inserted after
     * @param item The item to insert
     */
    public void addAfter(Node<E> node, E item) {
        node.next = new Node<E>(item, node.next);
        size++;
    }

    /**
     * Remove the first node from the list
     * @returns The removed node's data or null if the list is empty
     */
    public E removeFirst() {
        Node<E> temp = head;
        if (head != null) {
            head = head.next;
        }
        if (temp != null) {
            size--;
            return temp.data;
        } else {
            return null;
        }
    }

    /**
     * Remove the node after a given node
     * @param node The node before the one to be removed
     * @returns The data from the removed node, or null
     *          if there is no node to remove
     */
    public E removeAfter(Node<E> node) {
        Node<E> temp = node.next;
        if (temp != null) {
            node.next = temp.next;
            size--;
            return temp.data;
        } else {
            return null;
        }
    }

    /**
     * Find the node at a specified index
     * @param index The index of the node sought
     * @returns The node at index or null if it does not exist
     */
    public Node<E> getNode(int index) {
        Node<E> node = head;
        for (int i = 0; i < index && node != null; i++) {
            node = node.next;
        }
        return node;
    }

    // Public Methods
    /**
     * Get the data value at index
     * @param index The index of the element to return
     * @returns The data at index
     * @throws IndexOutOfBoundsException if the index is out of range
     */
    public E get(int index) {
        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException(Integer.toString(index));
        }
        Node<E> node = getNode(index);
        return node.data;
    }

    /**
     * Set the data value at index
     * @param index The index of the item to change
     * @param newValue The new value
     * @returns The data value priviously at index
     * @throws IndexOutOfBoundsException if the index is out of range
     */
    public E set(int index, E newValue) {
        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException(Integer.toString(index));
        }
        Node<E> node = getNode(index);
        E result = node.data;
        node.data = newValue;
        return result;
    }

    /**
     * Insert the specified item at the specified position in the list.
     * Shifts the element currently at that position (if any) and any
     * subsequent elements to the right (adds one to their indicies)
     * @param index Index at which the specified item is to be inserted
     * @param item The item to be inserted
     * @throws IndexOutOfBoundsException if the index is out of range
     */
    public void add(int index, E item) {
        if (index < 0 || index > size) {
            throw new IndexOutOfBoundsException(Integer.toString(index));
        }
        if (index == 0) {
            addFirst(item);
        } else {
            Node<E> node = getNode(index - 1);
            addAfter(node, item);
        }
    }

    /**
     * Append the specified item to the end of the list
     * @param item The item to be appended
     * @returns true (as specified by the Collection interface)
     */
    public boolean add(E item) {
        add(size, item);
        return true;
    }

    /*<exercise chapter="2" section="5" type="programming" number="1">*/
    /**
     * Remove the item at the specified position in the list. Shifts
     * any squsequent items to the left (subtracts one from their
     * indicies). Returns the tiem that was removed.
     * @param index The index of the item to be removed
     * @returns The item that was at the specified position
     * @throws IndexOutOfBoundsException if the index is out of range
     */
    public E remove(int index) {
        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException(Integer.toString(index));
        }
        Node<E> removedNode = null;
        if (index == 0) {
            return removeFirst();
        } else {
            Node<E> node = getNode(index - 1);
            return removeAfter(node);
        }
    }
    /*</exercise>*/

    /**
     * Query the size of the list
     * @return The number of objects in the list
     */
    public int size() {
        return size;
    }

    /**
     * Obtain a string representation of the list
     * @return A String representation of the list 
     */
    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder("[");
        Node p = head;
        if (p != null) {
            while (p.next != null) {
                sb.append(p.data.toString());
                sb.append(" ==> ");
                p = p.next;
            }
            sb.append(p.data.toString());
        }
        sb.append("]");
        return sb.toString();
    }

    /*<exercise chapter="2" section="5" type="programming" number="3">*/
    /**
     * Remove the first occurence of element item.
     * @param item The item to be removed
     * @return true if item is found and removed; otherwise, return false.
     */
    public boolean remove(E item) {
        if (head == null) {
            return false;
        }
        Node<E> current = head;
        if (item.equals(current.data)) {
            removeFirst();
            return true;
        }
        while (current.next != null) {
            if (item.equals(current.next.data)) {
                removeAfter(current);
                return true;
            }
            current = current.next;
        }
        return false;
    }
}
SingleLinkedList.addFirst
SingleLinkedList.addAfter
SingleLinkedList.removeAftert
SingleLinkedList.removeFirst


SLLTest.java
// Data Structures and Algorithms
// CCT College Dublin
// Dr. Muhammad Iqbal


public class SLLTest {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        Node<String> tom = new Node<String>("Tom");
        Node<String> dick = new Node<String>("Dick");
        Node<String> harry = new Node<String>("Harry");
        Node<String> sam = new Node<String>("Sam");

        tom.next = dick;
        dick.next = harry;
        harry.next = sam;
        
        SingleLinkedList<String> testInstance = new SingleLinkedList<String>();
        testInstance.add("Tom");
        testInstance.add("Dick");
        testInstance.add("Harry");
        testInstance.add("Berry");
        testInstance.add(4, "Peter");
        
        for (int i = 0; i < testInstance.size(); i++)
           {
             System.out.println("Linked List Values are " + testInstance.get(i));
           }
        
        testInstance.remove("Harry");
        testInstance.remove(1);
        testInstance.removeFirst();
        testInstance.toString();
        System.out.println("Linked List Values are " + testInstance.toString());
        
    for (int i = 0; i < testInstance.size(); i++)
        {
            System.out.println("Value are " + testInstance.get(i));
        }
        
    }
    
}

Double-Linked Lists