Tuesday, July 26, 2011

Merge sorting in JavaScript

The latest addition to my data structures and algorithms in JavaScript is the merge sort algorithm.

There are four main steps in the merge sort algorithm (from Wikipedia):
  • If the list is of length 0 or 1, then it is already sorted. Otherwise:
  • Divide the unsorted list into two sublists of about half the size.
  • Sort each sublist recursively by re-applying the merge sort.
  • Merge the two sublists back into one sorted list.
I found it a lot easier to understand the algorithm by just looking at this diagram (also from Wikipedia).



I added a public mergeSort function to the sortArray object, which I showed in the first post in these series.

this.mergeSort = function() {                                              
    elements = internalMergeSort(elements, this.onSort);                      
    
    return elements;
};     

This function calls the internalMergeSort function passing in the internal elements array and the onSort callback.

var internalMergeSort = function(elements, onSort){            
    if (elements.length < 2){                               
        return elements;  
    }           
             
    // Calculate the middle of the elements
    var middle = Math.floor(elements.length / 2);           
               
    // Divide 
    var leftRange = elements.slice(0, middle);
    var rightRange = elements.slice(middle, elements.length);                                                      
   
    // Conquer                                                                                                        
    var mergingResult = merge(internalMergeSort(leftRange, onSort), 
                              internalMergeSort(rightRange, onSort));                                   
             
    onSort(mergingResult);           
               
    return mergingResult;
};


The function recursively divides the elements in two parts, sorting them and finally merging them back together.

The merge function is implemented like this.

function merge(left, right){                      
    var res = [];           
    
    while (left.length > 0 && right.length > 0) {                
        if (left[0] <= right[0]) {
            res.push(left.shift());
        } else {
            res.push(right.shift());
        }                                              
    }           
    
    while (left.length > 0) {                
        res.push(left.shift());
    }            
    while (right.length > 0) {            
        res.push(right.shift());
    }
    
    return res;
};  

I peeked at this implementation, which is very similar to mine and which helped me write the first while loop more elegantly.

Testing sorting algorithms is pretty easy. So far, I have only defined one QUnit test for this algorithm.

test("MergeSort sorts.", function() {
    var sortArray = new algorithms.sortArray();
    
    sortArray.push(15);
    ...
    sortArray.push(130); 
  
    var actual = sortArray.mergeSort();                               
    var expected = [1, 10, 12, 15, 17, 22, 25, 50, 60, 90, 130];
    
    deepEqual(actual, expected);

});  

A small remark here is that you should use deepEqual instead of equal for array assertions. We want to compare the contents of the array, not their references.

See the algorithm in action here.

Saturday, July 23, 2011

My js data structures and algorithms now on GitHub

If you have been reading my blog lately, you know that I'm implementing some data structures and algorithms in JavaScript. So far, I have blogged about simple sorting algorithms, the stack data structure and the queue data structure. This week I have also implemented a doubly linked list. I started writing a post about that last implementation, but I didn't like where it was going, so instead of writing about it, I have pushed everything to a public Git repository.

You can find the public repository here.

I structured the repository like this:
  • Scripts: The JavaScript files which contain the various algorithms and data structures.
  • Implementations: HTML pages used to demonstrate the usage of some of these algorithms and data structures.
  • Tests: QUnit tests.
Feel free to watch or fork this project. I will still be writing about most implemenations, but probably not all of them.

Thursday, July 21, 2011

Recursively spawning Web Workers

I like to think of HTML5 Web Workers simply as 'threading for the Web'.

Wikipedia describes it a bit more in detail.
Web Workers define an API for running scripts, basically JavaScript, in the background independently of any user interface scripts. This allows for long-running scripts that are not interrupted by scripts that respond to clicks or other user interactions, and allows long tasks to be executed without yielding to keep the page responsive.
You can start a Web Worker like this.

if (typeof(Worker) === "undefined") {
    alert("Web Workers not supported in your browser.");
} else {
    var worker = new Worker("worker.js");
}

The Web Worker API allows workers to create subworkers. The fun quirk here is that you can start a worker that recursively spawns another worker with the same JavaScript file.

So in worker.js, you can do this.

var worker = new Worker("worker.js");

And as you can guess, this has some interesting effects.


Try it yourself here.

Monday, July 18, 2011

Overoptimizing my JavaScript stack implementation for fun

Davy Brion made a comment on my JavaScript stack/queue implementation on Twitter last night: Any reason why you don't immediately set elements to [] at declaration in your stack/queue example?

var elements;
 
this.push = function(element) {
    if (typeof(elements) === 'undefined') {
        elements = [];   
    }                            
    elements.push(element);
}

Yes, I made an overoptimization, and a bad one. In this implementation, you save a few bytes in memory if you initialize the stack, but don't push elements. This might have made some sense 15 years ago, but today a few bytes are very negligible compared to the cost of evaluating the elements reference on every push call.

Anyway, thinking about this driving home, I thought of another optimization, which meets both arguments and is far more fun.

this.push = function(element) {                        
    elements = [];   
    elements.push(element);                            
    this.push = function(element) {
        //Rewriting self. Overoptimization ftw!
        elements.push(element);                                   
    }                                                 
}

The first time the push function is executed, the elements array is initialized. In the same function, the function rewrites itself to no longer initialize the elements array. So the second time the push function is called, it will no longer initalize the array, but only push a new item to the existing elements array.

Dynamilicious!

Sunday, July 17, 2011

Internet Explorer, how could you miss this?

Modern browsers keep making efforts to make the web a better place. One of those small, but so thoughtful features, is a way to tame unwanted dialogs.

If I run following JavaScript, Chrome and Firefox both help me disabling it.

for (var i = 0; i < 1000; i++) {
    alert("Annoying alert " + i);
}


Internet Explorer does not. Even if I kill the process of the tab, the tab will try to restore itself, forcing me to kill Internet Explorer as a whole.


IE team, I don't dislike your browser, but this might be something worth implementing in IE10?

Tuesday, July 12, 2011

Stacks and queues in JavaScript

The second assignment in my 'implementing data structures and algorithms in JavaScript' quest consists of two popular data structures: the stack and the queue.

The stack
A stack is a last in, first out (LIFO) abstract data type and data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top.
Implementing this turned out to be pretty easy. A native JavaScript array already exposes methods to push and pop elements. In the dataStructures namespace, I defined a stack object. The stack object contains a private array which is initialized as soon as the first element is pushed into the stack. The public push and pop functions expose the corresponding functions of the private array. The stackTop function returns the last element added to the stack, but doesn't remove it from the internal array.

var dataStructures = {
    stack : function() {                  
        var elements;
        
        this.push = function(element) {
            if (typeof(elements) === 'undefined') {
                elements = [];   
            }                            
            elements.push(element);
        }
        this.pop = function() {
            return elements.pop();
        }
        this.stackTop = function(element) {
            return elements[elements.length - 1];
        }
    }
}

Although a native array contains most stack operations, the stack object we made is still pretty useful. We end up with a clean encapsulated stack which only exposes the core stack operations.

var someStack = new dataStructures.stack();
 
someStack.push(1);
someStack.push(2);
someStack.push(3);
 
var stackTopResult = someStack.stackTop();                                                 
 
stackTopResults.html(stackTopResult);
 
var popResult = "";
 
popResult += someStack.pop();
popResult += someStack.pop();
popResult += someStack.pop();

The queue
A queue is a particular kind of collection in which the entities in the collection are kept in order and the principal (or only) operations on the collection are the addition of entities to the rear terminal position and removal of entities from the front terminal position. This makes the queue a First-In-First-Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will be the first one to be removed.
As with the stack, the native array object already contains a few functions which help us implement a queue. The terminology doesn't completely match though. Enqueue maps with pop and dequeue maps with shift.

I added the queue object to the dataStructures namespace. The queue also holds a private array of elements. The enqueue function pushes a new element on the queue, and the dequeue function removes the first element from the queue. The peek function returns the first element in the array, but does not remove it.

queue : function() {
    var elements;
    
    this.enqueue = function(element) {
        if (typeof(elements) === 'undefined') {
            elements = [];   
        }
        elements.push(element);                       
    }
    this.dequeue = function() {
        return elements.shift();                                                
    }
    this.peek = function(){
        return elements[0];                  
    }
}

The queue can be used like this.

someQueue.enqueue(1);
someQueue.enqueue(2);
someQueue.enqueue(3);               
                
var queuePeekResult = someQueue.peek();
 
queuePeekResults.html(queuePeekResult);
 
var dequeueResult = "";                   
                    
dequeueResult += someQueue.dequeue();
dequeueResult += someQueue.dequeue();
dequeueResult += someQueue.dequeue(); 

Find the final result here. Also check out the previous post on simple sorting in JavaScript.

Thursday, July 7, 2011

Simple sorting in JavaScript

About three years ago I graduated and got my degree in Applied Computer Science. Although it says Computer Science, we hardly ever focused on data structures and algorithms. Obviously, I now see that as a shortcoming. So I plan to make up for that by reading up on some of the basics. While at it, I might be blogging on some of the topics.

I am going to start by implementing some of the simple sorting algorithms in JavaScript.

Find the final result here.

Bubble sort

The first sorting algorithm I'm going to implement is probably also the easiest, and slowest in most scenario's: Bubble sort.
Bubble sort is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted.
Let's start by defining a namespace.

var algorithms = { };

We could extend the native array object, but to keep things simple, let's create our own sortArray object.

var algorithms = {         
    sortArray : function(){
                
    } 
};

The sortArray object contains a private array of elements. To add elements to that array, we will expose a push function on our sortArray object.

var algorithms = {         
    sortArray : function(){
        var elements = [];                                                                           
                     
        this.push = function(val){
            elements.push(val);                    
        };                                       
    } 
};

So far we can do something like this.

var sortArr = new algorithms.sortArray();               
             
sortArr.push(80);
sortArr.push(20);
sortArr.push(...);

Now we have to add a public sort function. Before we do that we need to define a private swap function, which can swap elements.

var swap = function(one, two) {
    var tmp = elements[one];
    elements[one] = elements[two];
    elements[two] = tmp;                    
};

The bubbleSort function will take a callback argument which will be called every sort iteration.

this.bubbleSort = function(callback) {   
    callback(elements);
    //Loop over all the elements
    for (var out = elements.length - 1; out > 0; out--){                            
        for (var inn = 0; inn < out; inn++) {
            //Are they out of order?
            if (elements[inn] > elements[inn+1]){
                swap(inn, inn+1);                                
            } 
        }
        callback(elements);
    }                    
};

That should be it. Now let's try to push some elements in the array, define a callback and call the bubbleSort function. In the callback function I'm using the Google Charts API to visualize the sorting process.

$(document).ready(function() {             
    ...                                 
                
    var results = $("#results");
    
    var drawElements = function(elements) {                            
        var chd = "";                    
        
        for (var e in elements) {
            chd += elements[e].toString() + ",";
        }
        
        //Remove that last ","
        chd = chd.substring(0, chd.length - 1);                        
        
        //Build the Google Charts url
        var chartUrlBase = "http://chart.apis.google.com/chart?chxt=y&chbh=a&chs=300x225&cht=bvg&chco=A2C180,3D7930&chtt=Sorting";
        var chartUrlComplete = chartUrlBase + "&chd=t:" + chd;                                    
        
        $("#results").append("<img src='" + chartUrlComplete + "'/>").append("<br/>");                                                                                                                                         
    }                              
                         
    sortArr.bubbleSort(drawElements);                               
});

Selection sort

The second algorithm is Selection sort. I found this one the easiest to understand so far: find the mimimum value, swap it with the value in the first position, advance one position and repeat.
Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2) time complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and also has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited.
Let's start by giving the BubbleSort callback function another place. We will make this a public function on our sortArray. I will name it onSort and set it to an empty function by default.

this.onSort = function() {};      

Once that is done we can define our public selectionSort function. This function calls the onSort function every sort iteration.

this.selectionSort = function(){
    this.onSort(elements);
    for (var out = 0; out < elements.length - 1; out++) {
        var min = out;
        for (var inn = out; inn < elements.length; inn++) {
            //If minium greater => new minimum
            if (elements[inn] < elements[min]){
                min = inn;
            }                            
        }
        swap(out, min);
        this.onSort(elements);
    }
};

Insertion sort

The last simple sorting algorithm I'm going to implement in this post is Insertion sort.
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array (or list) is built one entry at a time.

Every repetition of insertion sort removes an element from the input data, inserting it into the correct position in the already-sorted list, until no input elements remain. The choice of which element to remove from the input is arbitrary, and can be made using almost any choice algorithm.
The insertionSort is just another public function on our sortArray object.

this.insertionSort = function(){
    this.onSort(elements);
    for (var out = 1; out < elements.length; out++){                        
        var temp = elements[out];
        var inn = out;
        
        //Until one is smaller
        while (inn > 0 && elements[inn-1] >= temp){
            elements[inn] = elements[inn-1];

            inn--;
        }
        elements[inn] = temp;
        this.onSort(elements);
    }
}

You can find the final result here. If you have any remarks on these implementations, please let me know!