I changed my job a month ago. The reason was simple: my previous company was no
more, has ceased to be, expired and gone to meet its maker. Sure, it happens
quite often in a startup world, but this experience was new to me, and frankly,
very teaching one. As a note to myself, I’m going to analyse it and make a set
of valuable “life lessons” that I’ve learned from it.
How Did I Get There
One usual November day, almost a year ago, I got a message from an unknown
person. She said that she works for a startup that helps companies to find their
ideal employees. She was interested in me as a potential candidate because of
my GitHub profile and was eager to invite me to their system. To get into it
I had to fill my profile and pass basic Skype interview with a recruiter. After
that I got a couple weeks of “being visible”; during that time companies were
able to see my profile and send me requests for their own interviews. Thinking
that I’m not loosing anything, I did the steps and forget about it.
After a couple of days I got a there message from a small startup. They were
doing telemedicine for a couple of years now and looked pretty good. At the end
of a Skype interview, which was mostly chatting about experience and so on, they
asked me whether I liked them and offered me a job. Being upset after getting
a rejection from Google a month ago, I was very glad to get an offer so soon and
carelessly accepted it without any bargaining. In addition to modest salary, I
also got a pack of stock options. Dreaming of luxury yachts and expensive cars,
I started collecting papers necessary for German visa.
Lesson: Take your time to make a decision. Get a handful of other
options to consider before saying “yes” or “no”.
Lesson: Investigate the cost of life and salaries in your field in
a country you’re planning to work in before accepting any offers and even
Lesson: Pay attention to the interview process in a company you’re
applying to: while hard whiteboard coding interviews won’t guarantee that 100%
of company employees are and will be competent, the absence of any technical
tasks there raises the risk of black sheep passing through.
Lesson: Take stock option offer with a grain of salt: in addition to
the weak chances of your new company to become a second Google, there are
a lot of tricky clauses and rules which can leave you without desired wealth
with a snap of fingers. It’s a nice addition to a decent salary; accepting
it in an exchange of a part of your salary is not a good idea though.
The Best Days
Fast-forward to April, 2016 when I finally arrived in Berlin. I’ve been working
remotely all the time since January, and finally, had an opportunity to meet
my new colleagues face to face. On the very first evening I went with some of
them to a restaurant near my place and had a good dinner there. Next day I met
the rest of the team; they were kind and friendly. So far, so good.
The real issue I discovered soon was the absence of somebody responsible for the
office and on-boarding process. It was unclear who I should ask about necessary
documents and promised phone and gym subscriptions. It was always someone’s side
responsibility, and sometimes, people didn’t pay much attention to it. An
example of it was how I tried to get paid for the time I worked for the company.
Turned out that it was a side-responsibility of CFO. He gave me a form which
I had to fill so that later he would pass it to the accountants (the company
outsourced them). The form was not the most descriptive one, but I managed to
fill it with the help from my fellow developers who have been in the same boat
before. I sent it to CFO right before the end of April and started waiting for
my salary. When it didn’t come on time, I asked him. He replied that he forgot
to pass the form to the accountants because he was too busy.
Lesson: Pay attention to the processes in your company. It’s
management’s job to make sure that company works like a clockwork,
both externally and internally. Neither the quality of the product, nor
the user base or any other thing can save the company when it is managed
Apart from this nuisance with salary, working there at that time was good.
Every week we had a company meetings and saw how the product grows and people
use it. Though some time ago I decided not to get personally attached anymore to
the projects I do on my job, I did have a good feeling knowing that this thing
I’m working on right now is useful to people.
So, by the end of June we had a successful launch in Norway and a big feature
done and ready to be rolled out. Also, we were going to move to another office
on the last day of June. When I came to the office that day, I though that we
were on a road to success, but half of the team getting fired proved that I was
Lesson: Life is volatile. Even when it seems that everything is better
than ever, things can get much worse, and it’s good to be prepared for this
Struggling and The End
After those events, work did not progress well: partly because everybody was
uncertain about the future, partly because there wasn’t a new road map yet. We
were basically slacking the whole July, drinking beer and hanging out. It
started getting a bit better in August, at least on a product side, but still
was very volatile. Discoveries of new unpleasant issues on management and
financial sides were made almost every week; they didn’t make us feel better either.
The pleasant and somewhat funny consequence of this situation was increased
number of various team events. I think that almost any time we had founders in
Berlin, we went with them somewhere to eat and drink. In the end it turned out
that the company fell apart earlier than the team.
On the day we discovered that the company is no more, we went for the last team
hangout. I got pretty drunk and returned home at 4 a.m. Two days later I flew to
Italy on a vacation, and after that, started looking for a new job. But that is
Usually when programming in Haskell, I rely a lot on compiler. It catches
many errors that are caused by inattention, which occur especially often
when refactoring large portions of code. In some sense famous “it compiles,
therefore it works” mantra is true (assuming that you don’t have logical
errors). The problem with this approach for me for a long time was that
I ran compiler manually and then read its output. The output was usually
quite long, and I had to read it and start fixing errors from the top, otherwise
I might end up fixing wrong error. It was a tedious routine, even with a help of
bash scripts and file watchers, but thanks to ghcid1, I am free from it now.
ghcid can be interpreted as “GHCi-daemon” or “GHC-with-a-bit-of-IDE”. It is a
rather small program which starts GHCi, imports your modules there, executes
some Haskell expression in it, watches your source code and fires :reload and
reevaluates expression once something is changed. It also captures warnings and
errors produced by compiler, and presents them in a way where you can see the
topmost ones right away, without scrolling the whole output to the top. Last but
not least is that it plays really well with stack, which is de-facto
a standard tool for building and testing Haskell projects nowadays.
How to use it
First, install it:
$ stack install ghcid
Now you can run it with stack exec. At this point I usually make a bash script
and tune ghcid the way I need it for my application.
First of all we need to setup is the expression which will be evaluated in GHCi
session. You can do it with -T flag. If you want to run something different
than GHCi, you can pass the command to -c flag. By default ghcid detects
whether you’re using stack or just plain cabal and runs either stack repl
or cabal repl respectively.
Another thing I usually turn on for my projects are -Wall -Werror compiler
flags. It catches such errors as redundant imports, orphan instances, unused
variables and helps you keep the codebase clean. However, when working in a REPL
mode, you’d usually concentrate on the big picture and leave those warnings to
be fixed later, e.g. before committing code. When -W option is passed ghcid
doesn’t take those errors into account and evaluate given expression anyway.
By default ghcid watches only for those source files that were loaded by GHCi.
Sometimes it might not be enough. For example, now I’m working on application
which has Hamlet templates in templates directory and I’d like to reload the
app each time some of templates is changed. Luckily it’s easy to make ghcid do
so with --restart=<PATH> or --reload=<PATH> options. The first one will
rebuild the project and restart the whole GHCi process while the second one will
just fire :reload command to existing GHCi process whenever any of the files
and directories in <PATH> is changed.
Issues and ways to improve
One thing that would be nice to have is errors highlighting. Sure they are
printed in bold font now, but having line numbers and error messages in
different colors would help even more. Right now it is possible to achieve by
using ghcid inside terminal emulator in Neovim or Emacs.
Currently highlighting is the only thing that comes to my mind. Overall ghcid
is a great development tool, and even minor bugs (usually caused by GHCi) can’t
really outweigh its advantages. Definitely must-have.
This article will cover several sorting algorithms: working in quadratic time
(bubble, insertion and selection sort), in loglinear time (merge sort, quicksort and
heap sort) and in linear time, but only for special data types (counting and
radix sort). Source code of examples used in this article is available on
Quadratic time algorithms
There is not much to say about quadratic time algorithms. Usually, they are not
used by their own, but with some loglinear algorithm instead. For example,
insertion sort sometimes is used inside merge sort or quicksort to sort small
subarrays. As for bubble sort, usually it is considerably slower that other
quadratic time sort algorithms and not used in practice.
Idea: repeatedly iterate over an array comparing items pairwise and
exchanging if they are out of order. It is easy to notice that on each
step the biggest item will be placed at the end of an array, thus at
most n iterations is needed to put all items in their places. This algorithm is
Idea: iterate over an array keeping its left side sorted. When next item is
encountered, push it further to the left until it is less or equal than other
items. There are two loops, each of at most n iterations, thus quadratic time.
This algorithm is stable.
Idea: iterate over all positions in array. For each position i search for
minimal item in the range of [i, n] and put it in position i. For each
iteration searching for minimal item will take O(n) time, thus giving
quadratic time overall. This algorithm can be implemented as stable,
however, the implementation below is not stable.
Loglinear time algorithms
Loglinear algorithms are the ones that widely used in practice. It is proven
that it is not possible to sort arbitrary data using comparisons faster than
in O(nlog(n)) time, so these algorithms are the fastest ones. However, which
algorithm to use in a program depends on its goals and constraints.
Idea: (top-down merge sort) split array in halves, sort each half
recursively and then merge them. Using master theorem, T(n) = 2T(n/2) + O(n)
= O(nlog(n)). This algorithm is stable.
Bottom-up merge sort treats n-element array as n singleton arrays. It merges
them pairwise producing n/2 two-element arrays, merging them again and again
until it merges two n/2-element arrays into one. It can be implemented without
The implementation below use O(n) space for buffer array which is used
for merging two subarrays. This is standard recursive top-down merge sort.
Merge sort is easily parallelizable algorithm. It is more efficient than
quicksort on linked lists where it can be implemented using only constant space.
Idea: split array in two parts by comparing each element to the pivot, then
recursively sort these parts. Because quicksort depends on the contents of
array, it is possible to construct such array that will force this algorithm to
work quadratic time. Though this problem is easily solved by choosing random
element, to be precise it is necessary to say that this algorithm has
O(nlog(n)) time complexity in the average case and O(n^2) time complexity
in the worst case. Space complexity is O(log(n)) in the average case and O(n)
in the worst case since this algorithm is recursive.
Another delicate part of quicksort is how to partition array in constant space.
In order to do that create hiStart pointer which will store index of the
beginning of the second part of the array. Then iterate over the array comparing
items to pivot: if current item is less than pivot, swap it with the item at
hiStart and increment hiStart.
To speed up quicksort in the situation of repeated items one could implement
splitting into three parts: less than, equal to and greater than pivot. It
requires one more pointer, loEnd which will store index of the end of the
The implementation below chooses random pivot and uses two-way partition. It is
not stable, however, it is possible to make quicksort stable, but it
requires additional O(n) space. This implementation use O(1) space.
Quicksort has better constant than merge sort and heap sort. When
implemented well it could outperform its competitors drastically. It is possible
to parallelize quicksort, but it is a bit harder to do than in case of merge
Idea: build a heap on input array, then iteratively take minimal element out
of it until there is no items left. Since building a heap has O(n) time
complexity and retrieving minimal element from it requires O(log(n)) time
overall time complexity for this sorting algorithm is O(nlog(n)). This
algorithm is not stable.
This idea could be used on any data structure which supports building in
O(nlog(n)) or faster and retrieving minimum/maximum in O(log(n)). Heap
is preferred because it is possible to build heap on the same input array,
thus spending only O(1) space.
Linear time algorithms
Linear time sorting algorithms do not compare elements or they would work in
O(nlog(n)). Instead, they either require some constraints on input or use
some knowledge about items of an array. They usually used in complex algorithms
to cut some time here and there, e.g. building suffix arrays or for sorting data
on parallel clusters.
Assumption: all items of input array belong to some finite set of items
or could be uniquely associated with a key from this finite set. Denote the
size of such set as k.
Idea: iterate over input array counting how many times each distinct item
occurs here, then using this information put items in their places. To store
occurences we need array of integers of length k and output array of length
n, thus we need O(n+k) space. Both of these array are traversed during the
execution of algorithm, so its time complexity is O(n+k) too.
Assumption: each item from input array should be represented in positional
notation, i.e. as a collection of characters from finite alphabet of size k
where the power of each character depends on where it is placed within
a collection. Examples of positional notation are binary and decimal numbers
Idea: sort input items by a single character in fixed position for all
possible positions, e.g. sort by 1st character, then by 2nd, continue until
all m positions were used to sort items. Sorting at specified position
could be performed using counting sort since alphabet is finite, therefore
it takes O(m(n+k)) time and O(mn) space to sort the whole array.
Depending on the starting position radix sort could be LSD (least significant
digits first) or MSD (most significant digits first). The first one is used to
sort in representation order (correct for decimals), the second one is for
sorting in lexicographic order. The implementation below is LSD radix
Algorithms comparison chart
Algorithm \ Property
Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms
This article will cover another bunch of must-know data structures, such as
heaps, priority queues, associative arrays and hashes. Source code of examples
used in this article is available on
Heap is a tree-like data structure satisfying the following property: the
key of parent node is ordered with respect of the key of child node. According
to the order, heaps can be divided into max heaps (parent key is greater than
the chilld’s one) and min heaps (parent key is lesser than the child’s one).
According to the structure of an underlying tree heaps can be binary, d-ary,
binomial, fibonacci and many others. The following interface is used to
Heaps are usually used to implement priority queues, which are used in some
graph algorithms (Prim, Dijkstra). Heapsort sorting algorithm is another example
of heap’s usage.
AFAIK there is no particular implementation of heap as data structure in Java
and C++ languages. However, there is std::make_heap function in C++ standard
library which rearranges elements in given container in a way that they form
a heap. As for Java, there is PriorityQueue class which uses heap under the
Binary min heap implementation
Idea: this implementation does not have explicit tree structure. Instead,
array is used to store nodes. Index of given node is used to identify its parent
and children (getParent, getLeftChild and getRightChild methods). The
image below gives an example of how to store binary heap of 7 items in an array.
The most important part of heap implementation is methods for maintaining heap
property. In case of binary heap there are two of such methods: bubbleUp and
bubbleDown. bubbleUp compares given element and its parent and if they
violate heap property, exhanges them and moves up the tree. bubbleDown
does the same thing, only for the children of the given element and moves down
the tree. The first method is used to implement insert: we put new item at the
end of an array and run bubbleUp on it in order to put it in its place. The
second method is used to implement popMinimum: we exchange the top item with
the last one and run bubbleDown on the top item in order to put it in its
Now the last thing to do is how to heapify given array efficiently. Naive
approach is to create an empty heap and perform n insertions, which gives us
O(nlog(n)) time and O(n) space. However, it is possible to do it in O(n) time
and O(1) space using bubbleDown method: just iterate over the array in reverse
order and run bubbleDown on each item.
isEmpty and getMinimum have O(1) time complexity. Because both bubbleUp
and bibbleDown take O(nlog(n)) time, insert and popMinimum methods have
the same time complexity. As for Heap(int) ctor: though it is just n runs
of bubbleDown which sholud give us O(nlog(n)) time, actually, it works
faster than that, because most of heapification takes place in lower levels.
Complete proof of its time complexity can be found on
Priority queue is abstract data structure similar to stack or queue, but the
order of items retrieving in it depends on items’ priorities. Usually item with
the highest priority is served first.
Priority queues applicable in many popular algorithms: Dijkstra, Prim, Huffman
encoding, etc. There are implementations of priority queues in C++ and Java:
std::priority_queue and PriorityQueue classes respectively.
Implementation of priority queue using binary min heap
No additional comments as implementation is quite straightforward.
Associative array or map is an abstract data structure. It stores key-value
pairs in such a way, that each possible key appears only once. To describe
associative array the following interface is used:
Associative arrays are applicable in great variaties of computer tasks: removing
duplicates, counting items, etc. There are even specials databases called
“key-value storage” which store key-value pairs either in memeory or on disk
(Redis, Memcached, etc).
Hash table is one of data structures implementing associative array. It uses
array to store key-value pairs and special hash function to compute index of
particular pair by its key. Usually, for implementation of map it performs
better than binary search tree (both put and get work in O(1) amortized
comparing to O(log(n)) in BST), but it is highly dependent on choosing the
right hash function, load factor and collision resolution method.
Hash tables are used to implement maps, sets, database indexes and caches. They
are available in almost any modern programming language. Some languages provide
them as syntactic construct (Python, Ruby), others have specific classes in
standard libraries (std::unordered_map in C++ and HashMap in Java).
Implementation of hash table
Idea: hash table is always implemented as underlying array containing
entries. Depending on collision resolution method, entries could be either
key-value pairs or their chains. There are several methods for collision
resolution; the most popular ones are open adressing and chaining.
Open addressing method implements the following strategy: when collision
occurs during insertion, try to find a different place in array for
particular key by sequential probing of other places until the free one is
found. When searching for a value, pairs are traversed in exactly the same
way. There are several ways of choosing which places to look at when searching
for the free one; the easiest method, called “linear probing”, just looks at the
place following the current one.
Chaining method solves the collision problem differently. It stores
buckets of pairs in underlying array; when collision occurs, new pair is dropped
in associated bucket. When searching for a value, bucket is searched. Usually,
buckets are implemented as linked lists of pairs. The implementation below uses
Another important part of hash table implementation is choosing the right hash
function. The main requirement of hash function is uniform distribution of its
values. The uniformity is crucial; otherwise the number of collisions rises
causing bad performance of hash table. Though, for particular implementation
this condition can be loosen to uniformity at specific sizes. The implementation
below relies on hashCode function of Java’s Object for simplicity and
performs no additional hashing. It may affect performance as programmers writing
classes are not obliged to guarantee uniformity of hashCode.
As the number of items in hash table grows, the chance of collision rises too;
that’s why it is necessary to resize underlying array and re-insert all
key-value pairs sometimes. To determine the time for performing this action,
load factor indicator is used. It is calculated as the number of key-value
pairs inserted divided by the size of underlying array. As for the
implementation below, load factor of 0.75 indicates that resizing is needed; the
number was chosen manually, though in real world you may want to perform some
experiments to choose it more precisely.
Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms
This article is intended to cover some basic data structures: arrays, linked
lists, stacks and queues; their applications and implementations in Java. Source
code of implementations used in this article is available on
Abstract and concrete data structures
Abstract data structure is a mathematical model, an interface declaring
desired behaviour. Concrete data structure is an implementation of such
interface. For example, collection or container is an abstract data structure.
It says only that some items of particular type are stored inside it and
declares methods for accessing and updating them. Array and list, on the
other side, are concrete data structures. They describe the way items are
stored (in continuous memory chunk versus as a chain of pointers) and implement
methods declared in collection’s interface.
Collection (Container) is an abstract data structure. It
holds several items of some specified type T and offers methods for accessing
and updating them. The following interface will be used throughout this article
to describe containers:
Array is a container that holds a number of items in contiguous area of
memory. In its simplest form, static array, the capacity of a container is set
up at the moment of its creation and may not be changed. Dynamic arrays,
however, can grow automatically when current capacity is exceeded.
Arrays are used both on its own and as an underlying data structure for more
sophisticated structures, e.g. heaps, associative arrays and adjacency matrices.
Arrays are supported out of the box in overwhelming majority of modern
programming languages. Usually static arrays present as a syntactic feature.
Here is an example of creating static array of 10 ints in Java:
Dynamic arrays often shipped as utility classes of standard library
(std::vector in C++ and ArrayList in Java).
Dynamic array implementation
Idea: let’s start with a small static array. If there is no free space left
to insert new item, then allocate new static array with extended capacity and copy
all items from the old array into the new one. It is proven that allocating new
array twice as large as the old one gives you O(1) amortized complexity of
inserting new item at the end of an array.
Here is a simple implementation of dynamic array in Java:
Advanced implementation can shrink array when the number of items drops below
some point, say, half of capacity. However, it should be implemented with
cautious, because too eager behaviour can cause drastic change of complexity.
Imagine an array which shrinks itself in half as soon as the number of items
drops below half of capacity; when there are n/2 items and you remove one,
then insert one and repeat these actions m times the overall complexity will
be O(mn). It’s better to shrink array by lesser amount, a third or a fourth of
capacity for example.
Linked list is also a container. It stores its items in nodes. Each node
consists of at least two fields: a value to store and a pointer to another node.
Nodes are chained one to another and, thus, can be located in arbitrary places
of memory (contrary to arrays). Linked lists can be divided into several groups
according to the number of pointers in its nodes: singly-linked,
doubly-linked and multiply-linked.
Linked lists can be used either on its own or as an underlying data structure
for such ADTs as adjacency lists, stacks, queues and for collision resolution in
Linked lists are present in many programming languages, for example, std::list
in C++ and LinkedList in Java. Usually they are implemented as doubly-linked
Doubly-linked list implementation
Idea: key idea of this implementation is to use head and tail sentinel
nodes. They are constant for particular linked list object; they point to each
other when the list is empty and to the first/last node otherwise. Head
sentinel’s prev is always null, just as tail sentinel’s next. Main
purpose of such sentinel nodes is to save us time and energy playing around
null checks: this way we can be sure that list always has head and tail nodes,
though they’re merely decorative.
Here is a simple implementation of doubly-linked list in Java:
Here is complexity comparision of dynamic array and doubly linked list for all
operations in Collection interface. As you see, linked list wins on
prepending, while dynamic array wins on random access.
Additional space used
Stacks and queues
Stacks and queues are abstract data structures that support limited access to
their elements: they allow only appending and retrieval. They are distinguished
by the particular retrieval order they support: stack is LIFO container
(“last-in-first-out”) and queue is FIFO container (“first-in-first-out”).
Interfaces of stack and queue are presented below:
Both stacks and queues are widely applicable in modern software engineering.
Examples of stack usage are:
Function calls are stored in stack while program is executed
Many parsing and expression evaluation algorithms use stacks to hold values or
currently processed lexemes
JVM is a stack-based virtual machine. It does not use registers; instead it
stores all variables on stack
Examples of queue usage are:
Concurrent applications; for example, concurrent access to some resource means
that there is a queue of requests and they are processed in order
Buffering and data transfer
Usually stacks and queues are implemented as adapters to another containers,
for example, linked lists or dynamic arrays. In Java Queue is an interface
implemented by classes ArrayDeque, LinkedList and others. Stack is
a concrete class, but it is obsolete. It is recommended to use interface
Deque instead and its implementations as it has methods of both stacks and
Explanations skipped as the implementation is quite straightforward.
Implementation of stack using doubly-linked list will be exactly the same.
Complexity of operations is O(1) amortized for
array and O(1) for list.
Queue implementation using dynamic array
Idea: this is standard two-pointer implementation of a queue.
Because in queue we need to retrieve element from the beginning of a queue, it
is inefficient to shift the whole array back each time dequeue is called.
Instead lets create a pointer to location where the real beginning of a queue is
and shift it forward on dequeue. On enqueue, however, lets insert new item
not just at the end of an array, but, if possible, in the free place between
array’s beginning and queue’s beginning. In order to achieve this lets create
another pointer to store location of the real end of a queue. When this two
pointers meet lets relocate the whole array so that queue’s beginning and end
will be the same as underlying array’s are.
Complexity of this solution is O(1) amortized for both operations.
Queue implementation using doubly-linked list
This implementation is much easier, because in doubly-linked list we can
remove item from both beginning and end in O(1).
Complexity of this solution is O(1) for both operations.
Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms