Sunday, May 17, 2015

Getting an IT job in Sri Lanka : An anecdote!

A lot of us who are involved in a field of study in IT, do at some point want to get a job in a renowned IT company in Sri Lanka. I am also one who went through the process of finding a company, applying for it, going through the selection process and finally setting foot as an employee. And there is a lot of things that I did right, and probably equally as many that I did wrong, and learned from. So I'll just post this humble accounts for anyone who may be interested. So this 'meta-post' will be, hopefully, helpful for someone in the same situation at some point.

There are few staples in getting a job in the IT sector. I will try to generalize to anyone who will be looking for employment in IT, regardless of the course they follow, whether it is in Software Engineering, Computer Science, or something of both, and whether they're from a state or private institute, looking for permanent employment or internship etc.

Rule #1 : Know what you want!

A lot of people, I have known, including myself at one point, are looking 'for a place to work'. At the end of the day, this is of course what it boils down to. But it is important to know what you want to do, at least to some extent. For instance, if you say 'I want to be a software engineer, I'll settle for anything in that area', it is the same as saying 'I want to be a sportsman, I'll take any sport that will take me into a team'.

IT has a very broad spectrum, all of which are more or less equally important. There is software development and process management aspects involved, pure and theoretical computer science thingies involved, management, academics, research and development and a whole lot of other things... So it helps to narrow down where you want to go in this.

The next thing is, to narrow down to a specific area. There is networking, there is the web, there is data analytics and there is project management, quality assurance, human computer interaction, mechatronics, hardware stuff etc. At least, have an idea what all of these are to some extent, and obtain the ability to know which is which.

So before you apply to an organization, you must be able to understand which of these interests you, and drives you, and makes you 'want' to work there.

Rule #2 : Research research research!!!

A very popular, and often confusing question, as I have experienced and heard from others, in selection interviews is "so, why do you want to work with us?". To answer this, you must know at least something that the company is doing. Basic things to look for are ;

  1. The products / services delivered by the company 
  2. The technologies they use 
  3. Clients of that company and why they prefer this company's products/services 
  4. Bonus points for : the corporate culture of the organization... 
You don't need a full understanding, and I don't think it's fair if the selectors expect you to fully understand all of the company before hiring you, but you must have the good sense to know what you're getting into. 

Rule #3 : Practice for interviews ! 

This is a very straightforward thing that we all seem to miss. We try to walk into an interview and hope for the best. But you must know that there is a million of sample interview questions and collections of them out there, and the chances are, the questions are coming out of a pretty limited lot of them. There is a collection of about 100-200 questions out of which about 20 will be asked from you in an interview (the longest ones, that I have faced...). 

One word for those trying to find these questions. GOOGLE! 

Rule #4 : Communicate, properly! 

The conception in Sri Lanka is that to be able to face an interview well, you must know English well. What I know, to be true, is that having a good command of English does not ensure your answers are good, and broken English doesn't mean the answers are bad! The only thing you may need to work on, is the phrasing. There may be questions out there, that are quite simple, but you cannot put to words. Which causes you to panic in an interview. To avoid this, following Rule #3 is a pretty good approach! 

Practice with a friend. I did this with multiple friends and helped out multiple friends by asking sample questions from them. Work on your answers later. For the sake of your employment, practice! 

Rule #5 : Get your shit straight! 

This is one point that I messed up, big time! And hence I know the importance of this. Please have a good understanding of what you're saying, or don't say it at all. 

For instance, always have an in-depth understanding of any project that you are referring to in your applications. And if you think that the question they ask is not something you encountered in that project, don't give an half-assed answer for the sake of completeness. 

Rule #6 : Don't panic! 

Interviews, are not trying to measure all of your capabilities in that limited time. One of the interviewers, I faced told me that "You've been an intern here, and you've gone through four years of college, so I'm going to assume you know how to write a simple piece of code...". Likewise, the interviews are not the only thing that go into your selection. In some companies that I applied for, they took some serious referrals from referees in the CV and some looked into previous project and work. So the chances are, if you do not completely mess up the interview, you will get what's coming to you. 

Rule #7 : " Sometimes you eat the bar, and sometimes the bar eats you... "

Rejection is natural, and dealing with it is one of the biggest ego-busters I've faced, and I am thankful for that. What is important is you give your maximum in the selection process, and not back down mid way. What will happen, will happen. 

Monday, March 23, 2015

An intro into bioinformatics - Cells and Molecules and other stuff...

So... Keeping up with the inconsistent, incoherent articles that I have posted in this blog for years, I have decided to venture into bioinformatics. So expect a series which will span for quite a long time (hopefully).

Bioinformatics is the discipline, as far as I can see, where we study the information related phenomena in the biological world. Main area of concern for us is how information is propagated through chemical sequencing to create functional life forms. This series of articles will give an introductory provisioning of domain knowledge in bioinformatics and computational biology, from what little knowledge I myself have. However, I have hopes to impart new knowledge as I find it.

Now then, in this article, let me give a basic understanding of the domain,  by giving you a rough idea about molecular biology. Please, feel free to point out any inconsistencies.

Everything we call 'life', from the trees to the bees, from the lemurs to the baiji, is the product of millions of years of evolution. Though there has been numerous doubts about the validity of the theory of evolution, it is no doubt a fact which has stood the burden of proof, and the test of time and scrutiny. Though it may not be as crystal clear as the phenomenon of evolution itself, the means and methods of evolution, is the point in case, the greatest puzzle of all time.

To understand this, one needs to look into the basic building blocks of life. At a whim, the uninitiated will undoubtedly point to the animals (or species for the more enlightened ones). The species, while being close to a valid solution to the problem of the unit of evolution, may have been the source of inspiration for the naturalists, even up until a couple of centuries ago. It stands with reason to assume that a species, a living, breathing, moving, reproducing agent of life, whether it be a plant or an animal, whether it be an insect or a mammal, would serve as a useful isolated entity from the surroundings and environment. Not until the discovery of the first microscope, did this idea change.

With the advent of the microscope, new organisms were discovered. These were autonomous organisms which displayed characteristics of life on their own; they had their own metabolism, movement and reproduction. These, 'cells' of life were then considered the unit of living beings, hence we call them the building blocks of life.

Cells are found in nature in a large variety. They differ from each other based on species and their function. However, it is reasonable to think of a complex organism such as a human being, as a colony or a civilization of single-celled organisms which coexist to achieve a single objective; survival. Though there are variations, cells typically have the same basic components.

A typical cell will contain various parts. A cell will get its shape by the cytoskeleton, a fibrous structure which envelops the cell. Inside the cytoskeleton is the cytoplasm, a gell-like substance which contains vital molecules for the existence of the cell. In the middle of a cell, (bar the exception of prokaryotes) lies the neucleus, which controls the existence of the cell and all other aspects of it.

A species is made up of millions and billions of cells. Consider yourself, sitting and reading this essay, and scrolling your mouse-wheel with your right index finger. The digital muscles (muscles in the fingers) that help you move your fingers, are separate cells. They have their own autonomy, they can reproduce accordingly, they can take in food, they can dispose waste, they have their own mitochondria to produce energy from the food. And how do you know your finger is on the wheel? The nerve endings in the fingertips are a set of extensions of neural cells located in your brachialis region. They also have the said autonomy, bar the veracious capacity to replicate. The prokaryotes in your blood gives it its red, and your lungs make up tissue from cells that can effectively filter oxygen from the air you breathe. Your head houses some of the most effective cells ever to have existed, which are capable of doing computations orders of magnitude faster than a computer (alas all the self-awareness and narcissism that comes along with the package). But a question remains, how does a species, a colony of species, propagate its form and function and characteristics to a new generation through reproduction?

To look at this, we will address first the issue of cellular reproduction. There are two forms of reproduction in cells. They are mitosis, in which, a single cell will divide into two identical cells and meiosis, where two cells will contribute their gene... (let this wait...) characteristics to the offspring. An example would be, suppose you scathe your arm on a sharp edge. The damaged muscles, sub-cutaneous tissue, epidermis and the skin tissue will eventually grow back. But if they contributed to each others' division, you'll have a very weird looking new layer of skin, that would make you actually love the acne scars on your face. This doesn't naturally happen (although in the case of a wound with a lot of blood, scar-tissue will form and it's a separate phenomenon) as the cells do not contribute to the reproduction of each other. Therefore the dermis, epidermis and hypodermis simply grow back independently, and it serves as a visible example of mitosis. But, as I am sure you are already aware (I hope I won't have to go towards the awkward details of how babies are made), a child cannot be born in the same manner, but you need two cells from the mother and the father (the egg and the sperm) to contribute to the new organism, which is an example of meiosis.

Now, comes an even more intriguing question; just how exactly is the new cell aware of what it is supposed to do, and what goes where? For this, we come to the even smaller units of life, the gene! The gene, as the unit of selection in evolution was originally proposed by many biologists, but was popularized mainly by Prof. Richard Dawkins by his book “The Selfish Gene”. In this, he gives a good explanation, which I will roughly paraphrase for better insight into the matter.

Imagine a fluid, in which, some molecules, which can combine and recombine to achieve some basic tasks (though not necessarily with purpose) are floating freely around. We will call these amino acids. The combined components, we will call proteins. Needless to say, with a limited number of amino acids, we can create a myriad of different proteins. Now imagine, a molecule, which has some molecules, that can influence the formation of proteins from the original amino acids. This second molecule could encode some structural information on what the proteins should look like (or the converse of it actually). Imagine if this molecule was a large one, which could encode a significant amount of these structural and functional information. All it needs to do is to encode what kind of a protein needs to be created. Such a molecule was created, beyond the shadow of doubt naturally, given the right conditions, and may have undergone a lot of trials and errors of natural selection on itself, and we have the DNA. The Deoxyribo Nucleic Acid owes its success to the structure of itself. It is built of four nucleotides (Adenine, Guanine, Cytosine and Thymine) and these can come together to form very large strands of the molecule, hence enormous amounts of information on the formation of proteins can be stored in them. However, if you think of it as a blueprint of the organism, the 'walls' and 'rooms' and 'hallways' of the blueprint must also be clearly identifiable (not necessarily for an observer, but for the protein builders or ribosomes). These 'sections' are what we call 'genes'. One cannot take a strand of DNA and go about saying where one gene ends and another begins. But sections of the strand will have definite pieces of information encoded in them.

Now, let's see how the whole thing is done in practice. In a cell, or more specifically in the nucleus, the DNA strands are packaged into units called Chromosomes. A chromosome is nothing more than a tightly wound strand of DNA and a special protein called histone which keeps it in shape. Inside the nucleus, there maybe several chromosomes. You typically have 46 of them and a potato has 48. The number of it doesn't really mean anything, however, for a healthy specimen of a species, the count is typically fixed. However, a person with Down's Syndrome will have an additional copy of the 21st chromosome which causes the anomaly. Inside the nucleus, smaller single stranded copies of the DNA is made, which we call RNA. They are made in sections, and these sections can pour through the membrane of the nucleus. The RNAs are like small prints of particular sections of a blueprint, which you will give your gaffer for landscaping, or your painter to finish the paint-job in the first-floor bedroom. The RNA then goes through a special unit called the ribosomes which will create the proteins as encoded by the RNA. These proteins will then go to the required places (through the golgi apparatus) and fulfill their destiny. And some of these new proteins will create new cells. It is pretty straightforward to imagine how mitosis will occur with this model. But meiosis becomes more interesting.

In meiosis, two cells come together, and they share, (now we can use this term) their genetic material. The chromosomes are combined, and split (typically at random places) and these random new DNA will be different to both the parents which will provide a new blueprint for a whole new specimen (Imagine getting a ninth of the blueprint of your house, and eight ninths of your sister's house, which are pretty similar to begin with, and creating a whole new blueprint for a new house). This is how the sexual reproduction operates in a significant number of animals and plants. However, at the creation of this new blueprint (or the crossover) there may be some random mutations which will cause the new specimen to have some characteristics which are not seen in the parents. This is called a mutation (legend has it it could give you laser eyes, and Wolverine's claws, but all I got was a slightly heightened sense of smell and voluntary movement of my ears!). However, if this mutation puts the specimen in an awkward position, that will not prevail. Imagine a pack of Gazelle in a savanna, and they are slightly around the height of the dried grass, which will make them less conspicuous to predators. But suppose there is one offspring, gifted with height. It will most probably be the first ambushed by a perched lioness, and will not propagate its 'tall' gene any further. Similar phenomena has caused the polar bears to be white while the grizzly bears remain brown, the tropical humans to have more sun-resistant dark skins, and the giraffes to have long necks and the deer to have short ones. This, is the beauty of natural selection.


 Life on earth, as we know it, is mysterious and quite wonderful. However, it is likely that our understanding of its origin and path is no great mystery any more. There are minutia to be worked out granted, but I shall leave with a small allegory by late Professor Douglas Adams. “Imagine two hundred years ago you tell a private detective living in New York 'a man who was spotted in Manhattan at 2 pm last Wednesday, was spotted 4 hours later in Charring Cross, London'. He will be simply stupefied how the man crossed the Atlantic in just 4 hours. Now imagine if you told a modern detective the same story. He will still be baffled by the details, which flight did he board? Why did he take the train? And so on... But no great mystery is there for him, about how the man crossed the Atlantic. Our perception on how life works is in the same situation”.  

Friday, February 27, 2015

On group digital signatures....

So there is this thing that I've been trying to figure out; what is the difference between a group signature and a normal signature. To give you a bit of context, let me give a brief intro into digital signatures. I promise, no equations till they're absolutely needed.

Encryption : 

Encryption is a technique we use to ensure many aspects of information security. Mainly, they're used to ensure the 'confidentiality' of a message. This basically means that the message can only be read by the intended party and the intended party alone. However, there are many snoopy malicious 'Eve's around, they'll always gawk at a chance to have a sneak peek at somebody else's affairs.

Integrity and Authenticity : 

Apart from confidentiality, these two are of vital importance in a message. Suppose, there is some person, who, whilst not being able to see what is in the card you're sending to your mother, is secretly attaching a confession of chronic drug abuse into it. How does your mother know whether it is from you, or somebody else altered the message?

Signing It : 

Of course, your mother would know your signature, so it's a simple matter of signing at the end of your letter to ensure that it's from you, and no one has altered it. And I'll add these two slight modifications and assumptions, that : your sign is unforgeable (your mother will immediately notice any tampering) and you have the number of words in your message with the date in your sign, so that she can make sure that you actually signed the said message and nothing else.

Digital Signatures : 

The principle of digital signatures is not much unlike this scenario. Assuming that the confidentiality is already taken care of, to ensure the authenticity and integrity of a message, we do the same thing as your letter to your mother.


  1. Integrity : In order to ensure that your message has not been altered or tampered with, you use a cryptographically secure hash function (the input of which, is your message, and the output is another value, shorter, and quite improbable that two messages will have the same value). You take the hash of your message to ensure that your message is not tampered with. This means that when  somebody gets the message, they can check if the hash is the same as the value they received by giving your message as the hash input. A match, will make sure that the message is intact. Much like the number of words in the bottom of the letter (which, I must admit, is a horrible hash function). 
  2. Authenticity : Now, you must have your own signature under the document. How you sign this document, is you sign the hash function itself. This way, assuming that the message is already confidential, the receiver will know, given that they can identify your signature, that : a ) YOU signed that message and b) you signed THAT message
Public Key Signatures :

A pretty straightforward way to implement this is by using your 'private key' to encrypt the hash function. When the receiver gets your message, they can decrypt it using your 'public key', quite the reverse for which we typically use it, and carry on the verification process.  

Why Group Signatures :

As the saying goes, necessity is the muse of invention. Group signatures are introduced to overcome a very basic need of protecting the signers privacy against potential verifiers. It's like you're signing a petition against the chief of police, a corrupt prick, of your area, and the local legislative officials are capable of verifying that that sign actually belongs to a real person. And the chief of police is convinced if the legislative officer says so, however, the chief cannot know 'who' actually signed it (because you know... white vans and stuff...). 

So this local legislative officer, acts as a group manager, and is the group manager of the group we may call the jurisdictional area of the police station. All residents in this area can sign the petition, and the legislative person can verify if it is true. However, in some special cases, such as attempted forgery or tampering with message content, the legislative officer is capable of revoking this 'anonymity'. However, the basic idea is that " The identity of the signer, doesn't leak to the potential verifiers through the publicly available verification scheme ".. 

I'll sum up WHY this is called a 'group signature scheme' in one sentence : because that way, it's easier to manage the signature verification of a specifically defined group for a group manager, than manage a set of global signatures. After all, your village or town officials may not be able verify the signatures of all the people in the whole wide world. 

Feel free to leave any comments, or request clarifications. This is a quick note, so hope it's clear enough... 

Saturday, December 20, 2014

The Kernel Trick for the Layman

Kernel methods, for those who have been engaged in machine learning, statistics, data mining and stuff like that, is a thrown-around word, without a lot of understanding going on. Since I have some experience dealing with these, thought I should do a write-up.

The Kernel Methods in Machine Learning, is employed by a methodology which is dubbed the "kernel trick". This is just fancy talk to say that we are using a round-about method to exploit kernel metrics for non-euclidean data manifolds, which leads to the obvious question... 

What the hell did I just read? 

To answer that question, my good sir/ madam, we need to look at three main things. What is a non-euclidean data manifold, what is a kernel and what is the kernel trick? Let's briefly look at these things very very briefly. 

Imagine you have collected some data on, say, the traffic behaviors of your nearby (sub)urban area. You have collected two fields, the day of the week and the amount of traffic transferred (ingress or egress) to and fro. Now, imagine trying to find a simple correlation. Assuming that the week starts from Sunday, you will see that on the first day of the week (i.e. Sunday) in the evenings, there will be a lot of traffic flowing outside of the city and on a Friday, a lot of traffic will flow out of the city. But the weekends and the rest of the weekdays, will be having their own localized traffic behavior. Now, suppose you are asked to find a model that describes the correlation between the day of the week and the traffic-flow. 

You will start by encoding the weeks as Sunday = 1, .... , Saturday = 7. And suppose you will take the 'Euclidean distance' between the pairs of days that you consider (in this case, simply the numerical difference) to describe the 'similarity' between the two days.  

Now, you will notice that the traffic on the days {1, 7} will be very similar. And we will also notice that the similarity between the days {1, 4} and {4, 7} will also be roughly equal. But, this will be larger than the similarity of 1 and 7. If this behaves in a Euclidean manner, or simply, it can be described adequately by a euclidean space, adding the similiarities of {1,4} and {1,7} should give a something closer to the similarity of {1,7}. Therefore, we can see that this data, is not behaving in a very Euclidean manner. 

Next, let's talk about kernels. The kernels I am talking about here, are the kernel functions. For this, an initial description about an 'embedding function' needs to be made. Suppose you have a vector-space, say X. Now, imagine a function φ(x) , which maps the data from X to some Hilbert Space H.

                                                   φ : X → H 

 Note that this Hilbert Space can have any dimensionality. Now suppose that you define a function, K such that,
                                                   K : X2 → ℜ 
         
and, K(x,y) = φ(x).φ(y) {i.e. the inner product} 

A function that satisfies these properties is called a kernel function. Now, those who are familiar with basic vectors, will also know that the larger the inner product between two vectors, the more similar they are.

Keeping these in mind, let's move on to the kernel trick.

What we do in the kernel trick , is if we cannot find a good similarity measure in the initial data field, we use the embedding function to map the data space into a Hilbert-space where it may show euclidean properties, and then, find the similarities in that space. So how do we do this?


  1. We have a function that maps the data-space into a Hilbert-space(we have  φ  )
  2. We have a way to find the similarity ( the inner product) 
  3. Best of all, we have a way to find the similarities in the mapped hilbert-space! (the kernel function does them both at the same time!). 

Now all that remains is to use this in a model that we can train for machine-learning. Prepare for the most simple and elegant algorithm you've ever seen in your lives!

Suppose you have to classes that you want to classify the data into. Let's encode these two classes as +1 and -1 , you know, just because...

Now, all you need is the sign from the algorithm. So the predicted class will be,

           class = sign [ Σ K(x,xi ). si wi]

Let's see what this means. Here x  is the data point or entry (vector) that you want to predict. K is the kernel function defined on the model, and the part with the kernel gives the 'similarity' between a point from the dataset, and the point we need to check. s  will be the sign (+ or - ) of the point we're checking with, and w  will be the weight of that instance. This simply means that the more relevant and more valid the point ( by the weight), and the larger the similarity between that point ( by the kernel), the more likely that the sign of that point ( by the sign ) will be the target sign. !


This is the simplest idea behind kernel methods. There are many variations of this and how this is used. But for the sake of simplicity, I will stop at this.

Please feel free to leave a comment or a question if you need any clarifications... Adios!

Thursday, December 18, 2014

Using IPython in your Python Codes

After a while, I thought I should jot this down for anyone who might want to write a certain kind of python script, where they may want to keep the code cached for a significant duration. One of the best tools for this is IPython.

For those familiar with python programming, IPython needs no introduction. I will forgo the niceties and just cut to the chase for the uninitiated. IPython is a python framework which provides an interactive real-time programming experience for the users. It is complete with an interactive shell, a QT graphical user interface and a web interface; the infamous IPython Notebook. Installation guides are all around the web so a simple Google will do.

However, imagine a situation where you have certain piece of code running, say, a server (take Tornado or Django for instance). And you want to run your python codes in a dynamic manner; you want to create  a variable during breakfast, and want to use it as a count for your web-site's visitors, and during lunch, you want to check how many visitors have come to your site. One approach will be to go full caveman and keep a thread running in the background and, is there anything we hate more than dealing with implementing a running thread with an interactive interface?

IPython provides a simple, elegant solution for this. You can initiate a 'kernel' of IPython, and the kernel will act as a single, independent and contiguous runtime instance for Python. Imagine the possibilities with this. However, harnessing the power of this extremely useful tool is not easy for a beginner with the lack of detailed documentation, for understandable reasons.

Now for the fun part; There is only one class that you have to import in your code to do this (note that I assume you already have IPython framework installed in your computer). That is the "KernelManager" object provided in IPython. I'll boil it down to the simple steps you have to follow.


  1. Create a Kernel Manager
  2. Use the Kernel Manager to create a kernel instance and run it
  3. Use the Kernel Manager to create a client which communicates with the said kernel
  4. Open a channel between the client and the kernel using the client (four kinds of channels, go through the documentation for detailed information on them)
  5. Feed your code through the ShellChannel of the client. 

The Sample code looks something like this. 

from IPython.kernel.manager import KernelManager

km = KernelManager()
km.start_kernel()
cl=km.client()

chan = cl.shell_channel()

code_str= "f = open ('/home/yourname/yourfile.txt','w')\
f.write('the sample works')\
f.close()"

chan.execute(code_str)

et voila.... You have a code!

If you need any clarifications, please ask below. Have an interview. Gotta go...

Monday, June 23, 2014

A quick intro to neural networks

I first heard about Neural Networks, as many of those in our generation, from "The Terminator"...

The Terminators had a 'neural net processor' as their brains, that allowed them to learn and reason as humans do. I was just fourteen at the time, and I had barely heard of what a neuron was. And well, it wasn't really catchy back then. 

Oddly enough, the damaged and recovered chip from the first Terminator movie looked rather delicious


About five years later, I found out that it's actually a thing... 

After spending a couple of years trying to understand what a neural network is, following multiple online courses, and reading a big book, I decided it should be better studied once I was better versed in statistics and intelligent systems. Now that it's partly true, I thought I should help anyone who might be looking for a simple explanation of what a neural network is to get a quick helicopter view of what all the fuzz is about. 

The Human Brain

Probably one of the most remarkable outcomes of natural selection and evolution, the human brain is a system of co-existing organisms called 'neurons' that form the ultimate problem solving system. It is comprised of a large number of neurons ( somewhere around twenty billion of them, 20,000,000,000 in digits...) of an average human being. Various animals have various numbers of neurons in their central information processing area, which is dubbed the brain. However, there are animals with no "brain" and yet a system that acts on and reacts to the environment (Spongebob for instance). 

Probably a gift from Plankton


The brain, as I said is made of the basic cells known as  Neurons. The neurons are specialized in dealing in electrochemical  signals which are passed from one neuron to another using a system called synapses. A neuron has two categories of .... arm like thingies that allow the reception and transmission of those electrical signals: One long arm called the axon, which is analogous to an 'output' and a bunch of input tentacles known as dendrites. The following diagram which I took from the internet is a simple enough representation of this (thank the owner... :) )


The Synapses are like the meeting points of dendrites and axons
While it is open for scientific debate as I heard, the common belief, at least as far as computer scientists are concerned , of how a neuron functions is, that when it receives input signals from their dendrites, each input signal is multiplied by a 'weight' given to each dendrite (actually, I'm guessing some kind of natural amplification of the signal happens according to the size and composition of each dendrite ) and summed. Only if the resulting charge / current  exceeds a certain threshold value , the axon is triggered to give an output. We can easily see how this could be the case by how we have heard the brain cells create and break connections when we were children and were learning really really fast. So basically, the 'information' in our brains are made of these electrical signals, and by making and breaking new connections the 'learning' of information happens - which is basically a correction of electrical charges to fit the information and a standard/metric. 

An Artificial Neuron - Trying to mimic the brain

Given the advanced power of information processing shown by the natural neural networks (even a dust-mite is more 'intelligent' than our smartest cognitive systems ) it is only a matter of time before we tried to exploit it in our artificial systems. For this the basic abstraction of the neuron is of great importance. Here's how it goes. 

An artificial neuron(a.k.a a perceptron ) , not much unlike the natural counterpart, consists of a set of inputs(dendrites) and an output(axon). Each input then has a weight attached to it. We can define a vector for these weights (let's be a bit imaginative and call it the weight vector) - w. Now suppose we gather all the inputs (let's say we have k inputs) and put them in another vector i . I will try to use a diagram to make it a bit more clear... 


Here, i  stands for the input vector of the form i = a u + b v + c x + d y  where u,v,x,y are unit vectors and each w stands for a component of the weight vector...
Basically what happens in the given perceptron is along each of the arms, scalar inputs of size a, b, c and d are coming into the perceptron. And we can form an input vector using these inputs, as is captioned in the diagram. Next, we can also have the weight vector w made from each weight attached to each input.

Now we see that the output is a function of the input. This is called in neural network terminology, an activation function. In a linear perceptron (we'll only talk about this particular kind of simple perceptrons for the moment...), this is typically defined as follows.



                                              1               for w.i > t for some scalar threshold value

                   f( i ) = 
                                 
                                              0              otherwise




As you can see, we fire an output of a binary 1 if the scalar product of the input vector and the weight vector is larger than some threshold value, which is a step activation function (also please note that there are continuous functions which can operate as an activation function, such as a sigmoid function or a logistic function bar the threshold) . So this is how we mimic a neuron in a program.


The Network

Now it comes down to networking each of these functional neurons. As you can see, there are several parameters that we have to decide in a neuron, namely, the weight vector (value of each component of the weight vector, or actually, the weights of each input), the activation function, the threshold value and the dimensionality of the weight vector (the number of inputs per neuron). In fact, you can see that the output is dependent on these parameters, and by changing these values, we can change the behavior of a neuron to an input. Next comes the point where each of these neurons are connected to form a network.

There are many intricate types of neural networks. The simplest of them are called a 'feed-forward network'. As the name suggests, what it does is, it keeps pushing the signals forward in a network till it is put out in an output.  (There are much more complicated and dynamic neural networks, and some pretty interesting derivations of them such as Self-Organizing-Maps, or Growing-Self-Organizing-Maps. Refer to Kohonen's neural networks and work by Prof. Damminda Alahakoon if you are interested).

Let's take a simple feed forward network. It is going to look something like this.

This simple feed-forward net has three neural layers. The output from each layer is fed as input into the layer in front. 

Suppose you have two separate inputs and you want to classify them as 0 or 1 according to the input. Then the simplest neural network or this learning purpose is shown as above. From the left, two inputs come to the green neurons. And the outputs of that layer (actually a bidimensional vector) is fed into each neuron in the magenta layer. Lastly, the outputs of the magenta layer is fed to the black layer, which is the output. We usually tend to use the same activation function and the threshold in each of the neurons, however, the weights of the neurons will depend on the output.

Firstly, we feed what we call a training data set into the network. Here, the inputs are fed, and the outputs are taken at the initial configuration of the network (with random input arm weights and a user-defined number of layers/ inputs per neuron). Then we get the output. In the training data, we will have the expected output as well. Now, we calculate using some metric, the accuracy of the predictions. Now, typically, we will have a high rate of error from the expected and found values. Now we use some kind of a state-space-search to find a configuration of the network (typically, we only change the weight of the input arms, not the activation function or the number of neurons ) that would lower the error. Then we use this new configuration to get the new errors, and change the configuration again! This goes on for a while till we find some configuration that we're happy with.

Here, there are some things that we ought to be mindful about. First of all, we have to figure out a way to get some kind of a feel about how many neurons we need for the network, the number of layers or even the number of inputs per neuron. I honestly am quite new to the field so I am yet to find out any rules-of-thumb defining these. Also, there's the omnipresent issue of overfitting the training data! We must also have some kind of an idea about just how much data we feed it to train it!

The state-space-search that I mentioned earlier can take many forms. One of my favorites is using a genetic algorithm, but we can use a simulated annealing search as well. Usually it's good not to get stuck at a local optimum of a solution, so I prefer the above two. And if we use genetic algorithms, not only can we manipulate the input weights, but the other parameters like the number of neurons as well. But one thing is for sure, that it will take a very, very, very long time to train a neural network to a specific problem.

Usually, neural networks are resilient to noisy data and outliers rarely screw it up. However, given the complexity of things, the Skynet and the Terminators are as of yet, a mere dream!

Please leave your comments below, and have a merry time with trying to code a neural network.. :D 


Sunday, September 1, 2013

OpenShift and JBoss

After trying and failing to port a complex system written in C# into Python (mainly due to the time constraints) I find myself again at the mercy of Java, albeit at a rather advanced level.

In this article, I will give a small introduction as to how to get your environment configured for the development tasks. 

Objective

A game-server written in C#.NET must be ported into a 'hostable' application. I will not start with my usual, hippie-ranting about how C# should be banished from the face of the earth for just being not free and open-sourced (discounting Mono of course!), but I should mention that if you are using C# for a server application, your options are limited on the Cloud ! With this in mind, I wanted to exploit the free-tiers of a Platform-as-a-Service for a 'Custom' program rather than a web-site.

Technologies Used

OpenShift

The Platform-as-a-Service I have used is OpenShift. I chose this due to many advantages including but not limited to the ease of deployment, wide range of supported technologies , good community support as well as a substantial free-tier.  Adding a special note about the wide-range-of-technologies, at the worst case if you don't find something useful in the offered 'cartridges', you can start-up a DIY (Do-it-yourself) Cartridge and begin with a fresh Red-hat Linux Installation and use SSH to get your freak on! I tried it out with installing Python (along with pip) and included several modules for the purpose and turns out it works like a charm. I have yet to try and install a command-line torrent client (I doubt it will work). But I have managed to overcome one of the commonly faced difficulties due to servers that do not support range-retrieval of large files. All in all, OpenShift is an interesting platform to be used in app deployment.

But of course there are some drawbacks. One of them being- the confusing load-balancing , DMZ and proxy mechanisms. This is one of the reasons that I am being forced into working with websockets (may or may not work! I'll leave that for later!!). This takes away the simplicity that you can have in a simple client-server app with bidirectional data-transfer. I have yet to conclude whether it was worthy a trade-off!

JBOSS

JBoss is, simply put, an Application Server written in Java. Without claiming to be much of an expert on the subject, I'll just encourage you to google for it. Our interests lie with the fact that it is one of the best supported Cartridges in OpenShift.

For instance, you can download an Eclipse Plugin and deploy your application in the remote server from end to end (if you are lucky). However, there are some common issues with authentication and credential validation of OpenShift accounts in  the Eclipse Plugin for OpenShift+JBOSS support. You can skim through the forums and discussions only to be let down. But despair not! Because we have the JBOSS Development Studio by RedHat.

Deployment


The JBOSS development studio is based on the eclipse SDK so it's not a completely novel experience for a nOOb. You just have to
1. Code in the IDE
2. Commit to the local repo
3. Push to the remote origin
4. Witness the Magic!

I will add more details on the installation and setting up as I have time (that was the idea behind the article). Meanwhile, feel free to ask for any clarification if you can't wait for the next edit!