Monday, March 23, 2015

An intro into bioinformatics - Cells and Molecules and other stuff...

So... Keeping up with the inconsistent, incoherent articles that I have posted in this blog for years, I have decided to venture into bioinformatics. So expect a series which will span for quite a long time (hopefully).

Bioinformatics is the discipline, as far as I can see, where we study the information related phenomena in the biological world. Main area of concern for us is how information is propagated through chemical sequencing to create functional life forms. This series of articles will give an introductory provisioning of domain knowledge in bioinformatics and computational biology, from what little knowledge I myself have. However, I have hopes to impart new knowledge as I find it.

Now then, in this article, let me give a basic understanding of the domain,  by giving you a rough idea about molecular biology. Please, feel free to point out any inconsistencies.

Everything we call 'life', from the trees to the bees, from the lemurs to the baiji, is the product of millions of years of evolution. Though there has been numerous doubts about the validity of the theory of evolution, it is no doubt a fact which has stood the burden of proof, and the test of time and scrutiny. Though it may not be as crystal clear as the phenomenon of evolution itself, the means and methods of evolution, is the point in case, the greatest puzzle of all time.

To understand this, one needs to look into the basic building blocks of life. At a whim, the uninitiated will undoubtedly point to the animals (or species for the more enlightened ones). The species, while being close to a valid solution to the problem of the unit of evolution, may have been the source of inspiration for the naturalists, even up until a couple of centuries ago. It stands with reason to assume that a species, a living, breathing, moving, reproducing agent of life, whether it be a plant or an animal, whether it be an insect or a mammal, would serve as a useful isolated entity from the surroundings and environment. Not until the discovery of the first microscope, did this idea change.

With the advent of the microscope, new organisms were discovered. These were autonomous organisms which displayed characteristics of life on their own; they had their own metabolism, movement and reproduction. These, 'cells' of life were then considered the unit of living beings, hence we call them the building blocks of life.

Cells are found in nature in a large variety. They differ from each other based on species and their function. However, it is reasonable to think of a complex organism such as a human being, as a colony or a civilization of single-celled organisms which coexist to achieve a single objective; survival. Though there are variations, cells typically have the same basic components.

A typical cell will contain various parts. A cell will get its shape by the cytoskeleton, a fibrous structure which envelops the cell. Inside the cytoskeleton is the cytoplasm, a gell-like substance which contains vital molecules for the existence of the cell. In the middle of a cell, (bar the exception of prokaryotes) lies the neucleus, which controls the existence of the cell and all other aspects of it.

A species is made up of millions and billions of cells. Consider yourself, sitting and reading this essay, and scrolling your mouse-wheel with your right index finger. The digital muscles (muscles in the fingers) that help you move your fingers, are separate cells. They have their own autonomy, they can reproduce accordingly, they can take in food, they can dispose waste, they have their own mitochondria to produce energy from the food. And how do you know your finger is on the wheel? The nerve endings in the fingertips are a set of extensions of neural cells located in your brachialis region. They also have the said autonomy, bar the veracious capacity to replicate. The prokaryotes in your blood gives it its red, and your lungs make up tissue from cells that can effectively filter oxygen from the air you breathe. Your head houses some of the most effective cells ever to have existed, which are capable of doing computations orders of magnitude faster than a computer (alas all the self-awareness and narcissism that comes along with the package). But a question remains, how does a species, a colony of species, propagate its form and function and characteristics to a new generation through reproduction?

To look at this, we will address first the issue of cellular reproduction. There are two forms of reproduction in cells. They are mitosis, in which, a single cell will divide into two identical cells and meiosis, where two cells will contribute their gene... (let this wait...) characteristics to the offspring. An example would be, suppose you scathe your arm on a sharp edge. The damaged muscles, sub-cutaneous tissue, epidermis and the skin tissue will eventually grow back. But if they contributed to each others' division, you'll have a very weird looking new layer of skin, that would make you actually love the acne scars on your face. This doesn't naturally happen (although in the case of a wound with a lot of blood, scar-tissue will form and it's a separate phenomenon) as the cells do not contribute to the reproduction of each other. Therefore the dermis, epidermis and hypodermis simply grow back independently, and it serves as a visible example of mitosis. But, as I am sure you are already aware (I hope I won't have to go towards the awkward details of how babies are made), a child cannot be born in the same manner, but you need two cells from the mother and the father (the egg and the sperm) to contribute to the new organism, which is an example of meiosis.

Now, comes an even more intriguing question; just how exactly is the new cell aware of what it is supposed to do, and what goes where? For this, we come to the even smaller units of life, the gene! The gene, as the unit of selection in evolution was originally proposed by many biologists, but was popularized mainly by Prof. Richard Dawkins by his book “The Selfish Gene”. In this, he gives a good explanation, which I will roughly paraphrase for better insight into the matter.

Imagine a fluid, in which, some molecules, which can combine and recombine to achieve some basic tasks (though not necessarily with purpose) are floating freely around. We will call these amino acids. The combined components, we will call proteins. Needless to say, with a limited number of amino acids, we can create a myriad of different proteins. Now imagine, a molecule, which has some molecules, that can influence the formation of proteins from the original amino acids. This second molecule could encode some structural information on what the proteins should look like (or the converse of it actually). Imagine if this molecule was a large one, which could encode a significant amount of these structural and functional information. All it needs to do is to encode what kind of a protein needs to be created. Such a molecule was created, beyond the shadow of doubt naturally, given the right conditions, and may have undergone a lot of trials and errors of natural selection on itself, and we have the DNA. The Deoxyribo Nucleic Acid owes its success to the structure of itself. It is built of four nucleotides (Adenine, Guanine, Cytosine and Thymine) and these can come together to form very large strands of the molecule, hence enormous amounts of information on the formation of proteins can be stored in them. However, if you think of it as a blueprint of the organism, the 'walls' and 'rooms' and 'hallways' of the blueprint must also be clearly identifiable (not necessarily for an observer, but for the protein builders or ribosomes). These 'sections' are what we call 'genes'. One cannot take a strand of DNA and go about saying where one gene ends and another begins. But sections of the strand will have definite pieces of information encoded in them.

Now, let's see how the whole thing is done in practice. In a cell, or more specifically in the nucleus, the DNA strands are packaged into units called Chromosomes. A chromosome is nothing more than a tightly wound strand of DNA and a special protein called histone which keeps it in shape. Inside the nucleus, there maybe several chromosomes. You typically have 46 of them and a potato has 48. The number of it doesn't really mean anything, however, for a healthy specimen of a species, the count is typically fixed. However, a person with Down's Syndrome will have an additional copy of the 21st chromosome which causes the anomaly. Inside the nucleus, smaller single stranded copies of the DNA is made, which we call RNA. They are made in sections, and these sections can pour through the membrane of the nucleus. The RNAs are like small prints of particular sections of a blueprint, which you will give your gaffer for landscaping, or your painter to finish the paint-job in the first-floor bedroom. The RNA then goes through a special unit called the ribosomes which will create the proteins as encoded by the RNA. These proteins will then go to the required places (through the golgi apparatus) and fulfill their destiny. And some of these new proteins will create new cells. It is pretty straightforward to imagine how mitosis will occur with this model. But meiosis becomes more interesting.

In meiosis, two cells come together, and they share, (now we can use this term) their genetic material. The chromosomes are combined, and split (typically at random places) and these random new DNA will be different to both the parents which will provide a new blueprint for a whole new specimen (Imagine getting a ninth of the blueprint of your house, and eight ninths of your sister's house, which are pretty similar to begin with, and creating a whole new blueprint for a new house). This is how the sexual reproduction operates in a significant number of animals and plants. However, at the creation of this new blueprint (or the crossover) there may be some random mutations which will cause the new specimen to have some characteristics which are not seen in the parents. This is called a mutation (legend has it it could give you laser eyes, and Wolverine's claws, but all I got was a slightly heightened sense of smell and voluntary movement of my ears!). However, if this mutation puts the specimen in an awkward position, that will not prevail. Imagine a pack of Gazelle in a savanna, and they are slightly around the height of the dried grass, which will make them less conspicuous to predators. But suppose there is one offspring, gifted with height. It will most probably be the first ambushed by a perched lioness, and will not propagate its 'tall' gene any further. Similar phenomena has caused the polar bears to be white while the grizzly bears remain brown, the tropical humans to have more sun-resistant dark skins, and the giraffes to have long necks and the deer to have short ones. This, is the beauty of natural selection.


 Life on earth, as we know it, is mysterious and quite wonderful. However, it is likely that our understanding of its origin and path is no great mystery any more. There are minutia to be worked out granted, but I shall leave with a small allegory by late Professor Douglas Adams. “Imagine two hundred years ago you tell a private detective living in New York 'a man who was spotted in Manhattan at 2 pm last Wednesday, was spotted 4 hours later in Charring Cross, London'. He will be simply stupefied how the man crossed the Atlantic in just 4 hours. Now imagine if you told a modern detective the same story. He will still be baffled by the details, which flight did he board? Why did he take the train? And so on... But no great mystery is there for him, about how the man crossed the Atlantic. Our perception on how life works is in the same situation”.