So... Keeping up with the inconsistent, incoherent articles that I have posted in this blog for years, I have decided to venture into bioinformatics. So expect a series which will span for quite a long time (hopefully).
Bioinformatics is the discipline, as far as I can see, where we study the information related phenomena in the biological world. Main area of concern for us is how information is propagated through chemical sequencing to create functional life forms. This series of articles will give an introductory provisioning of domain knowledge in bioinformatics and computational biology, from what little knowledge I myself have. However, I have hopes to impart new knowledge as I find it.
Now then, in this article, let me give a basic understanding of the domain, by giving you a rough idea about molecular biology. Please, feel free to point out any inconsistencies.
Everything we call 'life', from the trees to the bees, from the
lemurs to the baiji, is the product of millions of years of
evolution. Though there has been numerous doubts about the validity
of the theory of evolution, it is no doubt a fact which has stood the
burden of proof, and the test of time and scrutiny. Though it may not
be as crystal clear as the phenomenon of evolution itself, the means
and methods of evolution, is the point in case, the greatest puzzle
of all time.
To understand this, one needs to look into the basic building blocks
of life. At a whim, the uninitiated will undoubtedly point to the
animals (or species for the more enlightened ones). The species,
while being close to a valid solution to the problem of the unit of
evolution, may have been the source of inspiration for the
naturalists, even up until a couple of centuries ago. It stands with
reason to assume that a species, a living, breathing, moving,
reproducing agent of life, whether it be a plant or an animal,
whether it be an insect or a mammal, would serve as a useful isolated
entity from the surroundings and environment. Not until the discovery
of the first microscope, did this idea change.
With the advent of the microscope, new organisms were discovered.
These were autonomous organisms which displayed characteristics of
life on their own; they had their own metabolism, movement and
reproduction. These, 'cells' of life were then considered the unit of
living beings, hence we call them the building blocks of life.
Cells are found in nature in a large variety. They differ from each
other based on species and their function. However, it is reasonable
to think of a complex organism such as a human being, as a colony or
a civilization of single-celled organisms which coexist to achieve a
single objective; survival. Though there are variations, cells
typically have the same basic components.
A typical cell will contain various parts. A cell will get its shape
by the cytoskeleton, a fibrous structure which envelops the cell.
Inside the cytoskeleton is the cytoplasm, a gell-like substance which
contains vital molecules for the existence of the cell. In the middle
of a cell, (bar the exception of prokaryotes) lies the neucleus,
which controls the existence of the cell and all other aspects of it.
A species is made up of millions and billions of cells. Consider
yourself, sitting and reading this essay, and scrolling your
mouse-wheel with your right index finger. The digital muscles
(muscles in the fingers) that help you move your fingers, are
separate cells. They have their own autonomy, they can reproduce
accordingly, they can take in food, they can dispose waste, they have
their own mitochondria to produce energy from the food. And how do
you know your finger is on the wheel? The nerve endings in the
fingertips are a set of extensions of neural cells located in your
brachialis region. They also have the said autonomy, bar the
veracious capacity to replicate. The prokaryotes in your blood gives
it its red, and your lungs make up tissue from cells that can
effectively filter oxygen from the air you breathe. Your head houses
some of the most effective cells ever to have existed, which are
capable of doing computations orders of magnitude faster than a
computer (alas all the self-awareness and narcissism that comes along
with the package). But a question remains, how does a species, a
colony of species, propagate its form and function and
characteristics to a new generation through reproduction?
To look at this, we will address first the issue of cellular
reproduction. There are two forms of reproduction in cells. They are
mitosis, in which, a single cell will divide into two identical cells
and meiosis, where two cells will contribute their gene... (let this
wait...) characteristics to the offspring. An example would be,
suppose you scathe your arm on a sharp edge. The damaged muscles,
sub-cutaneous tissue, epidermis and the skin tissue will eventually
grow back. But if they contributed to each others' division, you'll
have a very weird looking new layer of skin, that would make you
actually love the acne scars on your face. This doesn't naturally
happen (although in the case of a wound with a lot of blood,
scar-tissue will form and it's a separate phenomenon) as the cells
do not contribute to the reproduction of each other. Therefore the
dermis, epidermis and hypodermis simply grow back independently, and
it serves as a visible example of mitosis. But, as I am sure you are
already aware (I hope I won't have to go towards the awkward details
of how babies are made), a child cannot be born in the same manner,
but you need two cells from the mother and the father (the egg and
the sperm) to contribute to the new organism, which is an example of
meiosis.
Now, comes an even more intriguing question; just how exactly is the
new cell aware of what it is supposed to do, and what goes where? For
this, we come to the even smaller units of life, the gene! The gene,
as the unit of selection in evolution was originally proposed by many
biologists, but was popularized mainly by Prof. Richard Dawkins by
his book “The Selfish Gene”. In this, he gives a good
explanation, which I will roughly paraphrase for better insight into
the matter.
Imagine a fluid, in which, some molecules, which can combine and
recombine to achieve some basic tasks (though not necessarily with
purpose) are floating freely around. We will call these amino acids.
The combined components, we will call proteins. Needless to say, with
a limited number of amino acids, we can create a myriad of different
proteins. Now imagine, a molecule, which has some molecules, that can
influence the formation of proteins from the original amino acids.
This second molecule could encode some structural information on what
the proteins should look like (or the converse of it actually).
Imagine if this molecule was a large one, which could encode a
significant amount of these structural and functional information.
All it needs to do is to encode what kind of a protein needs to be
created. Such a molecule was created, beyond the shadow of doubt
naturally, given the right conditions, and may have undergone a lot
of trials and errors of natural selection on itself, and we have the
DNA. The Deoxyribo Nucleic Acid owes its success to the structure of
itself. It is built of four nucleotides (Adenine, Guanine, Cytosine
and Thymine) and these can come together to form very large strands
of the molecule, hence enormous amounts of information on the
formation of proteins can be stored in them. However, if you think of
it as a blueprint of the organism, the 'walls' and 'rooms' and
'hallways' of the blueprint must also be clearly identifiable (not
necessarily for an observer, but for the protein builders or
ribosomes). These 'sections' are what we call 'genes'. One cannot
take a strand of DNA and go about saying where one gene ends and
another begins. But sections of the strand will have definite pieces
of information encoded in them.
Now, let's see how the whole thing is done in practice. In a cell, or
more specifically in the nucleus, the DNA strands are packaged into
units called Chromosomes. A chromosome is nothing more than a tightly
wound strand of DNA and a special protein called histone which keeps
it in shape. Inside the nucleus, there maybe several chromosomes. You
typically have 46 of them and a potato has 48. The number of it
doesn't really mean anything, however, for a healthy specimen of a
species, the count is typically fixed. However, a person with Down's
Syndrome will have an additional copy of the 21st
chromosome which causes the anomaly. Inside the nucleus, smaller
single stranded copies of the DNA is made, which we call RNA. They
are made in sections, and these sections can pour through the
membrane of the nucleus. The RNAs are like small prints of particular
sections of a blueprint, which you will give your gaffer for
landscaping, or your painter to finish the paint-job in the
first-floor bedroom. The RNA then goes through a special unit called
the ribosomes which will create the proteins as encoded by the RNA.
These proteins will then go to the required places (through the golgi
apparatus) and fulfill their destiny. And some of these new proteins
will create new cells. It is pretty straightforward to imagine how
mitosis will occur with this model. But meiosis becomes more
interesting.
In meiosis, two cells come together, and they share, (now we can use
this term) their genetic material. The chromosomes are combined, and
split (typically at random places) and these random new DNA will be
different to both the parents which will provide a new blueprint for
a whole new specimen (Imagine getting a ninth of the blueprint of
your house, and eight ninths of your sister's house, which are pretty
similar to begin with, and creating a whole new blueprint for a new
house). This is how the sexual reproduction operates in a significant
number of animals and plants. However, at the creation of this new
blueprint (or the crossover) there may be some random mutations which
will cause the new specimen to have some characteristics which are
not seen in the parents. This is called a mutation (legend has it it
could give you laser eyes, and Wolverine's claws, but all I got was a
slightly heightened sense of smell and voluntary movement of my
ears!). However, if this mutation puts the specimen in an awkward
position, that will not prevail. Imagine a pack of Gazelle in a
savanna, and they are slightly around the height of the dried grass,
which will make them less conspicuous to predators. But suppose there
is one offspring, gifted with height. It will most probably be the
first ambushed by a perched lioness, and will not propagate its
'tall' gene any further. Similar phenomena has caused the polar bears
to be white while the grizzly bears remain brown, the tropical humans
to have more sun-resistant dark skins, and the giraffes to have long
necks and the deer to have short ones. This, is the beauty of natural
selection.
Life on earth, as we know it, is mysterious and quite wonderful.
However, it is likely that our understanding of its origin and path
is no great mystery any more. There are minutia to be worked out
granted, but I shall leave with a small allegory by late Professor
Douglas Adams. “Imagine two hundred years ago you tell a private
detective living in New York 'a man who was spotted in Manhattan at 2
pm last Wednesday, was spotted 4 hours later in Charring Cross,
London'. He will be simply stupefied how the man crossed the Atlantic
in just 4 hours. Now imagine if you told a modern detective the same
story. He will still be baffled by the details, which flight did he
board? Why did he take the train? And so on... But no great mystery
is there for him, about how the man crossed the Atlantic. Our
perception on how life works is in the same situation”.