Laboratory of Macromolecular Crystallography
This is a review of the works carried out in LMC of the IMPB RAS. Information on other papers in this field
may be found in the original papers listed below.
Computer processing of nucleotide sequences. First steps.
(N.L.Lunina)
In early 70s a new direction began to emerge in computer biology. It was
concerned with processing of nucleotide sequences. The data from Heidelberg
database became readily available. Numerous publications were devoted to
computer methods of the work with nucleotide sequences.
In early 80s a researcher of the Research Computing Center, A.S.Kondrashov
advanced an initiative to engage in development of such methods with the use of
computers available in the RCC. He formulated a list of requirements which the
sequence processing program must meet. At his suggestion a system HEID was
developed. It enabled search for sequences in the database and processing of
selected sequences. In the course of this work numerous discussions were held
with researchers of the Institute of Biochemistry and Physiology of
Microorganisms (IBPM) - V.V.Vel'kov and V.V.Kryukov - and researchers of the
Institute of Protein Research - L.A.Voronin and A.V.Finkelstein. The software
package HEID was a result of these discussions.
The researchers from the IBPM used it for both the work with the sequences in
the database and for processing their own freshly obtained ones.
The system was initially designed for ES 1040 and then transferred to SM-4 as
that computer became available in the RCC. The system description was published
in ONTI NCBI in 1984. The system enabled one to find statistical regularities in
the distribution of nucleotides; to search for a site of interest, various-type
repetitions, open reading frames; to define the proteins which are read out from
a sequence, etc.
As statistical regularities in the distribution of nucleotides and nucleotide
pairs (purine/pyrimidine nucleotides) were revealed, a table started to be
compiled for the nearest neighbors, combinations of two and of three (both real
and expected neighbors were considered to see where the differences from the
expected ones were the strongest).
Due to close collaboration with biologists the system HEID contained a set of
functions which was full enough and convenient in use. Subsequently researchers
from the IBPM admitted that in after years some new programs for a sequence
search in databanks appeared which offered more functions, but none of them did
offer all the opportunities that HEID did.
In the ensuing years computer processing of nucleotide sequences became one of
the main directions of the RCC (new name - the Institute of Mathematical
Problems of Biology) activity.
March, 24, 2003
Publications
The full texts of papers
- Lunina, N.L. (1984). "The computer-based system HEID for the treatment of
nucleotide sequences". Software., NCBI AN SSSR, Pushchino, Russia. (In Russian)
|