Bioinformatics Sites

Biosciences & BioInformatics Companies & Websites
The new scientific discipline of BIOINFORMATICS (BI) is the melding of computer science and software technologies with the Human Genome Programs and Proteomics. Bioinformatics covers the fields of computational molecular biology, biological databases, and genome bioinformatics. The goal of bioinformatics is to create public databases, conduct research in computational biology, develop software tools for analyzing genome data, and disseminates biomedical information for a better understanding of molecular processes affecting human health and disease. Description of Bioinformatics

BIOINFORMATICS...
a merger of computer technology, software algorithms, and software analysis of large biological databases.

The Human Genome Project, begun in 1990, announced in April of 2000 that a complete (yet rough) draft of the human genome has been mapped.

The human genome, estimated to contain about 100,000 genes holds 3 Gigabytes of nucleotide sequence (A, T, G, & C's) data. That's enough data to fill 2080 (1.44MB) floppies, 30 Zip disks, or 1 average size hard drive. At the rate of 1 base nucleotide per second, how long would it take you to read the entire Human Genome ??????

	Genomic Databases involves:	Genomic databases contain:
	collecting & storing data	3 GB of nucleotide sequence (ATGC)
	searching existing databases	Single Nucleotide polymorphisms (SNP's) single base differences indiv to indiv
	interpreting databases ??? finding drug target sequences when genes are turned on in which tissues	Proteomics shapes of proteins that genes code how proteins interact
		comparisons of sequences sp to sp

1st Bioinformatics database (1980's - initially DOE then NIH's NCBI) was GENBANK. In the 1990's when the HGP began, it held the initial sequence data;
today, Genbank holds some 7 billion units of DNA sequences. The data is so vast and is coming at such a fast pace that supercomputers will be required to analyze the data. Incyte Genomics, a St. Louis pharmaceutical company can sequence 20 million bp/day and Celera Genomics has at this time (Jul 2000) has 50 terabytes (5 x 10¹³ bytes) of DNA sequence data.

The function of Bioinformatics often includes searching for similarities (Homologies) between sequenced pieces. Software programs as:
BLAST (Basic Local Alignment Search Tool) and ENTREZ are meta search engines (ala Yahoo) that can analyze vast amounts of data rapidly.

An example:
cathepsin-K. osteoclast cells (bone degrading cells for replenishment) are overactive in patients with osteoporosis. HGS sequenced, homologized, and found unique nucleotide sequences that were over-expressed in osteoclasts. These sequences coded for the enzymes called cathepsins. Smith-Kline-French is looking for a drug that will block cathepsin-K's binding sites.

For the Bio-Investor - 3 classes of bioinformatic companiesto invest in:
      1. Big Pharma        Super-pharmaceuticals using in-house BI
                                        Bayer, Glaxo-Wellcome
      2. strattler companies         they sequence and data-mine
                                                     HGS, Celera, & Incyte
      3. specialist companies        they do unique data-mining & integration.

	3,000 gigabytes = 3,000,000 MB
	at 2 Asci characters per byte and 6 x 10¹² bases in the human genome
	6 x 10¹² / 60 (min) / 60 (hr) / 24 (day) / 365 (years) =
	190,259 years
back