NIG Japan Big Data Genomics

| March 7, 2012

NIG Japan Big Data Genomics by Dan Gatti, Big Data I/O Forum 

As a leading international genetics research laboratory and inter-university research institution in Japan, the National Institute of Genetics builds an international DNA database, develops and provides various search and analysis services, and provides supercomputing resources to researchers throughout Japan and the world. The newly installed SGI UV 1000 will form the backbone of these operations and serve a crucial role in next-generation sequencing data analysis.

Japan’s National Institute of Genetics, an information and systems research organization located in Mishima, Shizuoka, under the leadership of Director-General Yuji Kohara, has selected an SGI® UV™ 1000, the top model in the SGI UV series, for a new supercomputer system. Featuring 768 processor cores 10TB of memory, the system will function as a server for next-generation sequencing data analysis.

The amount of data created by next-generation sequencers is growing exponentially. As the number of sequences that can be read—and thus the amount and size of data created at one time by next-generation sequencers multiplies—increasingly powerful computing resources are needed to handle the analytical processing of that data. These data include sequence assembly and mapping. Sequence assembly is the method of aligning and piecing together numerous reads (DNA fragments) to determine a genome sequence. Used when sequencing is performed on an unknown genome sequence, it is also called de novo assembly. Mapping refers to the method of determining a genome sequence by assembling reads against a reference genome whose sequence is already known.

The SGI UV 1000 adopted by the National Institute of Genetics as a pipeline server for next-generation sequencing analysis is a large-scale coherent shared memory server with 768 processor cores powered by Intel® Xeon® processor E7 family series and 10TB of memory. Certain sequencing data analysis processes, particularly de novo assembly programs, require vast amounts of computer memory, more than distributed parallel clusters can typically offer today. Anticipation at the Institute is growing around the SGI UV 1000 which, as an analysis server for de novo assembly programs, is the world’s only server to date that includes a massive 10TB of shared memory (scalable to 16TB) in a single system.

Category: Uncategorized

About the Author ()

Comments are closed.