UCSD: Gordon tackling Data Intensive Applications

| May 28, 2011

Data intensive applications are creating a data tsunami requiring new architecture and new ideas. The good folks down at USCD have received a $20 million grant form NSF to attack the data intensive applications. Gordan, AKA Flash Gordon is a supercomputer based on SSD Flash Memory and Virtual Shared Memory.

Gordon will have 250 trillion bytes of flash memory and 64 I/O nodes, more than enough to handle massive data bases while providing up to 100 times faster speeds when compared to hard drive disk systems.

The San Diego Supercomputer Center (SDSC) at the University of California, San Diego, today announced plans to establish a new partnership to bring together industry and university research to investigate a pressing business and information technology challenge: how to address both the management and technical aspects of “big data,” or the accumulation of massive data sets requiring new approaches for large-scale data storage and business analytics.

The project, called the Center for Large-scale Data Systems Research (CLDS), formally begins operations this fall and will also be home to the ongoing How Much Information? (HMI?) research program, which released a new report this week at the Storage Networking World (SNW) Spring 2011 conference in Santa Clara, Calif.

The latest report by HMI?, a consortium led by UC San Diego and previously based at the university’s School of International Relations and Pacific Studies,  analyzes the growth of “big data” in companies. The authors found that the world’s installed base of computer servers processed almost ten million million gigabytes of information in 2008, almost 10 to the 22nd power. Full details are available here.

“We are entering an era of data-intensive computing, where all of us – academia, industry, and government – will be faced with organizing, analyzing, and drawing meaningful conclusions from unprecedented amounts of data, and doing so in a cost- and performance-effective manner,” said Michael Norman, SDSC’s director.

SDSC recently announced the startup of two data-intensive computing systems, Dash and Trestles. Those systems will be followed later this year by a significantly larger system called Gordon, which will be the first supercomputer to employ large amounts of flash memory to help speed solutions to computing problems now limited by higher latency spinning disk technology. When deployed, Gordon should rank among the top 100 supercomputers in the world, capable of doing latency-bound file reads 10 times faster and more efficiently than any high-performance computing system today.

“It is new technology such as SDSC’s flash memory-based systems that is changing how science and research will be done in the Information Age,” added Norman. “CLDS will serve as a laboratory that will put us on the leading edge of adaptation and integration of technologies such as this, and explore the multi-faceted challenge of working with big data in collaboration with academic and industry partners.”
 
In addition to serving as the host site for ongoing HMI? research, CLDS will test and evaluate new trends in cloud-based storage systems, examining the cloud computing principles of “on-demand, elasticity, and scalability” in the context of large-scale storage requirements. Research will include exploration of new storage architectures and benchmark development.

“Establishing CLDS at SDSC is a natural fit,” said Chaitan Baru, an SDSC distinguished scientist and director of the new project, adding that the center will be structured as an industry-university consortium. “SDSC is recognized for its expertise in the development of systems for storing, managing, and analyzing ‘big data.’ Our goal here is to understand how new technologies will change the way we work in this data-rich age.”

Moreover, CLDS will be a key resource to strengthen analytical and research relationships, while fostering industry partnerships and exchanges through individual or group research projects, and providing support for industry forums and other professional education programs.

“Integrating management, economic and technical analysis is what all companies will need in the world of “big data” and even bigger analytics,” said James Short, research director of the HMI? program and lead scientist for the CLDS project. “SDSC offers a rich environment for integrating management analysis with both applied and theoretical computer science for research in large-scale data systems.”

Funding for the new center will come from a combination of industry, foundation, and government grants. Industry inquiries may be directed to Ron Hawkins, SDSC’s director of industry relations, at (858) 534-5045 or rhawkins@sdsc.edu.

Category: Uncategorized

About the Author ()

Comments are closed.