LexisNexis HPCC Systems 10 years old
June 15, 2011 – New York – LexisNexis Risk Solutions
today announced that it will offer its data intensive supercomputing platform under a dual license, open source model, as HPCC Systems. HPCC Systems is designed for the enterprise to solve big data problems. The platform is built on top of high performance computing technology, and has been proven with customers for the past decade. HPCC Systems provides a high performance computing cluster (HPCC) technology with a single architecture and a consistent data centric programming language. HPCC Systems is an alternative to Hadoop.
“We feel the time is right to offer our HPCC technology platform as a dual license, open source solution. We believe that HPCC Systems will take big data computing to the next level,” said James M. Peck, chief executive officer, LexisNexis Risk Solutions. “We’ve been doing this quietly for years for our customers with great success. We are now excited to present it to the community to spur greater adoption. We look forward to leveraging the innovation of the open source community to further the development of the platform for the benefit of our customers and the community,” said Mr. Peck.
To manage, sort, link, and analyze billions of records within seconds, LexisNexis developed a data intensive supercomputer that has been proven for the past ten years with customers who need to process large volumes of data. Customers such as leading banks, insurance companies, utilities, law enforcement and federal government leverage the HPCC platform technology through various LexisNexis® products and services. The HPCC platform specializes in the analysis of structured and unstructured data for enterprise class organizations.
“In the next few years, open source solutions will be leveraged in increasingly mission-critical deployments. In addition, we will start to see a trend where more and more companies open source some of their proprietary intellectual property (IP) in efforts to accelerate development of their technology platforms and applications, said Mark Driver, vice president and research director at Gartner. “For companies that own their IP and move to an open source model, properly managed open-source assets drive positive return on investment through flexibility, innovation and cost optimization. These three work in confluence to increase value, strengthen competitive opportunities and reduce costs,” said Mr. Driver.
HPCC Systems (www.hpccsystems.com) will be managed by Armando Escalante, who will also continue in his role as the senior vice president and chief technology officer of LexisNexis Risk Solutions. Mr. Escalante is responsible for technology development, R&D, information systems, security, and operations and has been leading the LexisNexis development team that built the HPCC platform for the last ten years.
HPCC Systems will initially release a virtual machine for the community to test, in addition to documentation and training. Full binaries will be released in several weeks and the source code will be released in a few more weeks after the binaries. HPCC Systems will have two offerings: the Community Edition which includes free platform software with community support, and the Enterprise Edition, which includes platform software with enterprise class support. Enterprise Edition customers will also have the option to acquire advanced modules and features.
LexisNexis Risk Solutions will not release its data sources, data products, the unique data linking technology, or any of the linking applications that are built into its products. These assets will remain proprietary and will not be released as open source.
HPCC Systems can process, analyze, and find links and associations in high volumes of complex data significantly faster and more accurately than current technology systems. The platform scales linearly from tens to thousands of nodes handling petabytes of data and supporting millions of transactions per minute. HPCC Systems is comprised of a single architecture, a consistent data-centric programming language, and two processing platforms: the Thor Data Refinery Cluster and the Roxie Rapid Data Delivery Cluster.
The core of the technology platform is the Enterprise Control Language (ECL), which is a declarative, data-centric programming language optimized for large-scale data management and query processing. The expressiveness of the language provides for increased productivity by enabling data analysts and developers to define “what” they want to do with their data instead of giving the system step-by-step instructions. As a result, developers can express complex queries and transformations with less programming time and fewer lines of code than other conventional programming languages. ECL specifications will be released under a Creative Commons license, which makes it easy for third parties to use, implement and contribute to the language.
The Thor Data Refinery Cluster is responsible for ingesting vast amounts of data, transforming, linking and indexing that data, with parallel processing power spread across the nodes. The Roxie Rapid Data Delivery Cluster provides highly scalable, high-performance online query processing and data warehouse capabilities.
Category: Uncategorized