EMC Greenplum Introduces Free Community Edition of 'Big Data' Tools for Developers and Data Scientists
Industry-Wide Data Collaboration and Innovation is Enabled by Free EMC Greenplum Database Edition, Open Source Analytic Algorithms from MADlib, and Alpine Miner Visual Modeler
SANTA CLARA, Calif., Feb. 1, 2011 /PRNewswire/ -- Strata 2011 -- EMC Corporation (NYSE: EMC), the world leader in information infrastructure solutions, today introduced a free Community Edition of the EMC® Greenplum® Database, the industry-leading, high-performance massively parallel processing (MPP) database product, along with free analytic algorithms and data mining tools. The announcement is being made today at the 2011 O'Reilly Strata Conference (Feb 1-3, 2011) in Santa Clara, CA, where Scott Yara, vice president, EMC Data Computing Products Division, will speak. Free downloads are immediately available at http://community.greenplum.com.
Building on earlier Greenplum "Big Data" breakthroughs, like the EMC Greenplum Data Computing Appliance, the new EMC Greenplum Community Edition removes the cost barrier to entry for big data power tools empowering large numbers of developers, data scientists, and other data professionals. This free set of tools enables the community to not only better understand their data, gain deeper insights and better visualize insights, but to also contribute and participate in the development of next-generation tools and solutions. With the Community Edition stack, developers can build complex applications to collect, analyze and operationalize big data leveraging best of breed big data tools including the Greenplum Database with its in-database analytic processing capabilities.
"Our new Community Edition provides a parallel-everything "Big Data" stack with unequaled speed that enables analysts to perform next-generation data analytics and experiment with real-world data, and most importantly -- innovate," explained Luke Lonergan, CTO and vice president, EMC Data Computing Products Division and co-founder of Greenplum. "This project is about empowering developers – they can program using the most popular tools and they have a place to contribute open source extensions to the stack."
The free EMC Greenplum Community Edition includes:
- Greenplum Database CE, an industry-leading massively parallel processing (MPP) database product for large-scale analytics and next-gen data warehousing.
- MADlib, the open source analytic algorithms library, providing data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data.
- Alpine Miner, an up and coming third party analytics tool, is an intuitive visual data mining modeler that delivers rapid "modeling to scoring" capabilities, leverages in-database analytics, and is purpose-built for "big data" applications.
Community Benefits
The initial release of the EMC Greenplum Community Edition is designed for both first-time users and experienced Greenplum customers. First-time users gain access to a comprehensive, purpose-built business analytics environment that enables them to view, modify and enhance included demo data files, enabling experimentation with "Big Data" analytical tools within the Greenplum database. Existing users can download an upgraded version of Greenplum Database CE and analytic tools for integration into their development and research environments.
The Community Edition can be downloaded as a pre-configured VMWare virtual appliance for use on laptops and desktops, or as a set of packages for deployment on user machines. All users are free to participate in new Greenplum Community Forums to get support, collaborate, post ideas, and test enhancements developed by various users independently.
Availability
Starting February 1, 2011, the EMC Greenplum Community Edition can be downloaded free of charge from http://community.greenplum.com. Regular Community Edition updates will be made available online. The Community Edition is intended for experimentation, development and research purposes only. Current Single-Node Edition users can deploy the new Community Edition in their single-node production environments. Greenplum commercial licenses must be purchased prior to using code for internal data processing or for any commercial or production purpose.
About the MADlib Environment
MADlib (magnetic, agile and deep) is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data. MADlib is designed to foster widespread development of scalable analytic skills, by harnessing efforts from commercial practice, academic research and open-source development.
About EMC Greenplum Database
EMC Greenplum Database utilizes a shared-nothing massively parallel processing (MPP) architecture designed from the ground up for business intelligence and analytical processing on commodity hardware. Data is automatically partitioned across multiple segment servers, and each segment owns and manages a distinct portion of the overall data. This 'shared-nothing' architecture means that all communication is done through a network interconnect and there are no disk-level sharing or contention issues to address. More information about Greenplum Database is available at www.greenplum.com/products/greenplum-database.
About EMC
EMC Corporation (NYSE: EMC) is the world's leading developer and provider of information infrastructure technology and solutions that enable organizations of all sizes to transform the way they compete and create value from their information. Information about EMC's products and services can be found at www.EMC.com
EMC and Greenplum are trademarks or registered trademarks of EMC Corporation in the U.S. and other countries. All other trademarks are the property of their respective owners.
This release contains "forward-looking statements" as defined under the Federal Securities Laws. Actual results could differ materially from those projected in the forward-looking statements as a result of certain risk factors, including but not limited to: (i) adverse changes in general economic or market conditions; (ii) delays or reductions in information technology spending; (iii) our ability to protect our proprietary technology; (iv) risks associated with managing the growth of our business, including risks associated with acquisitions and investments and the challenges and costs of integration, restructuring and achieving anticipated synergies; (v) fluctuations in VMware, Inc.'s operating results and risks associated with trading of VMware stock; (vi) competitive factors, including but not limited to pricing pressures and new product introductions; (vii) the relative and varying rates of product price and component cost declines and the volume and mixture of product and services revenues; (viii) component and product quality and availability; (ix) the transition to new products, the uncertainty of customer acceptance of new product offerings and rapid technological and market change; (x) insufficient, excess or obsolete inventory; (xi) war or acts of terrorism; (xii) the ability to attract and retain highly qualified employees; (xiii) fluctuating currency exchange rates; and (xiv) other one-time events and other important factors disclosed previously and from time to time in EMC's filings with the U.S. Securities and Exchange Commission. EMC disclaims any obligation to update any such forward-looking statements after the date of this release.
SOURCE EMC Corporation
WANT YOUR COMPANY'S NEWS FEATURED ON PRNEWSWIRE.COM?
Newsrooms &
Influencers
Digital Media
Outlets
Journalists
Opted In
Share this article