You are on page 1of 8

CTOlabs.

com

White Paper: Datameers User-Focused Big Data Solutions


May 2012

A White Paper providing context and guidance you can use

Inside:

Overview of the Big Data Framework Datameers Approach Consideration for Deployment

CTOlabs.com

Datameer: Bringing Big Data To All


This paper, produced by the analysts and researchers of CTOlabs.com, provides an overview of one of the pioneers of the Big Data movement, Datameer. Datameer provides end-user-focused capabilities, enabling user self-service and real time interaction for users.

Executive Summary
Datameer provides a complete analytics platform that supports users. Users see familiar interfaces and easy to manipulate interactive visualizations. They are supported with a back end that integrates all the enterprises data resources. The result: powerful Big Data solutions are provided to users in a way that lets them interact directly with data instead of being forced to work through development teams.

Background
Apache Hadoop is the leading open-source Big Data platform with an ecosystem of software to inexpensively store, process, and analyze almost any type of information from any source. Hadoop is renowned for its ability to work on commodity hardware. Hadoop enables fast, distributed analysis running in parallel on multiple servers in a cluster. Hadoop is reliable, managing and healing itself; scales linearly, working as well with one terabyte of data across three nodes as it does with petabytes of data across thousands; affordable, costing much less per terabyte to store and process data compared to traditional alternatives; and agile, allowing users to load raw data into the system and implement a schema on read approach which orders the data based on how its requested.

The Challenges of Hadoop


Enterprises face many common challenges when implementing Hadoop. Until the arrival of end-userfocused solutions like Datameer, Hadoop clusters had to be integrated, programmed, and queried by programming specialists or data scientists. This works for firms like Google, LinkedIn, Facebook, Twitter and others that can hire scores of computer scientists and data engineers, but in most firms the analyst

A White Paper for the Government IT Community

needs the help of additional programmers to build a query. Datameers approach has changed this by bringing powerful graphic user interface tools and easy to implement Big Data solutions directly to the user. Businesses and government agencies can rely on Datameer to tap into all data sources while presenting users unfamiliar with Hadoop programming paradigms with familiar tools. The analysts who can benefit the most from Big Data now have a tool purpose-built for their needs.

The Datameer Approach


Datameer was designed to let analysts and other Big Data end-users benefit from Hadoop. Datameer is the first business intelligence and analytics platform built natively on Hadoop to allow for end-user analysis and correlation of any size structured, semi-structured and unstructured data. Datameer runs on all major Hadoop distributions and integrates easily into existing IT infrastructure with pointand-click deployment. Datameer can be easly deployed over any Hadoop cluster, including those in-house or on public cloud environments like those at Amazon or Rackspace. Datameer easily integrates with all legacy technologies and datastreams, including existing business intelligence data warehouses, transactional databases and other analytic stores. It also works with newer NoSQL technologies. With Datameer, you can integrate, analyze, and visualize data of any volume, variety, and velocity, enabling numerous Big Data use cases. It works well for large-scale data mining and text analysis because it can import massive amounts of data in parallel. Datameer can find correlations across structured and unstructured data such as phone records, social media, and text for pattern detection by joining any type of data at any size. Datameer can also be helpful in cyber security monitoring, for example by importing a large numbers of log files from disparate servers and analyzing them together for anomalous behavior.

Datameer and Zvents


Zvents is the leading online platform for the discovery and promotion of local entertainment, including concerts, movies, restaurants, theaters, and more. We connect 35 million monthly uniques with over 140,000 local promoters, via a network of over 300 branded media partners, creating the largest local entertainment audience on the Web. Data is critical to any web business, and gaining rapid insights in the fast-moving world of live events is even more critical. Datameer has enabled us to leverage our considerable investments in Big Data technology, including Hadoop and Hypertable, to rapidly discover actionable business insights that enable us to better server our users and our advertisers. Datameer has given us a scalable, flexible, cost-effective way to structure and analyze terabytes of behavioral click data, driving new product initiatives like Top 40, our new trending list of hot events at top40. zvents.com. Ethan Stock,CEO and Founder, Zvents Inc.

CTOlabs.com

Datameer can do all of this because it is a complete analytics platform that supports data integration, analysis, visualization, and security while focusing on the data analysts and scientists who turn raw information into intelligence. To bring data of all sorts together, Datameer provides wizard-based data integration with over 20 prebuilt connectors. These provide immediate access to all common data sources including relational databases such as Teradata, Greenplum, Vertica, Oracle, DB2, Microsoft SQL Server, and MySQL, along with file formats such as CSV, Fixed Length, JSON, XML Mbox, Apache Log Files and Twitter. Datameer also has connectors for the Hadoop Distributed File System and the Hadoop database systems Hive and HBase.

Datameer and Nurago


As a result of using Datameer, nurago is better able to help our clients identify and analyze patterns in behavioral data of panel members. Datameer helps us as a market research vendor to scale for our most granular data requirements and greatly simplifies the integration of multiple sources. In addition, Datameer makes reporting on big data analytics directly accessible to our analysts so that they dont need to turn to developers for their requirements. Nikolaus Pohle,CTO of nurago

For The Analyst


For analysis, Datameer provides a familiar spreadsheet user interface that requires no programming to design end-to-end data processing pipelines. Datameer provides over 200 pre-built functions for exploring and discovering complex relationships. These include the basics such as aggregation but also advanced capabilities. Functions are provided for analysis of text, production of mathematical assessments, bioinformatics, engineering and statistics. Once users integrate and analyze their data, they can visualize the results using simple drag and drop wizards for creating visualizations and dashboards. An extensive library of widgets including tables, charts, graphs, and maps gives users the ability to choose the visualization that will best help them understand the results. With Datameer, analysts and data scientists can focus on what they do best, getting insights from data, instead of writing code. Datameer automatically compiles a workbook of spreadsheets into efficient Hadoop MapReduce execution plans; it then monitors their progress, status, and throughput to detect problems. If users want to go deeper, it offers open APIs for custom data integration, analytics and visualizations.

A White Paper For The Federal IT Community

As a result, Datameer provides powerful, agile analytics to support your organizations mission. Adding new data sources is quick and easy. By using Hadoop, Datameer has no limitations on storage and computation and does not require pre-defined data models so usage is never constrained by up-front system design. By focusing on the end-user, Datameer also eliminates Hadoops need for a users deep technical expertise. This lets any analyst, from any domain, across any site, and with any skillset to contribute by providing Big Data analytics in spreadsheets that can be accessed and updated instantly worldwide. Lastly, by simplifying the process and removing the IT bottleneck, Datameer removes limitations in time-totrigger, letting users develop and run Hadoop-based analytics jobs in minutes.

Datameer and Attributor


Attributors selection of Datameer was driven by our need to quickly provide analytics to our clients. Datameers ease-of-use, seamless integration with Clouderas CDH, HBase and MySQL and ability to correlate structured and unstructured data on day one has already saved us both time and money in running thousands of analytics jobs for our users. Matt Robinson,President and COO, Attributor

Datameer in Government
Government agencies have been leveraging the Big Data movement to directly support many government missions. Agencies have been using Big Data approaches in missions supporting Healthcare, Education, Environmental Research, Law Enforcement, Defense, Intelligence and numerous other activities. Early adopters in these communities have been leading the way in open source solutions and contributions back to the broader community. Solutions have been deployed throughout the federal space including on most publicly facing government web properties. The initial government foray into Big Data has in many ways mirrored the Big Data movement in industry. Still today, for most government missions to be served they must leverage teams of data scientists and engineers. Little to no user-centered Big Data approaches are in use in the government. We believe that is about to change. With Datameer available to every government knowledge worker by easy access through a browser, citizen service and mission support will be supported in new, highly efficient and effective ways.

CTOlabs.com

Concluding Thoughts
Since government agencies have already established visions and goals for big data approaches to serve their missions and since Datameer has a proven user-focused approach to leveraging all organizational data for analysis, we believe Datameer is poised for rapid growth in the federal sector. A logical step for most government agencies and systems integrators, architects and engineers that support them is to begin a proof of concept activity to see first hand how Datameer can work in your environment.

More Reading
For more federal Big Data technology and policy issues visit: CTOvision.com- A blog for enterprise technologists with a special focus on Big Data. CTOlabs.com - A reference for research and reporting on all IT issues. Carahsoft.com - Offering Big Data solutions for Government. GovernmentBigDataForum.com - Join the Government Big Data Forum. J.mp/ctonews - Sign up for the Government Big Data Newsletter. Datameer.com - Visit for more on how Datameer works and to arrange a proof of concept.

A White Paper For The Federal IT Community

About the Authors


Bob Gourley is CTO and founder of Crucial Point LLC and editor and chief of CTOvision.com He is a former federal CTO. His career included service in operational intelligence centers around the globe where his focus was operational all source intelligence analysis. He was the first director of intelligence at DoDs Joint Task Force for Computer Network Defense, served as director of technology for a division of Northrop Grumman and spent three years as the CTO of the Defense Intelligence Agency. Bob serves on numerous government and industry advisory boards. Contact Bob at bob@crucialpointllc.com

Alexander Olesker is a technology research analyst at Crucial Point LLC, focusing on disruptive technologies of interest to enterprise technologists. He writes at http://ctovision.com. Alex is a graduate of the Edmund A. Walsh School of Foreign Service at Georgetown University with a degree in Science, Technology, and International Affairs. He researches and writes on developments in technology and government best practices for CTOvision.com and CTOlabs.com, and has written numerous whitepapers on these subjects. Alex has worked or interned in early childhood education, private intelligence, law enforcement, and academia, contributing to numerous publications on technology, international affairs, and security and has lectured at Georgetown and in the Netherlands. Alex is also the founder and primary contributor of an international security blog that has been quoted and featured by numerous pundits and the War Studies blog of Kings College, London. Alex is a fluent Russian speaker and proficient in French. Contact Alex at AOlesker@crucialpointllc.com

For More Information


If you have questions or would like to discuss this report, please contact me. As an advocate for better IT in government, I am committed to keeping the dialogue open on technologies, processes and best practices that will keep us moving forward. Contact: Bob Gourley bob@crucialpointllc.com 703-994-0549 All information/data 2011 CTOLabs.com.

CTOlabs.com

You might also like