Wednesday, November 26, 2014

Top tools for taming big data

Top-flight reporting, analysis, visualization, integration, and development tools that help you harness Hadoop



Big data tools: Pentaho Business AnalyticsPentaho is another software platform that began as a report generating engine; it is, like JasperSoft, branching into big data by making it easier to absorb information from the new sources. You can hook up Pentaho's tool to many of the most popular NoSQL databases such as MongoDB and Cassandra. Once the databases are connected, you can drag and drop the columns into views and reports as if the information came from SQL databases.
I found the classic sorting and sifting tables to be extremely useful for understanding just who was spending the most amount of time at my website. Simply sorting by IP address in the log files revealed what the heavy users were doing.
Pentaho also provides software for drawing HDFS file data and HBase data from Hadoop clusters. One of the more intriguing tools is the graphical programming interface known as either Kettle or Pentaho Data Integration. It has a bunch of built-in modules that you can drag and drop onto a picture, then connect them. Pentaho has thoroughly integrated Hadoop and the other sources into this, so you can write your code and send it out to execute on the cluster.

No comments:

Post a Comment