Coverage includes:

  • Running distributed queries over massive datasets with Hadoop, Hive, and Shark

    Running distributed queries over massive datasets with Hadoop, Hive, and Shark

  • Hosting and sharing multi-terabyte datasets efficiently and economically

    Hosting and sharing multi-terabyte datasets efficiently and economically

  • Developing a NoSQL Web app with Redis to collect crowd-sourced data

    Developing a NoSQL Web app with Redis to collect crowd-sourced data

  • Running distributed queries over massive datasets with Hadoop and Hive

    Running distributed queries over massive datasets with Hadoop and Hive

  • Building a data dashboard with Google BigQuery

    Building a data dashboard with Google BigQuery

  • Exploring large datasets with advanced visualization

    Exploring large datasets with advanced visualization

  • MapReduce pipelines for transforming immense amounts of data

    MapReduce pipelines for transforming immense amounts of data

  • Automating complex processing with Apache Pig and the Cascading Java library

    Automating complex processing with Apache Pig and the Cascading Java library

  • Applying machine learning to classify, recommend, and predict incoming information

    Applying machine learning to classify, recommend, and predict incoming information

  • Using R to perform statistical analysis on massive datasets

    Using R to perform statistical analysis on massive datasets

  • Building highly efficient analytics workflows with Python and Pandas

    Building highly efficient analytics workflows with Python and Pandas

  • Previewing emerging trends and convergences in scalable data technologies and the evolving role of the

    Previewing emerging trends and convergences in scalable data technologies and the evolving role of the

Once again, “Data Just Right” is the InformIT eBook deal of the day!

Once again, “Data Just Right” is the InformIT eBook deal of the day. That means it is only $14.99 – over 50% the normal price. Click here to check out the deal!

Loading Data Into Hive: From Data Just Right Live Lessons Video Training

Righteous Data: The Best Data Posts of the Week

This week, it’s all about the practical applications of data technologies. Big (Bad) Data by Andrew Ross Sorkin Great article – just because one thinks they see a trend from a large amount of data, it doesn’t mean there’s automatically a correct, or even a compelling narrative. Sorkin writes: “A study by the Pew Research […]

Writing a Multistep MapReduce Job Using the mrjob Python Library: From the “Data Just Right” LiveLessons

Visual Storytelling with D3: An Introduction to Data Visualization in JavaScript

The D3.js library is excellent, expressive, and sometime daunting way to create interactive data visualizations for the web. I give a brief introduction to D3.js in “Data Just Right,” but for a very in-depth look, check out the latest book in the The Addison-Wesley Data and Analytics Series, “Visual Storytelling with D3: An Introduction to […]

Righteous Data Posts of the (past) Week

Ok, the Data Just Right blogging team has been a bit busy, and these posts are all a bit old. But… I am guessing that you are just now coming out of the daze that was the start of 2014. In any case, welcome to the first edition of “Righteous Data” – where we at […]

Hive, Python MapReduce, Pandas – video clips from Data Just Right LiveLessons

The informIT.com has just posted some clips from the Data Just Right LiveLessons series – a collection of video training sessions that provide practical examples, discussion, and screencasts using the technologies featured in the Data Just Right book! You can watch the following clips right now online: Loading data into Hive Writing a multistep MapReduce […]

logo_home

“Data Just Right” makes an appearance on the UCB ISchool Website

My alma mater, the UC Berkeley School of Information, recently blogged about my new publication Data Just Right. Back at you UC ISchool – check the dedication of the book.

Welcome to the Data Just Right Blog!

Hello everyone, I’d like to welcome you to the Data Just Right blog. This blog, along with the rest of the DataJustRight.com website, complements the book Data Just Right: Introduction to Large-Scale Data & Analytics. The site will feature regular posts about how to create practical solutions to complex data challenges. Don’t forget to join […]

sxswMichael Manoochehri is an entrepreneur, writer, and optimist. With the help of his many years of experience working with enterprise, research, and nonprofit organizations, his goal is to help make scalable data analytics more affordable and accessible. Michael has been a member of Google’s Cloud Platform Developer Relations team, focusing on cloud computing and data developer products such as Google BigQuery. In addition, Michael has written for the tech blog ProgrammableWeb.com, has spent time in rural Uganda researching mobile phone use, and holds an M.A. in Information Management and Systems from UC Berkeley’s School of Information.