Data Algorithms: Recipes for Scaling Up with Hadoop and Spark
Author | : | |
Rating | : | 4.92 (941 Votes) |
Asin | : | 1491906189 |
Format Type | : | paperback |
Number of Pages | : | 778 Pages |
Publish Date | : | 2014-10-27 |
Language | : | English |
DESCRIPTION:
About the AuthorMahmoud Parsian, Ph.D. He leads and develops scalableregression algorithms; DNA sequencing and RNA sequencing pipelinesusing Java, MapReduce, Hadoop, HBase, and Spark; and open sourcetools. Parsian currently leads Illumina'sBig Data team, which is focused on large-scale genome analyticsand distributed computing. in Computer Science, is a practicingsoftware professional with 30 years of experience as a developer,designer, architect, and author. Dr. For the past 15 years, he hasbeen involved in Java server-side, databases, MapReduce, anddistributed computing. He is also the author of JDBC Recipes and JDBC Metadata (bothfrom Apress).
Good stuff its a pretty good book covering all day to day use cases which we come across in Hadoop real world projects, I totally recommend this book for someone who is looking for real world solutions.. Dimitri K said Lots of code snippets, not too good in concepts.. The book provides examples of how to implement some simple data analysis jobs using Map-reduce and to some extent Spark. Hadoop is probably Lots of code snippets, not too good in concepts. The book provides examples of how to implement some simple data analysis jobs using Map-reduce and to some extent Spark. Hadoop is probably 3/Lots of code snippets, not too good in concepts. Dimitri K The book provides examples of how to implement some simple data analysis jobs using Map-reduce and to some extent Spark. Hadoop is probably 3/4 of the content. As for data analysis, the author does not show any deep knowledge or interest in explaining the methods. They are however pretty basic. The author concentrates on the code, which is in Java. Some examples are in Ruby, though, don't know why. There are lots of code snippets, which mostly fill in 700 pages of the book. The map-reduce patterns themselves are pre. of the content. As for data analysis, the author does not show any deep knowledge or interest in explaining the methods. They are however pretty basic. The author concentrates on the code, which is in Java. Some examples are in Ruby, though, don't know why. There are lots of code snippets, which mostly fill in 700 pages of the book. The map-reduce patterns themselves are pre. /Lots of code snippets, not too good in concepts. Dimitri K The book provides examples of how to implement some simple data analysis jobs using Map-reduce and to some extent Spark. Hadoop is probably 3/4 of the content. As for data analysis, the author does not show any deep knowledge or interest in explaining the methods. They are however pretty basic. The author concentrates on the code, which is in Java. Some examples are in Ruby, though, don't know why. There are lots of code snippets, which mostly fill in 700 pages of the book. The map-reduce patterns themselves are pre. of the content. As for data analysis, the author does not show any deep knowledge or interest in explaining the methods. They are however pretty basic. The author concentrates on the code, which is in Java. Some examples are in Ruby, though, don't know why. There are lots of code snippets, which mostly fill in 700 pages of the book. The map-reduce patterns themselves are pre. Great book to understand MapReduce The book focuses on MapReduce programming, with a lot of examples of distributed computing implemented for Spark and Hadoop. It provides tangible problems as well as their solutions. The book is a great read to upgrade the Spark skills. Also I assume the book is good if you try to learn Hadoop with no prior knowledge of MapReduce.
Dr. He is also the author of JDBC Recipes and JDBC Metadata (bothfrom Apress).. Parsian currently leads Illumina'sBig Data team, which is focused on large-scale genome analyticsand distributed computing. Mahmoud Parsian, Ph.D. For the past 15 years, he hasbeen involved in Java server-side, databases, MapReduce,
If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. Each chapter provides a recipe for solving a massive computational problem, such as building