Hadoop 2.x Administration Cookbook by Gurmukh Singh

By Gurmukh Singh

Key Features

  • Become a professional Hadoop administrator and practice initiatives to optimize your Hadoop Cluster
  • Import and export information into Hive and use Oozie to control workflow.
  • Practical recipes might help you propose and safe your Hadoop cluster, and make it hugely available

Book Description

Hadoop allows the dispensed garage and processing of enormous datasets throughout clusters of desktops. studying the right way to administer Hadoop is essential to use its particular positive factors. With this booklet, it is possible for you to to beat universal difficulties encountered in Hadoop administration.

The publication starts with laying the root via exhibiting you the stairs had to arrange a Hadoop cluster and its quite a few nodes. you'll get a greater figuring out of the way to take care of Hadoop cluster, particularly at the HDFS layer and utilizing YARN and MapReduce. additional on, you are going to discover sturdiness and excessive availability of a Hadoop cluster.

You'll get a greater knowing of the schedulers in Hadoop and the way to configure and use them in your projects. additionally, you will get hands-on event with the backup and restoration suggestions and the functionality tuning elements of Hadoop. eventually, you'll get a greater realizing of troubleshooting, diagnostics, and top practices in Hadoop administration.

By the tip of this e-book, you've gotten a formal knowing of operating with Hadoop clusters and also will have the ability to safe, encrypt it, and configure auditing on your Hadoop clusters.

What you'll learn

  • Set up the Hadoop structure to run a Hadoop cluster smoothly
  • Maintain a Hadoop cluster on HDFS, YARN, and MapReduce
  • Understand excessive availability with Zookeeper and magazine Node
  • Configure Flume for information ingestion and Oozie to run quite a few workflows
  • Tune the Hadoop cluster for optimum performance
  • Schedule jobs on a Hadoop cluster utilizing the reasonable and potential scheduler
  • Secure your cluster and troubleshoot it for varied universal discomfort points

About the Author

Gurmukh Singh is a professional know-how specialist with 14+ years of adventure in infrastructure layout, disbursed structures, functionality optimization, and networks. He has labored in large facts area for the final five years and offers consultancy and coaching on quite a few technologies.

He has labored with businesses reminiscent of HP, JP Morgan, and Yahoo.

He has authored tracking Hadoop by way of Packt Publishing

Table of Contents

  1. Hadoop structure and Deployment
  2. Maintain Hadoop Cluster - HDFS
  3. Maintain Hadoop Cluster -YARN and MapReduce
  4. High Availability
  5. Schedulers
  6. Backup and Recovery
  7. Data Ingestion and Workflow
  8. Performance Tuning
  9. Hbase and RDBMS
  10. Cluster making plans
  11. Troubleshooting, Diagnostics and top practises
  12. Security

Show description

Read or Download Hadoop 2.x Administration Cookbook PDF

Best data mining books

Data Mining and Statistics for Decision Making (Wiley Series in Computational Statistics)

Facts mining is the method of immediately looking out huge volumes of information for types and styles utilizing computational strategies from records, computing device studying and data thought; it's the perfect software for such an extraction of information. facts mining is mostly linked to a enterprise or an organization's have to determine traits and profiles, permitting, for instance, outlets to find styles on which to base advertising and marketing targets.

Measuring the Digital World: Using Digital Analytics to Drive Better Digital Experiences (FT Press Analytics)

This is often the publication of the broadcast e-book and should no longer comprise any media, site entry codes, or print vitamins that could come packaged with the sure booklet.   The definitive advisor to subsequent new release electronic size; necessary perception for development high-value electronic stories! is helping you trap the data you must convey deep personalization at scale displays today’s most recent insights into electronic habit and purchaser psychology for each electronic marketer, analyst, and govt who desires to enhance functionality To win at electronic, you need to catch the suitable facts, fast rework it into the fitting knowledge,and use them either to bring deep personalization at scale.

Spatial Data Mining: Theory and Application

·        This booklet is an up-to-date model of awell-received e-book formerly released in chinese language by means of technological know-how Press of China(the first variation in 2006 and the second one in 2013). It deals a scientific andpractical evaluate of spatial information mining, which mixes machine technological know-how andgeo-spatial details technology, permitting each one box to learn from theknowledge and methods of the opposite.

Knowledge and Systems Sciences: 17th International Symposium, KSS 2016, Kobe, Japan, November 4-6, 2016, Proceedings (Communications in Computer and Information Science)

This publication constitutes the refereed court cases of the seventeenth foreign Symposium, KSS 2016, held in Kobe, Japan, in November 2016. The 21 revised complete papers offered have been conscientiously reviewed and chosen from forty eight submissions. The papers disguise subject matters such as: Algorithms for giant facts; great info and education; Big information and healthcare; Big info and tourism; Big information and social media orientated wisdom discovery and knowledge mining, text mining, advice process, etc; Big facts, social media and societal management; creation of agent-based social structures sciences; collective intelligence; complex procedure modeling and complexity; decision research and determination help systems; internet+ and agriculture; internet+ and open innovation; knowledge production, creativity help, understanding aid, and so forth.

Additional resources for Hadoop 2.x Administration Cookbook

Sample text

Download PDF sample

Rated 4.67 of 5 – based on 49 votes