What Is YARN?

Written by Indicative Team

Share

YARN Defined

YARN, short for Yet Another Resource Negotiator, is a large-scale, distributed operating system for big data applications. Yarn was designed for cluster management and is one of the key features in the second generation of Hadoop, the Apache Software Foundation’s open source distributed processing framework. YARN allows developers  to create applications, work with huge amounts of data, and manipulate them in an efficient manner.

In a cluster architecture, Apache Hadoop YARN sits between HDFS and the processing engines being used to run applications. It combines a central resource manager with containers, application coordinators and node-level agents that monitor processing operations in individual cluster nodes.

The architecture of YARN ensures that the Hadoop cluster can be enhanced in the following ways:

  • Multi-tenancy – it allows users access various proprietary and open-source engines for deploying Hadoop as a standard for real-time, interactive, and batch processing tasks.
  • Cluster Utilization – it lets users to use the Hadoop cluster in a dynamic way, allowing for a better and more optimized way of utilizing the cluster.
  • Scalability – it allows for scalability to the Hadoop cluster.
  • Compatibility – this tool is highly compatible with existing Hadoop MapReduce applications.

In Data Defined, we help make the complex world of data more accessible by explaining some of the most complex aspects of the field.

Click Here for more Data Defined.