What Is Apache Mahout?

Written by Caitlin Davidson

Share

Apache Mahout Defined

Apache Mahout is a powerful, scalable machine-learning library that runs on top of Hadoop MapReduce. 

Mahout is supported by its 3 pillars:

  • Recommender engines: Recommenders can be classified as being user based or item based and can be used to attract users and suggest products by mining user behaviour.
  • Clustering: Clustering groups objects of a similar nature in one place. 
  • Classification: Classification techniques decide whether a thing deserves to be a part of some type or not. 

Key features of Mahout include:

  • Proven Algorithms– Mahout uses a set of algorithms to try to solve common problems encountered in many industries. 
  • Scalable to Large Data Sets – Designed to distribute across large data center clusters that run Apache Hadoop and apply the map/reduce paradigm. 
  • Active & Open Community – Mahout has its own community forum allowing for discussions between users to address issues. 

Companies using Mahout include:

  • Yahoo Mail
  • Linked In
  • NewsCred

In Data Defined, we help make the complex world of data more accessible by explaining some of the most complex aspects of the field.Click Here for more Data Defined.