What Is Data Anomaly Detection?

Data Anomaly Detection Defined

Data anomaly detection is the process within data mining, that identifies unexpected events, data points, and observations that deviate from a dataset’s normal behavior.

Anomaly detection has two basic assumptions:

  1. Anomalies only occur very rarely in the data.
  2. Their features differ from the normal instances significantly.

There are three different types of anomalies.

  • Point anomalies: A single instance of data which has deviated too far off from the rest of the data.
  • Contextual anomalies: The abnormality is context specific.
  • Collective anomalies: A set of data instances which collectively help in detecting anomalies.

Some anomaly detection techniques include:

  • Density-based Anomaly Detection
  • Clustering Anomaly Detection
  • Time Series Data Anomaly Detection –  Depending on a users business model and use case, time series data anomaly detection can be used for valuable metrics such as:
    • Web page views
    • Daily active users
    • Cost per click
    • Bounce rate
    • Churn rate
    • Revenue per click
    • Average order value

In Data Defined, we help make the complex world of data more accessible by explaining some of the most complex aspects of the field.

Click Here for more Data Defined.