Concept Drift

Concept drift is the problem that a machine-learning model trained on past data can become wrong over time because the world it models has changed. More formally, the 2020 review “Learning under Concept Drift” by Jie Lu, Joao Gama, and colleagues defines it as unforeseeable changes in the underlying distribution of streaming data over time. A model assumes the relationship between its inputs and the outcome it predicts is stable; when that relationship shifts, the model’s accuracy quietly decays even though nothing in the code changed.

The survey organizes the field around three tasks: detecting that drift has happened, understanding when and how it occurred, and adapting the model in response, usually by retraining on fresh data. Drift can be sudden, such as a market shock or a policy change, or gradual, such as slowly changing customer behavior. A related but distinct phenomenon is data drift, where the inputs themselves shift distribution. Because drift is invisible without measurement, monitoring production model performance and input statistics is a core part of operating machine learning, and it is one of the reasons the “Hidden Technical Debt” paper warned that the real cost of an ML system comes after deployment.

Why a business reader should care: a model that worked at launch can degrade silently as conditions change, so budgeting for monitoring and retraining is not optional but a basic operating cost of any AI system.

Sources

Related