Slot filling and intent classification are the two core natural-language-understanding tasks that sit at the front of a task-oriented assistant. A 2020 survey by Samuel Louvan and Bernardo Magnini reviews how neural models came to handle them and treats them as the central pair of tasks for understanding a user’s needs in a dialogue system.
Intent classification (also called intent detection) is the job of deciding what the user is trying to do. When someone says “set an alarm for seven,” the intent is something like SetAlarm; “what’s the weather in Paris” maps to GetWeather. Slot filling is the complementary job of pulling out the specific pieces of information the intent needs, the “slots.” For the alarm example, the time slot is “seven”; for the weather example, the location slot is “Paris.” Only with both, the intent and the filled slots, does the assistant know enough to actually act.
Historically these were built as separate components, then increasingly as joint models, because the two tasks inform each other: knowing the intent helps predict which slots to look for, and the slots present help disambiguate the intent. The survey describes this evolution from independent to joint to transfer-learning approaches as neural networks took over.
For a general reader, this pair explains what is really happening when you talk to Siri or Alexa: behind the friendly voice, the system is first classifying your request into one of a fixed set of intents and then extracting the parameters it needs, which is also why such assistants handle the commands they were designed for smoothly but stumble on anything outside that menu.