A monad is a way of structuring computations so that each step is sequenced with the next, while threading along some extra context. Philip Wadler’s writing on the subject, hosted on his University of Edinburgh page, describes how “the use of monads to structure functional programs” lets a pure language simulate “effects found in other languages, such as global state, exception handling, output, or non-determinism.”
The appeal for functional programming is that monads let a language keep its functions pure, returning the same output for the same input, while still expressing things like mutable state, failure, or input/output. In Haskell, input/output is handled through a monad: rather than performing side effects directly, a program builds up a description of effects that the runtime then carries out.
Wadler argued that monads also improve how programs can be modified, noting that they “increase the ease with which programs may be modified” and “can mimic the effect of impure features such as exceptions, state, and continuations.” The same paper notes the deep connection to category theory, the branch of mathematics where the notion of a monad originates, which is why monads are often described in abstract terms that newcomers find hard to follow.