Abstract
In this paper, we propose a novel modular architecture for self-supervised multi-sensor anomaly detection and localization. The framework consists of a spatio-temporal encoder for representation learning, a decoder for latent reconstruction, a predictive memory network for sub-sequence pattern identification, and a denoiser for false-positive reduction. It uniquely combines a reconstruction and latent prediction network and optimizes the modules in an end-to-end mechanism to minimize the combined weighted loss. We demonstrate the flexibility and efficiency of our architecture by introducing different components for each module, showcasing its adaptability and enhanced performance in anomaly detection and localization.