Mutual information optimization for mitigating catastrophic forgetting in continual learning: An information-theoretic approach

Huu Phuc Ngo; Bao Ngoc Vi; Hai-Hong Phan; Chi Cong Nguyen

doi:10.54939/1859-1043.j.mst.106.2025.129-136

Authors

Ngo Huu Phuc (Corresponding Author) Institute of Information and Communication Technology, Military Technical Academy
Vi Bao Ngoc Institute of Information and Communication Technology, Military Technical Academy
Phan Hai Hong Institute of Information and Communication Technology, Military Technical Academy
Nguyen Chi Cong Institute of Information and Communication Technology, Military Technical Academy

DOI:

https://doi.org/10.54939/1859-1043.j.mst.106.2025.129-136

Keywords:

Continual learning; Catastrophic forgetting; Mutual information; Information theory; Neural networks; Memory replay.

Abstract

Continual learning systems encounter the critical challenge of catastrophic forgetting, where neural networks lose previously acquired knowledge when adapting to new tasks. In this paper, we propose Continual Mutual Information Preservation (CMIP), an information-theoretic approach that leverages Mutual Information (MI) optimization and entropy regularization to retain prior knowledge while learning compact and informative latent representations. CMIP uses an auxiliary network to estimate MI and a replay memory, in which each mini-batch comprises 50% current-task samples and 50% samples replayed from previous tasks. Experiments are conducted on the MNIST-Split and CIFAR-100-Split datasets for the class-incremental learning (Class-IL) setting. On MNIST-Split, CMIP achieves 90.97% accuracy with an 8.81% forgetting rate, outperforming EWC (20.64% accuracy, ~77% forgetting) and GEM (65.1% accuracy, ~33% forgetting). This method is applicable to real-world scenarios, such as robotic perception and real-time data streams.

References

[1]. R. M. French, “Catastrophic forgetting in connectionist networks,” Trends in Cognitive Sciences, Vol. 3, No. 4, pp. 128–135, (1999). https://doi.org/10.1016/S1364-6613(99)01294-2 DOI: https://doi.org/10.1016/S1364-6613(99)01294-2

[2]. J. Kirkpatrick et al., “Overcoming catastrophic forgetting in neural networks,” Proceedings of the National Academy of Sciences, Vol. 114, No. 13, pp. 3521–3526, (2017). DOI: https://doi.org/10.1073/pnas.1611835114

[3]. D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” Advances in Neural Information Processing Systems, Vol. 30, pp. 6467–6476, (2017).

[4]. N. Tishby, F. C. Pereira, and W. Bialek, “The information bottleneck method,” arXiv preprint, physics/0004057, (2000). https://arxiv.org/abs/physics/0004057

[5]. M. I. Belghazi et al., “Mutual Information Neural Estimation,” International Conference on Machine Learning (ICML), (2020). https://arxiv.org/abs/1801.04062

[6]. T. Chen et al., “A simple framework for contrastive learning of visual representations,” Proceedings of the 37th International Conference on Machine Learning, Vol. 119, pp. 1597–1607, (2020).

[7]. Y. Polyanskiy and Y. Wu, “Information theory and deep learning: A modern perspective,” Annual Review of Statistics and Its Application, Vol. 11, pp. 101–125, (2024). DOI: https://doi.org/10.1017/9781108966351

[8]. T. Hospedales et al., “Meta-learning in neural networks: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, No. 9, pp. 5149–5169, (2022).

[9]. Z. Mai et al., “Online Continual Learning in Image Classification: An Empirical Survey,” Neurocomputing, Vol. 512, pp. 177–196, (2022). https://doi.org/10.1016/j.neucom.2021.8.811

[10]. G. M. van de Ven, T. Tuytelaars, and A. S. Tolias, “Three types of incremental learning,” Nature Machine Intelligence, (2022). https://doi.org/10.1038/s42256-022-00568-3 DOI: https://doi.org/10.1038/s42256-022-00568-3

Mutual information optimization for mitigating catastrophic forgetting in continual learning: An information-theoretic approach

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

ISSN: 1859-1043

Language

Make a Submission

Indexed by

Information

Visitors

GTM