Reversible Linguistic Steganography With Bayesian Masked Language Modeling

Reversible Linguistic Steganography With Bayesian Masked Language Modeling

Abstract:

Text authentication serves a vital role in the defense of digital identity and content against various types of cybercrime. The use of a digital signature is a common cryptographic technique for text authentication. Linguistic steganography can be applied to further conceal a digital signature within the corresponding text to facilitate data management. However, steganographic distortion lurking in the text, albeit almost imperceptible, has the potential to cause automatic computing machinery to make biased decisions. This has led to an interest in the pursuit of reversibility, the ability to reverse a steganographic process and remove distortion. In this article, we propose a reversible steganographic system for natural language text. We use a pre-trained transformer neural network for masked language modeling and embed messages in a reversible manner via predictive word substitution. Furthermore, we derive an adaptive steganographic route by taking account of predictive uncertainty, which is quantified based on a theoretical framework of Bayesian deep learning. Experimental results show that the proposed steganographic system can attain a proper balance between capacity, imperceptibility, and reversibility with close semantic and sentimental similarities between cover and stego texts.