Automatic speech de-identification on Singapore English speech

Avatar for Rong TONG
Rong TONG    
Assistant Professor

Read More 

Avatar for Daniel, Zhengkui WANG
Daniel, Zhengkui WANG    
Associate Professor

Read More 

Avatar for Chng Eng Siong (NTU)
CHNG Eng Siong (NTU)    
Researcher

This project develops a speech de-identification system that anonymises spoken data by removing or modifying PII while preserving content and utility.

Project description:​

Voice applications, from chatbots to assistants, transmit large amounts of personally identifiable information (PII), raising risks of misuse, privacy violations, and identity theft.

Solution and Notable Contribution:

The team explored three key aspects: ​

  1. LLM-based data augmentation to address the scarcity of PII-rich data  ​
  2. PII identification modelling using both a pipeline approach (ASR + NER) and end-to-end methods (perform ASR and PII tagging simultaneously) ​
  3. Speech anonymisation techniques aimed at balancing data utility and anonymisation

 

Publications

  • Yaodi Liu, Kun Zhang, Dianying Chen, Chenxi Cai, Xiaohe Wu, Rong Tong, “A discontinuous NER model based on token prediction and contrastive learning to enhance span“, The Journal of Super computing, Vol 81, no. 956, 2025
  • Priyanshu Dhingra, Satyam Agrawal, Chandra Sekar Veerappan, Eng Siong Chng, Rong Tong, “Leveraging Large Language Models for Speech De-Identification “, , International Journal of Asian Language Processing, 2025
  • CS Veerappan, P Dhingra, Zhengkui Wang, Rong Tong, SpeeDF-A Speech De-identification Framework, IEEE TENCON 2024
  • Priyanshu Dhingra, Satyam Satyam, Chandra Sekar Veerappan, Chng Eng Siong, Rong Tong, Enhancing Speech De-identification with LLM-based Data augmentation,  ICAICTA2024

 

A workflow diagram for a speech anonymization system. It progresses through three main stages: speech recognition, PII (Personally Identifiable Information) identification, and speech anonymization. Beneath these stages, the diagram details the processes for text augmentation, ASR/NER identification, and anonymization techniques like simple replacement and category-preserving substitution.