Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization

Tanasan Srikotr, Kazunori Mano

研究成果: Conference contribution

抄録

Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce the amount of speech data before transmitting. The conventional vector quantization method is based on the mathematical model. Last few years, the Vector Quantized Variational AutoEncoder has been proposed for an end-to-end vector quantization based on deep learning techniques. In this paper, we investigate the sub-band quantization in the Vector Quantized Variational AutoEncoder. This model can concentrate on specific frequency bands to assign more bits and leave the unnecessary band with few bits. Experimental results show the efficiency of the proposed quantization method for the spectral envelope parameters of the high-quality vocoder that operates at 48 kHz sampling frequency named WORLD vocoder. At the same four target bit rates, the sub-band Vector Quantized Variational AutoEncoder can reduce the Log Spectral Distortion around 0.93 dB in average.

本文言語English
ホスト出版物のタイトルProceedings of the TENCON 2019
ホスト出版物のサブタイトルTechnology, Knowledge, and Society
出版社Institute of Electrical and Electronics Engineers Inc.
ページ296-300
ページ数5
ISBN(電子版)9781728118956
DOI
出版ステータスPublished - 2019 10月
イベント2019 IEEE Region 10 Conference: Technology, Knowledge, and Society, TENCON 2019 - Kerala, India
継続期間: 2019 10月 172019 10月 20

出版物シリーズ

名前IEEE Region 10 Annual International Conference, Proceedings/TENCON
2019-October
ISSN(印刷版)2159-3442
ISSN(電子版)2159-3450

Conference

Conference2019 IEEE Region 10 Conference: Technology, Knowledge, and Society, TENCON 2019
国/地域India
CityKerala
Period19/10/1719/10/20

ASJC Scopus subject areas

  • コンピュータ サイエンスの応用
  • 電子工学および電気工学

フィンガープリント

「Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル