Sound-To-Sound Translation Using Generative Adversarial Network and Sound U-Net

Yugo Kunisada, Chinthaka Premachandra

研究成果: Conference contribution

抄録

In this paper, we propose a generic learning method for training conditional generative adversarial networks on audio data. This makes it possible to apply the same generic approach as described in this study to problems that previously required completely different loss formulations when learning audio data. This method can be useful for labeling noises with a certain number of identical frequencies, generating speech labels corresponding to each frequency, and generating audio data for noise cancellation. To achieve this, we propose a sound restoration process based on U-Net, called Sound U-net. In this study, we realized a wide applicability of our system, owing to its ease of implementation without a parameter adjustment, as well as a reduction in the training time for audio data. During the experiment, reasonable results were obtained without manually adjusting the loss function.

本文言語English
ホスト出版物のタイトル2022 2nd International Conference on Image Processing and Robotics, ICIPRob 2022
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781665407717
DOI
出版ステータスPublished - 2022
イベント2nd International Conference on Image Processing and Robotics, ICIPRob 2022 - Colombo, Sri Lanka
継続期間: 2022 3月 122022 3月 13

出版物シリーズ

名前2022 2nd International Conference on Image Processing and Robotics, ICIPRob 2022

Conference

Conference2nd International Conference on Image Processing and Robotics, ICIPRob 2022
国/地域Sri Lanka
CityColombo
Period22/3/1222/3/13

ASJC Scopus subject areas

  • 人工知能
  • コンピュータ サイエンスの応用
  • コンピュータ ビジョンおよびパターン認識
  • 信号処理
  • 制御と最適化
  • モデリングとシミュレーション

フィンガープリント

「Sound-To-Sound Translation Using Generative Adversarial Network and Sound U-Net」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル