Sound-To-Sound Translation Using Generative Adversarial Network and Sound U-Net

Yugo Kunisada, Chinthaka Premachandra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a generic learning method for training conditional generative adversarial networks on audio data. This makes it possible to apply the same generic approach as described in this study to problems that previously required completely different loss formulations when learning audio data. This method can be useful for labeling noises with a certain number of identical frequencies, generating speech labels corresponding to each frequency, and generating audio data for noise cancellation. To achieve this, we propose a sound restoration process based on U-Net, called Sound U-net. In this study, we realized a wide applicability of our system, owing to its ease of implementation without a parameter adjustment, as well as a reduction in the training time for audio data. During the experiment, reasonable results were obtained without manually adjusting the loss function.

Original languageEnglish
Title of host publication2022 2nd International Conference on Image Processing and Robotics, ICIPRob 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665407717
DOIs
Publication statusPublished - 2022
Event2nd International Conference on Image Processing and Robotics, ICIPRob 2022 - Colombo, Sri Lanka
Duration: 2022 Mar 122022 Mar 13

Publication series

Name2022 2nd International Conference on Image Processing and Robotics, ICIPRob 2022

Conference

Conference2nd International Conference on Image Processing and Robotics, ICIPRob 2022
Country/TerritorySri Lanka
CityColombo
Period22/3/1222/3/13

Keywords

  • Audio Processing
  • Conditional GAN
  • Generative Adversarial Networks
  • Machine Leaning
  • Sound U-Net

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Control and Optimization
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Sound-To-Sound Translation Using Generative Adversarial Network and Sound U-Net'. Together they form a unique fingerprint.

Cite this