Deep Learning based voice conversion

Voice conversion using Deep Learning

  • Implemented Bidirectional LSTMs and Encoder-Decoder Recurrent Neural Networks for voice conversion
  • Applied Dynamic Time Warping to align mel-cepstral coefficients of source and target speaker
  • Improved the conversion fidelity by 12.5\% using Attention mechanism to handle long-range dependencies