Emulating & Redproducing accurate Human Voice with Machine Learning / by Julian Kramer

Just watch this mindblowing demo from Adobe Project Vocual. All you need is about 20minutes of recording of one particular voice.

Just to summarize: Your "Audio Photoshop" learns your voice from about 20 minutes of recorded speech. Then it will transcribe your waveform into text which you can edit on a word basis. And add your own text spoken with the actual emulated voice of that person. 

Mind blown!

Glad they are already adding audio watermarking. 

