I want to be able to record my voice and convert it to an AI clone of my own voice that can be used for video editing at scale. It should allow me to write text and to be able to select text fragments and change the emotion/inflection of the higlighted words.
The output would be an audio format.