Copy link to clipboard
Copied
I recorded someone speaking short different 100 setences with iphone as mp3 file.
And I loaded the mp3 file into Premiere Pro and cut the timeline into 100 pieces corresponding to the 100 sentences.
I saved the 100 pieces as 100 wav files.
And I read each of 100 wav file in Python code with some Python module such as wave, soundfile or librosa which convert a wav file into a 1-deminsional array.
When I printed the length of each 1-dimension array, I found the lenght is weired as following.
wav_file_name : new_custom_training_data/sentence001.wav, array.shape : (548548,)
wav_file_name : new_custom_training_data/sentence002.wav, array.shape : (492492,)
wav_file_name : new_custom_training_data/sentence003.wav, array.shape : (840840,)
...
...
...
wav_file_name : new_custom_training_data/sentence097.wav, audio.shape : (272272,)
wav_file_name : new_custom_training_data/sentence098.wav, audio.shape : (616616,)
wav_file_name : new_custom_training_data/sentence099.wav, audio.shape : (600600,)
As you can see, the lengths of 1-d arrays are all in the form of 'xyzxyz'.
I have no idea what makes this strange thing happen.
Anyone has any idea?
The sample-rate was 48000 and the reason why I use Premiere Pro is that I also recorded the speaker's face as video and had to sync the audio and video.
Thanks.
Copy link to clipboard
Copied
Are your wav files the exact same duration?