はじめに
Open Model Zoo内のDemoに格納されている、text_to_speech_demo を使ってみましょう。
環境
今回はmacOSで実行してみます。(もちろん他OSでも同等です)
MacBook Pro (13-inch, 2018, Four Thunderbolt 3 Ports)
2.7 GHz クアッドコアIntel Core i7 メモリ16 GB
macOS Big Sur 11.1
Python 3.7.7
openvino 2021.2.185
モデルの確認
models.lstを開いて、使用するモデルを確認します。4つのモデルが必要です。モデル未入手の場合は、モデルダウンローダーを使って入手してください。
# This file can be used with the --list option of the model downloader.
forward-tacotron-duration-prediction
forward-tacotron-regression
wavernn-rnn
wavernn-upsampler
ヘルプの確認
% python3 text_to_speech_demo.py -h
usage: text_to_speech_demo.py [-h] -m_duration MODEL_DURATION -m_forward
MODEL_FORWARD -m_upsample MODEL_UPSAMPLE -m_rnn
MODEL_RNN -i INPUT [-o OUT]
[--upsampler_width UPSAMPLER_WIDTH] [-d DEVICE]
Options:
-h, --help Show this help message and exit.
-m_duration MODEL_DURATION, --model_duration MODEL_DURATION
Required. Path to ForwardTacotron`s duration
prediction part (*.xml format).
-m_forward MODEL_FORWARD, --model_forward MODEL_FORWARD
Required. Path to ForwardTacotron`s mel-spectrogram
regression part (*.xml format).
-m_upsample MODEL_UPSAMPLE, --model_upsample MODEL_UPSAMPLE
Required. Path to WaveRNN`s part for mel-spectrogram
upsampling by time axis (*.xml format).
-m_rnn MODEL_RNN, --model_rnn MODEL_RNN
Required. Path to WaveRNN`s part for waveform
autoregression (*.xml format).
-i INPUT, --input INPUT
Text file with text.
-o OUT, --out OUT Required. Path to an output .wav file
--upsampler_width UPSAMPLER_WIDTH
Width for reshaping of the model_upsample. If -1 then
no reshape. Do not use with FP16 model.
-d DEVICE, --device DEVICE
Optional. Specify the target device to infer on; CPU,
GPU, FPGA, HDDL, MYRIAD or HETERO is acceptable. The
sample will look for a suitable plugin for device
specified. Default value is CPU
実行してみます
テキストファイルを読み込ませると、movで出力されます。まずはVincent van Gogh (ゴッホ)さんの名言を喋らせてみましょう。
Your life would be very empty if you had nothing to regret.
% python3 text_to_speech_demo.py -m_duration ./models/forward-tacotron-duration-prediction/FP16/forward-tacotron-duration-prediction.xml -m_forward ./models/forward-tacotron-regression/FP16/forward-tacotron-regression.xml -m_upsample ./models/wavernn-upsampler/FP16/wavernn-upsampler.xml -m_rnn ./models/wavernn-rnn/FP16/wavernn-rnn.xml -i test.txt -o test.wav
OK。喋りました。次は複数行の例として、Steve Jobsさんの名言を喋らせてみましょう。パラメータ等は同じなので省略します。
Your time is limited, so don't waste it living someone else's life. Don't be trapped by dogma — which is living with the results of other people's thinking. Don't let the noise of others' opinions drown out your own inner voice. And most important, have the courage to follow your heart and intuition. They somehow already know what you truly want to become. Everything else is secondary.
スムーズに喋ってますね。
フリーのITエンジニア(何でも屋さん)。趣味は渓流釣り、サッカー観戦、インラインホッケー、アイスホッケー、RaspberryPiを使った工作など。AI活用に興味があり試行錯誤中です。