OpenVINO 2022.1 OpenModelZoo 実行 C++ ubuntu編

今回は、OpenVINOではおなじみのOpenModelZooをインストールします。
OpenModelZooはOpenVINOで動作するdemoが集まったものです。
色々なデモがあるので、自分の探しているAIにマッチしているものを探すのも面白いです。

OpenModelZooのダウンロード

OpenModelZooは2022.1ではRuntimeにもDev Toolsにも入っていませんので、gitで取得します。

レポジトリは下記のURLとなります。

https://docs.openvino.ai/nightly/omz_demos.html#doxid-omz-demos

git clone https://github.com/openvinotoolkit/open_model_zoo.git

コマンド 'git' が見つかりません。次の方法でインストールできます:

sudo apt install git

ubuntuをクリーンインストールしたので、gitをインストールしていませんでした…

$ git clone https://github.com/openvinotoolkit/open_model_zoo.git

$ sudo apt-get install git

$ cd open_model_zoo

$ git submodule update --init --recursive

demo list

デモのリストは下記となります。

  • 3D Human Pose Estimation Python Demo
  • 3D Segmentation Python Demo
  • Action Recognition Python Demo
  • Background Subtraction Python Demo
  • Background Subtraction C++ G-API Demo
  • BERT Named Entity Recognition Python Demo
  • BERT Question Answering Python Demo
  • BERT Question Answering Embedding Python Demo
  • Classification Python Demo
  • Classification Benchmark C++ Demo
  • Colorization Python Demo
  • Crossroad Camera C++ Demo
  • Deblurring Python Demo
  • Face Detection MTCNN Python Demo
  • Face Detection MTCNN C++ G-API Demo
  • Face Recognition Python Demo
  • Formula Recognition Python Demo
  • Gaze Estimation C++ Demo
  • Gaze Estimation C++ G-API Demo
  • Gesture Recognition Python Demo
  • Gesture Recognition C++ G-API Demo
  • GPT-2 Text Prediction Python Demo
  • Handwritten Text Recognition Python Demo
  • Human Pose Estimation C++ Demo
  • Human Pose Estimation Python Demo
  • Image Inpainting Python Demo
  • Image Processing C++ Demo
  • Image Retrieval Python Demo
  • Image Segmentation C++ Demo
  • Image Segmentation Python Demo
  • Image Translation Python Demo
  • Instance Segmentation Python Demo
  • Interactive Face Detection C++ Demo
  • Interactive Face Detection G-API Demo
  • Machine Translation Python Demo
  • Mask R-CNN C++ Demo for TensorFlow Object Detection API
  • Monodepth Python Demo
  • MRI Reconstruction C++ Demo
  • MRI Reconstruction Python Demo
  • Multi-Camera Multi-Target Tracking Python Demo
  • Multi-Channel Face Detection C++ Demo
  • Multi-Channel Human Pose Estimation C++ Demo
  • Multi-Channel Object Detection Yolov3 C++ Demo
  • Noise Suppression Python Demo
  • Noise Suppression C++ Demo
  • Object Detection Python Demo
  • Object Detection C++ Demo
  • Pedestrian Tracker C++ Demo
  • Place Recognition Python Demo
  • Security Barrier Camera C++ Demo
  • Speech Recognition DeepSpeech Python Demo
  • Speech Recognition QuartzNet Python Demo
  • Speech Recognition Wav2Vec Python Demo
  • Single Human Pose Estimation Python Demo
  • Smart Classroom C++ Demo
  • Smart Classroom C++ G-API Demo
  • Smartlab Python Demo
  • Social Distance C++ Demo
  • Sound Classification Python Demo
  • Text Detection C++ Demo
  • Text Spotting Python Demo
  • Text-to-speech Python Demo
  • Time Series Forecasting Python Demo
  • Whiteboard Inpainting Python Demo

demo build

$ cd demos/

$ source ~/intel/openvino_2022/setupvars.sh

$ ./build_demos.sh

ビルドはこれだけです。
意外とあっさり終わってしまいました…

ビルドされた実行ファイルは、
~/omz_demos_build/intel64/Release
に格納されます。

demo 実行

~/omz_demos_build/intel64/Release に移動します。

今回は、text_detection_demoを実行します。

$ source ~/intel/openvino_2022/setupvars.sh

$ ./text_detection_demo -h

text_detection_demo [OPTION]
Options:
-h                             Print a usage message.
-i                             Required. An input to process. The input must be a single image, a folder of images, video file or camera id.
-loop                          Optional. Enable reading the input in a loop.
-o "<path>"                    Optional. Name of the output file(s) to save.
-limit "<num>"                 Optional. Number of frames to store in output. If 0 is set, all frames are stored.
-m_td "<path>"                 Required. Path to the Text Detection model (.xml) file.
-m_tr "<path>"                 Required. Path to the Text Recognition model (.xml) file.
-dt "<type>"                   Optional. Type of the decoder, either 'simple' for SimpleDecoder or 'ctc' for CTC greedy and CTC beam search decoders. Default is 'ctc'
-m_tr_ss "<value>" or "<path>" Optional. String or vocabulary file with symbol set for the Text Recognition model.
-tr_pt_first                   Optional. Specifies if pad token is the first symbol in the alphabet. Default is false
-lower                         Optional. Set this flag to convert recognized text to lowercase
-out_enc_hidden_name "<value>" Optional. Name of the text recognition model encoder output hidden blob
-out_dec_hidden_name "<value>" Optional. Name of the text recognition model decoder output hidden blob
-in_dec_hidden_name "<value>"  Optional. Name of the text recognition model decoder input hidden blob
-features_name "<value>"       Optional. Name of the text recognition model features blob
-in_dec_symbol_name "<value>"  Optional. Name of the text recognition model decoder input blob (prev. decoded symbol)
-out_dec_symbol_name "<value>" Optional. Name of the text recognition model decoder output blob (probability distribution over tokens)
-tr_o_blb_nm "<value>"         Optional. Name of the output blob of the model which would be used as model output. If not stated, first blob of the model would be used.
-cc                            Optional. If it is set, then in case of absence of the Text Detector, the Text Recognition model takes a central image crop as an input, but not full frame.
-w_td "<value>"                Optional. Input image width for Text Detection model.
-h_td "<value>"                Optional. Input image height for Text Detection model.
-thr "<value>"                 Optional. Specify a recognition confidence threshold. Text detection candidates with text recognition confidence below specified threshold are rejected.
-cls_pixel_thr "<value>"       Optional. Specify a confidence threshold for pixel classification. Pixels with classification confidence below specified threshold are rejected.
-link_pixel_thr "<value>"      Optional. Specify a confidence threshold for pixel linkage. Pixels with linkage confidence below specified threshold are not linked.
-max_rect_num "<value>"        Optional. Maximum number of rectangles to recognize. If it is negative, number of rectangles to recognize is not limited.
-d_td "<device>"               Optional. Specify the target device for the Text Detection model to infer on (the list of available devices is shown below). The demo will look for a suitable plugin for a specified device. By default, it is CPU.
-d_tr "<device>"               Optional. Specify the target device for the Text Recognition model to infer on (the list of available devices is shown below). The demo will look for a suitable plugin for a specified device. By default, it is CPU.
-auto_resize                   Optional. Enables resizable input with support of ROI crop & auto resize.
-no_show                       Optional. If it is true, then detected text will not be shown on image frame. By default, it is false.
-r                             Optional. Output Inference results as raw values.
-u                             Optional. List of monitors to show initially.
-b                             Optional. Bandwidth for CTC beam search decoder. Default value is 0, in this case CTC greedy decoder will be used.
-start_index                   Optional. Start index for Simple decoder. Default value is 0.
-pad                           Optional. Pad symbol. Default value is '#'.
Available devices: CPU GNA

必須ファイルに下記のモデルがあります。
Required. Path to the Text Detection model (.xml) file.
Required. Path to the Text Recognition model (.xml) file.

まずはdownloaderを使えるようにします。
別の端末で、

$ source openvino_env/bin/activate

$ cd open_model_zoo/demos/text_detection_demo/cpp/

$ omz_downloader --list models.lst

モデルが大きいので、これには時間がかかります。

モデルがダウンロードできたら、適当なテキストをキャプチャした画像を用意して、実行します。

最後のloopはウィンドウがすぐに消えてしまうので付けています。

./text_detection_demo -m_td ~/open_model_zoo/demos/text_detection_demo/cpp/intel/text-detection-0004/FP16/text-detection-0004.xml -m_tr ~/open_model_zoo/demos/text_detection_demo/cpp/intel/text-recognition-0012/FP16/text-recognition-0012.xml -i ~/images/screenshot2.png -loop

実行結果はこちらになります。

文字の位置を認識して、文字を読み取っています。