Commit 4d980a8c authored by Guo, Yejun's avatar Guo, Yejun Committed by Pedro Arthur

avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks

This filter accepts all the dnn networks which do image processing.
Currently, frame with formats rgb24 and bgr24 are supported. Other
formats such as gray and YUV will be supported next. The dnn network
can accept data in float32 or uint8 format. And the dnn network can
change frame size.

The following is a python script to halve the value of the first
channel of the pixel. It demos how to setup and execute dnn model
with python+tensorflow. It also generates .pb file which will be
used by ffmpeg.

import tensorflow as tf
import numpy as np
import imageio
in_img = imageio.imread('in.bmp')
in_img = in_img.astype(np.float32)/255.0
in_data = in_img[np.newaxis, :]
filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0, 1.]).reshape(1,1,3,3).astype(np.float32)
filter = tf.Variable(filter_data)
x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', name='dnn_out')
sess=tf.Session()
sess.run(tf.global_variables_initializer())
output = sess.run(y, feed_dict={x: in_data})
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb', as_text=False)
output = output * 255.0
output = output.astype(np.uint8)
imageio.imsave("out.bmp", np.squeeze(output))

To do the same thing with ffmpeg:
- generate halve_first_channel.pb with the above script
- generate halve_first_channel.model with tools/python/convert.py
- try with following commands
  ./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=native -y out.native.png
  ./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=tensorflow -y out.tf.png
Signed-off-by: 's avatarGuo, Yejun <yejun.guo@intel.com>
Signed-off-by: 's avatarPedro Arthur <bygrandao@gmail.com>
parent fc7b6d55
......@@ -3473,6 +3473,7 @@ derain_filter_select="dnn"
deshake_filter_select="pixelutils"
deshake_opencl_filter_deps="opencl"
dilation_opencl_filter_deps="opencl"
dnn_processing_filter_select="dnn"
drawtext_filter_deps="libfreetype"
drawtext_filter_suggest="libfontconfig libfribidi"
elbg_filter_deps="avcodec"
......
......@@ -8928,6 +8928,50 @@ ffmpeg -i INPUT -f lavfi -i nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2
@end example
@end itemize
@section dnn_processing
Do image processing with deep neural networks. Currently only AVFrame with RGB24
and BGR24 are supported, more formats will be added later.
The filter accepts the following options:
@table @option
@item dnn_backend
Specify which DNN backend to use for model loading and execution. This option accepts
the following values:
@table @samp
@item native
Native implementation of DNN loading and execution.
@item tensorflow
TensorFlow backend. To enable this backend you
need to install the TensorFlow for C library (see
@url{https://www.tensorflow.org/install/install_c}) and configure FFmpeg with
@code{--enable-libtensorflow}
@end table
Default value is @samp{native}.
@item model
Set path to model file specifying network architecture and its parameters.
Note that different backends use different file formats. TensorFlow and native
backend can load files for only its format.
Native model file (.model) can be generated from TensorFlow model file (.pb) by using tools/python/convert.py
@item input
Set the input name of the dnn network.
@item output
Set the output name of the dnn network.
@item fmt
Set the pixel format for the Frame. Allowed values are @code{AV_PIX_FMT_RGB24}, and @code{AV_PIX_FMT_BGR24}.
Default value is @code{AV_PIX_FMT_RGB24}.
@end table
@section drawbox
Draw a colored box on the input image.
......
......@@ -223,6 +223,7 @@ OBJS-$(CONFIG_DILATION_FILTER) += vf_neighbor.o
OBJS-$(CONFIG_DILATION_OPENCL_FILTER) += vf_neighbor_opencl.o opencl.o \
opencl/neighbor.o
OBJS-$(CONFIG_DISPLACE_FILTER) += vf_displace.o framesync.o
OBJS-$(CONFIG_DNN_PROCESSING_FILTER) += vf_dnn_processing.o
OBJS-$(CONFIG_DOUBLEWEAVE_FILTER) += vf_weave.o
OBJS-$(CONFIG_DRAWBOX_FILTER) += vf_drawbox.o
OBJS-$(CONFIG_DRAWGRAPH_FILTER) += f_drawgraph.o
......
......@@ -209,6 +209,7 @@ extern AVFilter ff_vf_detelecine;
extern AVFilter ff_vf_dilation;
extern AVFilter ff_vf_dilation_opencl;
extern AVFilter ff_vf_displace;
extern AVFilter ff_vf_dnn_processing;
extern AVFilter ff_vf_doubleweave;
extern AVFilter ff_vf_drawbox;
extern AVFilter ff_vf_drawgraph;
......
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment