时间:2022年6月25日上午10:00-12:00
地点:腾讯会议
报告人:Asad Khan
报告题目:
Research on Real Time Dense Face Reconstruction Based on Synthesizing Photo Realistic Image and Computational and Topological Properties of Neural Networks
摘要:
In recent years, cross-modal retrieval (i.e., image-text or text-image retrieval) has attracted
great attention due to the increasing demand for tremendous amounts of multimodal data. To address the problem of inappropriate information included between images and texts, we propose two cross-modal retrieval methods based on a dual-branch neural network defined on a common subspace and the hashing learning method. First, a cross-modal retrieval method based on a multilabel information deep ranking model (MIDRM) is provided. In this method, we introduce a triplet loss function into the dual-branch neural network model. This function takes advantage of the semantic information of the bimodal components, focusing on not only the similarities between similar images and text features but also the distances between dissimilar images and texts. Second, we develop a new cross-modal hashing (CMH) method called the deep regularized hashing constraint (DRHC). In this method, the regularized function is used to replace the binary constraint, and the discrete value is constrained to a certain numerical range so that the network can achieve end-to-end training. Overall, the time complexity is greatly improved, and the occupied storage space is also greatly reduced. Different experiments on our proposed MIDRM and DRHC models demonstrate their superior performance to those of the state-of-the-art methods on two widely used datasets. The experimental results show that our approach also increases the mean average precision (mAP) of cross-modal retrieval. In this work, the efficiency of convolutional neural networks (CNNs) facilitates 3D face reconstruction, which takes a single image as an input and demonstrates significant performance in generating a detailed face geometry. The dependence of the extensive scale of labelled data works as a key to making CNN-based techniques significantly successful. However, no such datasets are publicly available that provide an across-the-board quantity of face images with correspondingly explained 3D face geometry. State-of-the-art learning-based 3D face reconstruction methods synthesize the training data by using a coarse morphable model of a face having non-photo-realistic synthesized face images. In this article, by using a learning-based inverse face rendering, we propose a novel data-generation technique by rendering a large number of face images that are photo-realistic and possess distinct properties. Based on the real-time fine-scale textured 3D face reconstruction comprising decently constructed datasets, we can train two cascaded CNNs in a coarse-to-fine manner.
The networks are trained for actual detailed 3D face reconstruction from a single image. Experimental results demonstrate that the reconstruction of 3D face shapes with geometry details from only one input image can efficiently be performed by our method. Furthermore, the results demonstrate the efficiency of our technique to pose, expression and lighting
Dynamics In short, we have proposed novel techniques in all the directions mentioned above. Our techniques considerably reduce the limitations of previous techniques of similar nature. They significantly broaden the diversity of both direction of research within the wide spectrum of image processing.