Georgia Tech Computer Vision Reading Group

Fall 2019
Date & Time: Wednesdays, 2-3pm
Location: CODA C1215 Midtown

To subscribe to the mailing list for presentation announcements, join the Google groups. You can also add/import the Google calendar, which will be updated with the schedule.


Date Topic(s)/Paper(s) Presenter(s)
Aug 28, 2019 Kick-off meeting Cusuh
Sep 4, 2019 CVPR recap:

Vision for self-driving

- Panoptic Segmentation. Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, Piotr Dollár. CVPR 2019. [paper]
- Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving. Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger. CVPR 2019. [paper]
- Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression. Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, Silvio Savarese. CVPR 2019. [paper]
- End-to-end Interpretable Neural Motion Planner. Wenyuan Zeng, Wenjie Luo, Simon Suo, Abbas Sadat, Bin Yang, Sergio Casas, Raquel Urtasun. CVPR 2019. [paper]
- Deep Rigid Instance Scene Flow. Wei-Chiu Ma, Shenlong Wang, Rui Hu, Yuxen Xiong, Raquel Urtasun. CVPR 2019. [paper]
- Argoverse: 3D Tracking and Forecasting with Rich Maps. Ming-Fang Chang*, John Lambert*, Patsorn Sangkloy*, Jagjeet Singh*, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, James Hays. CVPR 2019. [paper] - D2-Net: A Trainable CNN for Joint Description and Detection of Local Features. Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Pollefeys, Josef Sivic, Akihiko Torii, Torsten Sattler. CVPR 2019. [paper]

Incorporating 3D information in networks

- 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. Christopher Choy, JunYoung Gwak, Silvio Savarese. CVPR 2019. [paper]
- DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove. CVPR 2019. [arXiv]
- Occupancy Networks: Learning 3D Reconstruction in Function Space. Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, Andreas Geiger. CVPR 2019. [arXiv]
- Pushing the Boundaries of View Extrapolation with Multiplane Images. Pratul P. Srinivasan, Richard Tucker, Jonathan T. Barron. CVPR 2019. [arXiv]
- DeepVoxels: Learning Persistent 3D Feature Embeddings. Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, Michael Zollhöfer. CVPR 2019. [arXiv]
- Strand-accurate Multi-view Hair Capture. Giljoo Nam, Chenglei Wu, Min H. Kim, Yaser Sheikh. CVPR 2019. [paper]

Generative models

- A Style-Based Generator Architecture for Generative Adversarial Networks. Tero Karras, Samuli Laine, Timo Aila. CVPR 2019. [arXiv]
- Semantic Image Synthesis with Spatially-Adaptive Normalization. Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu. CVPR 2019. [arXiv]
- Shapes and Context: In-the-Wild Image Synthesis & Manipulation. Aayush Bansal, Yaser Sheikh, Deva Ramanan. CVPR 2019. [arXiv]
- Non-Adversarial Image Synthesis with Generative Latent Nearest Neighbors. Yedid Hoshen, Jitendra Malik. CVPR 2019. [arXiv]
- Conditional Adversarial Generative Flow for Controllable Image Synthesis. Rui Liu, Yu Liu, Xinyu Gong, Xiaogang Wang, Hongsheng Li. CVPR 2019. [arXiv]
- Unsupervised Primitive Discovery for Improved 3D Generative Modeling. Salman H. Khan, Yulan Guo, Munawar Hayat, Nick Barnes. CVPR 2019. [paper]

Amit & John
Sep 11, 2019


- Tracking without bells and whistles. Philipp Bergmann*, Tim Meinhardt*, Laura Leal-Taixe. arXiv pre-print 2019. [arXiv]
- Heterogeneous Association Graph Fusion for Target Association in Multiple Object Tracking. Hao Sheng, Yang Zhang, Jiahui Chen, Zhang Xiong, Jun Zhang. TCSVT 2018. [paper]
- Improvements to Frank-Wolfe optimization for multi-detector multi-object tracking. Roberto Henschel, Laura Leal-Taixe, Daniel Cremers, Bodo Rosenhahn. arXiv pre-print 2017. [arXiv]
- Motion Segmentation & Multiple Object Tracking by Correlation Co-Clustering. Margret Keuper, Siyu Tang, Bjoern ANdres, Thomas Brox, Bernt Schiele. PAMI 2018. [paper]
- Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-identification. Long Chen, Haizhou Ai, Zijie Zhuang, Chong Shang. ICME 2018. [arXiv]
- Multiple Hypothesis Tracking Revisited. Chanho Kim, Fuxin Li, Arridhana Ciptadi, James M. Rehg. ICCV 2016. [paper]

Sep 18, 2019


- Modeling Uncertainty with Hedged Instance Embedding. Seong Joon Oh, Kevin Murphy, Jiyan Pan, Joseph Roth, Florian Schroff, Andrew Gallagher. ICLR 2019. [arXiv]
- What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? Alex Kendall, Yarin Gal. NeurIPS 2017. [arXiv]
- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. Yarin Gal, Zoubin Ghahramani. ICML 2016. [arXiv]

Sep 25, 2019

Object detection

- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun. NeurIPS 2015. [arXiv]
- Feature Pyramid Networks for Object Detection. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie. CVPR 2017. [arXiv]
- Focal Loss for Dense Object Detection. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár. ICCV 2017. [arXiv]

Real-time object detection

- SSD: Single Shot MultiBox Detector. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg. ECCV 2016. [arXiv]
- YOLO9000: Better, Faster, Stronger. Joseph Redmon, Ali Farhadi. CVPR 2017. [arXiv]
- YOLOv3: An Incremental Improvement. Joseph Redmon, Ali Farhadi. arXiv pre-print. [arXiv]

Instance segmentation

- Fully Convolutional Instance-aware Semantic Segmentation. Yi Li*, Haozhi Qi*, Jifeng Dai, Xiangyang Ji, Yichen Wei. CVPR 2017. [arXiv]
- Mask R-CNN. Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick. ICCV 2017. [arXiv]
- Mask Scoring R-CNN. Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, Xinggang Wang. CVPR 2019. [arXiv]

Real-time instance segmentation

- YOLACT: Real-Time Instance Segmentation. Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee. ICCV 2019. [arXiv]

Oct 2, 2019 PU-GAN: a Point Cloud Upsampling Adversarial Network. Ruihui Li, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, Pheng-Ann Heng. ICCV 2019. [arXiv] Patsorn
Oct 9, 2019 Invariant Information Clustering for Unsupervised Image Classification and Segmentation. Xu Ji, João F. Henriques, Andrea Vedaldi. ICCV 2019. [arXiv] James
Oct 16, 2019

Learning textures for meshes

- Texture Fields: Learning Texture Representations in Function Space. Michael Oechsle, Lars Mescheder, Michael Niemeyer, Thilo Strauss, Andreas Geiger. ICCV 2019. [paper] - PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothes Human Digitization. Shunsuke Saito*, Zeng Huang*, Ryota Natsume*, Shigeo Morishima, Angjoo Kanazawa, Hao Li. ICCV 2019. [arXiv]

Temporal deformation on meshes

- Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics. Michael Niemeyer, Lars Mescheder, Michael Oechsle, Andreas Geiger. ICCV 2019. [paper]

Function-based 3D representations

- Occupancy Networks: Learning 3D Reconstruction in Function Space. Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, Andreas Geiger. CVPR 2019. [arXiv] - DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove. CVPR 2019. [arXiv]

Oct 23, 2019 Title: Learning to Learn More with Less


Understanding how humans and machines learn from few examples remains a fundamental challenge. Humans are remarkably able to grasp a new concept from just few examples, or learn a new skill from just few trials. By contrast, state-of-the-art machine learning techniques typically require thousands of training examples and often break down if the training sample set is too small. In this talk, I will discuss our efforts towards endowing visual learning systems with few-shot learning ability. Our key insight is that the visual world is well structured and highly predictable not only in feature spaces but also in under-explored model and data spaces. Such structures and regularities enable the systems to learn how to learn new tasks rapidly by reusing previous experiences. I will focus on a few topics to demonstrate how to leverage this idea of learning to learn, or meta-learning, to address a broad range of few-shot learning tasks: meta-learning in model space and task-oriented generative modeling. I will also discuss some ongoing work towards building machines that are able to operate in highly dynamic and open environments, making intelligent and independent decisions based on insufficient information.


Yuxiong Wang is a postdoctoral fellow in the Robotics Institute at Carnegie Mellon University. He received a Ph.D. in robotics in 2018 from Carnegie Mellon University. His research interests lie in the intersection of computer vision, machine learning, and robotics, with a particular focus on few-shot learning and meta-learning. He has spent time at Facebook AI Research (FAIR).

Guest speaker: Yuxiong Wang
Oct 30, 2019 Learning Language Games through Interaction. Sida I. Wang, Percy Liang, Christopher D. Manning. ACL 2016. [arXiv] Arjun C.
Nov 6, 2019 CVPR deadline approaching --
Nov 13, 2019 CVPR deadline approaching --
Nov 20, 2019 Fashion++: Minimal Edits for Outfit Improvement. Wei-Lin Hsiao, Isay Katsman, Chao-Yuan Wu, Devi Parikh, Kristen Grauman. ICCV 2019. [arXiv] Meera
Nov 27, 2019 Thanksgiving break --
Dec 4, 2019
Dec 11, 2019 Final exams --

Previous semesters

Spring 2019

Date Paper(s) Presenter
Jan 14, 2019 Kick-off meeting Cusuh
Jan 21, 2019 MLK Jr. Day --
Jan 28, 2019 First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations. Guillermo Garcia-Hernando, Shanxin Yuan, Seungryul Baek, Tae-Kyun Kim. CVPR 2018. [paper] Samarth B.
Synthesis of Detailed Hand Manipulations Using Contact Sampling. Yuting Ye, C. Karen Liu. SIGGRAPH 2012. [paper]
V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map. Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee. CVPR 2018. [paper]
Hand Pose Estimation via Latent 2.5D Heatmap Regression. Umar Iqbal, Pavlo Molchanov, Thomas Breuel. ECCV 2018. [arXiv]
Feb 4, 2019 Realistic Evaluation of Semi-Supervised Learning Algorithms. Avital Oliver*, Augustus Odena*, Colin Raffel*, Ekin D. Cubuk, Ian J. Goodfellow. NeurIPS 2018. [arXiv] Cusuh
Feb 11, 2019 Memory Aware Synapses: Learning what (not) to forget. Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, Tinne Tuytelaars. ECCV 2018. [arXiv] Stefan
An Empirical Study of Example Forgetting during Deep Neural Network Learning. Mariya Toneva*, Alessandro Sordoni*, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, Geoffrey J. Gordon. ICLR 2019. [arXiv]
Feb 18, 2019 CodeSLAM -- Learning a Compact, Optimisable Representation for Dense Visual SLAM. Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison. CVPR 2018. [paper] John
Object-Centric Photometric Bundle Adjustment with Deep Shape Prior. Rui Zhu, Chaoyang Wang, Chen-Hsuan Lin, Ziyan Wang, Simon Lucey. WACV 2018. [paper]
Feb 25, 2019 End-to-end Recovery of Human Shape and Pose. Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik. CVPR 2018. [paper] Amit
SFV: Reinforcement Learning of Physical Skills from Video. Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine. SIGGRAPH Asia 2018. [paper]
Mar 4, 2019 Semi-Parametric Topological Memory for Navigation. Nikolay Savinov*, Alexey Dosovitskiy*, Vladlen Koltun. ICLR 2018. [arXiv] Apoorva
Taking a Deeper Look at the Inverse Compositional Algorithm. Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger. CVPR 2019. [arXiv]
Mar 11, 2019 ICCV deadline approaching --
Mar 18, 2019 ICCV deadline approaching --
Mar 25, 2019 ICCV supplementary deadline approaching --
Apr 1, 2019 LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving. Gregory P. Meyer*, Ankit Laddha*, Eric Kee, Carlos Vallespi-Gonzalez, Carl K. Wellington. CVPR 2019. [arXiv] Patsorn
Apr 8, 2019 SlowFast Networks for Video Recognition. Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, Kaiming He. CVPR 2018. [arXiv] Sean
Apr 15, 2019 Integrating New Knowledge into a Neural Network without Catastrophic Inference: Computational and Theoretical Investigations in a Hierarchically Structure Environment. 11:15am, EBB 1005. [GT Neuro Seminar calendar] Dr. James L. McClelland, Stanford University
Apr 22, 2019 DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove. CVPR 2019. [arXiv] Samarth M.
Apr 29, 2019 Final exams --

Fall 2018

Date Paper(s) Presenter
Aug 29, 2018 Kick-off meeting Cusuh
Sep 6, 2018 Low-Shot Learning with Imprinted Weights. Hang Qi, Matthew Brown, David G. Lowe. CVPR 2018. [paper] Jon
Sep 13, 2018 ECCV --
Sep 20, 2018 CornerNet: Detecting Objects as Paired Keypoints. Hei Law, Jia Deng. ECCV 2018. [arXiv] Cusuh
Sep 27, 2018 Implicit 3D Orientation Learning for 6D Object Detection from RGB Images. Martin Sundermeyer, Zoltan-Csaba Marton, Maximilian Durner, Manuel Brucker, Rudolph Triebel. ECCV 2018. [paper] Ren
DeepIM: Deep Iterative Matching for 6D Pose Estimation. Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, Dieter Fox. ECCV 2018. [arXiv]
Oct 4, 2018 PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz. CVPR 2018. [arXiv] Sean
Oct 11, 2018 Eigendecomposition-free Training of Deep Networks with Zero Eigenvalue-based Losses. Zheng Dang, Kwang Moo Yi, Yinlin Hu, Fei Wang, Pascal Fua, Mathieu Salzmann. ECCV 2018. [arXiv] Samarth
Learning to Find Good Correspondences. Kwang Moo Yi*, Eduard Trulls*, Yuki Ono, Vincent Lepetit, Mathieu Salzmann, Pascal Fua. CVPR 2018. [arXiv]
Oct 18, 2018 Progressive Neural Architecture Search. Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy. ECCV 2018. [arXiv] Nam
Oct 25, 2018 Neural 3D Mesh Renderer. Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada. CVPR 2018. [arXiv] Amit
Learning Category-Specific Mesh Reconstruction from Image Collections. Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik. ECCV 2018. [arXiv]
Nov 1, 2018 Informative Features for Model Comparison. Wittawat Jitkrittum, Heishiro Kanagawa, Patsorn Sangkloy, James Hays, Bernhard Schölkopf, Arthur Gretton. NeurIPS 2018. [arXiv] Patsorn
Nov 8, 2018 CVPR deadline approaching --
Nov 15, 2018 CVPR deadline approaching --
Nov 22, 2018 Thanksgiving --
Nov 29, 2018 No meeting --
Dec 6, 2018 Final exams --