(Contributors: Tian Chunyu, Zhang Yue) On November 3, 2024, the 8th Excellent Doctoral Forum of the Chinese Society of Image and Graphics (CSIG) was successfully held at Huazhong University of Science and Technology (HUST). The forum, organized by CSIG, was co-hosted by the School of Software Engineering at HUST, CSIG Wuhan Members’ Center, CSIG Youth Committee, CSIG Document Image Analysis and Recognition Committee, CSIG Imaging Detection and Perception Committee, CSIG Youth Promotion Club, and CSIG Excellent Doctoral Club. Professors Bai Xiang and Liu Yuliang from HUST, along with Professor Dong Yanni from Wuhan University, led the organization of the event.
The forum featured 10 speakers, including leading scholars and outstanding young researchers: Professors Du Bo, Jin Lianwen, Xie Hongtao, and Su Housheng, as well as Dr. Wang Wenhai, Associate Professor Liu Jing, Associate Professor Xu Tianyang, Dr. Liao Minghui, Dr. Yu Changqian, and Dr. Jiang Xingyu. The presentations were delivered through a hybrid format with both in-person and live-streamed sessions, attracting an overwhelming number of registrants and receiving widespread attention. The forum was chaired by Professors Liu Yuliang and Dong Yanni, with Liu Yuliang hosting the meeting.


The event began with an enthusiastic opening speech by Professor Ma Huimin, Vice President and Secretary General of CSIG. She encouraged participants to seize this opportunity for in-depth exchanges and mutual learning to collectively advance technological progress in image and graphics research.

Following this, Wang Shixian, Party Secretary of the School of Software Engineering at HUST, extended a warm welcome to all attending experts, faculty, and students.

Professor Ma Chao, Rotating Chair of the CSIG Excellent Doctoral Club, then gave an overview of the forum's history, highlighting its role in providing cutting-edge learning and exchange opportunities for young scholars.
Highlights of the Reports by Leading Scholars

The first session was led by Professor Du Bo, who presented on "Large Model Methods and Their Domain Applications." He explored the application of large models in artificial intelligence, particularly in remote sensing and other vertical domains. His team’s research included model architecture, dataset construction, and performance evaluation. He also discussed challenges and future directions for large model applications in specific fields.

Professor Jin Lianwen delivered a talk titled "OCR in the Era of General Artificial Intelligence." He reviewed the evolution of OCR technology, from traditional template-matching methods to modern deep learning approaches. He discussed new challenges and opportunities under the paradigm of general artificial intelligence, including multimodal data fusion and text recognition in complex scenarios. His team’s work on ancient text recognition and unified multi-task learning strategies was also shared.

Professor Xie Hongtao presented on "Active Defense and Passive Detection Techniques for Deepfake Technology." He introduced the fundamental principles and application scenarios of deepfake technologies, such as fake news generation and malicious attacks. He then explored approaches to actively defend against and passively detect deepfakes, sharing his team’s cutting-edge research.

Professor Su Housheng delivered a talk titled "Distributed State Estimation in Multi-Agent Systems." He introduced the basic concepts and applications of multi-agent systems, including drone swarms and intelligent transportation. His report covered distributed filter design, communication protocol optimization, and robustness analysis, concluding with future directions in distributed state estimation.
Reports from Excellent Doctoral Scholars

Dr. Wang Wenhai presented on "Technical Evolution and Application Exploration of the Shusheng-VL Multimodal Large Model." He discussed the fusion of visual, textual, and speech modalities, shared his team’s new InternVL2 model design, and analyzed future trends in multimodal large models.

Associate Professor Liu Jing delivered a talk on "Research on Cross-Modal Video Action Localization," discussing challenges in heterogeneous multimodal data and introducing her team’s spatiotemporal modeling framework and experimental results.

Associate Professor Xu Tianyang presented on "Visual Multimodal Fusion and Perception," discussing techniques such as feature extraction, alignment, and multimodal network design, and exploring future development trends in the field.

Dr. Liao Minghui from Huawei Terminal BG gave a talk on "Visual Multimodal Large Models and Their Applications in UI Understanding," sharing insights into interface element recognition and user intent analysis, highlighting the UI-Hawk model.

Dr. Yu Changqian from Kunlun Wanwei introduced "Research and Applications of Multimodal Understanding and Generation." He shared his team’s DiT-MoE model and discussed advancements in multimodal alignment and generation methods.

Dr. Jiang Xingyu explored "Adaptive Image Feature Matching in Multi-Scene Contexts," presenting methods for feature extraction and adaptive adjustment in diverse scenarios, along with the future trends of this research.
Panel Discussion and Closing Remarks

Professors Ma Chao, Zhang Dingwen, Zheng Bolong, and others participated in a panel discussion titled "How Advisors Can Guide Students to High-Quality Research in a Fast-Paced Era." Panelists shared insights from both faculty and student perspectives, emphasizing the importance of aligning individual interests with lab strengths, fostering proper research mindsets, and encouraging collaboration.

Professor Dong Yanni concluded the forum with a summary of the day’s reports and discussions, expressing gratitude to all participants and encouraging continued engagement with CSIG activities to collectively drive advancements in image and graphics research.