Keynote Speech 3, 4 and 5

Saturday, Aug. 24, 2024

Location: Weizhen Building，Ziyou Campus，Northeast Normal University（东北师范大学自由校区惟真楼）

Time: 9:00 a.m. — 9:50 a.m.

Host: Wenguang Sun

Yun Liu

Institute of Mathematics and Systems Science, Chinese Academy of Sciences

Title: Decentralized Stochastic Subgradient Methods for Nonsmooth Nonconvex Optimization

Abstract: In this work, we concentrate on the decentralized optimization problems with nonconvex and nonsmooth objective functions, especially on the decentralized training of nonsmooth neural networks. We introduce a unified framework to analyze the global convergence of decentralized stochastic subgradient methods. We prove the global convergence of our proposed framework under mild conditions, by showing that the generated sequence asymptotically approximates the trajectories of its associated differential inclusion. Furthermore, we verify that our proposed framework includes a wide range of existing efficient decentralized subgradient methods, including decentralized stochastic subgradient descent (DSGD), DSGD with gradient-tracking technique, and DSGD with momentum (DSGDM). In the scope of our framework, we propose a new approach employing the sign map to regularize the update directions in DSGDM. Consequently, we establish, for the first time, global convergence of these methods when applied to nonsmooth nonconvex objectives. Preliminary numerical experiments demonstrate that our proposed framework yields highly efficient decentralized subgradient methods with convergence guarantees in the training of nonsmooth neural networks.

CV: 刘歆，中国科学院数学与系统科学研究院“冯康首席研究员”，博士生导师，计算数学与科学工程计算研究所副所长。主要研究方向包括流形优化、分布式优化及其在材料计算、大数据分析和机器学习等领域的应用。刘歆分别于2016年，2021年和2023年获得国家自然科学基金委优秀青年科学基金项目、杰出青年科学基金项目和科技部重点专项的资助。现担任MPC, JCM, JIMO, APJOR等国内外期刊编委，《中国科学·数学》（中英文）青年编委，《计算数学》副主编。

Time: 10:00 a.m. — 10:50 a.m.

Host: Xiangfei Wang

Deyu Meng

Xi'an Jiaotong University

Title: 无限维理解下的深度学习理论与算法

Abstract: 现有深度学习方法大多通过有限维的方式来对数据表示、网络架构等基本元素进行设计，然而，这些元素真正的内在表达却应为无限维。采用简化的有限维设计往往忽略算法各元素的本质无限维内涵，从而带来算法理论探索及应用扩展的局限。针对这一问题，本报告将尝试针对图像无限维表达、卷积核无限维表达、梯度场无限维表达等问题展开讨论，分别介绍研究团队在参数化卷积核，无限维神经表达，深度网络的类量子不确定性原理等深度学习基础理论与算法方面所作出的初步探索成果，并介绍基于其所延伸出一些典型示例应用。

CV: 孟德宇，西安交通大学教授，博导，任大数据算法与分析技术国家工程实验室统计与大数据中心副主任。发表论文百余篇，谷歌学术引用超过27000次。现任TPAMI，NSR等7个国内外期刊编委。目前主要聚焦于机器学习基础理论与算法方面的研究。

Time: 11:00 a.m. — 11:50 a.m.

Host: Wenguang Sun

Zhiqing Xu

Institute of Natural Sciences, Shanghai Jiao Tong University

Title: 现象驱动理解深度学习

Abstract: 本报告将从频率原则、凝聚等多个稳定且普遍的现象，介绍近年来深度学习的一些理论前沿，探讨现象驱动的理论研究如何推进深度学习的进展，并指导实际算法的设计，例如低频偏好的原则启发多种多尺度神经网络的算法设计。进一步，我们将设计一类锚函数，以低成本的现象驱动的方式研究大语言模型的机理，介绍大语言模型的一些底层机理。

CV: 许志钦，上海交通大学自然科学研究院/数学科学学院长聘教轨副教授。2012年本科毕业于上海交通大学致远学院。2016年博士毕业于上海交通大学，获应用数学博士学位。2016年至2019年，在纽约大学阿布扎比分校和柯朗研究所做博士后。与合作者共同发现深度学习中的频率原则、参数凝聚和能量景观嵌入原则，发展多尺度神经网络、提出研究语言模型的锚函数等。发表论文于TPAMI，JMLR，AAAI，NeurIPS，SIMODS，CiCP，CSIAM-AM.等学术期刊和会议。现为Journal of Machine Learning的创刊managing editor。