Deep learning is disruptive in many applications mainly due to its superior performance. At the same time, many fundamental questions about deep learning remain unanswered. Model complexity of deep neural networks is one of them. Model complexity is concerned about how complicated a problem that a deep model can express and how nonlinear and complex the function of a model with given parameters can be.
In machine learning, data mining and deep learning, model complexity is always an important fundamental problem. Model complexity affects learnability of models on specific problems and data, as well as generalization ability of the model on unseen data. Moreover, the complexity of a learned model is affected not only by the model architecture itself, but also by the data distribution, data complexity, and information volume. In recent years, model complexity has become a more and more active direction, and has developed theoretical guiding significance in many areas, such as model architecture searching, graph representation, generalization study and model compression.
We propose this tutorial to overview the state-of-the-art research on deep learning model complexity. We summarize the model complexity studies into two directions: model expressive capacity and effective model complexity, and review the latest progress on these two directions.In addition, we introduce some application examples of deep learning model complexity to demonstrate its utility.
Keywords: Model Complexity, Deep Neural Network, Deep Learning
Xia Hu is currently a Ph.D. Candidate at the School of Computing Science, Simon Fraser University, Canada. Her research interests lie in deep learning, machine learning and data mining, with an emphasis on the interpretability, explainability and model complexity. She is currently focusing on exploring the explanatory representation of complex models, as well as the interpretation and analysis of internal mechanisms of network models (e.g., model complexity). She is also interested in various applications of deep learning in solving data mining problems.
Lingyang Chu is an Assistant Professor at the Department of Computing and Software at McMaster University. Before joining McMaster University, he was a principal researcher at Huawei Technologies Canada, where he led a research team focused on deep neural network interpretation, AI fairness, federated learning and big data analytics. During 2015~2018, he worked as a postdoctoral fellow at Simon Fraser University. He received the doctoral degree from the University of Chinese Academy of Sciences in 2015. His research works in large scale graph mining and interpretable AI have been published in top-tier venues. One of his works on interpretable AI was reported by a mainstream web portal of AI research in China in 2018.
Jian Pei is a Professor at the School of Computing Science and an associate member of the Department of Statistics and Actuarial Science at Simon Fraser University, Canada. His general areas include data science, big data, data mining, and database systems. His expertise is in developing effective and efficient data analysis techniques for novel data intensive applications. He is recognized as a Fellow of the Royal Society of Canada~(i.e., the national academy of Canada), the Canadian Academy of Engineering, ACM and IEEE.
Jian Pei is a productive and influential author in data mining, database systems, and information retrieval. Since 2000, he has published one textbook, two monographs, and over 200 research papers in refereed journals and conferences, which have been cited over 100,000 times in literature, and over 41,000 times in the last 5 years. His research has generated remarkable impacts substantially beyond academia. His algorithms have been adopted by industry in production and popular open-source software suites. He is responsible for several commercial systems of unprecedentedly large scale. He received many prestigious awards, such as the 2017 ACM SIGKDD Innovation Award, the 2015 ACM SIGKDD Service Award, the 2014 IEEE ICDM Research Contributions Award, a KDD Best Application Paper Award (2008), and an IEEE ICDE Influential Paper Award (2019).
Jiang Bian is a Principal Researcher and Research Manager at Microsoft Research with research interests in AI for finance, AI for logistics, business AI, deep learning, multi-agent reinforcement learning, computational advertising, and a variety of machine learning applications. Prior to that, he was a senior scientist, leading the recommendation and search modeling, at Yidian Inc., a startup on content-oriented content delivery platform. He also used to work at Yahoo! Labs as a Scientist and did a lot of studies on content optimization and personalization for the Yahoo! key content modules. He has authored tens of research papers published at several well-recognized AI-related conferences with thousands of citations, such as KDD, ICDE, AAAI, NIPS and ICML. He has been served as Program Committee Member/Peer Reviewer of many influential academic conferences and journals.
Weiqing Liu is a Senior Researcher at Microsoft Research. He holds a Ph.D. degree in the Department of Computer Science from the University of Science and Technology of China. His research interests focus on data mining and machine learning. He is actively transferring research to significant real-world applications, especially to finance scenarios. Currently, one of his research focuses is on the common critical challenge of applying AI into the finance area, especially the interpretability issue of machine learning models in application scenarios. His work has led to tens of research papers in prestigious conferences, such as KDD, WWW, WSDM, AAAI and IJCAI.