秦 勇

南开大学杰出教授、博士生导师
研究方向:语音技术、人机交互技术、自然语言处理技术
电子邮箱:qinyong@nankai.edu.cn

        1996年,秦勇从中国科学院声学研究所博士毕业,获得语音通讯专业理学博士学位,同年加入IBM中国研究院。在IBM中国研究院工作的近25年时间里,秦勇参与并领导了IBM多个语音技术、产品和解决方案的研发工作,包括世界上第一个中文连续语音识别产品、电话语音服务器、嵌入式语音识别系统、语音转写系统和实时语音翻译系统等,这些创新产品和解决方案被行业客户广泛采用,极大地提升了企业客户从海量音频、语音和视频数据中获取业务的洞察力。在结构化、半结构化和非结构化数据的管理和分析领域,秦勇带领的IBM中国研究院的知识管理部门研发了一系列中文自然语言理解相关的核心技术,包括命名实体识别、文本信息可视化、情感分析、基于心理语言学的用户画像技术等,帮助企业在大数据时代更好地利用数据。配合IBM全球在认知计算领域的发展战略,秦勇组织IBM中国研究院的研发团队直接参与了IBM沃森人工智能平台的建设,包括与中文相关的IBM沃森语音云服务和IBM沃森知识问答服务等核心技术的研发。 基于IBM中国研究院在医疗领域十多年的耕耘,秦勇主持了IBM中国研究院认知医疗团队在疾病管理方面的研发工作,研发了面向慢性疾病管理的医疗辅助决策支持技术和解决方案, 辅助基层医生更有效地对慢性疾病进行更好的诊断、治疗和管理。除了日常科研工作之外,秦勇积极承担了IBM中国研究院的对外技术交流、内部创新孵化、人才招聘与培养等管理工作,为IBM中国研究院营造了良好的学术氛围,并帮助打造了IBM中国研究院作为一个世界级研究院的企业形象。

        2021年7月,秦勇正式加入南开大学,致力于将企业研究院的技术创新的先进理念以及创新如何在真实场景当中落地的丰富经验传授给年轻学子,为教育和科研事业奉献自己的微薄之力。尤其是在人类语言技术领域的新学科建设方面为做出应有的贡献。

研究方向:

  • 智能语音技术,包括语音识别、语音合成、语音转换、语音情感识别、音频模式识别、音乐智能、构音障碍研究等;
  • 多模态交互技术,包括数字人技术、唇语识别技术、唇语合成技术、姿态生成等;
  • 人工智能应用技术,关注医疗、教育、军事、安保、智慧交通等领域;

科研教学:

    • 在研科研项目:
      语音及相关多模态基础模型评测方法与工具研究(科技部科技创新2030新一代人工智能重大项目,人工智能基础模型支撑平台与评测技术,2022-2025)
      面向老年人汉语语音识别的预训练模型和领域自适应技术研究 (国基金项目,2022-2025)
      南开大学-零犀科技人工智能技术联合研究中心 (南开-零犀,横向课题, 2023-2025)
      低资源语音识别技术合作 (横向课题,2022-2023)
    • 论文发表:在重要学术会议以及期刊,如SIGCHI, IJCAI, ICASSP, Interspeech, ICPR, ICME等发表超过60篇论文。
      合著教材:《语音信息处理》,北京理工大学本科生教学教材,入选华为金课
    • 讲授课程:
      《自然语言处理》(64学时,面向大二、大三本科生)
      《语音信息处理》(64学时,面向大二、大三本科生)
      《管理类讲座》 (16学时,面向硕士和博士研究生)
    • 毕业设计:积极承担本科毕业生的培养工作,两年内一共指导近20位本科生完成了毕业设计,毕设题目涉及语音识别与合成、自然语言处理、计算机视觉、区块链、情感计算、音乐智能、语种识别、多模态交互技术等多个热点领域。目前正在指导8位研究生,包括5为硕士研究生和3为博士研究生,研究方向为智能语音技术和情感计算技术。

秦勇教授近期参与发表的文章列表:

  1. Evaluation of artificial intelligence systems for assisting neurologists with fast and accurate annotations of scalp electroencephalography data, Subhrajit Roy, Isabell Kiral, Mahtab Mirmomeni, Todd Mummert, Alan Braz, Jason Tsay, Jianbin Tang, Umar Asif, Thomas Schaffter, Mehmet Eren Ahsen, Toshiya Iwamori, Hiroki Yanagisawa, Hasan Poonawala, Piyush Madan, Yong Qin, Joseph Picone, Iyad Obeid, Bruno De Assis Marques, Stefan Maetschke, Rania Khalaf, Michal Rosen-Zvi, Gustavo Stolovitzky, Stefan Harrer, IBM Epilepsy Consortium, EBioMedicine 2021
  2. Identification and external validation of IgA nephropathy patients benefiting from immunosuppression therapy Tingyu Chen,Eryu Xia,Tiange Chen,Caihong Zeng,Shaoshan Liang,Feng Xu,Yong Qin,Xiang Li,Yuan Zhang,Dandan Liang,Guotong Xie,Zhihong Liu, EBioMedicine 2020
  3. Few-Shot Audio Classification with Attentional Graph Neural Networks Shilei Zhang,Yong Qin,Kewei Sun,Yonghua Lin, Interspeech 2019
  4. Pairwise-ranking based collaborative recurrent neural networks for clinical event prediction Zhi Qiao,Shiwan Zhao,Cao Xiao,Xiang Li,Yong Qin,Fei Wang, IJCAI 2018
  5. Using Machine Learning Approaches for Emergency Room Visit Prediction based on Electronic Health Record Data Zhi QIAO,Ning SUN,Xiang LI,Eryu XIA,Shiwan ZHAO,Yong QIN, MIE 2018
  6. Using Model-Based Recursive Partitioning for Treatment-Subgroup Interactions Detection in Real-World Data: A Myocardial Infarction Case Study Tiange Chen,Xiang Li,Jingang Yang,Jingyi Hu,Meilin Xu,Yong Qin,Yuejin Yang, MIE 2018
  7. Clinical Similarity based Framework for Hospital Medical Supplies Utilization Anomaly Detection: A Case Study ” Ning SUN,Meilin XU,Mingzhi CAI,Xudong MA,Yong QIN, MIE 2018
  8. Fine-Tuning Neural Patient Question Retrieval Model with Generative Adversarial Networks Guoyu Tang,Yuan Ni,Keqiang Wang,Yong Qin, MIE 2018
  9. Emotion recognition with multimodal features and temporal models Shuai Wang,Wenxuan Wang,Jinming Zhao,Shizhe Chen,Qin Jin,Shilei Zhang,Yong Qin, ICMI 2017
  10. Video emotion recognition in the wild based on fusion of multimodal features Shizhe Chen,Xinrui Li,Qin Jin,Shilei Zhang,Yong Qin, ICMI 2016
  11. Wake-up-word spotting using end-to-end deep neural network system Shilei Zhang,Wen Liu,Yong Qin, ICPR 2016
  12. Rapid feature space MLLR speaker adaptation for deep neural network acoustic modeling Shilei Zhang,Yong Qin, ICPR 2016
  13. Semi-supervised accent detection and modeling Shilei Zhang,Yong Qin, ICASSP 2013
  14. Investigating Performance of the Discriminative Methods for Long-Term Speaker Adaptation Danning Jiang,Dimitri Kanevsky,Vaibhava Goel,Yong Qin, Interspeech 2012
  15. Model dimensionality selection in bilinear transformation for feature space MLLR rapid speaker adaptation Shilei Zhang,Yong Qin, ICASSP 2012
  16. Effects of automated transcription quality on non-native speakers’ comprehension in real-time computer-mediated communication Yingxin Pan,Danning Jiang,Lin Yao,Michael Picheny,Yong Qin, CHI 2010
  17. THE 2009 IBM GALE MANDARIN BROADCAST TRANSCRIPTION SYSTEM Stephen M. Chu,Daniel Povey,Hong-Kwang Kuo,Lidia Mangu,Shilei Zhang,Qin Shi,Yong Qin, ICASSP 2010
  18. Main Vowel Domain Tone Modeling with Lexical and Prosodic Analysis for Mandarin ASR Shilei Zhang,Qin Shi,Stephen M. Chu,Yong Qin, ICASSP 2010
  19. Modeling Syllable-Based Pronunciation Variation for Accented Mandarin Speech Recognition Shilei Zhang,Qin Shi,Yong Qin ICPR, 2010
  20. Automatic Pronunciation Transliteration for Chinese-English Mixed Language Keyword Spotting Shilei Zhang,Zhiwei Shuang,Yong Qin, ICPR 2010

秦勇教授近期专利发表:

  1. CLINICAL DECISION SUPPORT, USA, 16/405047, 2019
  2. AUGMENTING RELATIONAL DATABASE ENGINES WITH GRAPH QUERY CAPABILITY, USA, 16/691907, 2019
  3. DECENTRALIZED PRIVACY-PRESERVING CLINICAL DATA EVALUATION, USA, 16/238216, 2019
  4. EXTRACTING ENTITY RELATIONS FROM SEMI-STRUCTURED INFORMATIO, USA, 16/241000, 2019
  5. UTILIZING EXTERNAL KNOWLEDGE AND MEMORY NETWORKS IN A QUESTION-ANSWERING SYSTEM, USA, 16/199923, 2018
  6. CLASSIFIER TRAINED WITH DATA OF DIFFERENT GRANULARITY, USA, 16/106320, 2018
  7. METHOD AND APPARATUS FOR AUTISM SPECTRUM DISORDER ASSESSMENT AND INTERVENTION, USA, 16/156113, 2018
  8. EMOTION CLASSIFICATION BASED ON EXPRESSION VARIATIONS ASSOCIATED WITH SAME OR SIMILAR EMOTIONS, USA, 15/791821, 2017
  9. ATTENTION BASED SEQUENTIAL IMAGE PROCESSING, USA, 15/792051, 2017
  10. SPEECH RECOGNITION BY SELECTING AND REFINING HOT WORDS, USA, 15/592773, 2017
  11. SPECIALIST KEYWORDS RECOMMENDATIONS IN SEMANTIC SPACE, USA, 15/352842, 2017
  12. VISUAL LIVENESS DETECTION, USA, 14/821258, 2016