大型语言模型的进化与应用

文章主题：大型语言模型, ChatGPT, 效率, NLP

666AI工具大全，助力做AI时代先行者！

大型语言模型的进化与应用

*文章转载自AISP Seminar微信公众号

“AI安全与隐私”系列论坛第19期

实战 ChatGPT: 有效性、安全性和隐私保护

ChatGPT in Action: Effectiveness, Security and Privacy

►►

论坛时间

2023年3月28日（周二）上午 10:30-12:15

大型语言模型的进化与应用

►►

论坛议程

#1 开场主持

嘉宾：

吴保元教授

香港中文大学(深圳)数据科学学院副教授，深圳大数据研究院大数据安全计算实验室主任

时间：10:30-10:35

#2 主题报告

嘉宾：

胡侠教授

莱斯大学计算机科学系副教授，数据科学中心主任，AIPOW联合创始人

报告题目：

ChatGPT in Action: An Experimental lnvestigation of lts Effectiveness in NLP Tasks

实战ChatGPT: ChatGPT在自然语言处理中的实验性探索

时间：10:35-11:25

#3 问答环节

时间：11:25-11:35

#4 圆桌研讨

嘉宾：

胡侠教授

徐迈教授

赫然教授

吴保元教授

研讨主题：

① The impact of the emergence of ChatGPT on the development of human society and AI

ChatGPT的出现对人类社会和AI发展的影响

② New security challenges and ethical issues posed by chatGPT

ChatGPT带来的新安全挑战和道德问题

时间：11:35-12:15

►►

报告形式

1. 哔哩哔哩直播

http://live.bilibili.com/22947067

2. AISP论坛视频号

3. AI科技评论视频号

报告摘要

#1 主题报告

The recent progress in large language models has resulted in highly effective models like OpenAIs ChatGPT that have demonstrated exceptional performance in various tasks, including question answering, essay writing, and code generation. This presentation will cover the evolution of LLMs from BERT to ChatGPT and showcase their use cases. Although LLMs are useful for many NLP tasks, one significant concern is the inadvertent disclosure of sensitive information, especially in the healthcare industry, where patient privacy is crucial. To address this concern, we developed a novel framework that generates high-quality synthetic data using ChatGPT and fine-tunes a local offline model for downstream tasks. The use of synthetic data improved the performance of downstream tasks, reduced the time and resources required for data collection and labeling, and addressed privacy concerns. Finally, we will discuss the regulation of LLMs, which has raised concerns about cheating in education. We will introduce our recent survey on LLM-generated text detection and discuss the opportunities and challenges it presents.

原内容主要介绍了大型语言模型（LLM）的发展及其在各个领域的应用，特别是在自然语言处理（NLP）任务中的应用。其中，特别提到了OpenAI的ChatGPT模型，以及在医疗保健行业中可能出现的敏感信息泄露问题。为解决这一问题，研究者们开发了一种新的框架，该框架利用ChatGPT生成高质量的合成数据，并对其进行微调以适应下游任务，从而提高其性能。此外，也讨论了关于LLM的监管问题，引发了对于教育领域作弊行为担忧。最后，作者将分享他们对LLM生成的文本检测的研究结果，并探讨这一研究带来的机遇与挑战。重新组织后的文章阐述了大型语言模型（LLM）的最新发展，特别是OpenAI所开发的ChatGPT等高效模型在各类任务中的优异表现。本报告将深入探讨LLM从BERT到ChatGPT的进化历程，并通过实际应用案例加以展示。然而，虽然LLM在许多NLP任务上表现出色，但仍存在一个重要问题，即ChatGPT可能会无意间泄露敏感信息，特别是在医疗保健这样的敏感领域。为应对此问题，研究者们构建了一个新的框架，通过ChatGPT生成高质量 synthetic data，并对本地离线模型进行微调，以提升下游任务的性能。synthetic data的使用不仅提升了下游任务的效率，降低了数据收集和标注所需的时间和资源，同时也解决了隐私问题。文章最后，作者将讨论有关LLM的监管问题，以及由此引发的对于教育领域作弊的担忧。为此，作者将分享他们对LLM生成的文本检测的研究结果，并探讨这一研究带来的机遇与挑战。

嘉宾简介

Dr. Xia “Ben” Hu is an Associate Professor at Rice University in the Department of Computer Science and director of the Center for Transforming Data to Knowledge (D2K Lab). Dr. Hu has published over 100 papers in several major academic venues, including NeurIPS, ICLR, KDD, WWW, IJCAI, AAAI, etc. An open-source package developed by his group, namely AutoKeras, has become the most used automated deep learning system on Github (with over 8,000 stars and 1,000 forks). Also, his work on deep collaborative filtering, anomaly detection and knowledge graphs have been included in the TensorFlow package, Apple production system and Bing production system, respectively. His papers have received several Best Paper (Candidate) awards from venues such as ICML, WWW, WSDM, ICDM, AMIA and INFORMS. He is the recipient of NSF CAREER Award and ACM SIGKDD Rising Star Award. His work has been cited more than 18,000 times with an h-index of 51. He is the conference General Co-Chair for WSDM 2020 and ICHI 2023. He is also the founder of AI POW LLC.

胡侠，莱斯大学计算机科学系副教授以及数据科学中心（D2K Lab）主任，是一位在多个重要学术会议上发表超过100篇论文的学者，这些论文涵盖了NeurlPS、ICLR、KDD、WWYIJCAI、AAAI等多个顶级会议。他的研究成果不仅得到了广泛认可，而且 AutoKeras 这个由他的团队开发的开源软件包已经成为 GitHub 上最受欢迎的自动化深度学习系统，其用户数量超过8,000人并有1,000个fork。此外，他在深度协同过滤、异常检测和知识图谱领域的研究成果也被纳入了 TensorFlow 包、苹果生产系统和必应生产系统中。胡侠教授的研究工作在 ICML、WWYWSDM、ICDM、AMIA 和 INFORMS 等会议上荣获多个最佳论文（候选人）奖项。他还荣获了 NSF CAREER 奖和 ACM SIGKDD Rising. Star 奖。他的研究成果已经被引用超过 18,000 次，h 指数为 51。除此之外，胡侠教授还担任了 WSDM 2020 和 ICHI2023 两届会议的联合主席，并且是 AI POW LLC 的创始人。

大型语言模型的进化与应用

Professor Mai Xu is a doctoral supervisor at the School of Electronic Information Engineering, Beihang University. He is also a Chang Jiang Scholar appointed by the Ministry of Education and serves as the deputy director of the Youth Committee of the China Society of Image and Graphics. He has been the principal investigator for several research projects, including the National Natural Science Foundation of Chinas First Exploration, Key, Outstanding Youth, and Beijing Science and Technology Star programs. He has been honored with the first prize of the Ministry of Educations Technical Invention Award (as the first author), the first prize of the Technical Invention Award of the Chinese Association for Artificial Intelligence (as the second author), and the Outstanding Youth Achievements Transformation Award of the China Association for Science and Technology. Professor Xus research interests include video compression and image processing. In the past five years, he has published more than 100 papers in prestigious journals such as IJCV, IEEE TPAMI, TIP, JSAC, and TMM, as well as important conferences such as IEEE CVPR, ICCV, ECCV, ACM MM, AAAI, and DCC. Several of his papers have been selected as ESI Highly Cited Papers/Hot Papers. Professor Xu has been teaching the undergraduate course “Image Processing” for 10 consecutive years and has been awarded the Outstanding Teacher Award for Computer Science majors. He has also been teaching the graduate course “Machine Learning” for six consecutive years and has received the Beihang Graduate Course Excellence Teaching Award. He has been selected as one of the top ten teachers in the “I Love My Teacher” program at Beihang University.

徐迈，一位来自北京航空航天大学电子信息工程学院的知名教授和博导，同时他还担任教育部的特聘教授和中国图像图形学学会青工委的副主任。他身兼重任，成功地承担了包括国家自然科学基金的首批原创探索、重点、优青以及北京市杰青在内的多个重点项目。他的研究成果在教育领域也取得了显著的成就，荣获教育部技术发明一等奖和中国人工智能学会技术发明一等奖，以及中国科协求是杰出青年成果转化奖。徐迈教授的研究领域主要集中在视频压缩和图像处理。在过去的五年中，他在IJCV、IEEE TPAMI、TIP、JSAC、TMM等权威期刊，以及IEEE CVPR、ICCV、ECCV、ACM MM、AAAI、DCC等重要会议上发表了100余篇论文，其中许多论文还被选为ESI高被引论文或热点论文。此外，他连续10年负责《图像处理》本科生课程的教学工作，并因此荣获高校计算机专业优秀教师奖；连续6年致力于《机器学习》研究生课程的教学工作，他的教学成果获得了北航研究生课程卓越教学奖的荣誉。由于他的杰出贡献，徐迈教授还入选了北航“我爱我师”十佳教师名单。

大型语言模型的进化与应用

Professor Ran He is a member of the State Key Laboratory of Pattern Recognition at the Institute of Automation, Chinese Academy of Sciences, and an IAPR Fellow. His research focuses on the foundational theories of pattern recognition and their applications in computer vision, biometric recognition, and artificial intelligence security. They have undertaken various research projects, including the National Science Fund for Excellent Young Scholars, the Beijing Outstanding Young Scientist Fund, and the Young Promoter of the Chinese Academy of Sciences, among others. They have published a monograph on information theory learning and 23 papers in internationally renowned journals such as IEEE T-PAMI and IJCV in this field. Their contributions have been recognized through numerous awards, including the IEEE Signal Processing Societys Best Young Paper Award, the ICPR Best Scientific Paper Award, the CSIG Natural Science First Prize, and the Beijing Science and Technology Progress Second Prize. They serve as an editorial board member for journals such as IEEE T-IP, IEEE T-BIOM, Pattern Recognition, and Acta Automatica Sinica, as well as chair for conferences such as NeurIPS, ICML, CVPR, and ECCV in their respective fields.

赫然，中科院自动化所模式识别国家重点实验室研究员，国际模式识别学会会士(IAPR Fellow)。从事模式识别应用基础理论研究，并应用到计算机视觉、生物特征识别和人工智能安全。承担了国自然优秀青年科学基金、北京杰出青年科学基金和中科院青年促进会优秀会员等项目。出版信息理论学习专著1部，在本领域国际主流期刊IEEE T-PAMI和IJCV上发表论文23篇，获IEEE信号处理协会最佳青年论文奖、ICPR最佳科学论文奖、CSIG自然科学一等奖、北京市科技进步二等奖等。他是IEEE T-IP、IEEE T-BIOM、Patten Recognition和自动化学报等期刊编委，以及NeurIPS、ICML、CVPR、ECCV等领域主席。

►►

主办单位

深圳市大数据研究院

中国图象图形学学会

►►

承办单位

CSIG视觉大数据专委会

香港中文大学（深圳）数据科学学院

►►

协办单位

SCLBD

►►

媒体支持

AI科技评论

招贤纳士

我们正积极寻找人工智能安全与隐私方向的全职研究科学家、数据工程师、访问学生，以及博士后、2023年秋入学的博士研究生（人工智能安全与隐私、计算机视觉、机器学习等方向）。有关职位的更多信息，请点击招聘链接以获取更多信息。

大型语言模型的进化与应用

AI时代，拥有个人微信机器人AI助手！AI时代不落人后！

免费ChatGPT问答，办公、写作、生活好得力助手！

搜索微信号AIGC666aigc999或上边扫码，即可拥有个人AI助手！

相关文章