KDD2017 Themes
KDD2017 Themes
Foreword to the Applied Data Science: Invited Talks Track at KDD-2017
应用数据科学前言:KDD-2017特邀讲座
The Applied Data Science (ADS) Invited Talks Track at KDD-2017 is a continuation of what has now become a "7-year tradition" at KDD conferences. This is the second year the track operates under the ADS name, an evolution from its origins at KDD-2011 as the "Industry Practice Expo". The KDD Conference on Knowledge Discovery and Data Mining (KDD) is the world's first, largest and best conference on Data Science, Data Mining, and Knowledge Discovery. It brings together a healthy mix of academic researchers, industry and government researchers, and practitioners from a wide range of institutions and fields. The primary focus on KDD is on peer-reviewed research contributions and the academic advancement of the field. This is an important goal and in fact the KDD conference is now recognized as the most competitive and prestigious forum for presenting high quality research results. KDD, being fundamentally an applied field, needs the strong representation of applied work of big impact. Over the years of running the conference we observed that our initial speaker-selection approach needed to be re-thought because of the important contributions made to the field outside traditional academic, industrial and government research laboratories. The result of this re-thinking was to create a forum that exposes important contributions to Data Science through Big Data Applications that address strategic problems. We wanted to effectively capture the rising importance of Data Science and Machine Learning especially in the Big Data environment where structured and unstructured data create special challenges, and of course present new opportunities. The goal of the Invited Talks Track is to curate contributions from leaders in our field who have made important contributions through the development of a system, the creation of a new and important business, or the development and market introduction of a product,. Some of these important contributions may never see an academic paper or detailed peer-reviewed paper written about them, yet they are of critical importance to our very applied field. To give you an idea of how rapidly growing this area is, and how this sector of our industry and promises to be highly disruptive across many industries, we cite a couple of articles out of a plethora of such coverage: According to IDC, the global revenues from Big Data and business will grow from $130.1 billion in 2016 to more than $203 billion in 2020, at a compound annual growth rate (CAGR) of 11.7% [1]. Furthermore, to quote from a Forbes article: "Data monetization" will become a major source of revenues, as the world will create 180 zettabytes of data (or 180 trillion gigabytes) in 2025, up from less than 10 zettabytes in 2015.? [2]
应用数据科学(ADS)在KDD-2017特邀会谈上的讲话延续了KDD会议的“七年传统”。知识发现和数据挖掘会议(KDD)是世界上数据科学,数据挖掘和知识发现的世界上首个也是规模最大质量最高的会议。它汇集了来自各种机构和领域的学术研究人员,行业和政府研究人员以及从业人员。KDD的主要关注点是研究贡献的同行评审和该领域的学术进步。这是KDD会议的目标,事实上KDD如今被公认为最具竞争力和声望的高质量研究成果展示论坛。
多年来,我们观察到,由于在传统的学术,工业和政府研究实验室之外对该领域做出的重要贡献,我们最初的发言者选择方法需要重新思考。这个重新思考的结果是创建一个论坛,通过大数据应用来揭示数据科学的重要贡献,解决战略问题。我们希望有效地捕捉数据科学和机器学习日益增长的重要性,特别是在结构化和非结构化数据带来特殊挑战的大数据环境中,当然也会带来新的机遇。特邀会谈的目标是通过发展体系,创立新的重要业务,或开发和推出产品,为我们领域的领导者贡献力量。其中一些重要的贡献可能永远不会看到一篇关于它们的学术论文或详细的同行评议文章,但是它们对我们非常应用的领域是至关重要的。
为了让您了解这个领域发展的快速程度,以及我们这个行业如何承诺在很多行业中具有高度的破坏性,我们引用了大量这样的报道中的一些文章:根据IDC的报告,全球大数据和业务收入将从2016年的1301亿美元增长到2020年的2030多亿美元,复合年增长率(CAGR)为11.7%[1]。此外,引用“福布斯”的文章:“数据货币化”将成为收入的主要来源,因为到2025年,全世界将创造180 ZB的数据(或180万亿千兆字节),而2015年则不足10 ZB。[2]
The Invited Talks are clustered around the following themes:
Data Science in Sensor Data:
David Potere, CEO of Tellus Labs will speak on how common the analysis of spaceborne data has become in his talk: “Spaceborne data enters the mainstream.”
Tellus Labs首席执行官David Potere对星载数据分析的普遍性发表演讲:“星载数据进入主流”。
A related topic is analysis of climate data and will be covered by Professor Vipin Kumar of the University of Minnesota.
一个相关的话题是气候数据的分析,将由明尼苏达大学的Vipin Kumar教授进行讨论。
Professor Jonathan How of MIT will cover the issues of uncertainty in learning and planning when it comes to dealing with this type of data.
麻省理工学院的Jonathan How教授将介绍处理这类数据时学习和计划中的不确定性问题。
Benchmarks & Process Management in Data Science:
Eduardo Ariño de la Rubia, Chief Data Scientist of Domino Data Lab will speak on experiences with large enterprises in deploying process management tools for data scientists. Szilard Pafka, Chief Scientist of Epoch will share experiences and learnings from attempting to uild large scale benchmark metrics and data sets in the field.
Domino数据实验室首席数据科学家EduardoAriñode la Rubia针对大型企业为数据科学家部署流程管理工具的经验发表演讲。
Epoch首席科学家Szilard Pafka将分享经验和学习,尝试在该领域中使用大型基准度量和数据集。
Understanding Behavior with Data Science:
Professor Andy Berglund, of the University of Florida will speak on “Mining Big Data in Neuro Genetics to Understand Muscular Dystrophy.”
佛罗里达大学的Andy Berglund教授将就“挖掘神经遗传学中的大数据来理解肌营养不良症”发表演讲。
Mainak Mazumdar, EVP and Chief Research Officer of Nielsen will speak on challenges in media measurement with Big Data and Paritosh Desai, SVP Enterprise Data and Analytics of Target addresses the real considerations of successfully delivering results at a large retailer with his talk: “It takes more than math and science to hit the bullseye with data.”
Nielsen的执行副总裁兼首席研究官Mainak Mazumdar将就大数据媒体测量方面的挑战发表演讲,Target的SVP企业数据和分析师Paritosh Desai谈到了大型零售商成功交付成果的真正考虑因素,他说:“击中数据的靶心需要的不仅仅是数学和科学。“
Applied Machine Learning:
Joshua Bloom, VP of Data and Analytics of GE speaks on machine learning is practical industrial/manufacturing settings.
GE数据与分析副总裁Joshua Bloom谈到机器学习是实用的工业/制造环境。
Rajesh Parekh, Director of Analytics at of Facebook delivers a talk on “Designing AI at Scale to Power Everyday Life.” Professor Longbing Cao of UTS speaks on the dramatic results achieved with applying data science for tax enforcement in working with the Australian Taxation Office. These talks present a rare opportunity to hear from the very best in the field about the most exciting topics when it comes to building highly scalable platforms for Data Science and deploying practical, real-world applications. Our invited speakers will share key insights from their experiences and present valuable lessons learned.
Facebook分析总监Rajesh Parekh发表了一篇题为“设计人工智能以规模化日常生活”的讲座。悉尼科技大学的曹龙兵教授讲述了澳大利亚税务局与澳大利亚税务局合作应用数据科学进行税收执法所取得的巨大成果。这些讲座提供了一个难得的机会,可以从这个领域的最佳实践中听到最令人兴奋的话题,那就是构建数据科学高度可扩展的平台并部署实际的实际应用。我们邀请的发言者将分享他们的经验的重要见解,并提供宝贵的经验教训