Name: Prof. Ernesto Damiani

Institute: Center on Cyber-Physical Systems, Khalifa University, Abu Dhabi, UAE Dipartimento di Informatica Giovanni degli Antoni, Università degli Studi di Milano, Milano, Italy

Title: Modeling Artificial Intelligence Big Data Pipelines

In the era of the Internet of Things, huge volumes of highly dimensional data points are made available to applications at an unprecedented velocity. Computing Artificial Intelligence Analytics on such data requires: (i) improving data quality (interpolation, sparsity reduction) to make them suitable for feeding Machine Learning (ML) predictors and classifiers (ii) efficient training and tuning ML predictors and classifiers (iii) operating ML predictors and classifiers in production. In this talk, we discuss how conceptual models can be used to represent key features of such computations, including data ingestion, center-periphery distribution and parallelization, dependencies and potential interferences. Based on such conceptual models, we discuss a methodology and a toolkit for automatic synthesis and deployment of Artificial Intelligence analytics.
Detailed introduction:
Ernesto Damiani is a Professor of computer science at the University of Milan, where he leads the SEcure Service-oriented Architectures Research (SESAR) Lab. Ernesto is the Founding Director of the Center on Cyber-Physical Systems at Khalifa University, in the UAE. Ernesto Damiani received an honorary doctorate from Institut National des Sciences Appliquées de Lyon, France (2017) for his contributions to research and teaching on Big Data analytics. Ernesto is the Principal Investigator of the H2020 TOREADOR project on Big data as a service. His research spans Cyber-security, Big Data and cloud/edge processing, where he has published over 600 peer-reviewed articles and books. He is Distinguished Scientist of ACM and a recipient of the 2017 Stephen Yau Award.

Name: Anqun Pan

Institute: Tencent

Title: Conceptual Modeling On Tencent'€™s Distributed Database Systems

Tencent is the largest Internet service provider in China. Typical services include WeChat, games, payment, cloud storage and computing. Tencent serves billions of users and millions of enterprises, and some services like WeChat, are required to have the ability to handle more than 200,000 TPS requests at peak time. To do this, Tencent has built an elastic and scalable database service system, namely TDSQL, which can efficiently support their ever-growing service requests. TDSQL is deployed and runs on top of more than ten thousands of compute nodes. In this talk, we present the main challenges that we have encountered, and give our practice of conceptual modeling on TDSQL. First, failures of compute nodes often occur in an X86-based large-scale distributed system architecture. To address this issue, we introduce a fault tolerance model to guarantee the high availability of the services. Second, Tencent serves a huge number of requests, while different types of requests require different storage and compute resources. To improve the resource utilization, we propose a resource scheduling model that enables TDSQL to serve the requests elastically. Third, TDSQL provides a hybrid data modeling to support various data models, and develops DBaaS services to serve 10,000 + DB instances. Finally, we present how to fast develop applications in terms of conceptual modeling on top of TDSQL.
Detailed introduction:
Anqun Pan is currently an escalation engineer and the technical director of TEG, Tencent. Tencent is the largest Internet service provider in China, covering social network, finance, entertainment, cloud computing, instant message, tools, AI, etc.. Pan got his bachelor and master degrees from HUST in 2004 and 2007, respectively. He has 10+ years experience in distributed system development, and is leading the team to develop Tencent distributed DBMS, namely TDSQL. TDSQL serves a lot of business and users inside and outside Tencent. The number of digital financial accounts guaranteed by TDSQL is over 28 billion, and the daily trading volume processed by TDSQL exceeds 10 billion.

Name: Kyu-Young Whang

Institute: School of Computing, KAIST

Title: Recent Trends of Big Data Platforms and Applications

Big data refers to large amounts of data beyond acceptable limits of commonly-used data collection, storage, management, and analysis software. Big data has become a new trend and culture in academia and industry from the beginning of this decade. The importance of big data technology is being widely recognized and getting higher owing to recent technology development. In particular, popular social media services as well as devices connected via Internet-of-Things are accelerating generation of big data. Then, the cloud service improves accessibility of such big data by allowing us to access it everywhere. Furthermore, computing power has also improved rapidly with the introduction of new CPU and GPU hardware technologies. On the basis of these environmental changes, MapReduce and Hadoop significantly contributed to making big data processing prevalent in these days. Hadoop, which is an open-source implementation of MapReduce, enables us to achieve high-performance computing with only commodity machines, but without requiring expensive mainframe computers.

This keynote consists of two parts. The first part introduces the recent trends of big data platforms originated from Hadoop. Then, the second part addresses a few interesting big data applications enabled by such big data platforms.

In the first part, I would like to present the concept of the MapReduce paradigm and its significance in the history of big data processing. Then, I will discuss advantages as well as limitations of Hadoop and systematically review research efforts to overcome the limitations in three catagories: supports of iterative processing, stream processing, and the SQL language. As a solution for the third category, NewSQL, such as Google’s spanner and F1, has emerged as a new paradigm. I will also elaborate on ODYS, a massively-parallel search engine which has been developed at KAIST, as an example NewSQL system.

In the second part, I will address the effort of combining AI with big data since big data technology serves as an enabler of artificial intelligence (AI). For example, IBM Watson learned from 200 million pages including Wikipedia and news articles. With this rich knowledge base, IBM Watson surprisingly beat quiz-show human champions. IBM is expanding its application to medical science where Watson for Oncology collaborates with human doctors to diagnose cancers. As another example, smartphone vendors are developing intelligent personal assistants, such as Google Now, Apple Siri, and Amazon Alexa. These assistants benefit from big data because they are getting smarter by learning from huge amounts of user feedback and queries. I will overview some of these data-driven services.

In summary, the keynote will address the characteristics of big data, recent trends of big data platforms, and emerging applications for big data intelligence.
Detailed introduction:
Professor Kyu-Young Whang graduated (Summa Cum Laude) from Seoul National University in 1973 and received the M.S. degrees from Korea Advanced Institute of Science and Technology (KAIST) in 1975, and from Stanford University in 1982. He earned his Ph.D. degree from Stanford University in 1984. From 1983 to 1991, he was a Research Staff Member at IBM T. J. Watson Research Center, Yorktown Heights. In 1990 he joined KAIST where he currently is a KAIST Distinguished Professor and Emeritus Professor at School of Computing (former Computer Science Department).

His research interests encompass database systems/storage systems, object-oriented databases, geographic information systems (GIS), data mining, XML databases, search engines, and big data management platforms and analytics.

Dr. Whang, an ACM Fellow and an IEEE Fellow, served as an Editor-in-Chief of the VLDB Journal from 2003-2009 and the General Chair of VLDB2006. He served as a Trustee of the VLDB Endowment from 1998 to 2004 and from 2010 to 2016. He also served the data engineering community as the Chair of IEEE Technical Committee on Data Engineering from 2013 to 2015.

He is the recipient of numerous awards such as the Best Paper Award from the 6th IEEE ICDE in 1990, the Best Demonstration Award from the 21st IEEE ICDE in 2005, the prestigious Korea (presidential) Engineering Award in 2012, the Korea (presidential) Supreme Scientist/Engineer Award in 2017, and the ACM SIGMOD Contributions Award in 2014. He is a member of The National Academy of Sciences, Republic of Korea.