ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

HPC 2018 - Accelerating Big Data Processing and Associated Deep Learning on Datacenters and HPC Clouds with Modern Architectures

Date2018-02-24

Deadline2017-12-31

VenueVienna, Austria Austria

Keywords

Websitehttps://web.cse.ohio-state.edu/~panda.2/...

Topics/Call fo Papers

The convergence of HPC, Big Data, and Deep Learning is becoming the next game-changing business opportunity. Apache Hadoop, Spark, gRPC/TensorFlow, and Memcached are becoming standard building blocks in handling Big Data oriented processing and mining. Modern HPC bare-metal systems and Cloud Computing platforms have been fueled with the advances in multi-/many-core architectures, RDMA-enabled networking, NVRAMs, and NVMe-SSDs during the last decade. However, Big Data and Deep Learning middleware (such as Hadoop, Spark, Flink, and gRPC) have not embraced such technologies fully. Recent studies have shown that default designs of these components can not efficiently leverage the features of modern HPC clusters, like Remote Direct Memory Access (RDMA) enabled high-performance interconnects, high-throughput parallel storage systems (e.g. Lustre), Non-Volatile Memory (NVM). In this tutorial, we will provide an in-depth overview of the architecture of Hadoop, Spark, gRPC/TensorFlow, and Memcached. We will examine the challenges in re-designing networking and I/O components of these middleware with modern interconnects, protocols (such as InfiniBand, RoCE) and storage architectures. Using the publicly available software packages in the High-Performance Big Data project (HiBD, http://hibd.cse.ohio-state.edu), we will provide case studies of the new designs for several Hadoop/Spark/gRPC/TensorFlow/Memcached components and their associated benefits. Through these, we will also examine the interplay between high-performance interconnects, storage (HDD, NVM, and SSD), and multi-core platforms (e.g., Xeon x86, OpenPOWER) to achieve the best solutions for these components and applications on modern HPC clusters and clouds. We also present in-depth case-studies with modern Deep Learning tools (e.g., Caffe, TensorFlow, DL4J, BigDL) with RDMA-enabled Hadoop, Spark, and gRPC.
Targeted Audience and Scope
This tutorial is targeted for various categories of people working in the areas of Big Data processing, Deep Learning, Cloud Computing, and HPC on modern datacenters and HPC Clouds with high-performance networking and storage architectures. Specific audience this tutorial is aimed at include:
Scientists, engineers, researchers, and students engaged in designing next-generation Big Data and Deep Learning systems and applications over high-performance networking and storage architectures
Designers and developers of Big Data, Deep Learning, Cloud Computing, Hadoop, Spark, Memcached, gRPC, and TensorFlow middleware
Newcomers to the field of Big Data processing and Deep Learning on modern datacenters and HPC Clouds who are interested in familiarizing themselves with Hadoop, Spark, Memcached, gRPC, TensorFlow, RDMA, SR-IOV, Virtualization, high-performance networking and storage
Managers and administrators responsible for setting-up next generation Big Data and Deep Learning environment and modern high-end systems/facilities in their organizations/laboratories
The content level will be as follows: 30% beginner, 40% intermediate, and 30% advanced. There is no fixed prerequisite. As long as the attendee has a general knowledge in Big Data, Deep Learning, Hadoop, Spark, Memcached, gRPC, TensorFlow, high performance computing, Cloud Computing, networking and storage architectures, he/she will be able to understand and appreciate it. The tutorial is designed in such a way that an attendee gets exposed to the topics in a smooth and progressive manner. This tutorial is organized as a coherent talk to cover multiple topics.

Last modified: 2017-12-21 15:52:45