HPC, Big Data and Data Science devroom
Room:
AW1.126
Calendar:
iCal, xCal
High Performance Computing (HPC) and Big Data are two important approaches to scientific computing. HPC typically deals with smaller, highly structured data sets and huge amounts of computation while Big Data, not surprisingly, deals with gigantic, unstructured data sets and focuses on the I/O bottlenecks. With the Big Data trend unlocking access to an unprecedented amount of data, Data Science has emerged to tackle the problem of creating processes and approaches to extracting knowledge or insights from these data sets. Machine learning and predictive analytics algorithms have joined the family of more traditional HPC algorithms and are pushing the requirements of cluster and data scalability.
Free and Open Source communities have been the foundation of the HPC and Big Data communities for some time. In the HPC community, it should be no surprise that 488 of the Top500 supercomputers in the world run Linux. On the Big Data side, the Hadoop ecosystem has had a tremendous amount of Open Source contributions from a wide range of organizations coming together under the Apache Software Foundation.
Our goal is to bring the communities together, share expertise, learn how we can benefit from each other's work and foster further joint research and collaboration. We welcome talks about Free and Open Source solutions to the challenges presented by large scale computing, data management and data analysis.