BDDC 2014 - 1st International Workshop on Big Data Discovery & Curation (BDDC)
Topics/Call fo Papers
Traditionally, data warehouses have been used to provide business users ways to consolidate information from different sources for analysis and reporting. For getting data ready for analysis, ETL is used which involves reading data from different sources, cleaning the data, converting the format of the input data so that it conforms to the target database, and writing it to the target database. Big data paradigm is changing this problem due to three V’s: volume, velocity, and variety. Thus, one needs to discover, integrate, and analyze large volumes of diverse data quickly.
In big data paradigm, potentially a large number of data sources and data assets are considered for analytics. Finding relevant data for analytics is an important data discovery problem. Data diversity makes this problem difficult. The diversity of the data can be due to data model; type of data?structured, semi-structured, or unstructured; enterprise data vs. open public data; integrating social media data, etc. One also needs to handle data quality and data governance issues. In this workshop we invite papers describing techniques for identifying relevant sets of data, finding different kinds of relationships between structured, semi-structured, and unstructured data, integrating those data using various join, union, and merge techniques, validating the integrated data, and analyzing it, from various industry domains.
Topics of interest include (but are not limited to):
Cleaning big data
Integration of big heterogeneous data
Metadata extraction
Automated rule generation
Curating scientific data
Data discovery
This workshop will be useful for big data researchers and practitioners who need to combine multiple data sources, enterprise data as well as external/ social media data, and give an integrated view/analytics to the user.
In big data paradigm, potentially a large number of data sources and data assets are considered for analytics. Finding relevant data for analytics is an important data discovery problem. Data diversity makes this problem difficult. The diversity of the data can be due to data model; type of data?structured, semi-structured, or unstructured; enterprise data vs. open public data; integrating social media data, etc. One also needs to handle data quality and data governance issues. In this workshop we invite papers describing techniques for identifying relevant sets of data, finding different kinds of relationships between structured, semi-structured, and unstructured data, integrating those data using various join, union, and merge techniques, validating the integrated data, and analyzing it, from various industry domains.
Topics of interest include (but are not limited to):
Cleaning big data
Integration of big heterogeneous data
Metadata extraction
Automated rule generation
Curating scientific data
Data discovery
This workshop will be useful for big data researchers and practitioners who need to combine multiple data sources, enterprise data as well as external/ social media data, and give an integrated view/analytics to the user.
Other CFPs
- 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications - BigMine-14
- International Workshop on Data Mining for Brain Science
- The 3rd ACM SIGKDD International Workshop on Urban Computing
- 2nd International Workshop on Multimodal Crowd Sensing (CrowdSens 2014)
- 2014 Workshop on Outlier Detection & Description under Data Diversity
Last modified: 2014-04-26 23:05:13