ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

SWEET 2012 - 1st international workshop on Scalable Workflow Enactment Engines and Technologies (SWEET'12)

Date2012-05-20

Deadline2012-02-19

VenueArizona, USA - United States USA - United States

Keywords

Websitehttps://sites.google.com/site/sweetworkshop2012

Topics/Call fo Papers

The 1st international workshop on
Scalable Workflow Enactment Engines and Technologies (SWEET'12)

held in conjunction with the 2012 SIGMOD conference in the city of Scottsdale, Arizona, USA on May 20-25, 2012

Motivation

One of the goals of computer system engineering has always been to develop systems that are easy to use and understand, but at the same time put great computational power at the fingertips of the end users. The cloud computing model has the potential for making this a realistic goal in the area of business and scientific data processing, by enabling simple access to large pools of data storage and computational resources. More specifically, we observe that cloud computing is facilitating the convergence of workflow-based processing with traditional data management, thereby providing users with the best of both worlds. Workflows are used extensively, both for business applications and in computational science. Common to the broad range of workflow systems currently in use are their relatively simple programming models, which are usually exposed through a visual programming style, and are backed by a well-defined model of computation. While the flexibility of workflows for rapid prototyping of science pipelines makes them appealing to computational scientists, recent applications of workflow technology to data-intensive science shows the need for a robust underlying data management infrastructure. At the same time, on the data management side of science and business, workflow-like models and languages are beginning to emerge, to make it possible for users with no application development resources but close to the data domain, to assemble complex data processing pipelines.

Workshop Focus and Goals

The goal of the workshop is to bring together researchers and practitioners to explore the potential of cloud-based computing in facilitating the convergence between workflows and large-scale data processing. Concretely, the workshop is expected to provide insight into:
performance issues: efficient data processing using cloud-based workflows,
modelling issues: best practices in data-intensive workflow modelling and enactment,
support technology issues: how the potential synergy between large-scale data processing and workflow technology can be exploited in a principled way.
Indeed, while it appears that workflow technology is well-positioned to benefit from the scalability of computing resources offered by a cloud infrastructure, at present only few examples of cloud-based workflow systems exist (Pegasus, eScience Central), along with experimental prototypes that show how MapReduce implementations can be exposed as workflow patterns. Conversely, the database-to-workflow trajectory would benefit from formal existing workflow models, as well as from the componentization of data processing functions, the visual programming paradigm, and the formal validation analysis tools that are currently available in the workflow space. However, some hybrid database-cloud solutions have already been developed, such as HadoopDB, which is an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. To further illustrate this convergence, below are some of the recent developments that workshop contributors may want to be aware of.
In business workflow management systems, progress has been made on developing expressive graphical languages to represent complex data-intensive workflows, e.g., in BPMN and YAWL. At the same time, the scalability of the enactment engines for large datasets continues to be an issue for some of the most popular scientific workflow systems, including Taverna, Kepler and Galaxy amongst others.
In the area of data integration, a number of data mashup framework and toolkits are available for ad hoc, often temporary integration across multiple heterogeneous data sources. These include Yahoo pipes and W3C’s XProc for pipelines of XML transformations, amongst others. The problem of optimizing such transformations is closely related to the problem in database systems where queries contain complex user-defined functions.
In the area of cloud computing and data management, new data storage and data transformation techniques have been developed to store data and execute complex computational tasks over it, in a distributed and scalable fashion, e.g., Google File System and Map-Reduce framework. This includes also the development of related easy-to-use data processing languages such as Yahoo’s Pig Latin.

Topics

The topics of the workshop include, but as usual, are not strictly limited to:

Architectures:
? cloud-based, scalable workflow enactment architectures,
? efficient data storage for data-intensive workflows,
? optimizing execution of data-intensive workflows,
? workflow scheduling in cloud computing.

Models, Languages:
? languages for data-intensive workflows, data processing pipelines and data-mashups,
? verification and validation of data-intensive workflows,
? programming models for cloud computing,
? access control and authorisation models, privacy, security, risk and trust issues,
? workflow patterns for data-intensive workflows.

Applications of cloud-based workflow:
? bioinformatics,
? data mashups,
? semantic web data management,
? big data analytics.

Last modified: 2011-11-14 09:32:49