Authors: Ofer Biran, Oshrit Feder, Yosef Moatti, Athanasios Kiourtis, Dimosthenis Kyriazis, George Manias, Argyro Mavrogiorgou, Nikitas M. Sgouros, Martim T. Barata, Isabella Oldani, María A. Sanguino, Pavlos Kranas, Samuele Baroni, Miquel Mila Prat, Sergio Salmer
Solving practical policy problems demands data that typically are of various types. Thus, policy makers have to manage various types of data sources and multiple scientific methods and methodologies necessary to clean, filter, analyze, validate, and possibly augment the datasets as they are ingested. Such processing is mandatory if the data is to provide value.
PolicyCLOUD is an ongoing EU-funded research project that delivers an innovative data-centric approach to policy practice. This goal is being reached through a cloud-based ecosystem that supports data-driven policy management in an efficient manner that is both legally and ethically sound[2]. This ecosystem consists of a unique, integrated cloud-based environment that targets easy and efficient ingestion and use of data for the sake of policy creation, monitoring, and assessment.
In our research article, which is open access in Data & Policy [5], we describe the types of data sources used by the ecosystem, some of the built-in analytic capabilities of this environment, and the initial uses of PolicyCLOUD for solving real policy problems.
PolicyCLOUD offers data scientists a toolbox for ingesting and preparing datasets for policy analysis. Specifically, PolicyCLOUD offers efficient ways to:
- register datasets and analytic functions;
- apply an on-the-fly pipeline of analytic functions to datasets upon ingestion, either to transform data (e.g., removing irrelevant information) or to extract initial insights (e.g., adding analytic insights such as sentiment analysis to enrich the dataset):
- apply analytic functions to datasets after ingestion to extract and/or visualize information from the data stored within the PolicyCLOUD datastore.
PolicyCLOUD also has a set of built-in reusable models and tools that can be used when building ingestion pipelines or when applying analytic transformations on already ingested data. These include tools for situational knowledge acquisition, opinion mining, and sentiment and trend analysis, complemented with tools for data aggregation, linking, cleansing, and interoperability.