Submission date: 
Wednesday, 31 August, 2022

WP:

This document is the third and final architecture deliverable for WP4. In this summary we first mention what the major additions made during the past year, that is from D4.3 and then give a quick overview of the overall work achievements by end of month 32.

Following are the major changes that were added in the past year: new subsection 2.2.5: “Integration of External Frameworks” was added. It describes the generic integration of external frameworks and how specifically we applied it to the Politika framework.

In subsection 2.3: “Security and Legal concerns”, a lot of new information concerning the Legal/ethical framework was added. This includes the legal questions which are presented to registrants of datasets or analytic functions to DAA and which will guide the registrant to fill in all the needed legal details.

Section 4.1 (SKA analysis) reports of incremental improvements, except for the new situational knowledge analysis based on time series forecasting (SKA-TSF), a new component which has been fully defined and is at the closure of this document under development.

Section 4.2 reports on contextual Information opinion and sentiment components, reporting on the status of the aspect-based sentiment analysis and introducing a new component for the trend analysis.

The description of the Social Dynamics component was extended in multiple directions: section 4.3.5 presents the GUIs that have been developed for defining a policy model during design, section 4.3.6 briefly covers the integration of this component with the PolicyCLOUD platform, and finally sections 4.3.7 and 4.3.8 describes simulation and policy models respectively related to the Agri-Food and the Radicalization use cases.

Following is a summary of the overall achievements of the WP4 tasks:

The Data Acquisition and Analytics layer (DAA) was architected and implemented in a successful way as it was successfully used for all the Use Case scenarios. Furthermore, the DAA was also the natural basis for the integration of external frameworks as detailed in Section 2.2.5.

Task T4.2 comprises two major technologies: Data Cleaning and Enhanced Interoperability. Since they are interlinked we present their major achievements as a whole:

First of all, the design and implementation of all the sub mechanisms, including these components genericity, were completed. The corresponding final software prototypes were released.

The integration with all components and tools of PolicyCLOUD Data Ingestion Pipeline was also successfully completed.

Concerning Enhanced Data Interoperability, the integration and collaboration with T4.4 “Opinion Mining & Sentiment Analysis” for providing an enhanced Entity-Level Sentiment Analysis (ELSA) mechanism was achieved.

Concerning Data Cleaning, an extensive list with generalized cleaning rules and actions, covering multiple domains (going beyond the PolicyCLOUD Use Cases and scenarios was elaborated and implemented.

Finally end-to-end integration was achieved for all the Use Cases and scenarios.

Concerning Situational Knowledge Acquisition and Analysis several analytical tools have been implemented. In general terms, they are focused at i) providing exploratory data analysis based on data visualization and ii) predicting analysis based on time series forecasting. End-to-end integration was achieved for all the use cases and scenarios.

Concerning Sentiment and Opinion Analysis two main capabilities have been provided, i) a sentiment analysis tool for extracting the feeling and opinion from text as well as entities detected in the text and ii) a trend analysis tool for conceptualization, contextualization and monitoring of given entities in social media. End-to-end integration was achieved for all the use cases and scenarios.

T4.5 Concerning Social Dynamics, we implemented this component as a stand-alone web tool referred to as Politika. Politika uses agent-based, social simulation as the primary analysis tool for evaluating policy alternatives based on a special-purpose modeling language, a concurrent simulator and a meta- simulation framework for automatically managing and evaluating simulations of policy alternatives. We then proceeded with the integration of Politika to the rest of the PolicyCLOUD platform through a REST API. Finally, we developed two use cases in Politika as part of the Radicalization and Agri-Food groups of policies examined in PolicyCLOUD.

T4.6 Concerning the Optimization and Reusability of Analytical Tools, we relied on the Seamless Analytical Framework, firstly introduced by the BigDataStack EU project. The aim of this framework is to combine and unify the best benefits of two complementary worlds in data management systems: the operational and the analytical datastores. However, the background technology had some limitations when performing complex relational algebraic operations such as the execution of the SQL “JOIN” . Under the scope of PolicyCLOUD, we designed a solution that can overcome this limitation by re-writing the source core related with the runtime execution of this complex operator, when it involves federated datasets that are split across both the two involved datastore solutions.