Schedule

Subscribe to RSS Feed

2020
Tuesday, September 15th
7:40 PM

Improving the management of rapid filtration in a DWTP using k-Means clustering

Lluis Godo-Pla, LEQUIA - University of Girona, LEQUIA - Universitat de Girona & ATL, Spain

7:40 PM - 8:00 PM

Rapid filtration have been used since the early stages of drinking water treatment plants (DWTP) as an effective physical barrier against pathogens and suspended solids. The seemingly simple operation consists of setting filter run-time and backwashing conditions, however plants that operate big amounts of filters require an effective and informed maintenance planning to avoid failures of the filters. Several causes can affect filters performance such as: Irregular filter bed due to hydraulic impacts, ineffective backwash, presence of fine particles or sensor failures, among others. Currently, this malfunctioning can be detected using online available data, but this task is very time consuming without an adequate data-mining algorithm. The objective of this work is to develop a flexible framework for providing a data-based diagnosis of filters performance, based on clustering analysis. For this purpose, two statistical clustering techniques were compared (k-Means and hierarchical clustering) using the clean bed headloss and a saturation index as main features. The developed tool rates the 48 filters of a full-scale DWTP in a traffic light colour as a visual indicator of performance according to which cluster they belong, therefore indicating where the attention is needed. The presented methodology can increase process knowledge and provide a basis for decision-making and planning of maintenance tasks in large filtration systems.

Thursday, September 17th
8:20 AM

On using 'Emerging Interest' in Scientific Literature to inform Chemical Risk Prioritisation

Jason Whyte, ACEMS, University of Melbourne, CEBRA, University of Melbourne, Australia

8:20 AM - 8:40 AM

Modern industrial practices employ a large and diverse collection of chemicals. This can challenge regulators charged with environmental protection. Typically, insufficient data is available for risk assessments. Thus, chemicals may find widespread use until adequate evidence of adverse environmental effects prompts regulatory action. Globally, regulators have seen that such ‘reactive’ risk management has disadvantages. Recently in Australia (and elsewhere), relatively rapidly, certain unrestricted, longused perand polyfluoroalkyl substances (PFAS) became subjects of concern, then regulation. Such events motivate us to support regulators’ ‘proactive’ risk management efforts. We aim to assist regulators in anticipating the emergence of potentially risky chemicals, enabling their timely actions. We hypothesise that a time series of research interest mined from a scientific publication database may reveal ‘emerging interest’ in a chemical that foreshadows its progress towards regulation. We investigate this for six PFAS by determining the associated research interest in Web of Science. For each chemical, we use R code to apply queries to an application programming interface, and count annual positive results across a publication year range. Inspection of these time series suggests two tests, each of which determines the first year in which some condition is satisfied. We propose classification rules to interpret test outcomes, and compare results against PFAS regulatory histories. For the regulated PFAS, we anticipate the historical progression of Australian regulatory concern. We also judge some unrestricted PFAS as being of concern, and this is validated by interest from other jurisdictions. These results demonstrate our system’s predictive ability, and encourage further development.

1:00 PM

General Sentiment Decomposition: Climate Change topic & Twitter.com users

Maurizio Romano, University of Cagliari, University of Cagliari, Italy

1:00 PM - 1:20 PM

Word of Mouth political and marketing importance is growing day by day. These phenomena can be directly observed in everyday life, e.g.: the rise of influencers and social media managers. If more people talk about a specific product, then more people are encouraged to buy it and vice versa. This effect is amplified proportionally to how high the consideration or close the relationship between the potential customer and the reviewer is. Furthermore, considering the negative reporting bias, it is easy to understand how customer satisfaction is of absolute interest for a company (or for a politician). After analyzing the impact of Word of Mouth on earnings and the related psychological aspects, we propose an algorithm to extract the sentiment from a natural language text corpus. The combined approach of Neural Networks, characterized by high predictive power but at the cost of harder interpretation, with more straightforward and informative models, allows not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. The assessment of an objective quantity improves the interpretation of the results in many fields. For example, it is possible to identify specific critical sectors that require intervention to improve the offered services, to find the strengths of the company (useful for advertising campaigns), and, if time information is present, to analyze trends on macro/micro topics. To support further decision-making, we apply this method to Twitter's data, analyzing the sentiment of people who discuss environmental issues. In this way, we identify the aspects that are perceived as critical by the people w.r.t. the "feedback" they publish on the web and quantify how happy (or not) they are about a particular climate change-related problem.

2:00 PM

Integrating Human Knowledge and Data Mining through interactive Classification and Regression Trees (CART)

Georgios Sarailidis, University of Bristol, University of Bristol, Civil Engineering, United Kingdom

2:00 PM - 2:20 PM

Many applications in environmental sciences require learning from data. Machine Learning (ML) methods are considered an efficient tool for that purpose due to their ability to identify patterns and structure in large datasets in an automatic way. Fully automatic ML techniques, like Classification and Regression Trees (CART), have been developed and used in numerous applications in environmental modelling. However, they require large amounts of data for training which are often lacking in environmental applications. Moreover, they lack transparency, as they don’t provide any explanation of how (or why) they ended up in a decision tree. Furthermore, they rely mainly on statistical methods and performance metrics (i.e. fit to data) to extract information and patterns from data. While performance is an important aspect in many applications, regarding environmental applications, performance may not necessarily be the only objective the modeller wants to achieve. Interpretability or explanatory power of the resulting decision trees are also important aspects of decision trees. On the other end, fully manual approaches for building a decision tree integrate expert domain knowledge, but they could be tedious and time consuming. In this project, we are trying to bridge this gap. We combine the strengths of the two approaches and integrate expert domain knowledge with ML strategies (CART) by putting the humans-in-the-loop. We propose an approach where users can interact with the automatic algorithm for tree building (e.g. by selecting branches to prune or changing features or thresholds in the nodes) either in an ex-post mode or at certain iterations of the algorithm. Thus, users can explore decision trees which may be less accurate but have more explanatory power. Our interactive approach may also prove useful in cases where data are scarce and/or noisy and the expert’s knowledge can guide the algorithm towards reaching a satisfying decision tree.

2:40 PM

The Data Science Added Value for Intelligent Decision Support: Application to Greenhouse Gas Emissions Reduction

Miquel Sanchez-Marra, Univ. Politècnica de Catalunya, Intelligent Data Science and Artificial Intelligence Research Center (IDEAI), Dept. of Computer Science

2:40 PM - 3:00 PM

Nations who ratified the 2015 Paris Agreement pledged to limit the global average temperature increase well below 2°C in comparison to preindustrial levels. In order to adhere to the Paris Agreement, aggressive emission regulation, mitigation, and reduction policies must be proposed and adapted. Therefore, there is a need to impose emission reduction techniques on new industrial facilities within all industrialised countries as well as modify current industrial facilities to have fewer emissions. With the USA being one of the world’s largest Greenhouse Gases (GHG) polluters when considering annual emissions as well as overall cumulative emissions, it was selected as a case study for the development of an Intelligent Decision Support System (IDSS) to recommend suitable action plans for industrial facilities to control and reduce greenhouse gas emissions for the mitigation and reduction of climate change. The use of Data Science (data-driven models) provided very useful knowledge patterns to give support to the recommendation process. Furthermore, related literature, legislation and expert knowledge were integrated into the IDSS to recommend the most suitable action plans for mitigation and reduction of GHG emissions for a given industrial facility in USA. Real GHG data collected from the Greenhouse Gas Reporting Program (GHGRP) during the year 2017 through the United States Environmental Protection Agency (USEPA) was used. The kind of facilities ranged from waste-activity, food industry, manufacturing industries, petroleum refineries, natural gas, metal, mineral, chemical, power generation and HVAC facilities. The prototype IDSS deployed was satisfactorily assessed by some experts on sustainability issues belonging to the ISST. The action plans for each kind of facility were tested and showed good available technologies and aggressive emission control measures, aligned with the terms of the Paris Agreement.