This page is optimized for AI. For the human-readable: Smart City Open Data Collector

Smart City Open Data Collector

Project Idea Metadata

Project Idea Description

There is an urgent need to improve complex systems like mobility, energy or industrial supply chains expressed with the Sustainable Development Goals by the United Nations. The new generation of automated control systems will control large-scale systems in an unprecedented way safely and efficiently. Machine learning and artificial intelligence are part of our lives and become even more: omnipresent. Therefore, the Swiss National Science Foundation has selected the proposal to fund the National Center of Competence in Research «dependable, ubiquitous automation» - short NCCR Automation.


Data is the new oil

The focus in the ETH domain lies in theoretical foundations and research of new methods. In large parts, this has been possible with simulated data or reference datasets. For practical applications of these methods – fine-tuning them to the domain knowledge – field data is required. Especially as the new techniques will need to be able to cope with imperfections of our world: noise, glitches, missing data, wrong measurements. EMPA has demonstrated the importance of data in small demonstrators; the large scale is missing.

Today's vast majority of data collectors are large corporations (mainly from the US and China), fueling the public's anxieties that technological progress is widening the gap between rich and poor, digital and analogue world.


We need control over our data! Therefore, the European Commission has introduced a strong legal framework to empower the users and help European business catch up. See details described in the communication from the Commission to the European Parliament and the Council titled "data protection as a pillar of citizens' empowerment and the EU's approach to the digital transition - two years of application of the General Data Protection Regulation". The need for accessible data for public and private organizations remains, though.

Without fully transparent access to the underlying dataset used to train models and build intelligent machines, algorithms will expose their biases and prejudices only when making decisions – as shown by activists and researchers equally. An open dataset can help identify systematic errors and inequalities beforehand. And help build universal policies on how to test datasets and algorithms before using them in production systems. Paradoxically, access to data gets more difficult over time.


One area where such datasets could be collected from volunteers and which are of high interest are High-Resolution Electricity Meter Data.

Smart Meters generate every second data points on energy usage in residential, commercial and industrial buildings. Publicly available are either aggregated datasets (multiple buildings) or short observations lacking seasonality effects and long-term trends.

Swiss companies (e.g. Smart-Me) offer open APIs and would make collecting data sets possible. This initiative is not limited to the fields mentioned above and will grow when the proposal finds social acceptance, produces valuable insights and helps develop digital neutrality/fairness/equality. Data about electric car charging behaviour could be another hot topic to pursue.



Privacy protection – a central concern why public institutions have refrained from starting their own data sets – could become a focus area of research. The question of how much abstraction is necessary to protect the private spheres of individuals and at the same time ensure enough detail so that algorithms can learn from long-term observations remains open. Since fines of €20 million and more threaten commercial viability for data breaches, there is growing interest from the industry to find technical solutions to protect data and hesitation to openly share data with researchers or innovation partners.


The data subjects that volunteer to donate their data will opt-in to a contract that defines the license terms under which the data will be published. This could be, for example, a Creative Commons agreement often used for Wikipedia media files; or other licenses that would allow for charging a fee for commercial uses.

A limited non-profit corporation could be one legal form for protecting the funding research institutions, building trust for the volunteering data subjects (compared to a simple association), and acting independently but in the interest of partnering research institutions.


Open Data, Open Research

This project aligns with national and international efforts to make research results freely accessible, democratizing education and ensuring equal opportunities for all members of society. Beyond pursuing such honourable goals, young researchers will find this project a chance to demonstrate data management excellence, apply best practices in information security, and distinguish themselves as creators and not just users of these datasets.

Partnership with the NCCR Automation is of mutual benefit as the project provides the dataset required for the ongoing research and builds a platform for generations of researchers to come. The results derived from the datasets give the project a rigorous scientific foundation as NCCR Automation members come from the most prestigious institutes of Switzerland. Data subjects volunteering to share their data – which was anyway already commercially exploited – can see how they contribute to a safer, greener, fairer digital future. Switzerland can distinguish its leading innovator role with this social enterprise that expands the concepts of neutrality into the digital realm and cyberspace.

Let's use the empowered rights of data subjects (from GDPR and others) to build an open source data collection, specifically for the smart city algorithms of the future. Data on energy consumption, mobility, EV charging are collected rigorously, researchers need access to this long term, high quality, high-frequency data - let's use the open APIs and data access rights to collect them and preserve them for open access to researchers.