Skip to main content

Continuous validation for data analytics systems

Authors

Mark Staples, Liming Zhu and John Grundy

NICTA

UNSW

Swinburne University of Technology

Abstract

(From a future history of 2025.) Continuous development is common for build/test (continuous integration) and operations (devOps). This trend continues through the lifecycle, into what we call `devUsage': continuous usage validation. In addition to ensuring systems meets user needs, organisations must continuously validate their legal and ethical use. The rise of end-user programming and multi-sided platforms exacerbate validation challenges. A separate trend is the specialisation of software engineering for technical domains, including data analytics. This domain has specific validation challenges. We must validate the accuracy of statistical models, but also whether they have illegal or unethical biases. Usage needs addressed by machine learning are sometimes not specifiable in the traditional sense, and statistical models are often `black boxes'. We describe future research to investigate solutions to these devUsage challenges for data analytics systems. We plan to adapt risk management and governance frameworks previously used for software product qualities, use social network communities for input from aligned stakeholder groups, and perform cross-validation using autonomic experimentation, cyber-physical data streams, and online discursive feedback.

BibTeX Entry

  @inproceedings{Staples_ZG_16,
    author           = {Staples, Mark and Zhu, Liming and Grundy, John},
    month            = may,
    year             = {2016},
    title            = {Continuous Validation for Data Analytics Systems},
    booktitle        = {International Conference on Software Engineering},
    address          = {Austin, USA}
  }

Download

Served by Apache on Linux on seL4.