Publication Details
Download |
Thomas Weber, Christina Winiker, Sven Mayer
An Investigation of How Software Developers Read Machine Learning Code Proceedings of the 18th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '24), Association for Computing Machinery, 2024-10-20 (bib) |
Background: Machine Learning plays an ever-growing role in everyday software. This means a paradigmatic shift in how software operators from algorithm-centered software where the developers defines the functionality to data-driven development where behavior is inferred from data. Aims: The goal of our research is to determine how this paradigmatic shift materializes in the written code and whether developers are aware of these changes and how they affect their behavior. Method: To this end, we perform static analysis of N software repositories to determine structural differences in the code. Following this, we conducted a user study using eye tracking to determine how the code reading of developers differs when reading Machine Learning source code versus traditional code. Results: The results show that there are structural differences in the code of this paradigmatically different software. Developers appear to adapted their mental models with growing experience resulting in distinctly different reading patterns. Conclusions: These difference highlight that we cannot treat all code the same but require paradigm-specific, empirically validated support mechanisms to help developers write high-quality code. |