Accurate diagnosis of cross-browser compatibility issues via machine learning

Nataliia Semenenko
Due to the rapid evolution of Web technologies and the failure of Web standards to uniformize every single technology evolution, Web developers are faced with the challenge of ensuring that their applications are correctly rendered across a broad range of browsers and platforms. While abidance to Web standards may reduce the chances of Web documents being inconsistently rendered across multiple browsers, in practice cross-browser compatibility issues are recurrent and range from minor layout bugs to critical functional failures such as a button being invisible in a given browser-platform combination. To detect cross-browser incompatibilities, developers often resort to visually checking that each document produced by their application is consistently rendered across all relevant browser-platform combinations. This manual testing approach is time consuming and error-prone. Existing cross-browser compatibility testing tools speed up this process by automating the rendering of a Web document in multiple browsers and platforms, and applying either image analysis or Document Object Model (DOM) analysis to highlight potential cross-browser incompatibilities. However, existing tools in this space suffer from over-sensitivity, meaning that they produce a large number of false positives as they tend to classify even insignificant differences as potential incompatibilities. Reducing the number of false positives produced by cross-browser compatibility testing tools is challenging, since defining criteria for classifying a difference as an incompatibility is to some extent subjective. This Master's thesis presents a machine learning approach to improve the accuracy of two techniques for cross-browser compatibility testing – one based on image analysis (Browserbite) and one based on DOM analysis (Mogotest). To this end, we selected over 140 Web pages, each rendered in 10 to 14 browser-system combinations and built statistical classifiers to differentiate between true incompatibilities and false alarms. Two classification algorithms were used, namely classification trees and neural networks. An extensive experimental evaluation shows that neural networks produce highly accurate classifiers, both when post-processing the outputs of the image-based and the DOM-based technique. An attempt to combine image and DOM-based analysis is also reported.
Graduation Thesis language
Graduation Thesis type
Master - Software Engineering
Marlon Dumas, Tõnis Saar
Defence year