Georg Dotzler

Learning Code Transformations from Repositories

Reihe:

Learning Code Transformations from Repositories
DOWNLOAD COVER

Library updates, program errors, and maintenance tasks in general force developers to apply the same code change to different locations within their projects. If the locations are very different to each other, it is very time-consuming to identify all of them. Even with sufficient time, there is no guarantee that a manual search reveals all locations. If the change is critical, each missed location can lead to severe consequences. The manual application of the code change to each location can also get tedious. If the change is larger, developers have to execute several transformation steps for each code location. In the worst case, they forget a required step and thus add new errors to their projects.

To support developers in this task, this thesis presents the recommendation system ARES. It leads to more accurate recommendations compared to previous approaches. ARES achieves this by conserving variations in the training examples in more detail due to its pattern design and by an improved handling of code movements. With the tool C3, this thesis also presents an extension to ARES that allows the extraction of training examples from code repositories. In combination, both tools create a recommendation system that automatically learns code recommendation patterns from repositories.

ARES, C3, and similar tools rely on lists of edit operations to express code changes. However, creating compact (i.e., short) lists of edit operations from data in repositories is difficult. As previous approaches produce too long lists for ARES and C3, this thesis presents a novel tree differencing approach called MTDIFF. The evaluation shows that MTDIFF shortens the edit operation lists compared to other state-of-the-art approaches.