Mining stack overflow for API class recommendation using DOC2VEC and LDA
Document Type
Article
Publication Date
10-1-2021
Abstract
To address the lexical gaps between natural language (NL) queries and Application Programming Interface (API) documentations, and between NL queries and programme code, this study developed a novel approach for recommending Java API classes that are relevant to the program ming tasks described in NL queries. A Doc2Vec model was trained using question titles mined from Stack Overflow. The model was used to find question titles that are semantically similar to a query. Latent Dirichlet Allocation (LDA) topic modelling was applied on the Java API classes (extracted from code snippets found in the accepted answers of these similar questions) to extract a single topic comprising of the Top-10 Java API classes that are relevant to the query. The benchmarking of the proposed approach against state-of-the-art approaches, RACK and NLP2API, by using four performance metrics show that it is possible to produce comparable API recommendation results using a less complex approach that makes use of some basic machine learning models, in particular, Doc2Vec and LDA. The approach was implemented in a Java API class recommender with an Eclipse IDE's plug-in serving as the front-end.
Keywords
Application Programming Interface (API), Stack Overflow, Natural language (NL) queries
Divisions
fsktm
Publication Title
IET Software
Volume
15
Issue
5