Context
Software product line engineering has proven to be an efficient paradigm to developing families of similar software systems at lower costs, in shorter time, and with higher quality.
Objective
This paper analyzes the literature on product lines from 1995 to 2014, identifying the most influential publications, the most researched topics, and how the interest in those topics has evolved along the way.
Method
Bibliographic data have been gathered from ISI Web of Science and Scopus. The data have been examined using two prominent bibliometric approaches: science mapping and performance analysis.
Source data
This repository stores the raw data retrieved from ISIWoS and Scopus after performing the query:
"software product line" or (("product line" or "mass customization" or "product famil" or "program family" or "software factory" or "product platform") and (("domain engineering" or "application engineering") or ("feature model" or "feature diagram" or "decision model" or "decision diagram") or (software and variabilit* and commonalit*)))
Data aggregation
To combine the records, the citation count for the common records was computed as the maximum of the citations given by ISIWoS and Scopus. This repository stores:
- Records just included in ISIWoS
- Records just included in Scopus
- Common records with highest citations on ISIWoS
- Common records with highest citations on Scopus
Preproccessing
The data retrieved from bibliographic databases usually have errors. For instance, references may be duplicated, authors’ names may appear in different ways, etc. So, it is necessary to preprocess the data before carrying out any analysis. To track the evolution of the SPL research area and measure its performance, we have used two approaches that require analyzing publication keywords and citations: Co-Word Analysis and H-index. In particular, this repository includes the keyword standardization we have undertaken.
From the ISIWoS records, a set of 2,000 keywords was available: 1,667 were authors' keywords and 333 were words provided by ISIWoS KeyWords Plus. The Scopus records included a set of 9,308 keywords. The initial aggregated set of 11,308 keywords was progressively reduced by applying the following stepsThe set of keywords was progressively reduced by applying the following steps:
Keywords were converted to uppercase, leading and trailing white-spaces were removed, and inner white-spaces were replaced by the character ‘-’. After that, the repeated keywords were removed.
Keywords useless to identify research topics inside the SPL area were discarded. For example, SOFTWARE-PRODUCTLINE, PRODUCT-FAMILY, SOFTWARE-ENGINEERING, etc. are applicable to all the records and thus they cannot be used to distinguish a particular topic. Hence, those general keywords were removed.
Keywords were grouped. To improve the interpretability of the co-word analysis results, the set of keywords was reduced by grouping those words that refer to the same topic. For instance, AUTOMATED-CONFIGURATION, FEATURE-BASEDCONFIGURATION, PRODUCT-DERIVATION-TOOL, STAGEDCONFIGURATION, etc. were grouped as PRODUCT-DERIVATION.
Results
According to the study carried out, (i) software architecture was the initial motor of research in SPL; (ii) work on systematic software reuse has been essential for the development of the area; and (iii) feature modeling has been the most important topic for the last fifteen years, having the best evolution behavior in terms of number of published papers and received citations.
Conclusion
Science mapping has been used to identify the main researched topics, the evolution of the interest in those topics and the relationships among topics. Performance analysis has been used to recognize the most influential papers, the journals and conferences that have published most papers, how numerous is the literature on product lines and what is its distribution over time.
Contact
If you need further information, do not hesitate to contact Ruben Heradio (rheradio@issi.uned.es)