A Similarity Measure for Retrieving Software Artifacts

M. R. Girardi and B. Ibrahim
University of Geneva, Centre Universitaire d'Informatique
CH-1211 Geneve 4, Switzerland
E-mail: {girardi, bertrand}@cui.unige.ch
A Postscript version of this document can be found here.
Sections 5 and 6 make heavy use of HTML+ features that are not available in all browsers.

Table of Contents

Abstract
1 Introduction (compressed version, uncompressed version)
2 Overview of the reuse system (compressed version, uncompressed version)
3 The semantic formalism (compressed version, uncompressed version)
4 The analysis of the descriptions (compressed version, uncompressed version)
5 Similarity analysis (compressed version, uncompressed version)
6 Experimental comparison (compressed version, uncompressed version)
7 Related work (compressed version, uncompressed version)
8 Final remarks (compressed version, uncompressed version)
References (compressed version, uncompressed version)

Abstract

This paper introduces the main features and the retrieval mechanism of ROSA, a software reuse system based on the processing of the natural language descriptions of software artifacts. The system supports the automatic indexing of components by acquiring lexical, syntactic and semantic knowledge from software descriptions. The retrieval mechanism is based on a similarity analysis that provides good retrieval effectiveness through partial matching of descriptions, processing of synonyms, generalizations and specializations of terms and considering the syntactic and semantic information available in the descriptors of software artifacts.

1 Introduction (compressed version, uncompressed version)

Site Hosting: Bronco