3 The semantic formalism

This section outlines the semantic formalism used to represent software descriptions [5]. The formalism establishes the rules to generate the internal representation of both queries and natural language descriptions of software components. The formalism consists of a case system for simple imperative sentences with some constraints and heuristics that are used to map a description into a frame-like internal representation.

The case system basically consists of a sequence of one or more semantic cases. Semantic cases are associated to some syntactic compounds of an imperative sentence. An imperative sentence consists of a verb (representing an action) possibly followed by a noun phrase (representing the direct object of the action) and perhaps some embedded prepositional phrases. For instance, the sentence `search a file for a string' consists of the verb `search', in the infinitive form, followed by the noun phrase `a file', which represents the object manipulated by the action, and followed by the prepositional phrase `for a string', which represents the goal of the `search' action. In the example, the semantic cases `Action', `Location' and `Goal' are respectively associated to the verb, direct object and prepositional phrase of the sentence.

Semantic cases show how noun phrases are semantically related to the verb in a sentence. For instance, in the sentence `search a file for a string', the semantic case `Goal' associated to the noun phrase `for a string' shows the target of the action `search'. We have defined a basic set of semantic cases for software descriptions by analysing the short descriptions of Unix commands in manual pages. These semantic cases describe basically the functionality of the component (the action, the target of the action, the medium or location, the mode by which the action is performed, etc.).

A semantic case consists of a case generator (possibly omitted) followed by a nominal or verbal phrase. A case generator reveals the presence of a particular semantic case in a sentence. Case generators are mainly prepositions. For instance, in the sentence `search a file for a string', the preposition `for' in the prepositional phrase `for a string' suggests the `Goal' semantic case.

4 The analysis of the descriptions


This is a section of a local copy of the paper A Similarity Measure for Retrieving Software Artifacts by M. R. Girardi and B. Ibrahim.

Site Hosting: Bronco