Skip to content

CodeHow

BASE-LAB edited this page Feb 13, 2020 · 3 revisions

CodeHow From paper "CodeHow: Effective Code Search based on API Understanding and Extended Boolean Model".

To address the issue that existing tools lack of query understanding ability, the author proposes CodeHow that can recognize potential APIs, expand the query with the APIs and performs code retrieval by applying the Extended Boolean model, which considers the impact of both text similarity and potential APIs on code search.

Code Structures

Relevant codes consist of :

  • test/java/CS/index/CodeHowAPIIndex:

    CodeHow needs to index potential APIs for API understanding, thus this class builds API index from Java SE API documents.

  • test/java/CS/search/CodeHowSearch:

    The search process of CodeHow.

  • main/java/CS/methods/CodeHow/ExpandQueryBuilder: According to the paper, we first find K potential API from given query and calculate their scores. Then for each potential API, we formulate extend boolean query to search on the whole document set. At last we added up all these querys as well as the pure text similarity query to assemble a sum query and get top K candidates from the ranking result.

    Each sub-query and wrapped query in this procedure is implemented by FuncScoreQuery.

Usage

Data Preparation

  • Download Cosbench dataset.
  • Set the path of the codebase of CosBench to Config.Util/codebaseOrigin.
  • Download library doc dataset. We use API information obtained from the Java 8’s document. This is the processed dataset we used. And if you want to use another library, you should convert the doc data into Json format like:
    [   
        {
            "func" : [
                         {
                             "funcName": "xxx",
                             "funcDes":"xxx"
                         },
                         ...
                     ],
            "name" : "xxx"
        },
        ...
    ]
    
    where "name" is the package name, "func" is a array of function in this package, "funcName" is the name of function, and "funcDes" is the description of function. More information of them could be seen at origin paper.
  • Set the path of the API document to Config.Util/APIOrigin.

Index

  • Set the output path of index for codebase to Config.Util/codebaseIndex.
  • Run test/java/CS/index/baseIndex.
  • Set the output path of index for API to Config.Util/APIIndex.
  • Run test/java/CS/index/CodeHowAPIIndex.

Search

  • Run test/java/CS/search/CodehowSearch/search with a query.

Evaluation

  • Set the path of the QAset of CosBench to Config.Util/QASet.
  • Set the output path of evaluation result to Config.Util/CodehowResult.
  • Run test/java/CS/search/CodehowSearch/evaluation.

Home

Method

Experiment

Clone this wiki locally