Skip to content
BASE-LAB edited this page Feb 13, 2020 · 5 revisions

This project reproduced Lucene-base Information Retrieval methods for code search, which focus more on Query Expansion and modifying similarity calculation approach to improve conventional code search performance.

Methods include:

Dependency

Tested in Ubuntu 16.04

  • Lucene Core 7.4.0

Code Structures

  • main/java/CS/evalution/: This module contains several measuring metrics definition and their implementation, including MAP, MRR, NDCG, Precision, Recall, F-score, First Position, etc.

  • main/java/CS/methods/: This module contains implementation of above information retrieval methods as well as their corresponding bonus required prerequisites.

  • main/java/CS/model/: This module contains necessary data model for serialization and other bottomed supporting model.

  • main/java/CS/similarity/: This module contains modifications on similarity in the comparison process.

  • main/java/CS/Util/: This module contains several utilities that preprocess the data, including static configuration, loading train and test dataset, evaluation and dataset format conversion.

  • test/java/CS/index/: This module contains implementations of index process.

  • test/java/CS/search/: This module contains implementations of search and evaluation process.

  • test/java/CS/Evaluate: This module contains implementations of evaluation only process.

Usage

Data Preparation

To build corresponding indexes and run searching process, the Cosbench dataset should be downloaded.

Configuration

Edit hyper-parameters and settings in java/CS/Util/ConfigUtil.java

Search and Evaluation

All searching and evaluating process is presented in a programmatic approach.

Edit corresponding files in test/java/CS/index and test/java/CS/search.

Evaluation Only

For the convenience of Evaluation, we provide the Top-20 search answers given by the 6 code search methods in the test/resources/searchResult folder. You can obtain the Evaluation results in test/resources/evaluateResult folder by modifying the test/java/CS/Evaluate/EvaluateResult file.

It should be noted that due to various factors (for example, different versions of the library used and the external data set), the Top-20 search answers obtained by the search method may be different. We only provide Top-20 search answers used in our experiments.

Home

Method

Experiment

Clone this wiki locally