Skip to content

LuSearch

BASE-LAB edited this page Feb 13, 2020 · 2 revisions

LuSearch From paper "Query Expansion via WordNet for Effective Code Search".

This code search method focused on an attempt on query expansion.

This method addresses this issue that faced with words in query have multiple similar semantics in the source code. This approach extends a query with synonyms generated from WordNet: extracts natural language phrases from source code identifiers, matches expanded queries with these phrases, and sorts the search results.

Relevant codes consist of :

  • test/java/CS/search/LuSearch:

    LuSearch method uses the same index built by base lucene. Differently when running searching process, WordNet needs to build different analyzers from standard analyzer.

        Analyzer analyzer = new WordnetQueryAnalyzer();
    
  • main/java/CS/methods/Wordnet/WordnetQueryAnalyzer: In Wordnet Analyzer, we build Wordnet synonym wordset from external dataset and then apply this wordset to each token in original token stream.

Usage

Data Preparation

  • Download Cosbench dataset.
  • Set the path of the codebase of CosBench to Config.Util/codebaseOrigin.
  • Download Wordnet. This is the Wordnet we used. This is the website of Wordnet.
  • Set the path of the Wordnet to Config.Util/wordnetPath.

Index

  • Set the output path of index for codebase to Config.Util/codebaseIndex.
  • Run test/java/CS/index/baseIndex.

Search

  • Run test/java/CS/search/luSearch/search with a query.

Evaluation

  • Set the path of the QAset of CosBench to Config.Util/QASet.
  • Set the output path of evaluation result to Config.Util/luSearchResult.
  • Run test/java/CS/search/luSearch/evaluation.

Home

Method

Experiment

Clone this wiki locally