lucene源码分析 - core

2018-09-30  本文已影响0人  机器智能

Apache Lucene is a high-performance, full-featured text search engine library. Here's a simple example how to use Lucene for indexing and searching (using JUnit to check if the results are what we expect):

lucene是高性能,功能全,文本搜索引擎库,这是一个简单的例子使用Lucene索引和搜索(使用junit)

   Analyzer analyzer = new StandardAnalyzer();

    Path indexPath = Files.createTempDirectory("tempIndex");
    Directory directory = FSDirectory.open(indexPath)
    IndexWriterConfig config = new IndexWriterConfig(analyzer);
    IndexWriter iwriter = new IndexWriter(directory, config);
    Document doc = new Document();
    String text = "This is the text to be indexed.";
    doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
    iwriter.addDocument(doc);
    iwriter.close();
    
    // Now search the index:
    DirectoryReader ireader = DirectoryReader.open(directory);
    IndexSearcher isearcher = new IndexSearcher(ireader);
    // Parse a simple query that searches for "text":
    QueryParser parser = new QueryParser("fieldname", analyzer);
    Query query = parser.parse("text");
    ScoreDoc[] hits = isearcher.search(query, 10).scoreDocs;
    assertEquals(1, hits.length);
    // Iterate through the results:
    for (int i = 0; i < hits.length; i++) {
      Document hitDoc = isearcher.doc(hits[i].doc);
      assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));
    }
    ireader.close();
    directory.close();
    IOUtils.rm(indexPath);

The Lucene API is divided into several packages:

lucene api分散在一下的几个包中

以上单独开文章翻译

To use Lucene, an application should:

可以这样在应用中使用Lucene:

  1. Create {@link org.apache.lucene.document.Document Document}s by adding {@link org.apache.lucene.document.Field Field}s;

  2. Create an {@link org.apache.lucene.index.IndexWriter IndexWriter} and add documents to it with {@link org.apache.lucene.index.IndexWriter#addDocument(Iterable) addDocument()};

  3. Call QueryParser.parse() to build a query from a string; and

  4. Create an {@link org.apache.lucene.search.IndexSearcher IndexSearcher} and pass the query to its {@link org.apache.lucene.search.IndexSearcher#search(org.apache.lucene.search.Query, int) search()} method.

  1. 创建Document
  2. 创建indexwriter
  3. 调用queryparser.parse来构建query
  4. 创建一个indexsearcher并调用search方法处理query

To demonstrate these, try something like:

> java -cp lucene-core.jar:lucene-demo.jar:lucene-analyzers-common.jar org.apache.lucene.demo.IndexFiles -index index -docs rec.food.recipes/soups 
adding rec.food.recipes/soups/abalone-chowder 
添加rec.food.recipes/soups/abalone-chowder 
  [ ... ]
> java -cp lucene-core.jar:lucene-demo.jar:lucene-queryparser.jar:lucene-analyzers-common.jar org.apache.lucene.demo.SearchFiles 
Query: chowder 
Searching for: chowder 
搜索chowder
34 total matching documents 
34个匹配的文档
1. rec.food.recipes/soups/spam-chowder 
  [ ... thirty-four documents contain the word "chowder" ... ]

Query: "clam chowder" AND Manhattan 
Searching for: +"clam chowder" +manhattan 
2 total matching documents 
1. rec.food.recipes/soups/clam-chowder 
  [ ... two documents contain the phrase "clam chowder" and the word "manhattan" ... ] 
    [ Note: "+" and "-" are canonical, but "AND", "OR" and "NOT" may be used. ]

上一篇 下一篇

猜你喜欢

热点阅读