首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 其他教程 > 其他相关 >

lucene 建立资料索引和针对索引进行搜索(lucene2.2版本)

2012-10-26 
lucene 建立文件索引和针对索引进行搜索(lucene2.2版本)??? 最近因为项目需要,开始了解lucene的应用,手头

lucene 建立文件索引和针对索引进行搜索(lucene2.2版本)

??? 最近因为项目需要,开始了解lucene的应用,手头有一本《Lucene In Action》,不过一用起来才发现,我现在用2.0lucene包的情况下,该书第一个示例就无法正确编译通过,找了一些资料,终于算是调试通过,算是一个好的开始吧。

??? 1.建立索引:

?????

package demo.example.searcher;import java.io.*;import java.util.*;import org.apache.lucene.analysis.standard.*;import org.apache.lucene.index.*;import org.apache.lucene.document.*;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;public class Indexer {private static Log log = LogFactory.getLog(Indexer.class);public static void main(String[] args) throws Exception {File indexDir = new File("C:\\index");File dataDir = new File("C:\\lucene\\src");long start = new Date().getTime();int numIndexed = index(indexDir, dataDir);long end = new Date().getTime();System.out.println("use:" + (end - start));}public static int index(File indexDir, File dataDir) {int ret = 0;try {IndexWriter writer = new IndexWriter(indexDir, new StandardAnalyzer(), true);writer.setUseCompoundFile(false);indexDirectory(writer, dataDir);ret = writer.docCount();writer.optimize();writer.close();} catch (Exception e) {e.printStackTrace();}return ret;}public static void indexDirectory(IndexWriter writer, File dir) {try {File[] files = dir.listFiles();for (File f : files) {if (f.isDirectory()) {indexDirectory(writer, f);} else {indexFile(writer, f);}}} catch (Exception e) {e.printStackTrace();}}public static void indexFile(IndexWriter writer, File f) {try {System.out.println("Indexing:" + f.getCanonicalPath());Document doc = new Document();Reader txtReader = new FileReader(f);doc.add(new Field("contents", txtReader));doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES, Field.Index.UN_TOKENIZED));writer.addDocument(doc);} catch (Exception e) {e.printStackTrace();}}}

?

?

??? 2.针对上面类建立的索引进行查询:

??

package demo.example.searcher;import java.util.*;import org.apache.lucene.search.*;import org.apache.lucene.queryParser.*;import org.apache.lucene.analysis.standard.*;import org.apache.lucene.document.*;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;public class Searcher {private static Log log = LogFactory.getLog(Searcher.class);public static void main(String[] args) {String indexDir = "C:\\index";String q = "查询关键字";search(indexDir, q);}public static void search(String indexDir, String q) {try {IndexSearcher is = new IndexSearcher(indexDir);QueryParser queryParser = new QueryParser("contents", new StandardAnalyzer());Query query = queryParser.parse(q);long start = new Date().getTime();Hits hits = is.search(query);long end = new Date().getTime();System.out.println("use:" + (end - start));for (int i = 0; i < hits.length(); i++) {Document doc = hits.doc(i);System.out.println("The right file:" + doc.get("filename"));}} catch (Exception e) {e.printStackTrace();}}}

?

?

最后运行正常。

?

不过在运行测试的时候发现了一个不明白的问题:

在建立索引的文件都是Java类,在测试查询关键字信息的时候,中英文都很正常,但发现在java类源文件中的信息被过滤了,无法检索出来,这是怎么回事啊,lucene自动过滤类文件的注释信息么?

热点排行