前阵子在做一个手机服务器端系统的开发,主要使用spring+hibernate框架,由于系统涉及到全文检索功能,所以很自然地就想到了compass这个开源的搜索引擎,并且在j2ee领域,个人觉得目前最好用的就是这个了。为了提高系统的性能,我们大量的使用了compass创建的索引来做数据查询,因为我们主要功能是数据查询,很少做数据的更改,所以比较适合用这种方式,也收到了很好的效果。
首先是spring与compass的集成配置文件:applicationContext-compass.xml
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd" default-lazy-init="true"> <!-- 配置compass注解 --> <bean id="annotationConfiguration" class="org.compass.annotations.config.CompassAnnotationsConfiguration"> </bean> <!-- 配置compass bean --> <bean id="compass" class="org.compass.spring.LocalCompassBean"> <!-- 定义索引存放路径 --> <property name="connection"> <value>file://${index.path}</value> </property> <!-- 配置需要创建索引的实体 --> <property name="classMappings"> <list> <value>com.***.mobile.entity.KnowledgeInfo</value> <value>com.***.mobile.entity.ChannelInfo</value> <value>com.***.mobile.entity.MobileBaseEntity</value> <value>com.***.mobile.entity.BaseEntity</value> <value>com.***.mobile.entity.BusLine</value> </list> </property> <property name="compassConfiguration" ref="annotationConfiguration"></property> <property name="compassSettings"> <props> <!-- compass事务工厂 --> <prop key="compass.transaction.factory">org.compass.spring.transaction.SpringSyncTransactionFactory</prop> <!-- paoding分词器 --> <prop key="compass.engine.analyzer.MMAnalyzer.CustomAnalyzer">net.paoding.analysis.analyzer.PaodingAnalyzer</prop> <!-- 设置高亮显示 --> <prop key="compass.engine.highlighter.default.formatter.simple.pre"><![CDATA[<font color="#cc0033"><b>]]></prop> <prop key="compass.engine.highlighter.default.formatter.simple.post"><![CDATA[</b></font>]]></prop> </props> </property> <property name="transactionManager" ref="transactionManager" /> </bean> <!-- 用Hibernate3事件系统,支持Real Time Data Mirroring.经Hiberante改变的数据会自动被反射到索引里面 --> <bean id="hibernateGpsDevice" class="org.compass.gps.device.hibernate.dep.Hibernate3GpsDevice"> <property name="name"> <value>hibernateDevice</value> </property> <property name="sessionFactory" ref="sessionFactory" /> <property name="mirrorDataChanges"> <value>true</value> </property> </bean> <!-- 同步更新索引 --> <bean id="compassGps" class="org.compass.gps.impl.SingleCompassGps" init-method="start" destroy-method="stop"> <property name="compass" ref="compass" /> <property name="gpsDevices"> <list> <bean class="org.compass.spring.device.hibernate.dep.SpringHibernate3GpsDevice"> <property name="name" value="hibernateDevice"/> <property name="sessionFactory" ref="sessionFactory"/> </bean> </list> </property> </bean> <bean id="compassTemplate" class="org.compass.core.CompassTemplate"> <property name="compass" ref="compass" /> </bean> </beans>
接着是实体的compass注解,将需要创建索引的实体及实体属性标注:
@Entity @Table(name="B_ENTITY_BASE") @Searchable public class MobileBaseEntity implements Serializable { private static final long serialVersionUID = -5594658438463757978L; @SearchableId protected String id; @SearchableProperty(name = "keys") protected String name; @SearchableProperty(index = Index.NO, store = Store.YES) protected String address; protected String phone; protected String description; @SearchableProperty(index = Index.NO, store = Store.YES) protected Double longitude; @SearchableProperty(index = Index.NO, store = Store.YES) protected Double latitude; @SearchableProperty(index = Index.NO, store = Store.YES) protected String type; protected String channel; protected String knowledgeId; protected String stauts;//0审核退回,1审核中,2正常 }
@SearchableId:不要求定义搜索的元数据;
@SearchableProperty(name = "keys"):属性索引的别名,这个别名在做查询的时候会用到,我们可以把多个实体类的多个属性定义为同一个别名来做全文检索;
@SearchableProperty(index = Index.NO, store = Store.YES):保存这个属性的值,但不作为索引。index = Index.XXX,这个有几种策略,1:NOT_ANALYZED ,不分词但创建索引;2:ANALYZED,分词并创建索引。
@SearchableComponent:关联复合索引,用以复合类型;
@SearchableReference:用以引用类型。
接下来就是使用compass的API来做数据查询,首先是模糊匹配查询,
List<BusLine> busLineList = new ArrayList<BusLine>(); Compass compass = this.compassTemplate.getCompass(); CompassSession compassSession = compass.openSession(); CompassQueryBuilder queryBuilder = compassSession.queryBuilder(); //指定查询实体 CompassQuery queryAlias = queryBuilder.alias(BusLine.class.getSimpleName()); CompassBooleanQueryBuilder boolQueryBuilder = queryBuilder.bool(); boolQueryBuilder.addMust(queryAlias); boolQueryBuilder.addMust(queryBuilder.wildcard("baseEntityKeys", "*"+value+"*")); CompassHits compassHits = boolQueryBuilder.toQuery().hits(); if(compassHits != null && compassHits.length() > 0) { for(int i = 0; i < compassHits.length(); i++) { BusLine line = (BusLine)compassHits.data(i); busLineList.add(line); } } compassSession.close(); return busLineList;
至于是使用模糊匹配查询还是全文检索,这需要根据应用场景来选择,能使用模糊匹配的话尽量使用,全文检索的代价比模糊匹配要高。全文检索的例子:
String keywords = keyword.trim(); List<KnowledgeInfo> result = new ArrayList<KnowledgeInfo>(); Properties prop = PropertyUtil.readPropertyFile(); String type = prop.getProperty("knowledge.zc.type"); ChannelInfo channel = channelInfoService.getChannelById(type); Compass compass = this.compassTemplate.getCompass(); CompassSession session = compass.openSession(); CompassQueryBuilder builder = session.queryBuilder(); CompassBooleanQueryBuilder boolBuidler = builder.bool(); boolBuidler.addMust(builder.wildcard("sort", channel.getSort()+"*")); if(keywords != null && !"".equals(keywords)) { String[] array = keywords.split(" "); if (array != null && array.length > 0) { for (String value : array) { if (value != null && StringUtils.isNotEmpty(value.trim())) { boolBuidler.addMust(builder.queryString("keys:" + value).toQuery()); } } } CompassHits hits = boolBuidler.toQuery().hits(); if (hits != null && hits.length() > 0) { for (int i = 0; i < hits.length(); i++) { KnowledgeInfo info = (KnowledgeInfo) hits.data(i); String ht = hits.highlighter(i).fragment("keys"); if (null != ht) { info.setKeywords(ht); } result.add(info); } } } session.close(); return result;
compass一个简单的应用就完成了。