elasticsearch基于smartcn中文分词查询_Java开源博客系统-Powered by java1234

博客信息

elasticsearch基于smartcn中文分词查询

发布时间：『 2018-01-16 20:35』博客类别：elasticsearch 阅读(5675) 评论(0)

我们新建索引film2

然后映射的时候，指定smartcn分词；

post http://192.168.1.111:9200/film2/_mapping/dongzuo/

{

"properties": {

"title": {

"type": "text",

"analyzer": "smartcn"

"publishDate": {

"type": "date"

"content": {

"type": "text",

"analyzer": "smartcn"

"director": {

"type": "keyword"

"price": {

"type": "float"

}

然后执行前面的数据代码；

这样前面film索引，数据是标准分词，中文全部一个汉字一个汉字分词；film2用了smartcn，根据内置中文词汇分词；

我们用java代码来搞分词搜索；

先定义一个静态常量：

private static final String ANALYZER="smartcn";

/**
 * 条件分词查询
 * @throws Exception
 */
@Test
public void search()throws Exception{
	SearchRequestBuilder srb=client.prepareSearch("film2").setTypes("dongzuo");
	SearchResponse sr=srb.setQuery(QueryBuilders.matchQuery("title", "星球狼").analyzer(ANALYZER))
		.setFetchSource(new String[]{"title","price"}, null)
		.execute()
		.actionGet(); 
	SearchHits hits=sr.getHits();
	for(SearchHit hit:hits){
		System.out.println(hit.getSourceAsString());
	}
}

指定了中文分词，查询的时候查询的关键字先进行分词然后再查询，不指定的话，默认标准分词；

这里再讲下多字段查询，比如百度搜索，搜索的不仅仅是标题，还有内容，所以这里就有两个字段；

我们使用 multiMatchQuery 我们看下Java代码：‘’

/**
 * 多字段条件分词查询
 * @throws Exception
 */
@Test
public void search2()throws Exception{
	SearchRequestBuilder srb=client.prepareSearch("film2").setTypes("dongzuo");
	SearchResponse sr=srb.setQuery(QueryBuilders.multiMatchQuery("非洲星球", "title","content").analyzer(ANALYZER))
		.setFetchSource(new String[]{"title","price"}, null)
		.execute()
		.actionGet(); 
	SearchHits hits=sr.getHits();
	for(SearchHit hit:hits){
		System.out.println(hit.getSourceAsString());
	}
}

关键字： elasticsearc smartcn中文分词查询多字段查询

上一篇：elasticsearch安装中文分词器插件smartcn

下一篇：做活动，领取Java爬虫-Java包下载网源码

关注Java1234微信公众号

博主信息

Java1234_小锋

(知识改变命运，技术改变世界)

按日志类别

按日志日期

友情链接