ElasticSearch系列 - SpringBoot整合ES:短语匹配查询 match_phrase
文章目录
-
-
- 1. ElasticSearch match_phrase查询是什么?它与match查询有什么区别?
- 2. ElasticSearch match_phrase 查询的语法是什么?
- 3. ElasticSearch match_phrase 查询的参数有哪些?
- 4. ElasticSearch multi_match 短语匹配查询
- 5. SpringBoot整合ES实现 multi_phrase 查询
-
1. ElasticSearch match_phrase查询是什么?它与match查询有什么区别?
match_phrase查询是一种用于匹配短语的查询方式,可以用于精确匹配多个单词组成的短语。它会将查询字符串分解成单词,然后按照顺序匹配文档中的单词,只有当文档中的单词顺序与查询字符串中的单词顺序完全一致时才会匹配成功。
与match查询不同,match查询只需要匹配查询中的一个或多个单词,而不需要考虑单词的顺序。例如,如果查询是“quick brown fox”,match查询将匹配包含“quick”、“brown”或“fox”的文档,而不管它们的顺序如何。相比之下,match_phrase查询只会匹配包含完全短语“quick brown fox”的文档。
因此,match_phrase查询更适合需要精确匹配短语的情况,而match查询更适合需要模糊匹配单词的情况。
2. ElasticSearch match_phrase 查询的语法是什么?
{"query": {"match_phrase": {"field_name": "query_string"}}
}
其中,field_name表示要匹配的字段名,query_string表示要匹配的查询字符串。
3. ElasticSearch match_phrase 查询的参数有哪些?
ElasticSearch的match_phrase查询参数如下:
query:需要匹配的短语。
slop:允许短语中的单词之间的最大距离。默认为0,表示必须按照给定的顺序精确匹配。
analyzer:指定用于分析查询字符串的分析器。
boost:为查询设置权重,以控制查询结果的相关性得分。
minimum_should_match:指定应该匹配的最小术语数。
例如,以下是一个使用match_phrase查询的示例:
{"query": {"match_phrase": {"title": {"query": "Elasticsearch 中文","slop": 2}}}
}
这个查询将匹配包含“Elasticsearch”和“中文”这两个词的文档,但是这两个词之间的最大距离不能超过2个单词。
4. ElasticSearch multi_match 短语匹配查询
① 构造数据:
PUT /my_index
{"mappings": {"properties": {"title":{"type": "text"},"content":{"type": "text"}}}
}PUT /my_index/_doc/1
{"title": "文雅酒店","content": "Beijing City"
}PUT /my_index/_doc/2
{"title": "文雅精品酒店","content": "Huaibei City"
}PUT /my_index/_doc/3
{"title": "文雅超级精品酒店","content": "Qingdao City"
}
② 假设在酒店标题中搜索“文雅酒店”,希望酒店标题中的“文雅”与“酒店”紧邻并且“文雅”在“酒店”前面,使用短语精确匹配查询:
GET /my_index/_search
{"query": {"match_phrase": {"title": "文雅酒店"}}
}
可以看到只有 title 字段中包含 ”文雅酒店“ 的文档被查询到了:
{"took" : 76,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : 0.6184612,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.6184612,"_source" : {"title" : "文雅酒店","content" : "Beijing City"}}]}
}
③ 如果想要 title 字段包含 ”文雅精品酒店“ 的文档也被查询到,则可以设置 match_phrase 查询的 slop 参数,它用来调节匹配词之间的距离阈值,默认为0,表示必须按照给定的顺序精确匹配。如下查询将匹配包含“文雅”和“酒店”这两个词的文档,但是这两个词之间的最大距离不能超过2个单词。
GET /my_index/_search
{"query": {"match_phrase": {"title": {"query": "文雅酒店","slop": 2}}}
}
{"took" : 25,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.6184612,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.6184612,"_source" : {"title" : "文雅酒店","content" : "Beijing City"}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.25545135,"_source" : {"title" : "文雅精品酒店","content" : "Huaibei City"}}]}
}
④ 如果想要 title 字段包含 ”文雅超级精品酒店“ 的文档也被查询到,则slop可以设为4:
GET /my_index/_search
{"query": {"match_phrase": {"title": {"query": "文雅酒店","slop": 4}}}
}
{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 0.6184612,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.6184612,"_source" : {"title" : "文雅酒店","content" : "Beijing City"}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.25545135,"_source" : {"title" : "文雅精品酒店","content" : "Huaibei City"}},{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 0.13824427,"_source" : {"title" : "文雅超级精品酒店","content" : "Qingdao City"}}]}
}
5. SpringBoot整合ES实现 multi_phrase 查询
GET /my_index/_search
{"query": {"match_phrase": {"title": {"query": "文雅酒店","slop": 4}}}
}
@Slf4j
@Service
public class ElasticSearchImpl {@Autowiredprivate RestHighLevelClient restHighLevelClient;public void searchUser() throws IOException {SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();MatchPhraseQueryBuilder matchPhraseQueryBuilder = QueryBuilders.matchPhraseQuery("title", "文雅酒店");matchPhraseQueryBuilder.slop(4);searchSourceBuilder.query(matchPhraseQueryBuilder);SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);System.out.println(searchResponse);}
}