
一、參考
Suggesters
Elasticsearch Suggester 詳解
二、基本介紹
2.1 bing 示例

2.2 suggest 過(guò)程

三、ES 的 suggester
3.1 實(shí)現(xiàn)原理
將輸入的文本分解為token , 然后在索引的字典中查找相似的 term 并且返回
3.2 4 種 suggester
(1) term suggester
(2) phrase suggester
(3) completion suggester
(4) context suggester
四、term suggester
(1) 創(chuàng)建索引,寫入文檔
# 創(chuàng)建索引
PUT yztest/
{
"mappings": {
"properties": {
"message": {
"type": "text"
}
}
}
}
# 添加文檔1
POST yztest/_doc/1
{
"message": "The goal of Apache Lucene is to provide world class search capabilities"
}
# 添加文檔2
POST yztest/_doc/2
{
"message": "Lucene is the search core of both Apache Solr and Elasticsearch."
}
(2) 查看分詞 token
# 分析分詞器結(jié)果
GET yztest/_analyze
{
"field": "message",
"text": [
"The goal of Apache Lucene is to provide world class search capabilities",
"Lucene is the search core of both Apache Solr and Elasticsearch."
]
}
(3) 不同的查詢結(jié)果

a) 當(dāng)輸入單詞拼寫錯(cuò)誤時(shí)候,會(huì)推薦正確的拼寫單詞列表
# 查詢
POST yztest/_search
{
"suggest": {
"suggest_message": { # 自定義的suggester名稱
"text": "lucenl", # 查詢的字符串,即用戶輸入的內(nèi)容
"term": { # suggester類型為term suggester
"field": "message", # 待匹配字段
"suggest_mode": "missing" # 推薦結(jié)果模式,missing表示如果存在了term和用戶輸入的文本相同,則不再推薦
}
}
}
}
# 返回結(jié)果
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"suggest_message" : [
{
"text" : "lucenl",
"offset" : 0,
"length" : 6,
"options" : [ # options為一個(gè)數(shù)組,里面的值為具體的推薦值
{
"text" : "lucene",
"score" : 0.8333333,
"freq" : 2
}
]
}
]
}
}
b) 當(dāng)輸入為多個(gè)單詞組成的字符串時(shí)
# 查詢
POST yztest/_search
{
"suggest": {
"suggest_message": {
"text": "lucene search",
"term": {
"field": "message",
"suggest_mode": "always"
}
}
}
}
# 查詢結(jié)果
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"suggest_message" : [
{
"text" : "lucene",
"offset" : 0,
"length" : 6,
"options" : [ ]
},
{
"text" : "search",
"offset" : 7,
"length" : 6,
"options" : [ ]
}
]
}
}
五、phrase suggester
# 詞組查詢
POST yztest/_search
{
"suggest": {
"YOUR_SUGGESTION": {
"text": "Solr and Elasticearc", # 用戶輸入的字符串
"phrase": { # 指定suggest類型為phrase suggester
"field": "message", # 待匹配的字段
"highlight": { # 可以設(shè)置高亮
"pre_tag": "<em>",
"post_tag": "</em>"
}
}
}
}
}
# 返回結(jié)果
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"YOUR_SUGGESTION" : [
{
"text" : "Solr and Elasticearc",
"offset" : 0,
"length" : 20,
"options" : [
{
"text" : "solr and elasticsearch",
"highlighted" : "solr and <em>elasticsearch</em>", # 高亮部分
"score" : 0.017689342
}
]
}
]
}
}

六、completion suggester
自動(dòng)補(bǔ)全功能
6.1 創(chuàng)建 mapping 指定 suggest 字段
# 創(chuàng)建索引
PUT yztest/
{
"mappings": {
"properties": {
"message": { # 通過(guò)字段的type,指定是否使用suggest
"type": "completion"
}
}
}
}
6.2 查詢
(1) 索引文檔
POST yztest/_doc/1
{
"message": "The goal of Apache Lucene is to provide world class search capabilities"
}
POST yztest/_doc/2
{
"message": "Lucene is the search core of both Apache Solr and Elasticsearch."
}
POST yztest/_doc/3
{
"message": "Lucene is the search core of Elasticsearch."
}
POST yztest/_doc/4
{
"message": "Lucene is the search core of Apache Solr."
}
(2) 前綴查詢
# 查詢
POST yztest/_search
{
"suggest": {
"message_suggest": { # 自定義suggester名稱
"prefix": "lucene is the", # 前綴字符串,即用戶輸入的文本
"completion": { # 指定suggester的類型為 completion suggester
"field": "message" # 待匹配的字段
}
}
}
}
# 查詢結(jié)果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"suggest" : {
"message_suggest" : [
{
"text" : "lucene is the",
"offset" : 0,
"length" : 13,
"options" : [
{
"text" : "Lucene is the search core of Apache Solr.",
"_index" : "yztest",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"message" : "Lucene is the search core of Apache Solr."
}
},
{
"text" : "Lucene is the search core of Elasticsearch.",
"_index" : "yztest",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"message" : "Lucene is the search core of Elasticsearch."
}
},
{
"text" : "Lucene is the search core of both Apache Solr and ",
"_index" : "yztest",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"message" : "Lucene is the search core of both Apache Solr and Elasticsearch."
}
}
]
}
]
}
}
(3) skip_duplicates
刪除重復(fù)匹配文檔
# 查詢中指定skip_duplicates, 默認(rèn)值為false
POST yztest/_search
{
"suggest": {
"message_suggest": {
"prefix": "lucene is the",
"completion": {
"field": "message",
"skip_duplicates": true
}
}
}
}
(4) fuzzy query
# 查詢中指定fuzzy屬性,即不一定是prefix準(zhǔn)確查詢
POST yztest/_search
{
"suggest": {
"message_suggest": {
"prefix": "lucen is the",
"completion": {
"field": "message",
"fuzzy": {
"fuzziness": 2
}
}
}
}
}
(5) regex 查詢,正則匹配
# 正則匹配
POST yztest/_search
{
"suggest": {
"message_suggest": {
"regex": ".*solr.*", # 正則表達(dá)式
"completion": {
"field": "message"
}
}
}
}
七、context suggester
八、如何實(shí)現(xiàn)?

|