查询es的时候,报如下错误:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cmdb.ci_area] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "alarm_anaysis_2024",
"node" : "k4ujQE7LSOyuWU_qUkeUqQ",
"reason" : {
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cmdb.ci_area] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
],
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cmdb.ci_area] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cmdb.ci_area] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
"status" : 400
}
以下出自文心一言给出的解释:
在Elasticsearch中,文本字段(text fields)默认是不为聚合(aggregations)和排序(sorting)等需要每个文档字段数据的操作优化的。这些操作在文本字段上默认是禁用的,因为文本字段主要是为了全文搜索而设计的,它们的数据在索引时会被分词并存储为倒排索引(inverted index),这种方式对于快速搜索非常有效,但不适合直接用于聚合或排序。
对于需要进行聚合或排序的字段,Elasticsearch推荐使用关键字字段(keyword fields)。关键字字段的数据不会被分词,而是直接以原始字符串的形式存储,这使得它们非常适合用于聚合和排序操作。
如果你确实需要在文本字段上进行聚合或排序,并且无法将字段类型更改为关键字字段,你可以通过设置fielddata=true来允许Elasticsearch为该字段加载字段数据。但是,这样做有几个重要的注意事项:
内存使用:加载字段数据会显著增加内存使用,因为Elasticsearch需要为每个文档中的该字段值创建一个内存中的数据结构。如果数据量很大,这可能会导致严重的内存压力,甚至可能使Elasticsearch集群变得不稳定。
性能影响:即使有足够的内存来加载字段数据,查询性能也可能受到显著影响,因为字段数据的访问通常比直接从索引中检索数据要慢。
数据一致性:字段数据是在查询时动态加载的,因此它可能不总是与磁盘上的索引数据完全同步。这可能会导致在并发更新场景下出现数据一致性问题。
因此,在设置fielddata=true之前,请务必仔细考虑这些因素,并确保你的Elasticsearch集群有足够的内存来处理额外的负载。如果可能的话,最好是将字段类型更改为关键字字段,或者重新设计你的数据模型和查询策略,以避免在文本字段上进行聚合或排序操作。
看一下使用的dsl语句
GET alarm_anaysis_*/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"terms": {
"cmdb.ci_area": [
"小米",
"华为",
"苹果",
"oppo"
]
}
}
],
"filter": [
{
"range": {
"@timestamp": {
"time_zone": "Asia/Shanghai",
"format":"yyyy-MM-dd HH:mm:ss",
"gte": "now-2d",
"lte": "now"
}
}
}
]
}
},
"aggs": {
"series": {
"date_histogram": {
"field": "@timestamp",
"interval": "10m",
"time_zone": "Asia/Shanghai",
"format":"yyyy-MM-dd HH:mm:ss"
},
"aggs": {
"ci_area": {
"terms": {
"field": "cmdb.ci_area",
"size": 10
}
}
}
}
}
}
在cmdb.ci_area字段后面加上 .keyword,即可解决
GET alarm_anaysis_*/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"terms": {
"cmdb.ci_area.keyword": [
"小米",
"华为",
"苹果",
"oppo"
]
}
}
],
"filter": [
{
"range": {
"@timestamp": {
"time_zone": "Asia/Shanghai",
"format":"yyyy-MM-dd HH:mm:ss",
"gte": "now-2d",
"lte": "now"
}
}
}
]
}
},
"aggs": {
"series": {
"date_histogram": {
"field": "@timestamp",
"interval": "10m",
"time_zone": "Asia/Shanghai",
"format":"yyyy-MM-dd HH:mm:ss"
},
"aggs": {
"ci_area": {
"terms": {
"field": "cmdb.ci_area.keyword",
"size": 10
}
}
}
}
}
}