search(12)- elastic4s-聚合=桶+度量

摘要:
聚合是将索引数据可视化为可读和有用数据的主要工具。聚合由桶桶和度量组成。Elastic4如下所示:valaggTerms=search。聚合。sourceInclude。sizeprintlnvaltermsResult=客户端。处决waittermsResult。后果击打。foreachtermsResult。后果聚合。条款。每个输出的buckets是:POST:/cartxns/_ search?StringEntityMapMapred,4green,avg under 2_ Price是一个简单的度量:POST/cartxns/_ search{“aggs”:{“colors”:{“terms”:{“field”:“color.keyword”},“聚合”:{“avg_price”:{“avg”:{“field”:“price”}}}}……“聚合”(aggregations):{”colors“:{”doc_count_error_upper_bound“:0,”sum_other_doc_count“:1,”buckets“:[{”key“:”red“,”doc_cont“:4,”avg_plice“:{值”:32500.0}}},{”key“:“blue”,“doc_count”:2,“avg_price”:{”value“:20000.0}},{“key”:“green”,“doc_count”:2,“avg_price”:{“value”:21000.0}}]}术语定义bucket。在条款下添加aggs avg表示满足某个后端条件的文件的平均定价avg _ price

这篇我们介绍一下ES的聚合功能(aggregation)。聚合是把索引数据可视化处理成可读有用数据的主要工具。聚合由bucket桶和metrics度量两部分组成。

所谓bucket就是SQL的GROUPBY,如下:

GET /cartxns/_search
{
  "size" : 2,
  "aggs": {
    "color": {
      "terms": {"field": "color.keyword"}
    }
  }
}

...

  "aggregations": {
    "color": {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets": [
        {
          "key" : "red",
          "doc_count" : 4},
        {
          "key" : "blue",
          "doc_count" : 2},
        {
          "key" : "green",
          "doc_count" : 2}
      ]
    }
  }

上面这个例子中是以color.keyword为bucket的。elastic4是如下表现的:

val aggTerms = search("cartxns").aggregations(
    termsAgg("colors","color.keyword").includeExactValues("red","green")
  ).sourceInclude("color","make").size(3)
  println(aggTerms.show)

  val termsResult = client.execute(aggTerms).await
  termsResult.result.hits.hits.foreach(m =>println(m.sourceAsMap))
  termsResult.result.aggregations.terms("colors").buckets.foreach(b => println(s"${b.key},${b.docCount}"))

输出为:

POST:/cartxns/_search?StringEntity({"size":3,"_source":{"includes":["color","make"]},"aggs":{"colors":{"terms":{"field":"color.keyword","include":["red","green"]}}}},Some(application/json))
Map(color -> red, make ->honda)
Map(color -> red, make ->honda)
Map(color -> green, make ->ford)
red,4green,2

下面的avg_price是个简单的度量:

POST /cartxns/_search
{
  "aggs":{
    "colors":{
      "terms":{"field":"color.keyword"},
      "aggs":{
        "avg_price":{
          "avg":{"field":"price"}
        }
      }
    }
  }
}

...

  "aggregations": {
    "colors": {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets": [
        {
          "key" : "red",
          "doc_count" : 4,
          "avg_price": {
            "value" : 32500.0}
        },
        {
          "key" : "blue",
          "doc_count" : 2,
          "avg_price": {
            "value" : 20000.0}
        },
        {
          "key" : "green",
          "doc_count" : 2,
          "avg_price": {
            "value" : 21000.0}
        }
      ]
    }
  }

terms定义bucket。在terms下加上aggs-avg表示符合某个backet条件文件的平均定价avg_price。elastic4是如下表达的:

val aggTermsAvg = search("cartxns").aggregations(
    termsAgg("colors","color.keyword").subAggregations(
      avgAgg("avg_price","price")
    )
  ).sourceInclude("color","make").size(3)
  println(aggTermsAvg.show)

  val avgResult = client.execute(aggTermsAvg).await
  avgResult.result.hits.hits.foreach(m =>println(m.sourceAsMap))
  avgResult.result.aggregations.terms("colors").buckets
    .foreach(b => println(s"${b.key},${b.docCount},${b.avg("avg_price").value}"))

...

POST:/cartxns/_search?StringEntity({"size":3,"_source":{"includes":["color","make"]},"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}},Some(application/json))
Map(color -> red, make ->honda)
Map(color -> red, make ->honda)
Map(color -> green, make ->ford)
red,4,32500.0blue,2,20000.0green,2,21000.0

然后,我们可以在bucket里再增加bucket,如下:

POST /cartxns/_search
{
  "aggs":{
    "colors":{
      "terms":{"field":"color.keyword"},
      "aggs":{
        "avg_price":{"avg":{"field":"price"}},
        "makes":{"terms":{"field":"make.keyword"}}
      }
    }
  }
}

...

  "aggregations": {
    "colors": {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets": [
        {
          "key" : "red",
          "doc_count" : 4,
          "makes": {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
              {
                "key" : "honda",
                "doc_count" : 3},
              {
                "key" : "bmw",
                "doc_count" : 1}
            ]
          },
          "avg_price": {
            "value" : 32500.0}
        },
        {
          "key" : "blue",
          "doc_count" : 2,
          "makes": {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
              {
                "key" : "ford",
                "doc_count" : 1},
              {
                "key" : "toyota",
                "doc_count" : 1}
            ]
          },
          "avg_price": {
            "value" : 20000.0}
        },
        {
          "key" : "green",
          "doc_count" : 2,
          "makes": {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
              {
                "key" : "ford",
                "doc_count" : 1},
              {
                "key" : "toyota",
                "doc_count" : 1}
            ]
          },
          "avg_price": {
            "value" : 21000.0}
        }
      ]
    }
  }

elastic4示范:

val aggTAvgT = search("cartxns").aggregations(
    termsAgg("colors","color.keyword").subAggregations(
      avgAgg("avg_price","price"),
      termsAgg("makes","make.keyword")
    )
  ).size(3)
  println(aggTAvgT.show)

  val avgTTResult = client.execute(aggTAvgT).await
  avgTTResult.result.hits.hits.foreach(m =>println(m.sourceAsMap))
  avgTTResult.result.aggregations.terms("colors").buckets
    .foreach { cb =>println(s"${cb.key},${cb.docCount},${cb.avg("avg_price").value}")
      cb.terms("makes").buckets.foreach(mb => println(s"${mb.key},${mb.docCount}"))
    }

...

POST:/cartxns/_search?StringEntity({"size":3,"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}},"makes":{"terms":{"field":"make.keyword"}}}}}},Some(application/json))
Map(price -> 10000, color -> red, make -> honda, sold -> 2014-10-28)
Map(price -> 20000, color -> red, make -> honda, sold -> 2014-11-05)
Map(price -> 30000, color -> green, make -> ford, sold -> 2014-05-18)
red,4,32500.0honda,3bmw,1blue,2,20000.0ford,1toyota,1green,2,21000.0ford,1toyota,1

最后,我们再在最内层的bucket增加min,max两个metrics:

POST /cartxns/_search
{
  "size":3,
  "aggs":{
    "colors":{
      "terms":{"field":"color.keyword"},
      "aggs":{
        "avg_price":{"avg":{"field":"price"}},
        "makes":{"terms":{"field":"make.keyword"},
        "aggs":{
          "max_price":{"max":{"field":"price"}},
          "min_price":{"min":{"field":"price"}}
        }
       }
      }
    }
  }
}

...

  "aggregations": {
    "colors": {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets": [
        {
          "key" : "red",
          "doc_count" : 4,
          "makes": {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
              {
                "key" : "honda",
                "doc_count" : 3,
                "max_price": {
                  "value" : 20000.0},
                "min_price": {
                  "value" : 10000.0}
              },
              {
                "key" : "bmw",
                "doc_count" : 1,
                "max_price": {
                  "value" : 80000.0},
                "min_price": {
                  "value" : 80000.0}
              }
            ]
          },
          "avg_price": {
            "value" : 32500.0}
        },
        {
          "key" : "blue",
          "doc_count" : 2,
          "makes": {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
              {
                "key" : "ford",
                "doc_count" : 1,
                "max_price": {
                  "value" : 25000.0},
                "min_price": {
                  "value" : 25000.0}
              },
              {
                "key" : "toyota",
                "doc_count" : 1,
                "max_price": {
                  "value" : 15000.0},
                "min_price": {
                  "value" : 15000.0}
              }
            ]
          },
          "avg_price": {
            "value" : 20000.0}
        },
        {
          "key" : "green",
          "doc_count" : 2,
          "makes": {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets": [
              {
                "key" : "ford",
                "doc_count" : 1,
                "max_price": {
                  "value" : 30000.0},
                "min_price": {
                  "value" : 30000.0}
              },
              {
                "key" : "toyota",
                "doc_count" : 1,
                "max_price": {
                  "value" : 12000.0},
                "min_price": {
                  "value" : 12000.0}
              }
            ]
          },
          "avg_price": {
            "value" : 21000.0}
        }
      ]
    }
  }

elastic4示范:

val aggTAvgTMM = search("cartxns").aggregations(
    termsAgg("colors","color.keyword").subAggregations(
      avgAgg("avg_price","price"),
      termsAgg("makes","make.keyword").subAggregations(
        maxAgg("max_price","price"),
        minAgg("min_price","price")
      )
    )
  ).size(3)
  println(aggTAvgTMM.show)

  val avgTTMMResult = client.execute(aggTAvgTMM).await
  avgTTMMResult.result.hits.hits.foreach(m =>println(m.sourceAsMap))
  avgTTMMResult.result.aggregations.terms("colors").buckets
    .foreach { cb =>println(s"${cb.key},${cb.docCount},${cb.avg("avg_price").value}")
      cb.terms("makes").buckets.foreach { mb =>println(s"${mb.key},${mb.docCount},${mb.avg("min_price").value},${mb.avg("max_price").value}")
      }
    }

...

POST:/cartxns/_search?StringEntity({"size":3,"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}},"makes":{"terms":{"field":"make.keyword"},"aggs":{"max_price":{"max":{"field":"price"}},"min_price":{"min":{"field":"price"}}}}}}}},Some(application/json))
Map(price -> 10000, color -> red, make -> honda, sold -> 2014-10-28)
Map(price -> 20000, color -> red, make -> honda, sold -> 2014-11-05)
Map(price -> 30000, color -> green, make -> ford, sold -> 2014-05-18)
red,4,32500.0honda,3,10000.0,20000.0bmw,1,80000.0,80000.0blue,2,20000.0ford,1,25000.0,25000.0toyota,1,15000.0,15000.0green,2,21000.0ford,1,30000.0,30000.0toyota,1,12000.0,12000.0

免责声明:文章转载自《search(12)- elastic4s-聚合=桶+度量》仅用于学习参考。如对内容有疑问,请及时联系本站处理。

上篇Qemu虚拟机 玩树莓派最新版系统 (截止2017-04-10)1. qt 入门-整体框架下篇

宿迁高防,2C2G15M,22元/月;香港BGP,2C5G5M,25元/月 雨云优惠码:MjYwNzM=

随便看看

eureka服务列表刷新设置

服务器:当我们启用服务使用者时,它将向服务注册中心发送一个rest请求,以获取上面注册的服务列表。出于性能原因,eureka服务器将维护一个只读缓存服务列表以返回到客户端。默认情况下,缓存列表将每30秒更新一次。如果关闭UseReadOnlyResponseCache,服务器:#将不会读取只读缓存服务列表,因为每30秒刷新一次很慢,所以读/写缓存过期策略Us...

Wayland 源码解析之代码结构

Wayland实现的代码组成可以分为以下四个部分:1.Wayland库的核心部分,大部分Wayland协议实现都位于该库中。1) 该工具程序分析Wayland协议文件并生成相应的头文件和代码文件。源代码文件列表:wayland/cursor/wayland cursor。通道/光标/通道光标。cwyland/cursor/os兼容性。cwyland/curs...

ESXi挂载NFS共享存储

使用万兆交换机,ESXi使用NFS协议连接存储。本文介绍的是通过NFS协议挂载共享存储上的VS01卷,共享存储上已经赋予ESXi主机访问该卷的权限。...

C#Win32API编程之PostMessage

本文以C#调用Win32API函数PostMessage完成指定表单的后台鼠标和键盘模拟为例,大致解释了C#调用非托管代码和Window的消息处理机制。我们可以将PostMessage用于函数。成功与否在很大程度上取决于我们传达的信息是否真实。消息表明消息是什么。请原谅我先讲故事。我希望先解释一下PostMessage函数。这是一个异步操作,如下图所示:调用...

selenium自动化之鼠标操作

,selenium为我们提供了一个处理此类事件的类——ActionChains。ActionChains可以模拟鼠标操作,例如单击、双击、右键单击、拖动等。鼠标移动时演示页面的截图:demo1.使用鼠标移动到WriteonOver按钮的顶部。python脚本如下:读取鼠标移动代码,首先定义浏览器驱动程序,最大化窗口,打开测试页面URL,定位到测试按钮顶部,定...

Activiti-个人任务

1.分配任务所有者1.1固定分配在业务流程建模期间指定固定任务所有者;在properties视图中,填写Assignee项作为任务所有者;注:通过固定分配方法,任务是逐步执行的,任务负责人将根据bpmn的配置分配给每个任务;1.2表达式分配1.2.1 UEL表达式Activiti使用UEL表达式,UEL是javaEE6...