ElasticSearch入门教程 - 从零开始学习ElasticSearch搜索引擎
ElasticSearch入门教程 - 从零开始学习ElasticSearch搜索引擎
目录
1. ElasticSearch简介
ElasticSearch(简称ES)是一个基于Apache Lucene构建的开源分布式搜索和分析引擎。它提供了一个分布式、多租户的全文搜索引擎,具有HTTP Web接口和无模式JSON文档。
核心特点:
✅ 分布式:支持水平扩展,可以轻松扩展到数百台服务器
✅ 实时搜索:近实时的搜索和分析能力
✅ RESTful API:使用简单的RESTful API进行交互
✅ 全文搜索:强大的全文搜索功能
✅ 多语言支持:支持多种语言的文本分析
✅ 可扩展性:插件机制,功能可扩展
✅ 高可用性:支持集群部署,自动故障转移
全文搜索:网站搜索、电商搜索
日志分析:ELK(ElasticSearch + Logstash + Kibana)日志分析栈
数据分析:实时数据分析、业务指标统计
监控系统:应用性能监控、系统监控
推荐系统:基于搜索的推荐功能
| 特性 | ElasticSearch | 传统数据库 |
|---|---|---|
| 数据模型 | 文档(JSON) | 表(行和列) |
| 查询语言 | Query DSL | SQL |
| 全文搜索 | 原生支持 | 需要额外配置 |
| 扩展性 | 水平扩展 | 垂直扩展为主 |
| 实时性 | 近实时 | 实时 |
| 事务支持 | 不支持 | 支持 |
- ElasticSearch:核心搜索引擎
- Kibana:数据可视化工具
- Logstash:数据收集和处理管道
- Beats:轻量级数据采集器
- Elastic Stack(ELK Stack):完整的日志分析解决方案
2. 环境搭建
- Java版本:JDK 8或更高版本(推荐JDK 11+)
- 操作系统:Linux、macOS、Windows
- 内存:至少2GB RAM(生产环境推荐8GB+)
- 磁盘空间:根据数据量确定
Windows系统安装
下载ElasticSearch
- 访问官网:https://www.elastic.co/downloads/elasticsearch
- 下载Windows版本(ZIP文件)
解压并运行
# 解压到指定目录 unzip elasticsearch-8.11.0-windows-x86_64.zip # 进入bin目录 cd elasticsearch-8.11.0/bin # 启动ElasticSearch elasticsearch.bat验证安装
- 打开浏览器访问:http://localhost:9200
- 应该看到类似以下JSON响应:
{ "name" : "DESKTOP-XXX", "cluster_name" : "elasticsearch", "cluster_uuid" : "xxx", "version" : { "number" : "8.11.0", "build_flavor" : "default", "build_type" : "zip", "build_hash" : "xxx", "build_date" : "2024-01-01T00:00:00.000Z", "build_snapshot" : false, "lucene_version" : "9.8.0", "minimum_wire_compatibility_version" : "7.17.0", "minimum_index_compatibility_version" : "7.0.0" } }
macOS系统安装
使用Homebrew安装:
# 安装Homebrew(如果未安装)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# 安装ElasticSearch
brew install elasticsearch
# 启动服务
brew services start elasticsearch
# 或者手动启动
elasticsearch验证安装:
curl http://localhost:9200Linux系统安装
使用APT(Ubuntu/Debian):
# 添加ElasticSearch仓库
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
# 安装ElasticSearch
sudo apt-get update
sudo apt-get install elasticsearch
# 启动服务
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch使用YUM(CentOS/RHEL):
# 添加ElasticSearch仓库
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
sudo vim /etc/yum.repos.d/elasticsearch.repo
# 添加以下内容:
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
# 安装ElasticSearch
sudo yum install elasticsearch
# 启动服务
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch2.3 配置ElasticSearch
配置文件位置:
- Windows/macOS:
config/elasticsearch.yml - Linux:
/etc/elasticsearch/elasticsearch.yml
常用配置:
# 集群名称
cluster.name: my-application
# 节点名称
node.name: node-1
# 数据存储路径
path.data: /var/lib/elasticsearch
# 日志存储路径
path.logs: /var/log/elasticsearch
# 网络绑定地址
network.host: 0.0.0.0
# HTTP端口
http.port: 9200
# 发现设置(单节点)
discovery.type: single-node
# 内存设置(在jvm.options中配置)
# -Xms2g
# -Xmx2gKibana是ElasticSearch的可视化管理工具。
macOS安装:
brew install kibana
brew services start kibana访问Kibana:
- 浏览器打开:http://localhost:5601
检查集群健康状态:
curl http://localhost:9200/_cluster/health响应示例:
{
"cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}状态说明:
green:所有主分片和副本分片都正常yellow:所有主分片正常,但部分副本分片未分配red:部分主分片不可用
3. 核心概念
| ElasticSearch | 关系型数据库 | 说明 |
|---|---|---|
| Index(索引) | Database(数据库) | 数据的逻辑分组 |
| Type(类型) | Table(表) | ES 7.x后已废弃,使用_doc |
| Document(文档) | Row(行) | 一条数据记录 |
| Field(字段) | Column(列) | 数据字段 |
| Mapping(映射) | Schema(模式) | 数据结构定义 |
| Shard(分片) | - | 索引的分片 |
| Replica(副本) | - | 分片的副本 |
索引是文档的集合,类似于关系型数据库中的数据库。
特点:
- 一个索引包含多个文档
- 索引名称必须小写
- 索引可以包含多个类型(ES 7.x后推荐只使用_doc)
文档是ElasticSearch中的基本数据单位,以JSON格式存储。
文档结构:
{
"_index": "users",
"_type": "_doc",
"_id": "1",
"_source": {
"name": "张三",
"age": 25,
"email": "zhangsan@example.com",
"city": "北京"
}
}映射定义了文档的结构和字段类型,类似于数据库的表结构。
字段类型:
text:全文搜索字段keyword:精确匹配字段long、integer、short、byte:整数类型double、float:浮点数类型boolean:布尔类型date:日期类型object:对象类型nested:嵌套对象类型
分片是索引的水平分割,每个分片是一个独立的Lucene索引。
主分片(Primary Shard):
- 索引创建时指定,之后不能修改
- 默认5个主分片
副本分片(Replica Shard):
- 主分片的副本,提供高可用性
- 默认1个副本分片
- 可以动态修改
节点(Node):
- 单个ElasticSearch实例
- 可以存储数据、参与集群索引和搜索
集群(Cluster):
- 多个节点组成
- 提供高可用性和扩展性
4. RESTful API基础
ElasticSearch使用RESTful API,所有操作都通过HTTP请求完成。
基本格式:
http://localhost:9200/<index>/<type>/<id>HTTP方法:
GET:查询POST:创建/更新PUT:创建/更新DELETE:删除HEAD:检查存在性
基本语法:
curl -X<METHOD> http://localhost:9200/<path> -d '<JSON>'常用选项:
-X:指定HTTP方法-d:请求体数据-H:请求头--pretty:格式化输出
示例:
# GET请求
curl http://localhost:9200/_cluster/health?pretty
# POST请求
curl -X POST http://localhost:9200/users/_doc -H 'Content-Type: application/json' -d'
{
"name": "张三",
"age": 25
}'Kibana提供了Dev Tools工具,可以更方便地执行API请求。
访问:
- 打开Kibana:http://localhost:5601
- 点击左侧菜单"Dev Tools"
示例:
GET /_cluster/health
POST /users/_doc
{
"name": "张三",
"age": 25
}5. 索引操作
简单创建
# 使用PUT方法创建索引
PUT /users响应:
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "users"
}带配置创建
PUT /users
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "standard",
"stopwords": "_english_"
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "my_analyzer"
},
"age": {
"type": "integer"
},
"email": {
"type": "keyword"
},
"create_time": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}查看所有索引
GET /_cat/indices?v响应:
health status index uuid pri rep docs.count store.size
green open users abc123 1 1 0 208b查看索引详情
GET /users查看索引映射
GET /users/_mapping查看索引设置
GET /users/_settingsPUT /users/_settings
{
"number_of_replicas": 2
}注意: 主分片数量创建后不能修改。
DELETE /users批量删除:
DELETE /index1,index2,index3
# 删除所有索引(危险操作)
DELETE /_all索引别名可以指向一个或多个索引。
创建别名:
POST /_aliases
{
"actions": [
{
"add": {
"index": "users",
"alias": "users_alias"
}
}
]
}使用别名:
GET /users_alias/_search切换别名(零停机时间):
POST /_aliases
{
"actions": [
{
"remove": {
"index": "users_v1",
"alias": "users"
}
},
{
"add": {
"index": "users_v2",
"alias": "users"
}
}
]
}6. 文档操作
指定ID创建
PUT /users/_doc/1
{
"name": "张三",
"age": 25,
"email": "zhangsan@example.com",
"city": "北京",
"create_time": "2024-01-15 10:00:00"
}响应:
{
"_index" : "users",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}自动生成ID
POST /users/_doc
{
"name": "李四",
"age": 30,
"email": "lisi@example.com",
"city": "上海"
}响应中的_id是自动生成的。
根据ID查询
GET /users/_doc/1响应:
{
"_index" : "users",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "张三",
"age" : 25,
"email" : "zhangsan@example.com",
"city" : "北京",
"create_time" : "2024-01-15 10:00:00"
}
}只获取_source
GET /users/_doc/1/_source检查文档是否存在
HEAD /users/_doc/1响应:
- 200:存在
- 404:不存在
完全替换
PUT /users/_doc/1
{
"name": "张三",
"age": 26,
"email": "zhangsan@example.com",
"city": "北京",
"create_time": "2024-01-15 10:00:00"
}部分更新
POST /users/_update/1
{
"doc": {
"age": 26
}
}使用脚本更新
POST /users/_update/1
{
"script": {
"source": "ctx._source.age += 1"
}
}更新或插入(upsert)
POST /users/_update/1
{
"script": {
"source": "ctx._source.age += 1"
},
"upsert": {
"name": "张三",
"age": 25
}
}DELETE /users/_doc/1响应:
{
"_index" : "users",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}批量索引
POST /_bulk
{ "index" : { "_index" : "users", "_id" : "1" } }
{ "name" : "张三", "age" : 25, "email" : "zhangsan@example.com" }
{ "index" : { "_index" : "users", "_id" : "2" } }
{ "name" : "李四", "age" : 30, "email" : "lisi@example.com" }
{ "create" : { "_index" : "users", "_id" : "3" } }
{ "name" : "王五", "age" : 28, "email" : "wangwu@example.com" }
{ "update" : { "_index" : "users", "_id" : "1" } }
{ "doc" : { "age" : 26 } }
{ "delete" : { "_index" : "users", "_id" : "2" } }注意: 每两行一组,第一行是操作,第二行是数据(delete操作不需要数据行)。
7. 查询操作
查询所有文档
GET /users/_search
{
"query": {
"match_all": {}
}
}分页查询
GET /users/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 10
}指定返回字段
GET /users/_search
{
"query": {
"match_all": {}
},
"_source": ["name", "age"]
}排序
GET /users/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}全文匹配
GET /users/_search
{
"query": {
"match": {
"name": "张三"
}
}
}多字段匹配
GET /users/_search
{
"query": {
"multi_match": {
"query": "北京",
"fields": ["city", "address"]
}
}
}GET /users/_search
{
"query": {
"term": {
"email": {
"value": "zhangsan@example.com"
}
}
}
}注意: term查询不会对查询词进行分析,适合keyword字段。
多值精确匹配
GET /users/_search
{
"query": {
"terms": {
"age": [25, 30, 35]
}
}
}GET /users/_search
{
"query": {
"range": {
"age": {
"gte": 25,
"lte": 35
}
}
}
}操作符:
gt:大于gte:大于等于lt:小于lte:小于等于
GET /users/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "张"
}
}
],
"must_not": [
{
"term": {
"city": "上海"
}
}
],
"should": [
{
"range": {
"age": {
"gte": 25
}
}
}
],
"filter": [
{
"term": {
"status": "active"
}
}
]
}
}
}子句说明:
must:必须匹配,影响评分must_not:必须不匹配,不影响评分should:应该匹配,影响评分filter:必须匹配,不影响评分
通配符查询(Wildcard)
GET /users/_search
{
"query": {
"wildcard": {
"name": {
"value": "张*"
}
}
}
}前缀查询(Prefix)
GET /users/_search
{
"query": {
"prefix": {
"email": {
"value": "zhang"
}
}
}
}模糊匹配(Fuzzy)
GET /users/_search
{
"query": {
"fuzzy": {
"name": {
"value": "张三",
"fuzziness": "AUTO"
}
}
}
}GET /users/_search
{
"query": {
"match": {
"name": "张三"
}
},
"highlight": {
"fields": {
"name": {}
}
}
}响应:
{
"hits": {
"hits": [
{
"_source": {
"name": "张三"
},
"highlight": {
"name": ["<em>张</em><em>三</em>"]
}
}
]
}
}详见第8章聚合分析。
8. 聚合分析
聚合是对数据进行统计分析的功能。
基本结构:
GET /users/_search
{
"size": 0,
"aggs": {
"聚合名称": {
"聚合类型": {
"聚合参数"
}
}
}
}平均值(Avg)
GET /users/_search
{
"size": 0,
"aggs": {
"avg_age": {
"avg": {
"field": "age"
}
}
}
}最大值(Max)
GET /users/_search
{
"size": 0,
"aggs": {
"max_age": {
"max": {
"field": "age"
}
}
}
}最小值(Min)
GET /users/_search
{
"size": 0,
"aggs": {
"min_age": {
"min": {
"field": "age"
}
}
}
}求和(Sum)
GET /users/_search
{
"size": 0,
"aggs": {
"total_age": {
"sum": {
"field": "age"
}
}
}
}统计(Stats)
GET /users/_search
{
"size": 0,
"aggs": {
"age_stats": {
"stats": {
"field": "age"
}
}
}
}返回: count, min, max, avg, sum
词条聚合(Terms)
GET /users/_search
{
"size": 0,
"aggs": {
"cities": {
"terms": {
"field": "city",
"size": 10
}
}
}
}响应:
{
"aggregations": {
"cities": {
"buckets": [
{
"key": "北京",
"doc_count": 100
},
{
"key": "上海",
"doc_count": 80
}
]
}
}
}范围聚合(Range)
GET /users/_search
{
"size": 0,
"aggs": {
"age_ranges": {
"range": {
"field": "age",
"ranges": [
{
"to": 25
},
{
"from": 25,
"to": 35
},
{
"from": 35
}
]
}
}
}
}日期范围聚合(Date Range)
GET /users/_search
{
"size": 0,
"aggs": {
"date_ranges": {
"date_range": {
"field": "create_time",
"ranges": [
{
"from": "2024-01-01",
"to": "2024-01-31"
}
]
}
}
}
}直方图聚合(Histogram)
GET /users/_search
{
"size": 0,
"aggs": {
"age_histogram": {
"histogram": {
"field": "age",
"interval": 5
}
}
}
}GET /users/_search
{
"size": 0,
"aggs": {
"cities": {
"terms": {
"field": "city"
},
"aggs": {
"avg_age": {
"avg": {
"field": "age"
}
}
}
}
}
}结果: 每个城市的分组,以及该城市的平均年龄。
9. Java客户端集成
Maven依赖:
<dependencies>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.17.0</version>
</dependency>
<!-- 或者使用新的Java API Client(推荐) -->
<dependency>
<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>8.11.0</version>
</dependency>
</dependencies>使用REST High Level Client
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.apache.http.HttpHost;
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http")
)
);使用Java API Client(推荐)
import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.json.jackson.JacksonJsonpMapper;
import co.elastic.clients.transport.ElasticsearchTransport;
import co.elastic.clients.transport.rest_client.RestClientTransport;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
// 创建REST客户端
RestClient restClient = RestClient.builder(
new HttpHost("localhost", 9200)
).build();
// 创建传输层
ElasticsearchTransport transport = new RestClientTransport(
restClient, new JacksonJsonpMapper()
);
// 创建API客户端
ElasticsearchClient client = new ElasticsearchClient(transport);// 创建索引
CreateIndexRequest request = new CreateIndexRequest("users");
CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
// 删除索引
DeleteIndexRequest deleteRequest = new DeleteIndexRequest("users");
AcknowledgedResponse deleteResponse = client.indices().delete(deleteRequest, RequestOptions.DEFAULT);// 创建文档
IndexRequest request = new IndexRequest("users");
request.id("1");
request.source(
"name", "张三",
"age", 25,
"email", "zhangsan@example.com"
);
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
// 查询文档
GetRequest getRequest = new GetRequest("users", "1");
GetResponse getResponse = client.get(getRequest, RequestOptions.DEFAULT);
// 更新文档
UpdateRequest updateRequest = new UpdateRequest("users", "1");
updateRequest.doc("age", 26);
UpdateResponse updateResponse = client.update(updateRequest, RequestOptions.DEFAULT);
// 删除文档
DeleteRequest deleteRequest = new DeleteRequest("users", "1");
DeleteResponse deleteResponse = client.delete(deleteRequest, RequestOptions.DEFAULT);// 构建查询
SearchRequest searchRequest = new SearchRequest("users");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// Match查询
sourceBuilder.query(QueryBuilders.matchQuery("name", "张三"));
// Term查询
sourceBuilder.query(QueryBuilders.termQuery("age", 25));
// Range查询
sourceBuilder.query(QueryBuilders.rangeQuery("age").gte(25).lte(35));
// Bool查询
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
boolQuery.must(QueryBuilders.matchQuery("name", "张"));
boolQuery.filter(QueryBuilders.rangeQuery("age").gte(25));
sourceBuilder.query(boolQuery);
// 分页
sourceBuilder.from(0);
sourceBuilder.size(10);
// 排序
sourceBuilder.sort("age", SortOrder.DESC);
searchRequest.source(sourceBuilder);
// 执行查询
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
// 处理结果
SearchHits hits = searchResponse.getHits();
for (SearchHit hit : hits) {
String sourceAsString = hit.getSourceAsString();
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
}SearchRequest searchRequest = new SearchRequest("users");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.size(0);
// 词条聚合
TermsAggregationBuilder aggregation = AggregationBuilders
.terms("cities")
.field("city");
sourceBuilder.aggregation(aggregation);
// 平均值聚合
AvgAggregationBuilder avgAgg = AggregationBuilders
.avg("avg_age")
.field("age");
sourceBuilder.aggregation(avgAgg);
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
// 处理聚合结果
Aggregations aggregations = searchResponse.getAggregations();
Terms cities = aggregations.get("cities");
for (Terms.Bucket bucket : cities.getBuckets()) {
String key = bucket.getKeyAsString();
long docCount = bucket.getDocCount();
}client.close();10. 实际应用案例
需求: 实现电商网站的商品搜索功能。
索引结构:
PUT /products
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"price": {
"type": "double"
},
"category": {
"type": "keyword"
},
"brand": {
"type": "keyword"
},
"description": {
"type": "text"
}
}
}
}搜索查询:
GET /products/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "手机",
"fields": ["title^2", "description"]
}
}
],
"filter": [
{
"term": {
"category": "电子产品"
}
},
{
"range": {
"price": {
"gte": 1000,
"lte": 5000
}
}
}
]
}
},
"sort": [
{
"price": {
"order": "asc"
}
}
],
"highlight": {
"fields": {
"title": {},
"description": {}
}
}
}需求: 分析应用日志,统计错误日志数量。
索引结构:
PUT /logs
{
"mappings": {
"properties": {
"timestamp": {
"type": "date"
},
"level": {
"type": "keyword"
},
"message": {
"type": "text"
},
"service": {
"type": "keyword"
}
}
}
}聚合分析:
GET /logs/_search
{
"size": 0,
"query": {
"range": {
"timestamp": {
"gte": "now-1d"
}
}
},
"aggs": {
"errors_by_service": {
"terms": {
"field": "service"
},
"aggs": {
"error_count": {
"filter": {
"term": {
"level": "ERROR"
}
}
}
}
}
}
}需求: 分析用户行为数据,统计用户活跃度。
聚合查询:
GET /user_actions/_search
{
"size": 0,
"aggs": {
"active_users": {
"cardinality": {
"field": "user_id"
}
},
"actions_by_hour": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "hour"
}
}
}
}11. 性能优化与最佳实践
分片策略
- 主分片数量:根据数据量和节点数量确定,建议每个分片20-50GB
- 副本数量:生产环境建议1-2个副本
映射优化
- 合理选择字段类型
- 不需要搜索的字段设置
index: false - 不需要聚合的字段设置
doc_values: false
{
"mappings": {
"properties": {
"content": {
"type": "text",
"index": true
},
"image_url": {
"type": "keyword",
"index": false
}
}
}
}使用filter而非query
filter不计算评分,性能更好。
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "手机"
}
}
],
"filter": [
{
"term": {
"status": "active"
}
}
]
}
}
}避免深度分页
使用search_after代替from/size进行深度分页。
GET /users/_search
{
"size": 10,
"sort": [
{
"_id": "asc"
}
],
"search_after": ["last_id"]
}- 使用bulk API进行批量操作
- 控制批量大小(建议1000-5000条)
- 使用多线程并行处理
查看集群状态
GET /_cluster/health
GET /_nodes/stats
GET /_cat/indices?v慢查询日志
在elasticsearch.yml中配置:
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s索引设计
- 合理设置分片和副本
- 选择合适的字段类型
- 使用别名管理索引
查询优化
- 使用filter代替query
- 避免深度分页
- 合理使用聚合
数据管理
- 定期删除旧数据
- 使用索引生命周期管理(ILM)
- 定期优化索引
监控维护
- 监控集群健康状态
- 关注慢查询日志
- 定期备份数据
12. 总结与进阶
通过本教程,你已经掌握了:
- ✅ ElasticSearch的基本概念和架构
- ✅ 索引和文档的CRUD操作
- ✅ 各种查询方式
- ✅ 聚合分析功能
- ✅ Java客户端集成
- ✅ 实际应用案例
- ✅ 性能优化技巧
ElasticSearch集群
- 多节点集群部署
- 分片分配策略
- 故障恢复机制
高级查询
- 复杂聚合查询
- 地理位置查询
- 嵌套查询
插件开发
- 自定义分析器
- 自定义插件
ELK Stack
- Logstash数据采集
- Kibana可视化
- Beats数据采集
性能调优
- JVM调优
- 查询性能优化
- 索引性能优化
- 官方文档:https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html
- 中文社区:https://elasticsearch.cn/
- GitHub:https://github.com/elastic/elasticsearch
- 搭建ELK日志分析系统
- 实现商品搜索功能
- 构建用户行为分析系统
- 优化查询性能
结语
ElasticSearch是一个功能强大、应用广泛的搜索引擎。通过本教程的学习,相信你已经掌握了ElasticSearch的核心功能和使用方法。
记住:
- 多实践:理论结合实践,多动手操作
- 理解原理:理解倒排索引、分片等核心概念
- 关注性能:注意查询和索引的性能优化
- 持续学习:关注ElasticSearch新版本特性
祝你学习愉快,编程顺利! 🚀
本教程由Java突击队学习社区编写,如有问题欢迎反馈。