文档删除 #
一、删除概述 #
1.1 删除类型 #
| 类型 | 说明 |
|---|---|
| 按ID删除 | 根据唯一键删除单个文档 |
| 按查询删除 | 根据查询条件删除多个文档 |
| 批量删除 | 批量删除多个文档 |
| 删除所有 | 清空索引 |
1.2 删除流程 #
text
删除请求
↓
解析删除条件
↓
标记删除(软删除)
↓
提交
↓
实际删除(段合并时)
二、按ID删除 #
2.1 JSON格式 #
bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"id": "book-001"}}'
2.2 XML格式 #
bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/xml" \
-d '<delete><id>book-001</id></delete>'
2.3 使用post工具 #
bash
# 删除单个文档
bin/post -c mycore -d '{"delete": {"id": "book-001"}}'
三、批量删除 #
3.1 多ID删除 #
bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{
"delete": [
{"id": "book-001"},
{"id": "book-002"},
{"id": "book-003"}
]
}'
3.2 XML批量删除 #
bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/xml" \
-d '<delete>
<id>book-001</id>
<id>book-002</id>
<id>book-003</id>
</delete>'
3.3 从文件删除 #
bash
# 创建删除列表文件
cat > delete_ids.json << 'EOF'
{
"delete": [
{"id": "book-001"},
{"id": "book-002"},
{"id": "book-003"}
]
}
EOF
# 执行删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
--data-binary @delete_ids.json
四、按查询删除 #
4.1 基本语法 #
bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:tech"}}'
4.2 条件删除示例 #
按字段值删除
bash
# 删除特定分类
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:tech"}}'
# 删除特定作者
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "author:张三"}}'
范围删除
bash
# 删除价格范围
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "price:[0 TO 50]"}}'
# 删除日期范围
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "publish_date:[* TO 2020-01-01T00:00:00Z]"}}'
复合条件删除
bash
# 多条件删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:tech AND price:[0 TO 100]"}}'
# 使用OR
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:tech OR category:book"}}'
# 使用NOT
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "NOT category:tech"}}'
4.3 删除所有文档 #
bash
# JSON格式
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "*:*"}}'
# XML格式
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/xml" \
-d '<delete><query>*:*</query></delete>'
五、删除与提交 #
5.1 自动提交删除 #
bash
# 删除并提交
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true" \
-H "Content-Type: application/json" \
-d '{"delete": {"id": "book-001"}}'
5.2 单独提交 #
bash
# 先删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"id": "book-001"}}'
# 后提交
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true"
5.3 软提交 #
bash
curl -X POST "http://localhost:8983/solr/mycore/update?softCommit=true" \
-H "Content-Type: application/json" \
-d '{"delete": {"id": "book-001"}}'
六、删除参数 #
6.1 commitWithin #
bash
# 在1000毫秒内提交
curl -X POST "http://localhost:8983/solr/mycore/update?commitWithin=1000" \
-H "Content-Type: application/json" \
-d '{"delete": {"id": "book-001"}}'
6.2 overwrite #
bash
# 对于删除操作,overwrite参数无意义
curl -X POST "http://localhost:8983/solr/mycore/update?overwrite=true" \
-H "Content-Type: application/json" \
-d '{"delete": {"id": "book-001"}}'
七、删除策略 #
7.1 软删除 #
在Schema中配置软删除字段:
xml
<field name="_root_" type="string" indexed="true" stored="true"/>
<field name="_nest_path_" type="string" indexed="true" stored="true"/>
<field name="_nest_parent_" type="string" indexed="true" stored="true"/>
配置solrconfig.xml:
xml
<updateHandler class="solr.DirectUpdateHandler2">
<str name="softCommits">true</str>
</updateHandler>
7.2 TTL(Time To Live) #
xml
<!-- 配置TTL字段 -->
<field name="ttl" type="string" indexed="true" stored="true"/>
<field name="ttl_expire" type="pdate" indexed="true" stored="true"/>
bash
# 索引时设置TTL
curl -X POST "http://localhost:8983/solr/mycore/update/json/docs" \
-H "Content-Type: application/json" \
-d '{
"id": "temp-001",
"data": "临时数据",
"ttl": "+7DAYS"
}'
7.3 定期清理 #
bash
# 清理过期数据脚本
#!/bin/bash
# delete_expired.sh
# 删除30天前的数据
curl -X POST "http://localhost:8983/solr/logs/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "timestamp:[* TO NOW-30DAYS]"}}'
curl -X POST "http://localhost:8983/solr/logs/update?commit=true"
八、删除监控 #
8.1 查看删除统计 #
bash
curl "http://localhost:8983/solr/mycore/admin/stats?key=updateHandler" | jq '.statistics.deletes'
8.2 查看索引统计 #
bash
# 查看文档数量
curl "http://localhost:8983/solr/mycore/select?q=*:*&rows=0"
# 查看删除文档数
curl "http://localhost:8983/solr/admin/cores?action=STATUS&core=mycore" | jq '.status.mycore.index.deletedDocs'
九、删除优化 #
9.1 批量删除优化 #
bash
# 使用查询删除代替多次ID删除
# 不推荐
for id in book-001 book-002 book-003; do
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d "{\"delete\": {\"id\": \"$id\"}}"
done
# 推荐
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{
"delete": [
{"id": "book-001"},
{"id": "book-002"},
{"id": "book-003"}
]
}'
9.2 索引优化 #
删除大量文档后,执行优化:
bash
# 优化索引
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true"
# 指定最大段数
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true&maxSegments=1"
十、实战示例 #
10.1 清理测试数据 #
bash
# 删除所有测试数据
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "env:test"}}'
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true"
10.2 清理过期日志 #
bash
# 删除7天前的日志
curl -X POST "http://localhost:8983/solr/logs/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "timestamp:[* TO NOW-7DAYS]"}}'
curl -X POST "http://localhost:8983/solr/logs/update?commit=true"
10.3 下架商品 #
bash
# 删除下架商品
curl -X POST "http://localhost:8983/solr/products/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "status:offline"}}'
curl -X POST "http://localhost:8983/solr/products/update?commit=true"
10.4 清理重复数据 #
bash
# 先查询重复数据
curl "http://localhost:8983/solr/mycore/select?q=*:*&fl=id,title&sort=id asc&rows=1000" > docs.json
# 分析并删除重复项
# ... 处理逻辑 ...
# 删除重复文档
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": [{"id": "dup-001"}, {"id": "dup-002"}]}'
十一、常见问题 #
11.1 删除不生效 #
问题:删除后文档仍然存在
原因:
- 未提交
- 查询条件错误
- 副本同步延迟
解决:
bash
# 确保提交
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true"
# 检查查询条件
curl "http://localhost:8983/solr/mycore/select?q=category:tech"
11.2 删除性能问题 #
问题:删除大量文档很慢
解决:
bash
# 使用查询删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:old"}}'
# 删除后优化索引
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true"
11.3 磁盘空间不释放 #
问题:删除后磁盘空间未释放
原因:Lucene采用标记删除,实际删除在段合并时
解决:
bash
# 执行优化强制合并
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true&maxSegments=1"
十二、删除安全 #
12.1 删除前确认 #
bash
# 先查询确认
curl "http://localhost:8983/solr/mycore/select?q=category:old&rows=0"
# 确认数量后删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:old"}}'
12.2 备份后删除 #
bash
# 备份数据
curl "http://localhost:8983/solr/mycore/export?q=category:old&fl=*&rows=10000" > backup.json
# 删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
-H "Content-Type: application/json" \
-d '{"delete": {"query": "category:old"}}'
12.3 权限控制 #
xml
<!-- solrconfig.xml配置权限 -->
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">auth-chain</str>
</lst>
</requestHandler>
十三、总结 #
删除方式对比:
| 方式 | 适用场景 | 性能 |
|---|---|---|
| 按ID删除 | 单个文档 | 高 |
| 批量ID删除 | 少量文档 | 高 |
| 按查询删除 | 大量文档 | 中 |
| 删除所有 | 清空索引 | 低 |
最佳实践:
- 删除前先查询确认
- 批量删除使用查询删除
- 删除后及时提交
- 定期优化索引
- 重要数据先备份
下一步,让我们学习文档查询!
最后更新:2026-03-27