文档删除 #

一、删除概述 #

1.1 删除类型 #

类型 说明
按ID删除 根据唯一键删除单个文档
按查询删除 根据查询条件删除多个文档
批量删除 批量删除多个文档
删除所有 清空索引

1.2 删除流程 #

text
删除请求
    ↓
解析删除条件
    ↓
标记删除(软删除)
    ↓
提交
    ↓
实际删除(段合并时)

二、按ID删除 #

2.1 JSON格式 #

bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"id": "book-001"}}'

2.2 XML格式 #

bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/xml" \
  -d '<delete><id>book-001</id></delete>'

2.3 使用post工具 #

bash
# 删除单个文档
bin/post -c mycore -d '{"delete": {"id": "book-001"}}'

三、批量删除 #

3.1 多ID删除 #

bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{
    "delete": [
      {"id": "book-001"},
      {"id": "book-002"},
      {"id": "book-003"}
    ]
  }'

3.2 XML批量删除 #

bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/xml" \
  -d '<delete>
    <id>book-001</id>
    <id>book-002</id>
    <id>book-003</id>
  </delete>'

3.3 从文件删除 #

bash
# 创建删除列表文件
cat > delete_ids.json << 'EOF'
{
  "delete": [
    {"id": "book-001"},
    {"id": "book-002"},
    {"id": "book-003"}
  ]
}
EOF

# 执行删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  --data-binary @delete_ids.json

四、按查询删除 #

4.1 基本语法 #

bash
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:tech"}}'

4.2 条件删除示例 #

按字段值删除

bash
# 删除特定分类
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:tech"}}'

# 删除特定作者
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "author:张三"}}'

范围删除

bash
# 删除价格范围
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "price:[0 TO 50]"}}'

# 删除日期范围
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "publish_date:[* TO 2020-01-01T00:00:00Z]"}}'

复合条件删除

bash
# 多条件删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:tech AND price:[0 TO 100]"}}'

# 使用OR
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:tech OR category:book"}}'

# 使用NOT
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "NOT category:tech"}}'

4.3 删除所有文档 #

bash
# JSON格式
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "*:*"}}'

# XML格式
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/xml" \
  -d '<delete><query>*:*</query></delete>'

五、删除与提交 #

5.1 自动提交删除 #

bash
# 删除并提交
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"id": "book-001"}}'

5.2 单独提交 #

bash
# 先删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"id": "book-001"}}'

# 后提交
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true"

5.3 软提交 #

bash
curl -X POST "http://localhost:8983/solr/mycore/update?softCommit=true" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"id": "book-001"}}'

六、删除参数 #

6.1 commitWithin #

bash
# 在1000毫秒内提交
curl -X POST "http://localhost:8983/solr/mycore/update?commitWithin=1000" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"id": "book-001"}}'

6.2 overwrite #

bash
# 对于删除操作,overwrite参数无意义
curl -X POST "http://localhost:8983/solr/mycore/update?overwrite=true" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"id": "book-001"}}'

七、删除策略 #

7.1 软删除 #

在Schema中配置软删除字段:

xml
<field name="_root_" type="string" indexed="true" stored="true"/>
<field name="_nest_path_" type="string" indexed="true" stored="true"/>
<field name="_nest_parent_" type="string" indexed="true" stored="true"/>

配置solrconfig.xml:

xml
<updateHandler class="solr.DirectUpdateHandler2">
  <str name="softCommits">true</str>
</updateHandler>

7.2 TTL(Time To Live) #

xml
<!-- 配置TTL字段 -->
<field name="ttl" type="string" indexed="true" stored="true"/>
<field name="ttl_expire" type="pdate" indexed="true" stored="true"/>
bash
# 索引时设置TTL
curl -X POST "http://localhost:8983/solr/mycore/update/json/docs" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "temp-001",
    "data": "临时数据",
    "ttl": "+7DAYS"
  }'

7.3 定期清理 #

bash
# 清理过期数据脚本
#!/bin/bash
# delete_expired.sh

# 删除30天前的数据
curl -X POST "http://localhost:8983/solr/logs/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "timestamp:[* TO NOW-30DAYS]"}}'

curl -X POST "http://localhost:8983/solr/logs/update?commit=true"

八、删除监控 #

8.1 查看删除统计 #

bash
curl "http://localhost:8983/solr/mycore/admin/stats?key=updateHandler" | jq '.statistics.deletes'

8.2 查看索引统计 #

bash
# 查看文档数量
curl "http://localhost:8983/solr/mycore/select?q=*:*&rows=0"

# 查看删除文档数
curl "http://localhost:8983/solr/admin/cores?action=STATUS&core=mycore" | jq '.status.mycore.index.deletedDocs'

九、删除优化 #

9.1 批量删除优化 #

bash
# 使用查询删除代替多次ID删除
# 不推荐
for id in book-001 book-002 book-003; do
  curl -X POST "http://localhost:8983/solr/mycore/update" \
    -H "Content-Type: application/json" \
    -d "{\"delete\": {\"id\": \"$id\"}}"
done

# 推荐
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{
    "delete": [
      {"id": "book-001"},
      {"id": "book-002"},
      {"id": "book-003"}
    ]
  }'

9.2 索引优化 #

删除大量文档后,执行优化:

bash
# 优化索引
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true"

# 指定最大段数
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true&maxSegments=1"

十、实战示例 #

10.1 清理测试数据 #

bash
# 删除所有测试数据
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "env:test"}}'

curl -X POST "http://localhost:8983/solr/mycore/update?commit=true"

10.2 清理过期日志 #

bash
# 删除7天前的日志
curl -X POST "http://localhost:8983/solr/logs/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "timestamp:[* TO NOW-7DAYS]"}}'

curl -X POST "http://localhost:8983/solr/logs/update?commit=true"

10.3 下架商品 #

bash
# 删除下架商品
curl -X POST "http://localhost:8983/solr/products/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "status:offline"}}'

curl -X POST "http://localhost:8983/solr/products/update?commit=true"

10.4 清理重复数据 #

bash
# 先查询重复数据
curl "http://localhost:8983/solr/mycore/select?q=*:*&fl=id,title&sort=id asc&rows=1000" > docs.json

# 分析并删除重复项
# ... 处理逻辑 ...

# 删除重复文档
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": [{"id": "dup-001"}, {"id": "dup-002"}]}'

十一、常见问题 #

11.1 删除不生效 #

问题:删除后文档仍然存在

原因

  • 未提交
  • 查询条件错误
  • 副本同步延迟

解决

bash
# 确保提交
curl -X POST "http://localhost:8983/solr/mycore/update?commit=true"

# 检查查询条件
curl "http://localhost:8983/solr/mycore/select?q=category:tech"

11.2 删除性能问题 #

问题:删除大量文档很慢

解决

bash
# 使用查询删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:old"}}'

# 删除后优化索引
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true"

11.3 磁盘空间不释放 #

问题:删除后磁盘空间未释放

原因:Lucene采用标记删除,实际删除在段合并时

解决

bash
# 执行优化强制合并
curl -X POST "http://localhost:8983/solr/mycore/update?optimize=true&maxSegments=1"

十二、删除安全 #

12.1 删除前确认 #

bash
# 先查询确认
curl "http://localhost:8983/solr/mycore/select?q=category:old&rows=0"

# 确认数量后删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:old"}}'

12.2 备份后删除 #

bash
# 备份数据
curl "http://localhost:8983/solr/mycore/export?q=category:old&fl=*&rows=10000" > backup.json

# 删除
curl -X POST "http://localhost:8983/solr/mycore/update" \
  -H "Content-Type: application/json" \
  -d '{"delete": {"query": "category:old"}}'

12.3 权限控制 #

xml
<!-- solrconfig.xml配置权限 -->
<requestHandler name="/update" class="solr.UpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">auth-chain</str>
  </lst>
</requestHandler>

十三、总结 #

删除方式对比:

方式 适用场景 性能
按ID删除 单个文档
批量ID删除 少量文档
按查询删除 大量文档
删除所有 清空索引

最佳实践:

  • 删除前先查询确认
  • 批量删除使用查询删除
  • 删除后及时提交
  • 定期优化索引
  • 重要数据先备份

下一步,让我们学习文档查询!

最后更新:2026-03-27