遍历操作 #

一、遍历概述 #

1.1 什么是图遍历 #

图遍历是指从图中的一个或多个顶点出发，沿着边访问其他顶点的过程。

text

遍历类型：
├── 深度优先遍历（DFS）
├── 广度优先遍历（BFS）
├── 路径遍历
└── 模式匹配

1.2 遍历方向 #

text

遍历方向：
├── Out（出方向）：从顶点沿出边遍历
├── In（入方向）：从顶点沿入边遍历
└── Both（双向）：两个方向都遍历

二、基本遍历 #

2.1 出边遍历 #

gremlin

// 遍历所有出边连接的顶点
g.V('1').out()

// 遍历指定标签的出边顶点
g.V('1').out('knows')

// 遍历多个标签的出边顶点
g.V('1').out('knows', 'follows')

// 获取出边
g.V('1').outE()

// 获取指定标签的出边
g.V('1').outE('knows')

// 从出边获取顶点
g.V('1').outE().inV()  // 等同于 g.V('1').out()

2.2 入边遍历 #

gremlin

// 遍历所有入边连接的顶点
g.V('1').in()

// 遍历指定标签的入边顶点
g.V('1').in('knows')

// 遍历多个标签的入边顶点
g.V('1').in('knows', 'follows')

// 获取入边
g.V('1').inE()

// 获取指定标签的入边
g.V('1').inE('knows')

// 从入边获取顶点
g.V('1').inE().outV()  // 等同于 g.V('1').in()

2.3 双向遍历 #

gremlin

// 遍历所有边连接的顶点
g.V('1').both()

// 遍历指定标签的双向顶点
g.V('1').both('knows')

// 获取所有边
g.V('1').bothE()

// 获取指定标签的所有边
g.V('1').bothE('knows')

2.4 遍历示例 #

gremlin

// Tom认识的人
g.V().has('name', 'Tom').out('knows').values('name')

// 认识Tom的人
g.V().has('name', 'Tom').in('knows').values('name')

// 与Tom有关系的人
g.V().has('name', 'Tom').both('knows').values('name')

三、多跳遍历 #

3.1 链式遍历 #

gremlin

// 两跳遍历
g.V('1').out('knows').out('knows')

// 三跳遍历
g.V('1').out('knows').out('knows').out('knows')

// 混合遍历
g.V('1').out('knows').in('knows')

3.2 repeat遍历 #

gremlin

// 重复N次
g.V('1').repeat(out('knows')).times(3)

// 直到条件
g.V('1').repeat(out('knows')).until(has('name', 'Jerry'))

// 发射所有中间结果
g.V('1').repeat(out('knows')).emit().times(3)

// 条件发射
g.V('1').repeat(out('knows')).emit(has('status', 'active')).times(3)

// emit在repeat前（包含起点）
g.V('1').emit().repeat(out('knows')).times(3)

3.3 until条件 #

gremlin

// 直到属性匹配
g.V('1').repeat(out()).until(has('name', 'Jerry'))

// 直到度数为0（叶子节点）
g.V('1').repeat(out()).until(outE().count().is(0))

// 直到深度限制
g.V('1').repeat(out()).until(loops().is(gt(5)))

// 组合条件
g.V('1').repeat(out()).until(
  or(
    has('name', 'Jerry'),
    loops().is(gt(5))
  )
)

3.4 loops() - 循环计数 #

gremlin

// 获取当前循环次数
g.V('1').repeat(out()).emit().times(5).as('v').
  project('vertex', 'depth').
    by('name').
    by(loops())

// 限制循环深度
g.V('1').repeat(out()).until(loops().is(gt(3)))

四、路径遍历 #

4.1 path() - 获取路径 #

gremlin

// 获取完整路径
g.V('1').out('knows').path()

// 路径属性
g.V('1').out('knows').path().by('name')

// 多跳路径
g.V('1').out('knows').out('knows').path().by('name')

// 带边的路径
g.V('1').outE('knows').inV().path()

4.2 simplePath() - 简单路径 #

gremlin

// 无环路径
g.V('1').repeat(out()).times(3).simplePath()

// 排除特定路径
g.V('1').out('knows').simplePath().where(without('1'))

4.3 cyclicPath() - 环路检测 #

gremlin

// 检测环路
g.V('1').repeat(out()).times(5).cyclicPath()

// 排除环路
g.V('1').repeat(out()).times(5).simplePath()

4.4 路径过滤 #

gremlin

// 路径长度过滤
g.V('1').out('knows').path().filter(count(local).is(eq(3)))

// 路径内容过滤
g.V('1').out('knows').path().filter(unfold().has('status', 'active'))

五、分支遍历 #

5.1 union() - 并集遍历 #

gremlin

// 合并多个遍历结果
g.V('1').union(
  out('knows'),
  out('follows')
).dedup()

// 合并不同方向
g.V('1').union(
  out('knows'),
  in('knows')
).dedup()

// 合并不同深度
g.V('1').union(
  out('knows'),
  out('knows').out('knows')
).dedup()

5.2 choose() - 条件遍历 #

gremlin

// if-else遍历
g.V().choose(
  has('type', 'person'),
  out('knows'),
  out('related')
)

// 多条件遍历
g.V().choose(has('status'))
  .option('active', out('knows'))
  .option('inactive', in('knows'))
  .option(none, both('knows'))

5.3 coalesce() - 首选遍历 #

gremlin

// 尝试多个遍历，返回第一个非空结果
g.V('1').coalesce(
  out('knows'),
  out('follows'),
  out('related')
)

// 带默认值
g.V('1').coalesce(
  out('knows'),
  constant([])  // 默认空列表
)

六、递归遍历 #

6.1 repeat递归 #

gremlin

// 递归遍历直到条件满足
g.V('1').repeat(out('parent')).until(has('name', 'Grandfather'))

// 递归遍历固定深度
g.V('1').repeat(out('parent')).times(3)

// 发射所有祖先
g.V('1').repeat(out('parent')).emit()

6.2 emit递归 #

gremlin

// 发射所有中间结果
g.V('1').emit().repeat(out('parent')).times(3)

// 条件发射
g.V('1').repeat(out('parent')).emit(has('gender', 'male'))

// 发射后继续遍历
g.V('1').emit().repeat(out('parent')).until(has('name', 'Grandfather'))

七、最短路径 #

7.1 BFS最短路径 #

gremlin

// 最短路径（广度优先）
g.V('1').repeat(out().simplePath()).until(hasId('2')).limit(1).path()

// 最短路径带属性
g.V().has('name', 'Tom').
  repeat(out().simplePath()).until(has('name', 'Jerry')).
  limit(1).
  path().by('name')

7.2 加权最短路径 #

gremlin

// 按权重计算最短路径
g.V('1').repeat(
  outE().sack(sum).by('weight').inV().simplePath()
).until(hasId('2')).
  limit(1).
  path().by('name').by('weight')

7.3 所有最短路径 #

gremlin

// 所有最短路径
g.V('1').repeat(out().simplePath()).until(hasId('2').or().loops().is(gt(5))).
  hasId('2').
  path()

// 限制路径数量
g.V('1').repeat(out().simplePath()).until(hasId('2')).limit(10).path()

八、子图遍历 #

8.1 subgraph() - 提取子图 #

gremlin

// 提取子图
g.V('1').repeat(outE().subgraph('subGraph').inV()).times(3).cap('subGraph')

// 提取路径子图
g.V('1').out('knows').outE().subgraph('edges').inV().cap('edges')

8.2 诱导子图 #

gremlin

// 获取顶点集的所有边
g.V().hasLabel('person').aggregate('vertices').
  bothE().where(bothV().where(within('vertices'))).aggregate('edges').
  cap('vertices', 'edges')

九、遍历优化 #

9.1 性能优化技巧 #

gremlin

// 使用标签过滤
g.V().hasLabel('person').out('knows')  // 好
g.V().out('knows')  // 慢

// 使用属性索引
g.V().has('userId', 'user_001').out('knows')  // 好

// 限制遍历深度
g.V('1').repeat(out()).times(3)  // 好
g.V('1').repeat(out()).until(...)  // 注意条件

// 使用limit
g.V('1').out('knows').limit(10)  // 好

// 避免重复计算
g.V().hasLabel('person').as('p').
  out('knows').as('f').
  select('p', 'f')  // 好

9.2 遍历策略 #

text

遍历策略：
├── 从小集合开始
├── 使用索引过滤
├── 限制遍历深度
├── 使用limit限制结果
└── 避免全图扫描

十、实际应用示例 #

10.1 社交网络遍历 #

gremlin

// 查找朋友的朋友
g.V().has('name', 'Tom').out('knows').out('knows').dedup()

// 查找共同好友
g.V().has('name', 'Tom').out('knows').
  where(out('knows').has('name', 'Jerry'))

// 查找N度人脉
g.V().has('name', 'Tom').repeat(out('knows')).times(3).dedup()

// 查找最短社交路径
g.V().has('name', 'Tom').
  repeat(out('knows').simplePath()).until(has('name', 'Jerry')).
  limit(1).path().by('name')

10.2 组织结构遍历 #

gremlin

// 查找所有下属
g.V().has('name', 'CEO').repeat(out('manages')).emit()

// 查找直属下属
g.V().has('name', 'CEO').out('manages')

// 查找上级链
g.V().has('name', 'Employee').repeat(in('manages')).emit()

// 查找同级同事
g.V().has('name', 'Tom').in('manages').out('manages').where(neq('Tom'))

10.3 知识图谱遍历 #

gremlin

// 查找相关概念
g.V().has('name', 'Graph Database').out('related').dedup()

// 查找概念层级
g.V().has('name', 'Graph Database').repeat(in('subClassOf')).emit()

// 查找实例
g.V().has('name', 'Database').repeat(out('subClassOf')).emit().out('instanceOf')

// 查找属性路径
g.V().has('name', 'Person').out('hasProperty').out('propertyType')

十一、总结 #

遍历操作要点：

操作	语法	说明
出边遍历	out()	沿出边遍历
入边遍历	in()	沿入边遍历
双向遍历	both()	双向遍历
重复遍历	repeat()	重复执行
路径遍历	path()	获取路径

最佳实践：

使用标签和属性过滤
限制遍历深度
使用limit限制结果
避免全图扫描
合理使用索引

下一步，让我们学习过滤与转换！