节点与顶点 #

一、顶点概述 #

1.1 什么是顶点 #

顶点是图数据库中最基本的数据实体，用于表示现实世界中的对象。在Neptune中，顶点可以表示人、地点、事物、概念等。

text

顶点组成：
├── 唯一标识符（ID）
├── 标签（Label）
└── 属性集合（Properties）

1.2 顶点特点 #

text

顶点特点：
├── 系统自动分配唯一ID
├── 可有多个标签
├── 可有多个属性
├── 可有多个入边和出边
└── 属性值支持多种数据类型

1.3 顶点结构图 #

text

┌─────────────────────────┐
│       Person            │  ← 标签
│                         │
│  id: "1"                │  ← 系统ID
│  name: "Tom"            │  ← 属性
│  age: 30                │
│  email: "tom@test.com"  │
└─────────────────────────┘
         │
         │ knows
         ▼

二、创建顶点 #

2.1 Gremlin创建顶点 #

创建基本顶点：

gremlin

// 创建空顶点
g.addV()

// 创建带标签的顶点
g.addV('person')

// 创建带属性的顶点
g.addV('person').property('name', 'Tom')

// 创建完整顶点
g.addV('person').
  property('name', 'Tom').
  property('age', 30).
  property('email', 'tom@test.com')

批量创建顶点：

gremlin

// 使用多个addV
g.addV('person').property('name', 'Tom').
  addV('person').property('name', 'Jerry').
  addV('person').property('name', 'Mike')

// 使用unfold批量创建
g.inject(['Tom', 'Jerry', 'Mike']).
  unfold().
  addV('person').
  property('name', identity)

使用参数创建：

python

from gremlin_python.process.traversal import T
from gremlin_python.process.traversal import P

# Python示例
g.addV('person').property('name', 'Tom').property('age', 30).next()

# 批量创建
names = ['Tom', 'Jerry', 'Mike']
for name in names:
    g.addV('person').property('name', name).iterate()

2.2 SPARQL创建顶点 #

创建RDF资源：

sparql

PREFIX ex: <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

INSERT DATA {
  ex:Tom a foaf:Person ;
         foaf:name "Tom" ;
         ex:age 30 ;
         foaf:mbox <mailto:tom@test.com> .
}

批量创建：

sparql

PREFIX ex: <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

INSERT DATA {
  ex:Tom a foaf:Person ; foaf:name "Tom" .
  ex:Jerry a foaf:Person ; foaf:name "Jerry" .
  ex:Mike a foaf:Person ; foaf:name "Mike" .
}

三、查询顶点 #

3.1 基本查询 #

gremlin

// 查询所有顶点
g.V()

// 查询指定ID的顶点
g.V('1')

// 查询指定ID列表的顶点
g.V('1', '2', '3')

// 限制返回数量
g.V().limit(10)

3.2 按标签查询 #

gremlin

// 查询指定标签的顶点
g.V().hasLabel('person')

// 查询多个标签的顶点
g.V().hasLabel('person', 'employee')

// 查询标签
g.V().label()

3.3 按属性查询 #

gremlin

// 精确匹配
g.V().has('name', 'Tom')

// 多属性匹配
g.V().has('name', 'Tom').has('age', 30)

// 数值比较
g.V().has('age', gt(25))
g.V().has('age', gte(25))
g.V().has('age', lt(40))
g.V().has('age', lte(40))
g.V().has('age', inside(20, 40))

// 字符串匹配
g.V().has('name', containing('Tom'))
g.V().has('name', startingWith('T'))
g.V().has('name', endingWith('m'))

// 列表匹配
g.V().has('name', within('Tom', 'Jerry', 'Mike'))
g.V().has('name', without('Tom', 'Jerry'))

3.4 复杂查询 #

gremlin

// 组合条件
g.V().hasLabel('person').
  has('age', gt(25)).
  has('name', startingWith('T'))

// 使用where条件
g.V().where(outE('knows').count().is(gt(0)))

// 使用filter
g.V().filter(
  __.properties().count().is(gt(3))
)

// 使用and/or/not
g.V().and(
  has('age', gt(25)),
  has('name', startingWith('T'))
)

g.V().or(
  has('name', 'Tom'),
  has('name', 'Jerry')
)

四、顶点标签 #

4.1 标签概念 #

标签用于对顶点进行分类，一个顶点可以有多个标签。

text

标签作用：
├── 分类顶点类型
├── 加速查询过滤
├── 支持多标签
└── 命名规范：小写、下划线分隔

4.2 添加标签 #

gremlin

// 创建时添加标签
g.addV('person')

// 添加多个标签
g.addV('person').addV('employee')

// 为现有顶点添加标签
g.V('1').addV('manager')

// 注意：Neptune不支持直接添加标签到现有顶点
// 需要使用property设置

4.3 查询标签 #

gremlin

// 获取顶点标签
g.V('1').label()

// 获取所有标签
g.V().label().dedup()

// 按标签统计
g.V().groupCount().by(label)

五、顶点属性 #

5.1 属性操作 #

gremlin

// 添加单个属性
g.V('1').property('name', 'Tom')

// 添加多个属性
g.V('1').
  property('name', 'Tom').
  property('age', 30)

// 添加多值属性
g.V('1').
  property(list, 'email', 'tom@test.com').
  property(list, 'email', 'tom@work.com')

// 获取属性值
g.V('1').values('name')

// 获取所有属性值
g.V('1').values()

// 获取属性键
g.V('1').keys()

// 获取属性映射
g.V('1').valueMap()

// 获取属性对象
g.V('1').properties()

5.2 属性基数 #

text

属性基数类型：
├── single：单值属性（默认）
├── set：集合属性（去重）
└── list：列表属性（允许多值）

gremlin

// 单值属性
g.V('1').property(single, 'name', 'Tom')

// 集合属性
g.V('1').property(set, 'tag', 'important')
g.V('1').property(set, 'tag', 'urgent')

// 列表属性
g.V('1').property(list, 'phone', '123-456-7890')
g.V('1').property(list, 'phone', '098-765-4321')

5.3 更新属性 #

gremlin

// 更新属性值
g.V('1').property('age', 31)

// 条件更新
g.V('1').property('age', 31).has('age', 30)

// 使用合并更新
g.V('1').property('age', 31)

5.4 删除属性 #

gremlin

// 删除单个属性
g.V('1').properties('age').drop()

// 删除所有属性
g.V('1').properties().drop()

// 删除多值属性中的一个
g.V('1').properties('email').hasValue('old@test.com').drop()

六、顶点ID #

6.1 系统ID #

gremlin

// 获取顶点ID
g.V('1').id()

// 通过ID查询
g.V('1')

// 批量ID查询
g.V('1', '2', '3')

// ID类型
// Neptune使用字符串类型ID

6.2 自定义ID #

gremlin

// Neptune支持自定义ID
g.addV('person').property(T.id, 'user_001').property('name', 'Tom')

// 使用自定义ID查询
g.V('user_001')

七、删除顶点 #

7.1 基本删除 #

gremlin

// 删除单个顶点
g.V('1').drop()

// 删除多个顶点
g.V('1', '2', '3').drop()

// 条件删除
g.V().has('status', 'inactive').drop()

7.2 级联删除 #

gremlin

// 删除顶点及其所有边
// Neptune会自动删除关联的边
g.V('1').drop()

// 手动删除边后删除顶点
g.V('1').bothE().drop()
g.V('1').drop()

八、顶点遍历 #

8.1 出边遍历 #

gremlin

// 获取所有出边连接的顶点
g.V('1').out()

// 获取指定标签的出边顶点
g.V('1').out('knows')

// 获取多条边标签的出边顶点
g.V('1').out('knows', 'follows')

// 获取出边
g.V('1').outE()

// 获取指定标签的出边
g.V('1').outE('knows')

8.2 入边遍历 #

gremlin

// 获取所有入边连接的顶点
g.V('1').in()

// 获取指定标签的入边顶点
g.V('1').in('knows')

// 获取入边
g.V('1').inE()

// 获取指定标签的入边
g.V('1').inE('knows')

8.3 双向遍历 #

gremlin

// 获取所有边连接的顶点
g.V('1').both()

// 获取指定标签的双向顶点
g.V('1').both('knows')

// 获取所有边
g.V('1').bothE()

九、顶点统计 #

9.1 计数操作 #

gremlin

// 统计所有顶点数量
g.V().count()

// 统计指定标签顶点数量
g.V().hasLabel('person').count()

// 按标签分组计数
g.V().groupCount().by(label)

// 统计属性值数量
g.V().has('status', 'active').count()

9.2 度统计 #

gremlin

// 出度
g.V('1').outE().count()

// 入度
g.V('1').inE().count()

// 总度
g.V('1').bothE().count()

// 高度节点
g.V().order().by(bothE().count(), desc).limit(10)

// 孤立节点
g.V().where(bothE().count().is(0))

十、最佳实践 #

10.1 标签设计 #

text

标签设计原则：
├── 使用名词命名：person, product, order
├── 使用小写和下划线
├── 保持语义清晰
├── 避免过多标签
└── 考虑查询模式

10.2 属性设计 #

text

属性设计原则：
├── 使用唯一标识属性
├── 属性名使用小驼峰或下划线
├── 避免过大的属性值
├── 合理使用多值属性
└── 保持命名一致性

10.3 性能优化 #

gremlin

// 使用标签过滤
g.V().hasLabel('person')  // 好
g.V().filter(label().is('person'))  // 避免

// 使用属性索引
g.V().has('name', 'Tom')  // 好

// 限制遍历深度
g.V().repeat(out()).times(3)  // 好
g.V().repeat(out()).until(has('name', 'Tom'))  // 注意循环

// 使用limit
g.V().limit(100)  // 好

十一、实际应用示例 #

11.1 用户顶点 #

gremlin

// 创建用户
g.addV('user').
  property('userId', 'user_001').
  property('username', 'tom_hanks').
  property('email', 'tom@example.com').
  property('createdAt', datetime()).
  property('status', 'active')

// 查询活跃用户
g.V().hasLabel('user').
  has('status', 'active').
  order().by('createdAt', desc).
  limit(20)

11.2 产品顶点 #

gremlin

// 创建产品
g.addV('product').
  property('productId', 'prod_001').
  property('name', 'iPhone 15').
  property('price', 999.99).
  property('category', 'Electronics').
  property(list, 'tags', 'phone').
  property(list, 'tags', 'apple').
  property('stock', 100)

// 查询产品
g.V().hasLabel('product').
  has('category', 'Electronics').
  has('price', inside(500, 1500))

11.3 文档顶点 #

gremlin

// 创建文档
g.addV('document').
  property('docId', 'doc_001').
  property('title', 'Neptune Guide').
  property('content', '...').
  property('author', 'Tom').
  property('createdAt', datetime()).
  property('status', 'published')

// 查询文档
g.V().hasLabel('document').
  has('status', 'published').
  has('createdAt', gt(someDate))

十二、总结 #

顶点操作要点：

操作	Gremlin语法	说明
创建	addV(label)	创建新顶点
查询	V() / has()	查询顶点
更新	property()	更新属性
删除	drop()	删除顶点
遍历	out()/in()/both()	遍历关系

最佳实践：

为顶点设计合理的标签体系
使用唯一标识属性
保持属性命名一致
使用索引加速查询
避免顶点属性过多

下一步，让我们学习边与关系！