Neptune环境搭建 #

一、环境搭建概述 #

1.1 搭建方式 #

text
环境搭建选项:
├── AWS云环境
│   ├── AWS管理控制台
│   ├── AWS CLI
│   ├── AWS CDK
│   └── Terraform
├── 本地开发环境
│   ├── Docker容器
│   └── Neptune Notebook
└── 无服务器环境
    └── Neptune Serverless

1.2 环境要求 #

text
基本要求:
├── AWS账户
├── VPC网络
├── IAM权限
├── 安全组配置
└── 客户端工具

二、AWS控制台创建 #

2.1 创建VPC #

首先需要创建VPC网络环境:

text
VPC配置:
├── VPC CIDR:10.0.0.0/16
├── 公有子网:10.0.0.0/24, 10.0.1.0/24
├── 私有子网:10.0.2.0/24, 10.0.3.0/24
├── NAT网关(可选)
└── 安全组

2.2 创建Neptune集群 #

步骤1:进入Neptune控制台

text
AWS控制台 → 数据库 → Amazon Neptune → 创建数据库

步骤2:选择引擎类型

text
引擎选项:
├── Neptune(标准版)
└── Neptune Serverless(无服务器版)

步骤3:配置实例

text
实例配置:
├── 实例类:db.r5.large(开发)/ db.r5.xlarge(生产)
├── 多可用区部署:是(生产环境)
└── 只读副本数:根据需求配置

步骤4:配置存储

text
存储配置:
├── 存储类型:标准SSD / 预配置IOPS
├── 分配的存储:最小100GB
└── 自动扩展:建议启用

步骤5:配置网络

text
网络配置:
├── VPC:选择已创建的VPC
├── 子网组:选择私有子网
├── 安全组:配置访问规则
└── 可公开访问:否(推荐)

步骤6:配置认证

text
认证选项:
├── IAM数据库认证:启用(推荐)
├── 密码认证:可选
└── IAM角色:配置服务角色

2.3 创建参数组 #

text
参数组配置:
├── 创建自定义参数组
├── 配置查询超时
├── 配置缓存参数
└── 应用到集群

三、AWS CLI创建 #

3.1 创建子网组 #

bash
aws neptune create-db-subnet-group \
  --db-subnet-group-name my-neptune-subnet-group \
  --db-subnet-group-description "My Neptune subnet group" \
  --subnet-ids subnet-xxx subnet-yyy \
  --tags Key=Name,Value=NeptuneSubnetGroup

3.2 创建参数组 #

bash
aws neptune create-db-parameter-group \
  --db-parameter-group-name my-neptune-params \
  --db-parameter-group-family neptune1 \
  --description "My Neptune parameter group"

3.3 创建集群 #

bash
aws neptune create-db-cluster \
  --db-cluster-identifier my-neptune-cluster \
  --engine neptune \
  --engine-version 1.3.0.0 \
  --db-subnet-group-name my-neptune-subnet-group \
  --vpc-security-group-ids sg-xxx \
  --db-cluster-parameter-group-name my-neptune-cluster-params \
  --backup-retention-period 7 \
  --preferred-backup-window "03:00-04:00" \
  --preferred-maintenance-window "sun:04:00-sun:05:00" \
  --storage-encrypted \
  --kms-key-id alias/aws/neptune \
  --enable-cloudwatch-logs-exports \
  --master-username admin \
  --master-user-password MyPassword123

3.4 创建主实例 #

bash
aws neptune create-db-instance \
  --db-instance-identifier my-neptune-primary \
  --db-instance-class db.r5.large \
  --engine neptune \
  --db-cluster-identifier my-neptune-cluster \
  --db-parameter-group-name my-neptune-params

3.5 创建读副本 #

bash
aws neptune create-db-instance \
  --db-instance-identifier my-neptune-replica-1 \
  --db-instance-class db.r5.large \
  --engine neptune \
  --db-cluster-identifier my-neptune-cluster

3.6 查看集群状态 #

bash
aws neptune describe-db-clusters \
  --db-cluster-identifier my-neptune-cluster

四、本地开发环境 #

4.1 Docker开发环境 #

使用Docker运行本地Neptune兼容环境:

yaml
version: '3.8'
services:
  gremlin-server:
    image: tinkerpop/gremlin-server:3.6.2
    ports:
      - "8182:8182"
    volumes:
      - ./conf:/opt/gremlin-server/conf
    command: conf/gremlin-server-neptune.yaml

  gremlin-console:
    image: tinkerpop/gremlin-console:3.6.2
    stdin_open: true
    tty: true
    depends_on:
      - gremlin-server

启动服务:

bash
docker-compose up -d

4.2 连接Gremlin Server #

bash
docker exec -it neptune_gremlin-console_1 bin/gremlin.sh

gremlin> :remote connect tinkerpop.server conf/remote.yaml
gremlin> :remote console

4.3 Neptune Notebook环境 #

使用SageMaker Notebook连接Neptune:

bash
# 创建Notebook实例
aws sagemaker create-notebook-instance \
  --notebook-instance-name neptune-notebook \
  --instance-type ml.t3.medium \
  --role-arn arn:aws:iam::xxx:role/service-role/AmazonSageMaker-ExecutionRole

# 配置生命周期脚本
#!/bin/bash
set -e
sudo -u ec2-user -i <<'EOF'
source /home/ec2-user/anaconda3/bin/activate JupyterSystemEnv
pip install gremlinpython
pip install SPARQLWrapper
EOF

五、连接配置 #

5.1 获取连接端点 #

bash
# 获取集群写入端点
aws neptune describe-db-clusters \
  --db-cluster-identifier my-neptune-cluster \
  --query 'DBClusters[0].Endpoint' \
  --output text

# 获取集群读取端点
aws neptune describe-db-clusters \
  --db-cluster-identifier my-neptune-cluster \
  --query 'DBClusters[0].ReaderEndpoint' \
  --output text

5.2 Gremlin连接 #

Python示例:

python
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.anonymous_traversal import traversal

connection = DriverRemoteConnection(
    'wss://my-neptune-cluster.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/gremlin',
    'g'
)

g = traversal().withRemote(connection)

result = g.V().limit(10).toList()
print(result)

connection.close()

JavaScript示例:

javascript
const gremlin = require('gremlin');

const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const traversal = gremlin.process.AnonymousTraversalSource.traversal;

const connection = new DriverRemoteConnection(
  'wss://my-neptune-cluster.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/gremlin'
);

const g = traversal().withRemote(connection);

const result = await g.V().limit(10).toList();
console.log(result);

connection.close();

5.3 SPARQL连接 #

Python示例:

python
from SPARQLWrapper import SPARQLWrapper, JSON

endpoint = 'https://my-neptune-cluster.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/sparql'

sparql = SPARQLWrapper(endpoint)
sparql.setQuery("""
    PREFIX ex: <http://example.org/>
    SELECT ?s ?p ?o
    WHERE {
        ?s ?p ?o
    }
    LIMIT 10
""")
sparql.setReturnFormat(JSON)

results = sparql.query().convert()
print(results)

JavaScript示例:

javascript
const fetch = require('node-fetch');

const endpoint = 'https://my-neptune-cluster.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/sparql';

const query = `
  PREFIX ex: <http://example.org/>
  SELECT ?s ?p ?o
  WHERE {
    ?s ?p ?o
  }
  LIMIT 10
`;

const response = await fetch(`${endpoint}?query=${encodeURIComponent(query)}`, {
  headers: {
    'Accept': 'application/sparql-results+json'
  }
});

const results = await response.json();
console.log(results);

5.4 IAM认证连接 #

python
import requests
from requests_aws4auth import AWS4Auth
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.anonymous_traversal import traversal
import boto3

region = 'us-east-1'
service = 'neptune-db'

credentials = boto3.Session().get_credentials()
auth = AWS4Auth(
    credentials.access_key,
    credentials.secret_key,
    region,
    service,
    session_token=credentials.token
)

connection = DriverRemoteConnection(
    'wss://my-neptune-cluster.cluster-xxx.us-east-1.neptune.amazonaws.com:8182/gremlin',
    'g',
    transport_kwargs={'auth': auth}
)

g = traversal().withRemote(connection)

六、安全组配置 #

6.1 入站规则 #

text
入站规则:
├── 端口8182(Gremlin/SPARQL)
│   └── 来源:应用服务器安全组
├── 端口8182(IAM认证)
│   └── 来源:VPN或堡垒机
└── 端口22(SSH管理)
    └── 来源:管理网络

6.2 出站规则 #

text
出站规则:
├── 允许所有出站流量(默认)
└── 或按需限制

6.3 配置示例 #

bash
aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol tcp \
  --port 8182 \
  --source-group sg-yyy

七、参数配置 #

7.1 关键参数 #

参数 默认值 建议值 说明
neptune_query_timeout 120000 根据需求 查询超时(毫秒)
neptune_lab_mode 根据需求 实验性功能
neptune_enable_slow_query_log false true 慢查询日志

7.2 修改参数 #

bash
aws neptune modify-db-cluster-parameter-group \
  --db-cluster-parameter-group-name my-neptune-cluster-params \
  --parameters "ParameterName=neptune_query_timeout,ParameterValue=300000,ApplyMethod=immediate"

八、验证连接 #

8.1 Gremlin验证 #

python
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.anonymous_traversal import traversal

connection = DriverRemoteConnection(
    'wss://your-cluster-endpoint:8182/gremlin',
    'g'
)

g = traversal().withRemote(connection)

result = g.V().count().next()
print(f"Total vertices: {result}")

connection.close()

8.2 SPARQL验证 #

python
from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper('https://your-cluster-endpoint:8182/sparql')
sparql.setQuery('SELECT (COUNT(*) AS ?count) WHERE { ?s ?p ?o }')
sparql.setReturnFormat(JSON)

result = sparql.query().convert()
print(f"Total triples: {result['results']['bindings'][0]['count']['value']}")

8.3 健康检查 #

bash
curl -X POST https://your-cluster-endpoint:8182/status

九、常见问题 #

9.1 连接超时 #

text
排查步骤:
├── 检查安全组配置
├── 检查VPC路由
├── 检查网络ACL
├── 检查实例状态
└── 检查端口配置

9.2 认证失败 #

text
排查步骤:
├── 检查IAM用户权限
├── 检查IAM角色配置
├── 检查认证方式
├── 检查凭证有效性
└── 检查签名版本

9.3 SSL证书问题 #

python
# 禁用SSL验证(仅开发环境)
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

十、总结 #

环境搭建要点:

步骤 说明
1. 创建VPC 配置网络环境
2. 创建集群 选择实例类型
3. 配置安全 安全组、IAM
4. 连接测试 验证连接

下一步,让我们学习核心概念!

最后更新:2026-03-27