用 Spring AI + pgvector 构建 RAG 知识库 — 新加坡数学 AI 辅导 Phase 2 实战
目录
回顾#
Phase 1 搭建了 Java 25 + Spring AI 的多 Agent 解题管线。但当时 Planner Agent “裸跑”,没有任何参考题库。Phase 2 要解决的核心问题:让 Agent 带着"知识"回答问题。
具体目标:
- 导入 40 道 PSLE 真题/模拟题到向量数据库
- 解题前先做 RAG 检索,找到相似题作为 Prompt 上下文
- 按年级过滤,P5 请求不会看到 P6 难度的参考题
- Redis 缓存相同题目的 AI 响应,避免重复调用 LLM
整体架构变化#
Phase 1 的数据流是 请求 → Planner → CPA + Persona → 响应。
Phase 2 新增了两层:
- 前置:RAG 检索层(RagRetrievalService)
- 外层:Redis 缓存层(SolveService + @Cacheable)
PSLE 题库建设#
题目格式设计#
每道题包含 content(题目文本)和 metadata(结构化元数据):
{
"content": "The ratio of boys to girls in a class is 3:5. If there are 24 boys, how many students are there altogether?",
"metadata": {
"grade": 5,
"topic": "ratio.basic",
"difficulty": "medium",
"source": "PSLE Practice"
}
}
metadata 中的 grade 字段用于 RAG 过滤——学生查询 P5 题目时,只检索 P1-P5 的相似题,不会返回 P6 的内容。
知识点编码体系#
按 2026 PSLE 大纲整理了知识点编码:
| 年级 | 编码 | 示例 |
|---|---|---|
| P4 | fractions.of_remainder |
“给了 1/4,再给 1/3 的余下…” |
| P4 | measurement.area_perimeter |
“长方形面积和周长” |
| P5 | ratio.basic / ratio.difference / ratio.before_after |
比例类问题 |
| P5 | average.combined_groups |
综合平均数 |
| P6 | algebra.forming_equation |
“用 x 列方程求解” |
| P6 | algebra.simultaneous_concept |
鸡兔同笼问题 |
最终收录了 40 道题,覆盖 P4-P6 的核心考点:代数(13 题)、比例(9 题)、平均数(4 题)、分数(6 题)、测量(3 题)等。
Spring AI VectorStore 集成#
PgVectorStore 自动配置#
Spring AI 的 PgVectorStore 自动配置非常丝滑。只需在 application-dev.yml 中配置:
spring:
ai:
ollama:
embedding:
model: nomic-embed-text # 768维向量
vectorstore:
pgvector:
initialize-schema: true # 自动建 vector_store 表
dimensions: 768 # 匹配 embedding 模型
index-type: hnsw # HNSW 索引
initialize-schema: true 让 Spring AI 自动创建 vector_store 表及 HNSW 索引。不需要手动写 DDL。
启动时自动导入题库#
QuestionImportService 监听 ApplicationReadyEvent,在应用启动时检查 vector_store 是否为空,如果是则自动导入:
@Service
public class QuestionImportService {
private final VectorStore vectorStore;
private final JdbcTemplate jdbcTemplate;
private final ObjectMapper objectMapper = new ObjectMapper();
@EventListener(ApplicationReadyEvent.class)
public void importQuestionsOnStartup() {
Long count = jdbcTemplate.queryForObject(
"SELECT count(*) FROM vector_store", Long.class);
if (count != null && count > 0) {
log.info("Vector store already has {} docs, skip", count);
return;
}
log.info("Vector store is empty, importing...");
importQuestions();
}
public void importQuestions() {
var resource = new ClassPathResource("data/sg-math-questions.json");
List<Map<String, Object>> questions = objectMapper.readValue(
resource.getInputStream(), new TypeReference<>() {});
List<Document> documents = questions.stream()
.map(q -> new Document(
(String) q.get("content"),
(Map<String, Object>) q.get("metadata")))
.toList();
vectorStore.add(documents); // 自动调用 Embedding API + 写入 PG
}
}
vectorStore.add(documents) 这一行做了三件事:
- 调用 Ollama 的
nomic-embed-text模型生成 768 维向量 - 将 content + embedding + metadata 写入
vector_store表 - HNSW 索引自动更新
启动后验证:
SELECT count(*) FROM vector_store;
-- 40
SELECT content, metadata FROM vector_store LIMIT 1;
-- content: "Ali had 120 stickers..."
-- metadata: {"grade": 4, "topic": "fractions.of_remainder", ...}
Spring Boot 4.0 踩坑:ObjectMapper#
Spring Boot 4.0 中,ObjectMapper Bean 不再由 spring-boot-starter-web 自动注册(至少在某些配置组合下如此)。直接 @Autowired ObjectMapper 会报 NoSuchBeanDefinitionException。
解决方案很简单——在服务内部自行创建:
private final ObjectMapper objectMapper = new ObjectMapper();
这在我们的场景下完全够用,因为只需要最基本的 JSON 反序列化。
RAG 检索 + 年级过滤#
RagRetrievalService#
@Service
public class RagRetrievalService {
private static final int TOP_K = 5;
private static final double SIMILARITY_THRESHOLD = 0.5;
private final VectorStore vectorStore;
public List<Document> retrieveSimilarQuestions(
String question, int grade) {
var filterBuilder = new FilterExpressionBuilder();
var filter = filterBuilder.lte("grade", grade).build();
SearchRequest searchRequest = SearchRequest.builder()
.query(question)
.topK(TOP_K)
.similarityThreshold(SIMILARITY_THRESHOLD)
.filterExpression(filter)
.build();
return vectorStore.similaritySearch(searchRequest);
}
public String formatAsContext(List<Document> documents) {
var sb = new StringBuilder(
"=== Similar Questions from PSLE Question Bank ===\n\n");
for (int i = 0; i < documents.size(); i++) {
Document doc = documents.get(i);
sb.append("Question %d: %s\n".formatted(
i + 1, doc.getText()));
sb.append(" Topic: %s\n".formatted(
doc.getMetadata().get("topic")));
sb.append(" Difficulty: %s\n".formatted(
doc.getMetadata().get("difficulty")));
sb.append("\n");
}
return sb.toString();
}
}
关键设计:
FilterExpressionBuilder.lte("grade", grade):利用 Spring AI 的 filter DSL 实现年级过滤。P5 学生查询时grade <= 5,不会返回 P6 的代数题similarityThreshold(0.5):余弦相似度阈值 0.5,过滤掉不相关的结果topK(5):返回最相似的 5 道题作为 ContextformatAsContext():将检索结果格式化为自然语言,注入 Planner Agent 的 User Prompt
注入 Agent 链#
Planner Agent 的 User Prompt 现在包含三个部分:
private String runPlannerAgent(
SolveRequest request, String ragContext) {
String userMessage = """
Grade: P%d
Question: %s
%s
""".formatted(
request.grade(),
request.question(),
ragContext);
return chatClient.prompt()
.system(PLANNER_SYSTEM_PROMPT)
.user(userMessage)
.call()
.content();
}
实际发给 LLM 的 User Prompt 类似:
Grade: P5
Question: The ratio of boys to girls is 4:5. There are 36 students. How many girls are there?
=== Similar Questions from PSLE Question Bank ===
Question 1: The ratio of boys to girls in a class is 3:5. If there are 24 boys, how many students are there altogether?
Topic: ratio.basic
Difficulty: easy
Question 2: The ratio of the number of red beads to blue beads is 2:5. If there are 30 more blue beads than red beads, how many beads are there in total?
Topic: ratio.difference
Difficulty: medium
...
这样 LLM 在回答时就能参考同类题型的解法模式,输出质量显著提高。
日志验证#
INFO c.m.service.RagRetrievalService : RAG retrieval returned 5 similar questions for grade <= 5
INFO c.m.agent.MathSolverOrchestrator : RAG retrieval completed, found 5 similar questions
INFO c.m.agent.MathSolverOrchestrator : Starting Planner Agent for grade 5 question
System Prompt 优化#
Phase 2 对三个 Agent 的 System Prompt 做了显著增强:
Planner Agent#
Phase 1 (基础):只要求 JSON 结构化输出和 CPA 方法。
Phase 2 (增强):新增三部分内容:
- PSLE 评分标准
## PSLE 2026 Scoring Criteria
- Full marks require: correct answer + complete working + proper units
- Method marks: awarded for correct approach even if final answer is wrong
- Presentation: numbered steps, one operation per step, clear labelling
-
知识点编码体系 — 确保
knowledgeTags使用标准编码而非随意命名 -
RAG 引用指令 — “Reference similar questions from the knowledge base when the approach is applicable”
CPA Designer Agent#
Phase 2 新增:Bar Model 设计规则
## Bar Model Design Rules
- Use proportional bar lengths to represent quantities
- For ratio problems: draw bars side by side with equal unit lengths
- For fraction problems: divide a single bar into equal parts
- For algebra: use a bar with unknown length labeled with the variable
Persona Agent#
Phase 2 新增:按年级分层输出
- For P1-P3: use concrete objects (sweets, toys, stickers)
- For P4-P5: use relatable scenarios (sharing pizza, collecting cards)
- For P6: use slightly more mature contexts while keeping it fun
还增加了要求提供"Common mistakes"和"Follow-up practice question"的指令,让家长指导更有针对性。
Redis 缓存#
缓存策略#
同一道题 + 同一年级的 AI 响应缓存 24 小时。缓存 key 结构:
solveResults::the ratio of boys to girls is 4:5...:5
@Service
public class SolveService {
private final MathSolverOrchestrator orchestrator;
@Cacheable(
value = "solveResults",
key = "#request.question().trim().toLowerCase()"
+ " + ':' + #request.grade()")
public SolveResult solve(SolveRequest request) {
log.info("Cache miss - running full agent pipeline");
return orchestrator.solve(request);
}
}
Redis 配置#
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager(
RedisConnectionFactory connectionFactory) {
var jsonSerializer =
new Jackson2JsonRedisSerializer<>(Object.class);
RedisCacheConfiguration config =
RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofHours(24))
.serializeValuesWith(
RedisSerializationContext.SerializationPair
.fromSerializer(jsonSerializer))
.disableCachingNullValues();
return RedisCacheManager.builder(connectionFactory)
.cacheDefaults(config)
.build();
}
}
注意:Spring Boot 4.0 中 GenericJackson2JsonRedisSerializer 已标记 @Deprecated(forRemoval = true),需要改用 Jackson2JsonRedisSerializer。
效果#
- 首次请求(Cache Miss):需要完整 Agent 管线 ~2min(本地 Ollama)
- 后续相同请求(Cache Hit):直接从 Redis 返回,<50ms
Phase 2 新增文件总结#
backend/src/main/
├── java/com/mathlearning/
│ ├── config/
│ │ └── CacheConfig.java # Redis 缓存配置
│ └── service/
│ ├── QuestionImportService.java # 题库自动导入
│ ├── RagRetrievalService.java # RAG 检索 + 年级过滤
│ └── SolveService.java # 缓存层包装
└── resources/
└── data/
└── sg-math-questions.json # 40 道 PSLE 题库
修改的文件:
MathSolverOrchestrator.java— 新增 RAG 注入 + 增强 System PromptSolveController.java— 改用SolveService(带缓存)替代直接调 Orchestrator
Phase 2 总结#
| 任务 | 状态 |
|---|---|
| 40 道 PSLE 题库 JSON 编写 | ✅ |
| PgVectorStore 集成 + 自动导入 | ✅ |
| RAG 检索注入 Planner Agent | ✅ |
| 年级过滤(grade ≤ N) | ✅ |
| System Prompt 优化(PSLE 评分标准 + 知识点编码) | ✅ |
| Redis 24h 缓存 | ✅ |
核心收获:
- Spring AI VectorStore 的抽象层做得很好。
vectorStore.add(documents)一行代码完成 embedding + 存储 + 索引,similaritySearch()支持 filter DSL 做元数据过滤,几乎不需要写 SQL。 - RAG 不难实现,难在数据质量。40 道题的小题库已经能显著改善 Agent 输出。难点在于题目要覆盖够全、metadata 标注要准确、embedding 模型要适配。
- Redis 缓存是性价比最高的优化。本地 Ollama 单次完整 Agent 链 ~2 分钟,加了缓存后相同题目秒级返回,用户体验差异巨大。
- Spring Boot 4.0 仍有一些"惊喜",比如
ObjectMapper和GenericJackson2JsonRedisSerializer的变化,需要关注迁移指南。
Phase 3 计划:用 Kotlin Multiplatform + Compose for Web (Wasm) 搭建前端原型。