回顾#

Phase 1 搭建了 Java 25 + Spring AI 的多 Agent 解题管线。但当时 Planner Agent “裸跑”,没有任何参考题库。Phase 2 要解决的核心问题:让 Agent 带着"知识"回答问题

具体目标:

  1. 导入 40 道 PSLE 真题/模拟题到向量数据库
  2. 解题前先做 RAG 检索,找到相似题作为 Prompt 上下文
  3. 按年级过滤,P5 请求不会看到 P6 难度的参考题
  4. Redis 缓存相同题目的 AI 响应,避免重复调用 LLM

整体架构变化#

graph TD A[SolveRequest] --> B[SolveService - Redis @Cacheable] B -->|Cache Miss| C[MathSolverOrchestrator] C --> D[RagRetrievalService] D --> E[(pgvector - vector_store)] D -->|Top-5 相似题| F[Planner Agent + RAG Context] F --> G{StructuredTaskScope} G --> H[CPA Designer Agent] G --> I[Persona Agent] H --> J[SolveResult] I --> J J --> K[(Redis Cache - 24h TTL)] B -->|Cache Hit| L[直接返回缓存结果] classDef cache fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#e65100 classDef rag fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e7d32 classDef agent fill:#ffffff,stroke:#1976d2,stroke-width:3px,color:#0d47a1 classDef db fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#4a148c class B,K,L cache class D,E rag class C,F,H,I agent class J db

Phase 1 的数据流是 请求 → Planner → CPA + Persona → 响应

Phase 2 新增了两层:

  • 前置:RAG 检索层(RagRetrievalService)
  • 外层:Redis 缓存层(SolveService + @Cacheable)

PSLE 题库建设#

题目格式设计#

每道题包含 content(题目文本)和 metadata(结构化元数据):

{
  "content": "The ratio of boys to girls in a class is 3:5. If there are 24 boys, how many students are there altogether?",
  "metadata": {
    "grade": 5,
    "topic": "ratio.basic",
    "difficulty": "medium",
    "source": "PSLE Practice"
  }
}

metadata 中的 grade 字段用于 RAG 过滤——学生查询 P5 题目时,只检索 P1-P5 的相似题,不会返回 P6 的内容。

知识点编码体系#

按 2026 PSLE 大纲整理了知识点编码:

年级 编码 示例
P4 fractions.of_remainder “给了 1/4,再给 1/3 的余下…”
P4 measurement.area_perimeter “长方形面积和周长”
P5 ratio.basic / ratio.difference / ratio.before_after 比例类问题
P5 average.combined_groups 综合平均数
P6 algebra.forming_equation “用 x 列方程求解”
P6 algebra.simultaneous_concept 鸡兔同笼问题

最终收录了 40 道题,覆盖 P4-P6 的核心考点:代数(13 题)、比例(9 题)、平均数(4 题)、分数(6 题)、测量(3 题)等。

Spring AI VectorStore 集成#

PgVectorStore 自动配置#

Spring AI 的 PgVectorStore 自动配置非常丝滑。只需在 application-dev.yml 中配置:

spring:
  ai:
    ollama:
      embedding:
        model: nomic-embed-text    # 768维向量
    vectorstore:
      pgvector:
        initialize-schema: true    # 自动建 vector_store 表
        dimensions: 768            # 匹配 embedding 模型
        index-type: hnsw           # HNSW 索引

initialize-schema: true 让 Spring AI 自动创建 vector_store 表及 HNSW 索引。不需要手动写 DDL。

启动时自动导入题库#

QuestionImportService 监听 ApplicationReadyEvent,在应用启动时检查 vector_store 是否为空,如果是则自动导入:

@Service
public class QuestionImportService {

    private final VectorStore vectorStore;
    private final JdbcTemplate jdbcTemplate;
    private final ObjectMapper objectMapper = new ObjectMapper();

    @EventListener(ApplicationReadyEvent.class)
    public void importQuestionsOnStartup() {
        Long count = jdbcTemplate.queryForObject(
                "SELECT count(*) FROM vector_store", Long.class);
        if (count != null && count > 0) {
            log.info("Vector store already has {} docs, skip", count);
            return;
        }

        log.info("Vector store is empty, importing...");
        importQuestions();
    }

    public void importQuestions() {
        var resource = new ClassPathResource("data/sg-math-questions.json");
        List<Map<String, Object>> questions = objectMapper.readValue(
                resource.getInputStream(), new TypeReference<>() {});

        List<Document> documents = questions.stream()
                .map(q -> new Document(
                    (String) q.get("content"),
                    (Map<String, Object>) q.get("metadata")))
                .toList();

        vectorStore.add(documents);  // 自动调用 Embedding API + 写入 PG
    }
}

vectorStore.add(documents) 这一行做了三件事:

  1. 调用 Ollama 的 nomic-embed-text 模型生成 768 维向量
  2. 将 content + embedding + metadata 写入 vector_store
  3. HNSW 索引自动更新

启动后验证:

SELECT count(*) FROM vector_store;
-- 40

SELECT content, metadata FROM vector_store LIMIT 1;
-- content: "Ali had 120 stickers..."
-- metadata: {"grade": 4, "topic": "fractions.of_remainder", ...}

Spring Boot 4.0 踩坑:ObjectMapper#

Spring Boot 4.0 中,ObjectMapper Bean 不再由 spring-boot-starter-web 自动注册(至少在某些配置组合下如此)。直接 @Autowired ObjectMapper 会报 NoSuchBeanDefinitionException

解决方案很简单——在服务内部自行创建:

private final ObjectMapper objectMapper = new ObjectMapper();

这在我们的场景下完全够用,因为只需要最基本的 JSON 反序列化。

RAG 检索 + 年级过滤#

RagRetrievalService#

@Service
public class RagRetrievalService {

    private static final int TOP_K = 5;
    private static final double SIMILARITY_THRESHOLD = 0.5;

    private final VectorStore vectorStore;

    public List<Document> retrieveSimilarQuestions(
            String question, int grade) {
        var filterBuilder = new FilterExpressionBuilder();
        var filter = filterBuilder.lte("grade", grade).build();

        SearchRequest searchRequest = SearchRequest.builder()
                .query(question)
                .topK(TOP_K)
                .similarityThreshold(SIMILARITY_THRESHOLD)
                .filterExpression(filter)
                .build();

        return vectorStore.similaritySearch(searchRequest);
    }

    public String formatAsContext(List<Document> documents) {
        var sb = new StringBuilder(
            "=== Similar Questions from PSLE Question Bank ===\n\n");
        for (int i = 0; i < documents.size(); i++) {
            Document doc = documents.get(i);
            sb.append("Question %d: %s\n".formatted(
                i + 1, doc.getText()));
            sb.append("  Topic: %s\n".formatted(
                doc.getMetadata().get("topic")));
            sb.append("  Difficulty: %s\n".formatted(
                doc.getMetadata().get("difficulty")));
            sb.append("\n");
        }
        return sb.toString();
    }
}

关键设计:

  • FilterExpressionBuilder.lte("grade", grade):利用 Spring AI 的 filter DSL 实现年级过滤。P5 学生查询时 grade <= 5,不会返回 P6 的代数题
  • similarityThreshold(0.5):余弦相似度阈值 0.5,过滤掉不相关的结果
  • topK(5):返回最相似的 5 道题作为 Context
  • formatAsContext():将检索结果格式化为自然语言,注入 Planner Agent 的 User Prompt

注入 Agent 链#

Planner Agent 的 User Prompt 现在包含三个部分:

private String runPlannerAgent(
        SolveRequest request, String ragContext) {
    String userMessage = """
        Grade: P%d
        Question: %s

        %s
        """.formatted(
            request.grade(),
            request.question(),
            ragContext);

    return chatClient.prompt()
            .system(PLANNER_SYSTEM_PROMPT)
            .user(userMessage)
            .call()
            .content();
}

实际发给 LLM 的 User Prompt 类似:

Grade: P5
Question: The ratio of boys to girls is 4:5. There are 36 students. How many girls are there?

=== Similar Questions from PSLE Question Bank ===

Question 1: The ratio of boys to girls in a class is 3:5. If there are 24 boys, how many students are there altogether?
  Topic: ratio.basic
  Difficulty: easy

Question 2: The ratio of the number of red beads to blue beads is 2:5. If there are 30 more blue beads than red beads, how many beads are there in total?
  Topic: ratio.difference
  Difficulty: medium
...

这样 LLM 在回答时就能参考同类题型的解法模式,输出质量显著提高。

日志验证#

INFO  c.m.service.RagRetrievalService : RAG retrieval returned 5 similar questions for grade <= 5
INFO  c.m.agent.MathSolverOrchestrator : RAG retrieval completed, found 5 similar questions
INFO  c.m.agent.MathSolverOrchestrator : Starting Planner Agent for grade 5 question

System Prompt 优化#

Phase 2 对三个 Agent 的 System Prompt 做了显著增强:

Planner Agent#

Phase 1 (基础):只要求 JSON 结构化输出和 CPA 方法。

Phase 2 (增强):新增三部分内容:

  1. PSLE 评分标准
## PSLE 2026 Scoring Criteria
- Full marks require: correct answer + complete working + proper units
- Method marks: awarded for correct approach even if final answer is wrong
- Presentation: numbered steps, one operation per step, clear labelling
  1. 知识点编码体系 — 确保 knowledgeTags 使用标准编码而非随意命名

  2. RAG 引用指令 — “Reference similar questions from the knowledge base when the approach is applicable”

CPA Designer Agent#

Phase 2 新增:Bar Model 设计规则

## Bar Model Design Rules
- Use proportional bar lengths to represent quantities
- For ratio problems: draw bars side by side with equal unit lengths
- For fraction problems: divide a single bar into equal parts
- For algebra: use a bar with unknown length labeled with the variable

Persona Agent#

Phase 2 新增:按年级分层输出

- For P1-P3: use concrete objects (sweets, toys, stickers)
- For P4-P5: use relatable scenarios (sharing pizza, collecting cards)
- For P6: use slightly more mature contexts while keeping it fun

还增加了要求提供"Common mistakes"和"Follow-up practice question"的指令,让家长指导更有针对性。

Redis 缓存#

缓存策略#

同一道题 + 同一年级的 AI 响应缓存 24 小时。缓存 key 结构:

solveResults::the ratio of boys to girls is 4:5...:5
@Service
public class SolveService {

    private final MathSolverOrchestrator orchestrator;

    @Cacheable(
        value = "solveResults",
        key = "#request.question().trim().toLowerCase()"
            + " + ':' + #request.grade()")
    public SolveResult solve(SolveRequest request) {
        log.info("Cache miss - running full agent pipeline");
        return orchestrator.solve(request);
    }
}

Redis 配置#

@Configuration
@EnableCaching
public class CacheConfig {

    @Bean
    public CacheManager cacheManager(
            RedisConnectionFactory connectionFactory) {
        var jsonSerializer =
            new Jackson2JsonRedisSerializer<>(Object.class);

        RedisCacheConfiguration config =
            RedisCacheConfiguration.defaultCacheConfig()
                .entryTtl(Duration.ofHours(24))
                .serializeValuesWith(
                    RedisSerializationContext.SerializationPair
                        .fromSerializer(jsonSerializer))
                .disableCachingNullValues();

        return RedisCacheManager.builder(connectionFactory)
                .cacheDefaults(config)
                .build();
    }
}

注意:Spring Boot 4.0 中 GenericJackson2JsonRedisSerializer 已标记 @Deprecated(forRemoval = true),需要改用 Jackson2JsonRedisSerializer

效果#

  • 首次请求(Cache Miss):需要完整 Agent 管线 ~2min(本地 Ollama)
  • 后续相同请求(Cache Hit):直接从 Redis 返回,<50ms

Phase 2 新增文件总结#

backend/src/main/
├── java/com/mathlearning/
│   ├── config/
│   │   └── CacheConfig.java          # Redis 缓存配置
│   └── service/
│       ├── QuestionImportService.java # 题库自动导入
│       ├── RagRetrievalService.java   # RAG 检索 + 年级过滤
│       └── SolveService.java          # 缓存层包装
└── resources/
    └── data/
        └── sg-math-questions.json     # 40 道 PSLE 题库

修改的文件

  • MathSolverOrchestrator.java — 新增 RAG 注入 + 增强 System Prompt
  • SolveController.java — 改用 SolveService(带缓存)替代直接调 Orchestrator

Phase 2 总结#

任务 状态
40 道 PSLE 题库 JSON 编写
PgVectorStore 集成 + 自动导入
RAG 检索注入 Planner Agent
年级过滤(grade ≤ N)
System Prompt 优化(PSLE 评分标准 + 知识点编码)
Redis 24h 缓存

核心收获:

  1. Spring AI VectorStore 的抽象层做得很好vectorStore.add(documents) 一行代码完成 embedding + 存储 + 索引,similaritySearch() 支持 filter DSL 做元数据过滤,几乎不需要写 SQL。
  2. RAG 不难实现,难在数据质量。40 道题的小题库已经能显著改善 Agent 输出。难点在于题目要覆盖够全、metadata 标注要准确、embedding 模型要适配。
  3. Redis 缓存是性价比最高的优化。本地 Ollama 单次完整 Agent 链 ~2 分钟,加了缓存后相同题目秒级返回,用户体验差异巨大。
  4. Spring Boot 4.0 仍有一些"惊喜",比如 ObjectMapperGenericJackson2JsonRedisSerializer 的变化,需要关注迁移指南。

Phase 3 计划:用 Kotlin Multiplatform + Compose for Web (Wasm) 搭建前端原型。