项目知识库助手
大约 5 分钟
第十四章:实战项目:知识库助手(RAG + Tools + SSE)
14.1 目标:把前面章节“拼”成可上线形态
这个项目最终要满足:
- 用户提问优先基于知识库回答(RAG)
- 上下文不足时允许调用工具(白名单)
- 支持 SSE 流式输出(取消/断连/心跳)
- 具备最小治理:并发限制、脱敏、requestId、指标
你会发现:真正决定能否上线的,不是“模型回答得好不好”,而是“边界是否清晰、失败是否可控、成本是否可控”。
14.2 依赖(Maven)
下面给一个“能跑 + 易扩展”的最小依赖组合(以 OpenAI 为例,千问/DeepSeek 见第 4/5 章):
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
<version>1.12.2-beta22</version>
</dependency>
</dependencies>14.3 配置(application.yml)
langchain4j:
open-ai:
chat-model:
api-key: ${OPENAI_API_KEY:}
model-name: gpt-4o-mini
temperature: 0.2
timeout: PT60S
streaming-chat-model:
api-key: ${OPENAI_API_KEY:}
model-name: gpt-4o-mini
temperature: 0.2
timeout: PT60S
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus14.4 推荐目录结构(Controller 轻、Service 重)
com.example.kb
├── api // Web 层:参数与返回
├── app // 业务编排:RAG + Tools + Memory + Governance
├── rag // 检索与上下文拼装
├── tools // 工具定义(白名单)与执行
├── governance // 并发、脱敏、指纹、统一错误
└── model // DTO/结构化结果(可选)14.5 工具(Tools):FAQ 查询 + 工单创建(白名单)
工具一定要遵循:白名单 + 参数校验 + 最小权限(这里用最小示例)。
package com.example.kb.tools;
import dev.langchain4j.agent.tool.P;
import dev.langchain4j.agent.tool.Tool;
import java.util.Map;
import org.springframework.stereotype.Component;
@Component
public class SupportTools {
@Tool("在 FAQ 中搜索答案。参数 keyword 为关键字。")
public Map<String, Object> searchFaq(@P("关键字,例如 发票") String keyword) {
if (keyword == null || keyword.isBlank()) {
throw new IllegalArgumentException("keyword 不能为空");
}
return Map.of(
"keyword", keyword,
"hits", 2,
"items", new String[]{
"如何重置密码:进入个人中心-安全设置。",
"如何开票:进入订单详情页-申请发票。"
}
);
}
@Tool("创建工单。只有用户明确要求人工处理时才能调用。")
public Map<String, Object> createTicket(@P("标题") String title, @P("详细描述") String detail) {
if (title == null || title.isBlank()) {
throw new IllegalArgumentException("title 不能为空");
}
if (detail == null || detail.isBlank()) {
throw new IllegalArgumentException("detail 不能为空");
}
return Map.of(
"ticketId", "T-" + System.currentTimeMillis(),
"status", "CREATED",
"title", title
);
}
}14.6 RAG:最小检索器(InMemory 版,可替换为向量库)
为了让你“复制就能跑”,这里用 InMemoryEmbeddingStore 演示。上线后你只需要把 EmbeddingStore 换成 pgvector/redis/ES,对上层编排基本无影响。
package com.example.kb.rag;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import java.util.List;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class RagConfig {
@Bean
public EmbeddingStore<TextSegment> embeddingStore() {
return new InMemoryEmbeddingStore<>();
}
@Bean
public KnowledgeBaseSeeder knowledgeBaseSeeder(EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> store) {
return new KnowledgeBaseSeeder(embeddingModel, store);
}
@Bean
public EmbeddingStoreContentRetriever contentRetriever(EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> store) {
return EmbeddingStoreContentRetriever.builder()
.embeddingModel(embeddingModel)
.embeddingStore(store)
.maxResults(4)
.build();
}
public static class KnowledgeBaseSeeder {
private final EmbeddingModel embeddingModel;
private final EmbeddingStore<TextSegment> store;
public KnowledgeBaseSeeder(EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> store) {
this.embeddingModel = embeddingModel;
this.store = store;
seed();
}
private void seed() {
List<TextSegment> segments = List.of(
TextSegment.from("开票规则:订单完成后 30 天内可申请电子发票。"),
TextSegment.from("退款规则:7 天无理由,原路退回,1~3 个工作日到账。"),
TextSegment.from("物流规则:默认顺丰,48 小时内发货。")
);
for (TextSegment segment : segments) {
store.add(embeddingModel.embed(segment).content(), segment);
}
}
}
}14.7 业务编排:RAG →(可选)Tools → 生成
这里给一个“业务层可维护”的写法:Service 里统一把上下文拼好,然后交给 AI Service 或模型去完成最后生成。
package com.example.kb.app;
import dev.langchain4j.rag.content.Content;
import dev.langchain4j.rag.content.retriever.ContentRetriever;
import java.util.List;
import java.util.StringJoiner;
import org.springframework.stereotype.Component;
@Component
public class RagContextBuilder {
private final ContentRetriever retriever;
public RagContextBuilder(ContentRetriever retriever) {
this.retriever = retriever;
}
public String buildContext(String question) {
List<Content> contents = retriever.retrieve(question);
StringJoiner joiner = new StringJoiner("\n---\n");
for (Content c : contents) {
if (c.textSegment() != null) {
joiner.add(c.textSegment().text());
}
}
return joiner.toString();
}
}14.8 AI Service:把规则写在接口上,把执行留给系统
package com.example.kb.app;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import dev.langchain4j.service.spring.AiService;
@AiService
public interface KnowledgeAssistant {
@SystemMessage("""
你是企业知识库助手。
规则:
1) 优先基于【上下文】回答。
2) 如果上下文不足,可以调用 searchFaq。
3) 仍然无法解决且用户明确要人工处理时,才调用 createTicket。
4) 不要泄露敏感信息;不确定就明确说不确定。
""")
String answer(@UserMessage String prompt);
}业务 Service 负责把上下文塞进 prompt:
package com.example.kb.app;
import org.springframework.stereotype.Service;
@Service
public class KnowledgeAssistantService {
private final RagContextBuilder contextBuilder;
private final KnowledgeAssistant assistant;
public KnowledgeAssistantService(RagContextBuilder contextBuilder, KnowledgeAssistant assistant) {
this.contextBuilder = contextBuilder;
this.assistant = assistant;
}
public String ask(String question) {
String context = contextBuilder.buildContext(question);
String prompt = """
【上下文】
%s
【问题】
%s
""".formatted(context, question);
return assistant.answer(prompt);
}
}14.9 SSE:流式接口(取消/断连/心跳)
package com.example.kb.api;
import com.example.kb.app.KnowledgeAssistantService;
import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.TokenStream;
import java.time.Duration;
import org.springframework.http.MediaType;
import org.springframework.http.codec.ServerSentEvent;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Sinks;
@RestController
public class KnowledgeAssistantController {
interface StreamingAssistant {
TokenStream chat(String message);
}
private final KnowledgeAssistantService service;
private final StreamingAssistant streamingAssistant;
public KnowledgeAssistantController(KnowledgeAssistantService service, StreamingChatModel streamingChatModel) {
this.service = service;
this.streamingAssistant = AiServices.create(StreamingAssistant.class, streamingChatModel);
}
@GetMapping(value = "/kb/ask", produces = MediaType.TEXT_PLAIN_VALUE)
public String ask(@RequestParam String q) {
return service.ask(q);
}
@GetMapping(value = "/kb/ask/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<String>> askStream(@RequestParam String q) {
Sinks.Many<ServerSentEvent<String>> sink = Sinks.many().unicast().onBackpressureBuffer();
TokenStream tokenStream = streamingAssistant.chat(q);
tokenStream
.onPartialResponse(delta -> sink.tryEmitNext(ServerSentEvent.builder(delta).event("delta").build()))
.onCompleteResponse(resp -> sink.tryEmitComplete())
.onError(sink::tryEmitError)
.start();
Flux<ServerSentEvent<String>> heartbeat = Flux.interval(Duration.ofSeconds(10))
.map(tick -> ServerSentEvent.<String>builder("ping").event("ping").build());
return Flux.merge(sink.asFlux(), heartbeat)
.doFinally(signal -> tokenStream.cancel());
}
}14.10 单元测试建议(覆盖边界条件)
建议至少覆盖:
SupportTools:keyword/title/detail 为空时必须抛异常RagContextBuilder:检索为空时上下文为空字符串,助手必须明确“不确定”- SSE:断连时 tokenStream.cancel() 被调用(可用替身 TokenStream 测)
14.11 本章小结(以及你下一步要做什么)
你现在拥有了一个“可上线的最小闭环”:
- RAG:检索上下文,减少幻觉
- Tools:让系统能办事,但边界仍由你控制
- SSE:体验升级,同时具备取消与心跳
- 工程结构:Controller 轻、Service 重,便于治理与迭代
下一步落地到真实业务时,只需要替换两处:
- 把 InMemoryEmbeddingStore 换成真实向量库
- 把示例工具换成你的业务系统接口(并补齐权限、审计与脱敏)
