LangChain4j Spring Boot Integration

Blog

What is LangChain4j?

LangChain4j makes it easy to add large language models (LLMs) to Java apps.

How does it help?

One API for All

LLM providers (e.g., OpenAI, Google Vertex AI) and vector stores (e.g., Pinecone, Milvus) have different APIs.
LangChain4j gives you a single, unified API, so you don’t need to learn each one.
Switch between 15+ LLMs or 20+ vector stores without rewriting code.

Ready-Made Tools

LangChain4j Features

Built from community ideas since 2023, it offers a toolbox for LLM apps.
Includes basics like prompts and memory, plus advanced stuff like Agents and RAG (Retrieval-Augmented Generation).
Integrations
- 15+ LLM providers
- 20+ embedding (vector) stores
- 15+ embedding models
- 5 image generation models
- 2 scoring (re-ranking) models
- 1 moderation model (OpenAI)
Input Support
- Text and images (multimodal)
Core Tools
- AI Services: High-level LLM API
- Prompt Templates: Customizable prompts
- Chat Memory: Persistent and in-memory options (message window, token window)
- Streaming: Real-time LLM responses
- Output Parsers: For Java types and custom objects
- Tools: Function calling + dynamic code execution
RAG (Retrieval-Augmented Generation)
- Ingestion
  - Import docs (TXT, PDF, DOC, PPT, XLS, etc.) from files, URLs, GitHub, Azure Blob, Amazon S3, etc.
  - Split docs into segments with various algorithms
  - Post-process docs and segments
  - Embed segments using embedding models
  - Store embeddings in vector stores
- Retrieval
  - Query transformation (expand, compress)
  - Query routing
  - Fetch from vector stores or custom sources
  - Re-ranking with Reciprocal Rank Fusion
  - Fully customizable RAG pipeline
Extras
- Text classification
- Tokenization and token count estimation
- Kotlin Extensions: Async chat handling with coroutines

LangChain4j Integration

Why Use LangChain4j with Spring Boot?

Unified API: Swap LLMs (15+ supported) without rewriting code.
Spring Synergy: Leverage Spring’s ecosystem for config, dependency injection, and REST APIs.
Flexibility: From basic chats to advanced RAG or agent-based apps, LangChain4j scales with your needs.

Integrating LangChain4j with Spring Boot

LangChain4j simplifies adding large language models (LLMs) to Java applications, and when paired with Spring Boot, it becomes a powerful combo for building AI-driven features fast. In this article, we’ll explore how to integrate LangChain4j with Spring Boot using the Google Gemini LLM, covering basic setup, AI services, and advanced chat options—all with working code examples.

Create a Spring Boot Project Using Spring Initializer

Basic Integration

Let’s start with a minimal setup to connect Spring Boot with LangChain4j and the Google Gemini model.

Dependencies

Add these to your pom.xml.

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-google-ai-gemini</artifactId>
    <version>1.0.0-beta1</version>
</dependency>
<dependency>
    <groupId>org.springdoc</groupId>
    <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId>
    <version>2.8.5</version>
</dependency>

Configuration

In application.properties, set up your Gemini API key and model.

spring.application.name=chat0
langchain4j.google-ai.api-key=YOUR API KEY
langchain4j.google-ai.model-name=gemini-2.0-flash
langchain4j.google-ai.log-requests=true
langchain4j.google-ai.log-responses=true

Create a configuration class to initialize the Gemini model.

package com.kgisl.chat0;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import dev.langchain4j.model.googleai.GoogleAiGeminiChatModel;
import dev.langchain4j.model.chat.ChatLanguageModel;

@Configuration
public class GeminiConfig {
    @Value("${langchain4j.google-ai.api-key}")
    private String apiKey;

    @Value("${langchain4j.google-ai.model-name}")
    private String model;

    @Bean
    public ChatLanguageModel googleAiGeminiChatModel() {
        return GoogleAiGeminiChatModel.builder()
            .apiKey(apiKey)
            .modelName(model)
            .build();
    }
}

Controller

Now, expose an endpoint to chat with the model.

package com.kgisl.chat0;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import dev.langchain4j.model.chat.ChatLanguageModel;

@RestController
public class ChatController {
    private final ChatLanguageModel chatLanguageModel;

    public ChatController(ChatLanguageModel chatLanguageModel) {
        this.chatLanguageModel = chatLanguageModel;
    }

    @GetMapping("/chat")
    public String model(@RequestParam(value = "message", defaultValue = "Hello") String message) {
        return chatLanguageModel.chat(message);
    }
}

Hit http://localhost:8080/chat?message=Hi and you’ll get a response from Gemini. This setup manually wires the LLM into Spring Boot, giving you control over the model configuration.

AI Services

LangChain4j’s AiService abstraction simplifies things further by handling LLM interactions declaratively. Here’s how to use it.

Dependencies

Add the Spring Boot starter alongside the previous dependencies.

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-spring-boot-starter</artifactId>
    <version>1.0.0-beta1</version>
</dependency>

AI Service Interface

Define an interface with a system message to guide the LLM’s behavior.

package com.kgisl.chat0;

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.spring.AiService;

@AiService
interface Assistant {
    @SystemMessage("You are a polite assistant")
    String chat(String userMessage);
}

Controller

Inject the service into a controller.

package com.kgisl.chat0;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class AssistantController {
    @Autowired
    private Assistant assistant;

    @GetMapping("/chat")
    public String chat(String message) {
        return assistant.chat(message);
    }
}

The langchain4j-spring-boot-starter auto-configures the Gemini model using your application.properties. Call /chat?message=Hello and you’ll get a polite response. This approach cuts boilerplate and leverages Spring’s dependency injection.

Why is the ChatController with Assistant generally the better choice?

To determine which ChatLanguageModelController is best to use, let’s analyze the provided code snippets and their implications. You’ve shared two configurations involving a ChatLanguageModelController and an alternative ChatController that uses an Assistant interface. The choice depends on your use case, your requirements, and how you intend to interact with the chat model. Below, I’ll break it down:

1. First ChatLanguageModelController (Using ChatLanguageModel Directly)

@RestController
class ChatLanguageModelController {

    ChatLanguageModel chatLanguageModel;

    ChatLanguageModelController(ChatLanguageModel chatLanguageModel) {
        this.chatLanguageModel = chatLanguageModel;
    }

    @GetMapping("/model")
    public String model(@RequestParam(value = "message", defaultValue = "Hello") String message) {
        return chatLanguageModel.generate(message);
    }
}

Key Features

Direct Interaction: This controller directly uses the ChatLanguageModel (in this case, GitHubModelsChatModel with gpt-4o-mini) to generate responses.
HTTP Method: Uses @GetMapping with a query parameter (message), defaulting to "Hello" if no input is provided.
Simplicity: Minimal setup—just takes a user message and returns the model’s raw response.
No Memory: Does not include chat memory, so each request is stateless (no conversation history is maintained).
No System Message: No predefined behavior or role (e.g., "polite assistant") is enforced.

Pros

Simple and lightweight—ideal for one-off queries or testing the raw model output.
Easy to integrate into a basic API endpoint.
No overhead from additional abstractions like AiServices or tools.

Cons

Lacks conversation context (no ChatMemory), so it’s not suitable for multi-turn conversations.
No customization of the assistant’s behavior via a system message.
Limited to GET requests with query parameters, which might not be ideal for longer inputs (e.g., complex prompts).

Best Use Case

Quick prototyping or single-shot interactions with the model (e.g., a "Hello World" test).
Scenarios where conversation history isn’t needed, and you just want the raw model output.

2. Second ChatController (Using Assistant Interface with AiServices)

@RestController
@RequestMapping("/api/chat")
public class ChatController {

    private final Assistant assistant;

    public ChatController(Assistant assistant) {
        this.assistant = assistant;
    }

    @PostMapping
    public String chat(@RequestBody String userMessage) {
        return assistant.chat(userMessage);
    }
}

public interface Assistant {
    @SystemMessage("You are a polite assistant capable of providing SQL Queries, weather forecasts and writing poems.")
    String chat(String userMessage);
}

Key Features

Indirect Interaction: Uses the Assistant interface, which is built with AiServices and wraps the ChatLanguageModel.
HTTP Method: Uses @PostMapping with a request body, allowing larger and more complex inputs.
Chat Memory: Configured with MessageWindowChatMemory.withMaxMessages(10), enabling a 10-message conversation history.
System Message: Defines the assistant’s behavior ("polite assistant capable of providing SQL queries, weather forecasts, and writing poems").
Abstraction: Built using AiServices, which supports additional features like tools (though none are configured here).

Pros

Supports multi-turn conversations with memory, making it suitable for interactive chat applications.
Customizable assistant behavior via the @SystemMessage annotation.
POST method with @RequestBody is better for sending larger or structured inputs (e.g., JSON payloads).
Extensible with tools (e.g., for SQL generation or weather APIs) if needed later.

Cons

Slightly more complex setup due to AiServices and the Assistant interface.
Overhead from chat memory and additional abstraction, which might be unnecessary for simple use cases.

Best Use Case

Interactive chatbot applications where conversation history and a defined assistant persona are important.
Scenarios requiring advanced features like tools or multi-turn dialogues (e.g., generating SQL queries or poems based on prior context).

Comparison and Recommendation

Feature	ChatLanguageModelController	ChatController with Assistant
HTTP Method	GET (query param)	POST (request body)
Chat Memory	None	Yes (10 messages)
System Message	None	Yes (polite assistant)
Complexity	Low	Moderate
Conversation Support	Single-turn	Multi-turn
Input Flexibility	Limited (query string)	High (request body)
Extensibility	Basic	High (tools, memory)

Which is Best?

Use ChatLanguageModelController if,
- You need a simple, stateless API for one-off model queries.
- You’re testing or prototyping and don’t need conversation history or a defined persona.
- Example: A basic endpoint to test gpt-4o-mini responses.
Use ChatController with Assistant if,
- You’re building an interactive chatbot that needs to remember past messages (e.g., for SQL queries or poem generation based on prior input).
- You want to enforce a specific assistant behavior (e.g., politeness) or plan to add tools later.
- Example: A conversational app where users can ask follow-up questions.

Why?

The ChatController with Assistant is generally the better choice for most real-world applications because it offers conversation memory, a defined persona, and extensibility, which align with the needs of a robust chat system. The ChatLanguageModelController is too basic for anything beyond simple testing—it lacks the features needed for a production-ready chatbot.

If your goal is to leverage the capabilities mentioned in the Assistant interface (SQL queries, weather forecasts, poems), go with the ChatController. It’s more aligned with those requirements and provides a foundation for future growth.

Chat Options

For more control, LangChain4j lets you fine-tune the Gemini model with options like temperature, token limits, and function calling. Here’s an example.

ChatLanguageModel gemini = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-1.5-flash")
    .temperature(1.0)                 // Creativity level (0-2)
    .topP(0.95)                        // Nucleus sampling
    .topK(64)                          // Top-K sampling
    .maxOutputTokens(8192)             // Max response length
    .timeout(Duration.ofSeconds(60))   // Request timeout
    .candidateCount(1)                 // Number of responses
    .responseFormat(ResponseFormat.JSON) // Output in JSON
    .stopSequences(List.of("STOP"))    // Stop at specific phrases
    .toolConfig(GeminiMode.ANY, List.of("fnOne", "fnTwo")) // Function calling
    .allowCodeExecution(true)          // Execute generated code
    .logRequestsAndResponses(true)     // Debug logging
    .safetySettings(Map.of(
        GeminiHarmCategory.HATE_SPEECH, GeminiHarmBlockThreshold.MEDIUM
    )) // Safety filters
    .build();

This configuration could replace the @Bean in the first example. It’s ideal for apps needing specific LLM behavior, like structured JSON output or dynamic tool execution.

Types of Chat Messages with AiService

SystemMessage
- Defined via @SystemMessage in the MathAssistant interface.
- Tell the AI: "You’re a math assistant; use the add tool for addition."
- Handled internally by AiService, sent as the first message in the conversation.
UserMessage
- Defined via @UserMessage("{{message}}").
- Takes the input from the controller (e.g., "Add 5 and 3") and passes it to the LLM.
- AiService wraps it automatically.
AiMessage
- Generated by the underlying ChatLanguageModel (Gemini) when mathAssistant.chat() is called.
- It could contain text (e.g., "Let me calculate that") or a ToolExecutionRequest (e.g., for add).
- AiService processes this transparently and moves to tool execution if needed.
ToolExecutionResultMessage
- Created automatically when the add tool is executed.
- If the AI requests add(5, 3), the tool returns 8, and AiService feeds this back to the LLM to finalize the response (e.g., "The sum is 8").
- You don’t handle this manually—AiService does the heavy lifting.
CustomMessage
- Not applicable here because AiService doesn’t support it natively, and Gemini doesn’t use it (it’s Ollama-specific).
- You’d need a custom ChatLanguageModel implementation to use it.

Ebook Download

View all

Programming in Java

Read by 672 people

Download Now!

Learn

View all