Industry Ready Java Spring Boot, React & Gen AI — Live Course

AI EngineeringSpring ai

Spring AI Embeddings

Chat Memory and Spring Advisors

Problem with Default ChatClient

LLMs are stateless and do not remember past conversations by default
Each message is treated independently, causing the model to forget prior context.
Follow-up queries like “more” fail because the model is unaware of previous responses.
Tools like ChatGPT appear to remember only because they use additional memory layers.

Introducing Advisors in Spring AI

Advisors allow interception and modification of requests and responses before reaching the LLM
They help extend model behaviour without modifying the LLM itself.
Advisors can be used for adding memory, safeguarding responses, or monitoring I/O.
They provide a structured mechanism to enrich chat interactions.

Memory with Advisors

Memory is added using MessageChatMemoryAdvisor, which enables storing previous messages.
InMemoryChatMemory holds the conversation history within application memory
With memory enabled, follow-up questions like “more” or “explain briefly” work correctly.
Memory ensures continuity in multi-turn conversations.

Key Takeaways

LLMs forget unless explicit memory is added.
Advisors enhance and customize ChatClient behaviour.
Memory advisors allow creating conversational applications with context retention.
Enables natural, human-like multi-turn dialogue handling.

Prompt Template

Why We Need Prompt Templates

Helps avoid rewriting full prompts repeatedly by allowing dynamic replacement of values.
Enables building endpoints that take user inputs and auto-generate structured prompts
Enhances reusability, consistency, and reduces manual effort.
Maintains prompt quality across API calls.

How Prompt Templates Work

Templates contain placeholders such as {type}, {year}, and {lang}.
User-supplied input values are mapped to these placeholders during runtime.
On each API call, templates are processed to generate final prompts.
Produces clear and formatted prompts without manual writing.

Key Flow of Prompt Template Usage

User sends a request with required parameters like type, year, and language.
Controller extracts parameters and binds them to template variables
PromptTemplate replaces placeholders with real user inputs.
The final prompt is executed by the model for a structured AI response.

Key Takeaways

PromptTemplate makes prompts reusable and easy to maintain.
Always map placeholders with actual values for accuracy.
Templates can be edited anytime to improve formatting or output style.
Simplifies prompt engineering inside Spring AI applications.

Embeddings

Definition

Embeddings represent text as numerical vectors capturing meaning and relationships.
They allow semantic understanding rather than keyword matching.
Spring AI enables generating embeddings directly from Java applications
Embeddings form the core of similarity search and knowledge retrieval.

Embedding Workflow in Spring AI

A controller receives input text for embedding generation.
The EmbeddingModel converts the text into a vector of float values.
Models like text-embedding-3-large can be configured for embedding tasks.
Embeddings are returned as arrays representing semantic information.

When Multiple Embedding Models Exist

Spring AI may contain multiple embedding providers like OpenAI or Ollama.
@Qualifier helps specify exactly which embedding model should be used
Avoids ambiguity during dependency injection.
Ensures consistent vector generation from the chosen provider.

Key Takeaways

Embeddings output high-dimensional float vectors representing text meaning.
Useful in clustering, similarity search, and information retrieval.
Forms the basis for tasks like search, categorization, and RAG pipelines.

Cosine Similarity

Definition

Cosine similarity measures how similar two vectors are by computing the cosine of the angle between them.
Represents closeness in meaning rather than exact matching.
Values range from -1 to 1, where higher values mean higher similarity
Commonly used in NLP tasks.

Steps in Cosine Similarity Calculation

Convert both pieces of text into embeddings using an EmbeddingModel.
Compute the dot product between both embedding vectors.
Calculate the magnitude (norm) of each vector.
Divide dot product by product of magnitudes to get cosine similarity score.

Usage

Helps identify semantic similarity in text pairs.
Returns higher scores for related words like “computer” and “laptop.”
Useful in search engines, recommendation systems, and clustering.
Supports intelligent retrieval beyond keyword matching.

Key Takeaways

Provides numerical measure of text similarity.
Works on embeddings generated from text input.
Used in semantic search and related-content discovery.
Essential in implementing retrieval-based systems.

Last updated on

Spring AI ChatClient & ChatModel

Previous Page

Spring AI Tool Calling

Next Page