AI EngineeringSpring ai
Spring AI Embeddings
Chat Memory and Spring Advisors
Problem with Default ChatClient
- LLMs are stateless and do not remember past conversations by default
- Each message is treated independently, causing the model to forget prior context.
- Follow-up queries like “more” fail because the model is unaware of previous responses.
- Tools like ChatGPT appear to remember only because they use additional memory layers.
Introducing Advisors in Spring AI
- Advisors allow interception and modification of requests and responses before reaching the LLM
- They help extend model behaviour without modifying the LLM itself.
- Advisors can be used for adding memory, safeguarding responses, or monitoring I/O.
- They provide a structured mechanism to enrich chat interactions.
Memory with Advisors
- Memory is added using MessageChatMemoryAdvisor, which enables storing previous messages.
- InMemoryChatMemory holds the conversation history within application memory
- With memory enabled, follow-up questions like “more” or “explain briefly” work correctly.
- Memory ensures continuity in multi-turn conversations.
Key Takeaways
- LLMs forget unless explicit memory is added.
- Advisors enhance and customize ChatClient behaviour.
- Memory advisors allow creating conversational applications with context retention.
- Enables natural, human-like multi-turn dialogue handling.
Prompt Template
Why We Need Prompt Templates
- Helps avoid rewriting full prompts repeatedly by allowing dynamic replacement of values.
- Enables building endpoints that take user inputs and auto-generate structured prompts
- Enhances reusability, consistency, and reduces manual effort.
- Maintains prompt quality across API calls.
How Prompt Templates Work
- Templates contain placeholders such as {type}, {year}, and {lang}.
- User-supplied input values are mapped to these placeholders during runtime.
- On each API call, templates are processed to generate final prompts.
- Produces clear and formatted prompts without manual writing.
Key Flow of Prompt Template Usage
- User sends a request with required parameters like type, year, and language.
- Controller extracts parameters and binds them to template variables
- PromptTemplate replaces placeholders with real user inputs.
- The final prompt is executed by the model for a structured AI response.
Key Takeaways
- PromptTemplate makes prompts reusable and easy to maintain.
- Always map placeholders with actual values for accuracy.
- Templates can be edited anytime to improve formatting or output style.
- Simplifies prompt engineering inside Spring AI applications.
Embeddings
Definition
- Embeddings represent text as numerical vectors capturing meaning and relationships.
- They allow semantic understanding rather than keyword matching.
- Spring AI enables generating embeddings directly from Java applications
- Embeddings form the core of similarity search and knowledge retrieval.
Embedding Workflow in Spring AI
- A controller receives input text for embedding generation.
- The EmbeddingModel converts the text into a vector of float values.
- Models like text-embedding-3-large can be configured for embedding tasks.
- Embeddings are returned as arrays representing semantic information.
When Multiple Embedding Models Exist
- Spring AI may contain multiple embedding providers like OpenAI or Ollama.
- @Qualifier helps specify exactly which embedding model should be used
- Avoids ambiguity during dependency injection.
- Ensures consistent vector generation from the chosen provider.
Key Takeaways
- Embeddings output high-dimensional float vectors representing text meaning.
- Useful in clustering, similarity search, and information retrieval.
- Forms the basis for tasks like search, categorization, and RAG pipelines.
Cosine Similarity
Definition
- Cosine similarity measures how similar two vectors are by computing the cosine of the angle between them.
- Represents closeness in meaning rather than exact matching.
- Values range from -1 to 1, where higher values mean higher similarity
- Commonly used in NLP tasks.
Steps in Cosine Similarity Calculation
- Convert both pieces of text into embeddings using an EmbeddingModel.
- Compute the dot product between both embedding vectors.
- Calculate the magnitude (norm) of each vector.
- Divide dot product by product of magnitudes to get cosine similarity score.
Usage
- Helps identify semantic similarity in text pairs.
- Returns higher scores for related words like “computer” and “laptop.”
- Useful in search engines, recommendation systems, and clustering.
- Supports intelligent retrieval beyond keyword matching.
Key Takeaways
- Provides numerical measure of text similarity.
- Works on embeddings generated from text input.
- Used in semantic search and related-content discovery.
- Essential in implementing retrieval-based systems.
Last updated on
