Redis backed Caching of Internal Embeddings#3532
Conversation
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…configuration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
… and telemetry integration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…ate empty embeddings Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…oint/health sub-objects Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…orization Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…lthCheckConfig in converter
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…dding controller tests, switching the dab schema for the embedding system to default to false
embeddings endpoint is now permanently fixed to /embed with no user-configurable path option. This removes unnecessary configuration surface since the feature has not been released yet, eliminating the need for backward compatibility. Changes: - Remove path property from dab.draft.schema.json - Remove Path, UserProvidedPath, and EffectivePath from EmbeddingsEndpointOptions - Remove EffectiveEndpointPath from EmbeddingsOptions - Remove path deserialization from EmbeddingsOptionsConverterFactory - Remove --runtime.embeddings.endpoint.path CLI option - Remove path configuration logic from ConfigGenerator - Remove endpoint path validation from RuntimeConfigValidator - Update Startup.cs logging to use DEFAULT_PATH constant - Update all tests to remove path references
There was a problem hiding this comment.
Pull request overview
This PR extends the embeddings feature set (“phase2”) by adding embeddings-specific cache configuration (L1 in-memory + optional L2 Redis), wiring a dedicated FusionCache instance for embeddings in the Service startup path, and updating config serialization/schema and tests to cover the new options and related behaviors.
Changes:
- Add
runtime.embeddings.cacheconfiguration surface (object model + JSON converters + JSON schema). - Configure a named FusionCache instance (
EmbeddingsCache) and optionally attach Redis distributed cache/backplane for embeddings. - Add/adjust unit tests and update configuration snapshots to reflect the new runtime config shape.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Service/Startup.cs | Adds embeddings cache setup and overrides embedding service registration to use a named FusionCache (and optional Redis). |
| src/Service/HealthCheck/HealthCheckHelper.cs | Adds fields intended to track incoming role/token for health checks. |
| src/Service/Controllers/EmbeddingController.cs | Minor formatting change (blank line). |
| src/Service.Tests/UnitTests/StartupRedisConnectionTests.cs | New unit tests for Redis Entra-auth detection logic in Startup. |
| src/Service.Tests/UnitTests/EmbeddingsCacheOptionsTests.cs | New unit tests for embeddings cache option defaults/behavior. |
| src/Service.Tests/UnitTests/EmbeddingsCacheOptionsSerializationTests.cs | New unit tests validating (de)serialization of embeddings cache options with runtime config serializer options. |
| src/Service.Tests/UnitTests/EmbeddingsCacheLevel2OptionsTests.cs | New unit tests for embeddings L2 cache option record/config behavior. |
| src/Service.Tests/UnitTests/EmbeddingControllerTests.cs | Extends controller tests around service availability; currently introduces duplicate local declarations. |
| src/Service.Tests/UnitTests/ConfigValidationUnitTests.cs | Adds additional embeddings validation tests; currently duplicates existing test methods in the same class. |
| src/Service.Tests/Snapshots/ConfigurationTests.TestReadingRuntimeConfigForPostgreSql.verified.txt | Updates expected runtime-config snapshot output to match new serialization/config shape. |
| src/Service.Tests/Snapshots/ConfigurationTests.TestReadingRuntimeConfigForMySql.verified.txt | Updates expected runtime-config snapshot output to match new serialization/config shape. |
| src/Service.Tests/Snapshots/ConfigurationTests.TestReadingRuntimeConfigForCosmos.verified.txt | Updates expected runtime-config snapshot output to match new serialization/config shape. |
| src/Service.Tests/Configuration/ConfigurationTests.cs | Fixes JSON request body formatting in an existing strict-mode test. |
| src/Core/Services/Embeddings/EmbeddingTelemetryHelper.cs | Adds OpenTelemetry namespace import for tracing-related extensions. |
| src/Config/RuntimeConfigLoader.cs | Registers the new EmbeddingsCacheOptionsConverterFactory in serializer options. |
| src/Config/ObjectModel/Embeddings/EmbeddingsOptions.cs | Adds cache property to embeddings runtime options and helper booleans for cache enablement. |
| src/Config/ObjectModel/Embeddings/EmbeddingsCacheOptions.cs | Introduces embeddings cache options model (enabled/ttl/level2 + “user provided TTL” semantics). |
| src/Config/ObjectModel/Embeddings/EmbeddingsCacheLevel2Options.cs | Introduces embeddings L2 cache options model (Redis connection string). |
| src/Config/Converters/EmbeddingsOptionsConverterFactory.cs | Extends embeddings options converter to read/write the new cache section. |
| src/Config/Converters/EmbeddingsCacheOptionsConverterFactory.cs | New custom converter to control embeddings cache options serialization (avoid writing default TTL unless explicitly provided). |
| src/Cli/Commands/ConfigureOptions.cs | Adds CLI flags for embeddings runtime configuration; currently contains duplicated option/property blocks. |
| schemas/dab.draft.schema.json | Adds schema definition for runtime.embeddings.cache (including L2 settings). |
Comments suppressed due to low confidence (1)
src/Cli/Commands/ConfigureOptions.cs:474
- Duplicate option properties were introduced for embeddings (e.g., RuntimeEmbeddingsEnabled/Provider/BaseUrl/etc. are declared twice). This will not compile due to duplicate member definitions and duplicate [Option] attributes; remove the duplicated block and keep a single set of embeddings options (including endpoint.path/chunking options as needed).
[Option("runtime.embeddings.enabled", Required = false, HelpText = "Enable/disable the embedding service. Default: true")]
public CliBool? RuntimeEmbeddingsEnabled { get; }
[Option("runtime.embeddings.provider", Required = false, HelpText = "Configure embedding provider type. Allowed values: azure-openai, openai.")]
public EmbeddingProviderType? RuntimeEmbeddingsProvider { get; }
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
souvikghosh04
left a comment
There was a problem hiding this comment.
Posting some comments which needs to be fixed. Will have another review after that.
| services.AddSingleton<IEmbeddingService>(serviceProvider => | ||
| { | ||
| IFusionCache cache = embeddingsOptions.IsCachingEnabled | ||
| ? serviceProvider.GetRequiredService<IFusionCacheProvider>().GetCache("EmbeddingsCache") | ||
| : serviceProvider.GetRequiredService<IFusionCache>(); // Fallback to default cache | ||
|
|
||
| ILogger<EmbeddingService> logger = serviceProvider.GetRequiredService<ILogger<EmbeddingService>>(); | ||
| IHttpClientFactory httpClientFactory = serviceProvider.GetRequiredService<IHttpClientFactory>(); | ||
| HttpClient httpClient = httpClientFactory.CreateClient(typeof(EmbeddingService).Name); | ||
|
|
||
| return new EmbeddingService(httpClient, embeddingsOptions, logger, cache); | ||
| }); |
There was a problem hiding this comment.
is this a replacement for the existing AddHttpClient<IEmbeddingService> added above previously? please check and validate
There was a problem hiding this comment.
Yes, also added comments around it.
There was a problem hiding this comment.
currently at line 464, you have services.AddHttpClient(nameof(EmbeddingService));. you can may be change to services.AddHttpClient(nameof(EmbeddingService)); as you have the same below
| IFusionCache cache = embeddingsOptions.IsCachingEnabled | ||
| ? serviceProvider.GetRequiredService<IFusionCacheProvider>().GetCache("EmbeddingsCache") | ||
| : serviceProvider.GetRequiredService<IFusionCache>(); // Fallback to default cache |
There was a problem hiding this comment.
is this behaviour correct? currently its falling back to IFusionCache if caching is disabled. shouldn't it completely disable caching if that's what the IsCachingEnabled means?
There was a problem hiding this comment.
No, we default to L1 cache always unless L2 is configured.
There was a problem hiding this comment.
I understand but the naming IsCachingEnabled is confusing. it reads as enabled or disabled but doesn't tell anything about falling back to L1. may be use meaningful naming?
Co-authored-by: Copilot <copilot@github.com>
souvikghosh04
left a comment
There was a problem hiding this comment.
added additional comments. mainly about Redis connection failure -> fallback, retry, loggings. additional tests.
| // Create connection multiplexer for embeddings Azure Managed Redis cache | ||
| Task<IConnectionMultiplexer> connectionMultiplexerTask = | ||
| CreateConnectionMultiplexerAsync(level2Options.ConnectionString); |
There was a problem hiding this comment.
if Redis connection fails here where is the fallback? also, can we know why connection failed? do you have tests covering this? some failures could be transient so may be retry can succeed? can you refactor and address these gaps?
| if (TtlHours is not null) | ||
| { | ||
| this.TtlHours = TtlHours; | ||
| UserProvidedTtlHours = true; | ||
| } |
There was a problem hiding this comment.
you need additional validations like negative or zero ttl in Runtime. although JSON schema enforces it but runtime properties should have validations/checks.
| [NonAction] | ||
| [Consumes("text/plain", "application/json")] | ||
| [Produces("application/json", "text/plain")] |
There was a problem hiding this comment.
if its marked NonAction then what is the use of Consumes and Produces?
| services.AddSingleton<IEmbeddingService>(serviceProvider => | ||
| { | ||
| IFusionCache cache = embeddingsOptions.IsCachingEnabled | ||
| ? serviceProvider.GetRequiredService<IFusionCacheProvider>().GetCache("EmbeddingsCache") | ||
| : serviceProvider.GetRequiredService<IFusionCache>(); // Fallback to default cache | ||
|
|
||
| ILogger<EmbeddingService> logger = serviceProvider.GetRequiredService<ILogger<EmbeddingService>>(); | ||
| IHttpClientFactory httpClientFactory = serviceProvider.GetRequiredService<IHttpClientFactory>(); | ||
| HttpClient httpClient = httpClientFactory.CreateClient(typeof(EmbeddingService).Name); | ||
|
|
||
| return new EmbeddingService(httpClient, embeddingsOptions, logger, cache); | ||
| }); |
There was a problem hiding this comment.
currently at line 464, you have services.AddHttpClient(nameof(EmbeddingService));. you can may be change to services.AddHttpClient(nameof(EmbeddingService)); as you have the same below
1. Embeddings-Specific Cache Configuration
EmbeddingsCacheOptionsclass for embeddings-specific cache settings2. Level 2 (L2) Distributed Redis Cache Support
EmbeddingsCacheLevel2Optionsclass for Redis-specific configuration3. Enhanced Configuration Schema
cachesection underembeddings4. Dedicated FusionCache Instance for Embeddings
5. Comprehensive Test Coverage
EmbeddingsCacheOptionsclass (205 lines)EmbeddingsCacheLevel2Optionsclass (64 lines)