Chat with your data - Semantic Kernel-powered RAG application

My previous blog post showed how to create a RAG application using .NET Aspire, Ollama, and Semantic Kernel. If you're interested in local small language models and .NET Aspire setup in general I recommend you to check that post first.

This blog post continues the development of the same application presented earlier but concentrates more on extending the Semantic Kernel with Plugins. The purpose is to create a capability to chat with own data (cross-domain) and test how well basic mathematical operations work with the GPT-4 model.

I use the electricity consumption and price data with GPT-4 model in this sample. The aim is to test how well the bot answers questions like this:

  • What was my energy consumption on 15.2.2024 at 10:00?
  • What was my energy cost on 1.1.2024 from 00:00 to 23:59?
  • What was my total energy consumption in January 2024?

👉 Check also my older blog post about how to detect anomalies in energy consumption using Azure Anomaly Detector. 

Function calling and Plugins?

First a few words about the theory and terminology what should you know.

Function calling

Note that in other platforms, functions are often called "tools" or "actions". Function calling is only available in OpenAI models that are 0613 or newer.

Function calling allows you to connect models like gpt-4o to external tools and systems. This is useful for many things such as empowering AI assistants with capabilities, or building deep integrations between your applications and the models. Introduction

I recommend you to read this article to understand how function calling works.

Plugins

At a high-level, a plugin is a group of functions that can be exposed to AI apps and services. The functions within plugins can then be orchestrated by an AI application to accomplish user requests. Within Semantic Kernel, you can invoke these functions automatically with function calling. What is a Plugin?

As said, Function calling is a method that enables interaction between external systems/data and LLM. It can be said that Plugins extend the capabilities of the Semantic Kernel.

There are three types of Plugins available:

  1. Core Plugins. Currently, there are Core Plugins available for Http, Math, Text, Time, etc. operations. Note that these are for evaluation purposes only and can be removed in future releases.
  2. Native Code Plugins. Enable the use of custom code (e.g. C#, Python) in plugin implementation.
  3. OpenAPI specification Plugins. Enable that OpenAPI specification can be used to configure data fetch from external API.

Read more from these sources:

Application Overview

The solution follows the same architecture which was introduced in the previous blog post except the Open AI service from Azure is used instead of Ollama. This blog post concentrates on the configuration of Native Code Plugins.

Electricity consumption and price data

I use hourly time series data about consumption (kWh) and price (€). I'm using static JSON files as sample data sources in this application.

Electricity consumption data set

Electricity consumption data was fetched from a service called Datahub. Datahub is a centralized electricity consumption data repository in Finland. This service is owned by Fingrid which is Finland’s transmission system operator.

Datahub provides data exports in CSV format by default. I converted the CSV data to a JSON format which is easier to handle in C# code.

The converted Electricity Consumption JSON file looks like this:

[
  {
    "date": "2023-12-31T22:00:00Z",
    "value": 0.86
  },
  {
    "date": "2023-12-31T23:00:00Z",
    "value": 0.86
  },
  {
    "date": "2024-01-01T00:00:00Z",
    "value": 0.9
  },
  {
    "date": "2024-01-01T01:00:00Z",
    "value": 0.91
  }
]

Electricity price data set

There are many APIs available to get hourly Electricity price data programmatically. In this sample, I fetched the data via API and persisted the data into a static JSON file that contains hourly prices from this year.

Electricity Price JSON file looks like this:

[
  {
    "date": "2023-12-31T22:00:00.000Z",
    "value": 0.0496
  },
  {
    "date": "2023-12-31T23:00:00.000Z",
    "value": 0.0476
  },
  {
    "date": "2024-01-01T00:00:00.000Z",
    "value": 0.0353
  },
  {
    "date": "2024-01-01T01:00:00.000Z",
    "value": 0.0331
  }
]

Electricity Data Item entity

ElectricityDataItem is a general entity presenting an electricity consumption or price.

 public class ElectricityDataItem
 {
     public DateTime date { get; set; }
     public double value { get; set; }
 }

Electricity Data Repository

This simple ElectricityDataRepository deserializes JSON data to ElectricityDataItems and returns them later for Plugins.

public interface IElectricityDataRepository
{
    List<ElectricityDataItem> GetElectricityPrice();
    List<ElectricityDataItem> GetElectricityConsumption();
}

public class ElectricityDataRepository: IElectricityDataRepository
{
    public List<ElectricityDataItem> GetElectricityConsumption()
    {
        return GetData(@"Data\consumption.json");

    }
    public List<ElectricityDataItem> GetElectricityPrice()
    {
        return GetData(@"Data\price.json");
    }
    private List<ElectricityDataItem> GetData(string path)
    {
        string fileName = Path.Combine(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), path);
        string jsonString = File.ReadAllText(fileName);
        return JsonSerializer.Deserialize<List<ElectricityDataItem>>(jsonString)!;
    }
}

Semantic Kernel's OpenAI configuration

This configuration registers Semantic Kernel to the dependency injection pipeline and adds Electricity Price and Consumption plugins into it.

It's also good to configure timeout and retry logic because sometimes OpenAI response might take some time.

var azureOpenApiEndpoint = builder.Configuration["AzureOpenApiEndpointUri"];
var azureOpenApiKey = builder.Configuration["AzureOpenApiKey"];

if (!string.IsNullOrEmpty(azureOpenApiEndpoint) && !string.IsNullOrEmpty(azureOpenApiKey))
{
    var clientOptions = new OpenAIClientOptions();
    clientOptions.Retry.MaxRetries = 2;
    clientOptions.Retry.NetworkTimeout = TimeSpan.FromMinutes(3);
    var openAIClient = new OpenAIClient(new Uri(azureOpenApiEndpoint), new AzureKeyCredential(azureOpenApiKey), clientOptions);

    builder.Services
    .AddKernel()
    .AddAzureOpenAIChatCompletion("gpt-4", openAIClient)
            .Plugins
                .AddFromType<ElectricityPricePlugin>()
                .AddFromType<ElectricityConsumptionPlugin>();
}

ElectricityDataRepository is also added as a Singleton service to the dependency pipeline.

builder.Services.AddSingleton<IElectricityDataRepository, ElectricityDataRepository>();

Electricity Price Plugin

The Electricity Price Plugin (Native Code) is responsible for providing the price data for the language model. In this case, data is fetched from the static JSON files as shown above.

The plugin has two functions:

  1. GetPricesAsync returns everything from the JSON file
  2. GetPriceByTimestampAsync returns the price data of a specific time of day. If the user prompts a specific timestamp then this function is used.

👉 IMPORTANT. You need to use Description Attributes to provide information about the purpose of the Plugin so that the language model knows when to use this particular function. The description should be accurate and precise. You should add a description for the Plugin itself, return value, and parameters.

public class ElectricityPricePlugin
{
    private IElectricityDataRepository _electricityDataRepository;
    public ElectricityPricePlugin(IElectricityDataRepository electricityDataRepository)
    {
        _electricityDataRepository = electricityDataRepository;
    }

    [KernelFunction, Description("Gets history of hourly electricity price items in kilowatt-hours (kWH). Each electricity price item contains a date and value fields. Value field presents the data in euros (€).")]
    [return: Description("An array of hourly electricity prices items in kilowat hours (kWH)")]
    public async Task<List<ElectricityDataItem>> GetPricesAsync()
    {
        return _electricityDataRepository.GetElectricityPrice();
    }

    [KernelFunction, Description("Gets hourly price in a specific time. Electricity price item contains a date and value fields. Value field presents the data in euros (€).")]
    [return: Description("Returns hourly price in a specific time")]
    public async Task<ElectricityDataItem> GetPriceByTimestampAsync(
        [Description("Timestamp")] DateTime timestamp)
    {
        var data = _electricityDataRepository.GetElectricityPrice();
        if(data?.Count > 0)
        {
            var price = data.Where(x => x.date == timestamp.ToUniversalTime()).FirstOrDefault();
            if (price != null)
            {
                return price;
            }
        }
        return null;
    }
}

Electricity Consumption Plugin

Electricity Consumption Plugin (Native Code) follows the same structure as the above Price plugin.

public class ElectricityConsumptionPlugin
{
    private IElectricityDataRepository _electricityDataRepository;
    public ElectricityConsumptionPlugin(IElectricityDataRepository electricityDataRepository)
    {
        _electricityDataRepository = electricityDataRepository;
    }
    
    [KernelFunction, Description("Gets history of hourly electricity consumption items. Each electricity consumption item contains a date and value fields. Value field presents the data in kilowatt-hours (kWH).")]
    [return: Description("An array of hourly electricity consumption items")]
    public async Task<List<ElectricityDataItem>> GetElectricityConsumptionAsync()
    {
        return _electricityDataRepository.GetElectricityConsumption();
    }
    
    [KernelFunction, Description("Gets hourly electricity consumption in a specific time. Electricity consumption item contains a date and value fields. Value field presents the data in kilowatt-hours (kWH).")]
    [return: Description("Returns hourly electricity consumption in a specific time")]
    public async Task<ElectricityDataItem> GetConsumptionByTimestampAsync(
        [Description("Timestamp")] DateTime timestamp)
    {
        var data = _electricityDataRepository.GetElectricityConsumption();
        if (data?.Count > 0)
        {
            var price = data.Where(x => x.date == timestamp.ToUniversalTime()).FirstOrDefault();
            if (price != null)
            {
                return price;
            }           
        }
        return null;
    }
}

Function triggering

Functions (Plugins) can be triggered manually or automatically. In this sample, I use AutoInvokeKernelFunctions option which enables the Semantic Kernel to trigger the Function (Plugin) automatically based on the user's prompt.

 OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new()
 {
     ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
     Temperature = 0
 };

var aiData = _chatCompletionService.GetStreamingChatMessageContentsAsync(history, openAIPromptExecutionSettings, _kernel);

Test round

Let's try out how this works with sample questions.

Q1: What was my energy consumption on 15.2.2024 at 10:00?

✔️ The answer is correct and verified from the data.

Q2: What was my energy cost on 1.1.2024 from 00:00 to 23:59?

✔️ Hour-specific cost calculation is correct.
❌ Total sum calculation is not correct.

Q3: What was my total energy consumption in January 2024?

❌Total sum calculation is not correct.

Summary

Plugins are a great way to extend the capabilities of your chatbot to understand your data. The use of Native Code Plugins enables basically limitless possibilities to connect chatbots to different kinds of services and data storage.

As you can see there are some challenges and limitations with mathematical operations. Some basic mathematical calculations work but not all. I recommend you to read Paul Veitch's article Why GPT-4 Struggles with Maths: Exploring the Limitations of AI Language Models. Maybe later I'll do this so that either the Core Math Plugin or custom Native Math Plugin is configured in use.

This was a very interesting journey to investigate the capabilities of Semantic Kernel Plugins.

Comments