Building Cloud AI with Copilot – ResNet50 Azure HTTP Function

Introduction

This is another awfully long post about my experience using Copilot to write an Azure HTTP Trigger function that runs a resnet50 V2.7 Open Neural Network Exchange model(ONNX) on an image in the body of the HTTP POST.

For testing I was uploading the images with Telerik Fiddler Classic.

I forgot to specify language, so Copilot assumed (reasonably) that I wanted a Python Azure HTTP Trigger function.

The initial C# version wouldn’t compile because of the FunctionName attribute which is used for in-process Azure Functions. It did seem a bit odd that Copilot would generate code that support will end for November 10, 2026

public static class Function1
{
   private static readonly InferenceSession session = new InferenceSession("resnet50.onnx");

   [FunctionName("ImageClassification")]
   public static IActionResult Run(
       [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req,
       ILogger log)
   {
      log.LogInformation("Processing image classification request...");

      try
      {
         using var ms = new MemoryStream();
         req.Body.CopyTo(ms);
         using var image = Image.FromStream(ms);

         var inputTensor = PreprocessImage(image);

         var inputName = session.InputMetadata.Keys.First();
         var outputName = session.OutputMetadata.Keys.First();
         var result = session.Run(new Dictionary<string, NamedOnnxValue>
            {
                { inputName, NamedOnnxValue.CreateFromTensor(inputName, inputTensor) }
            });

         var predictions = result.First().AsTensor<float>().ToArray();

         return new JsonResult(new { predictions });
      }
      catch (Exception ex)
      {
         log.LogError($"Error: {ex.Message}");
         return new BadRequestObjectResult("Invalid image or request.");
      }
   }
...
}

It was just easier to change the FunctionName attribute manually.

public static class Function1
{
   private static readonly InferenceSession session = new InferenceSession("resnet50.onnx");

   [Function("ImageClassification")]
   public static IActionResult Run(
       [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req,
       ILogger log)
   {
      log.LogInformation("Processing image classification request...");

      try
      {
         using var ms = new MemoryStream();
         req.Body.CopyTo(ms);
         using var image = Image.FromStream(ms);

         var inputTensor = PreprocessImage(image);

         var inputName = session.InputMetadata.Keys.First();
         var outputName = session.OutputMetadata.Keys.First();
         var inputList = new List<NamedOnnxValue>
            {
                NamedOnnxValue.CreateFromTensor(inputName, inputTensor)
            };

         var result = session.Run(inputList);

         var predictions = result.First().AsTensor<float>().ToArray();

         return new JsonResult(new { predictions });
      }
      catch (Exception ex)
      {
         log.LogError($"Error: {ex.Message}");
         return new BadRequestObjectResult("Invalid image or request.");
      }
   }

The Azure HTTP Trigger function ran but failed when I tried to classify an image

The initialisation of the ILogger injected into the Run method was broken so I used Copilot to update the code to use constructor Dependency Injection (DI).

public static class Function1
{
   private static readonly ILogger logger;
   private static readonly InferenceSession session = new InferenceSession("resnet50-v2-7.onnx");

   // Static constructor to initialize logger
   static Function1()
   {
      var loggerFactory = LoggerFactory.Create(builder =>
      {
         builder.AddConsole();
      });
      logger = loggerFactory.CreateLogger("Function1Logger");
   }

   [Function("ImageClassification")]
   public static IActionResult Run([HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req)
   {
      logger.LogInformation("Processing image classification request...");

      try
      {
         using var ms = new MemoryStream();
         req.Body.CopyTo(ms);
         using var image = Image.FromStream(ms);

         var inputTensor = PreprocessImage(image);

         var inputName = session.InputMetadata.Keys.First();
         var outputName = session.OutputMetadata.Keys.First();
         var inputList = new List<NamedOnnxValue>
            {
                NamedOnnxValue.CreateFromTensor(inputName, inputTensor)
            };

         var result = session.Run(inputList);

         var predictions = result.First().AsTensor<float>().ToArray();

         return new JsonResult(new { predictions });
      }
      catch (Exception ex)
      {
         logger.LogError($"Error: {ex.Message}");
         return new BadRequestObjectResult("Invalid image or request.");
      }
   }
...
}

It was a bit odd that Copilot generated a static function and constructor unlike the equivalent YoloSharp Azure HTTP Trigger.

The Azure HTTP Trigger function ran but failed when I tried to classify an image

The Azure HTTP Trigger function ran but failed with a 400 Bad Request when I tried to classify an image

After some debugging I realised that Telerik Fiddle Classic was sending the image as form data so I modified the “composer” payload configuration.

Then the Azure HTTP Trigger function ran but the confidence values were wrong.

The confidence values were incorrect, so I checked the ResNet50 pre-processing instructions

The image needs to be preprocessed before fed to the network. The first step is to extract a 224x224 crop from the center of the image. For this, the image is first scaled to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side of the image is resized to 256 and the other side is scaled accordingly to maintain the original aspect ratio. After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std = 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout.
 private static Tensor<float> PreprocessImage(Image image)
 {
    var resized = new Bitmap(image, new Size(224, 224));
    var tensorData = new float[1 * 3 * 224 * 224];

    float[] mean = { 0.485f, 0.456f, 0.406f };
    float[] std = { 0.229f, 0.224f, 0.225f };

    for (int y = 0; y < 224; y++)
    {
       for (int x = 0; x < 224; x++)
       {
          var pixel = resized.GetPixel(x, y);

          tensorData[(0 * 3 * 224 * 224) + (0 * 224 * 224) + (y * 224) + x] = (pixel.R / 255.0f - mean[0]) / std[0];
          tensorData[(0 * 3 * 224 * 224) + (1 * 224 * 224) + (y * 224) + x] = (pixel.G / 255.0f - mean[1]) / std[1];
          tensorData[(0 * 3 * 224 * 224) + (2 * 224 * 224) + (y * 224) + x] = (pixel.B / 255.0f - mean[2]) / std[2];
       }
    }

    return new DenseTensor<float>(tensorData, new[] { 1, 3, 224, 224 });
 }

When the “normalisation” code was implemented and the Azure HTTP Trigger function run the confidence values were still incorrect.

The Azure HTTP Trigger function was working reliably but the number of results and size response payload was unnecessary.

The Azure HTTP Trigger function ran but the confidence values were still incorrect, so I again checked the ResNet50 post-processing instructions

Postprocessing
The post-processing involves calculating the softmax probability scores for each class. You can also sort them to report the most probable classes. Check imagenet_postprocess.py for code.
 // Compute exponentials for all scores
 var expScores = predictions.Select(MathF.Exp).ToArray();

 // Compute sum of exponentials
 float sumExpScores = expScores.Sum();

 // Normalize scores into probabilities
 var softmaxResults = expScores.Select(score => score / sumExpScores).ToArray();

 // Get top 10 predictions (label ID and confidence)
 var top10 = softmaxResults
     .Select((confidence, labelId) => new { labelId, confidence, label = labelId < labels.Count ? labels[labelId] : $"Unknown-{labelId}" })
     .OrderByDescending(p => p.confidence)
     .Take(10)
     .ToList();

The Azure HTTP Trigger function should run on multiple platforms so System.Drawing.Comon had to be replaced with Sixlabors ImageSharp

The Azure HTTP Trigger function ran but the Sixlabors ImageSharp based image classification failed.

After some debugging I realised that the MemoryStream used to copy the HTTPRequest body was not being reset.

[Function("ImageClassification")]
public static async Task<IActionResult> Run(
    [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req)
{
   logger.LogInformation("Processing image classification request...");

   try
   {
      using var ms = new MemoryStream();
      await req.Body.CopyToAsync(ms);

      ms.Seek(0, SeekOrigin.Begin);

      using var image = Image.Load<Rgb24>(ms);

      var inputTensor = PreprocessImage(image);
...   
   }
   catch (Exception ex)
   {
      logger.LogError($"Error: {ex.Message}");
      return new BadRequestObjectResult("Invalid image or request.");
   }
}

The odd thing was the confidence values changed slightly when the code was modified to use Sixlabors ImageSharp

The Azure HTTP Trigger function worked but the labelId wasn’t that “human readable”.

public static class Function1
{
   private static readonly ILogger logger;
   private static readonly InferenceSession session = new InferenceSession("resnet50-v2-7.onnx");
   private static readonly List<string> labels = LoadLabels("labels.txt");
...
   [Function("ImageClassification")]
   public static async Task<IActionResult> Run(
       [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req)
   {
      logger.LogInformation("Processing image classification request...");

      try
      {
...
         // Get top 10 predictions (label ID and confidence)
         var top10 = softmaxResults
             .Select((confidence, labelId) => new { labelId, confidence, label = labelId < labels.Count ? labels[labelId] : $"Unknown-{labelId}" })
             .OrderByDescending(p => p.confidence)
             .Take(10)
             .ToList();

         return new JsonResult(new { predictions = top10 });
      }
      catch (Exception ex)
      {
         logger.LogError($"Error: {ex.Message}");
         return new BadRequestObjectResult("Invalid image or request.");
      }
   }
...
   private static List<string> LoadLabels(string filePath)
   {
      try
      {
         return File.ReadAllLines(filePath).ToList();
      }
      catch (Exception ex)
      {
         logger.LogError($"Error loading labels file: {ex.Message}");
         return new List<string>(); // Return empty list if file fails to load
      }
   }
}

Summary

The Github Copilot generated code was okay but would be fragile and not scale terribly well. The confidence values changing very slightly when the code was updated for Sixlabors ImageSharp was disconcerting, but not surprising.

The Copilot generated code in this post is not suitable for production

Building Edge AI with Copilot-ResNet50 Client

Introduction

This is an awfully long post about my experience using Copilot to write a console application that runs a validated resnet50 V2.7 Open Neural Network Exchange model(ONNX) on an image loaded from disk.

I have found that often Copilot code generation is “better” but the user interface can be limiting.

The Copilot code generated compiled after the System.Drawing.Common and Microsoft.ML.OnnxRuntime NuGet packages were added to the project.

Input
All pre-trained models expect input images normalized in the same way, i.e. mini-batches 
of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and 
W are expected to be at least 224. The inference was done using jpeg image.

Preprocessing
The image needs to be preprocessed before fed to the network. The first step is to 
extract a 224x224 crop from the center of the image. For this, the image is first scaled 
to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side 
of the image is resized to 256 and the other side is scaled accordingly to maintain 
the original aspect ratio. 

After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std 
= 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout.

The code also had a reasonable implementation of the ResnetV5 preprocessing instructions

static void Main()
{
   string modelPath = "resnet50-v2-7.onnx"; // Path to your ONNX model
   string imagePath = "pizza.jpg"; // Path to the input image

   using var session = new InferenceSession(modelPath);
   var inputTensor = LoadAndPreprocessImage(imagePath);

   var inputs = new List<NamedOnnxValue>
   {
       NamedOnnxValue.CreateFromTensor("input", inputTensor)
   };

   using var results = session.Run(inputs);
   var output = results.First().AsTensor<float>().ToArray();

   Console.WriteLine("Predicted class index: " + Array.IndexOf(output, output.Max()));
}

static DenseTensor<float> LoadAndPreprocessImage(string imagePath)
{
   using Bitmap bitmap = new Bitmap(imagePath);
   int width = 224, height = 224; // ResNet50 expects 224x224 input
   using Bitmap resized = new Bitmap(bitmap, new Size(width, height));

   var tensor = new DenseTensor<float>(new[] { 1, 3, width, height });
   for (int y = 0; y < height; y++)
   {
      for (int x = 0; x < width; x++)
      {
         Color pixel = resized.GetPixel(x, y);
         tensor[0, 0, y, x] = pixel.R / 255f; // Normalize
         tensor[0, 1, y, x] = pixel.G / 255f;
         tensor[0, 2, y, x] = pixel.B / 255f;
      }
   }
   return tensor;
}

The program ran but failed with a Microsoft.ML.OnnxRuntime.OnnxRuntimeException Message=[ErrorCode:InvalidArgument] Input name: ‘input’ is not in the metadata

The name of the input tensor was wrong, so I used Netron to inspect the graph properties of the model.

After the input tensor name was updated, the program ran

I checked the labels using the torchvison ImageNet categories and the results looked reasonable

The model and input file paths were wrong and I had been manually fixing them.

The confidence values didn’t look right so I re-read the preprocessing requirements for a ResNet model

Input
All pre-trained models expect input images normalized in the same way, i.e. mini-batches 
of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and 
W are expected to be at least 224. The inference was done using jpeg image.

Preprocessing
The image needs to be preprocessed before fed to the network. The first step is to 
extract a 224x224 crop from the center of the image. For this, the image is first scaled 
to a minimum size of 256x256, while keeping aspect ratio. That is, the shortest side 
of the image is resized to 256 and the other side is scaled accordingly to maintain 
the original aspect ratio. 

After that, the image is normalized with mean = 255*[0.485, 0.456, 0.406] and std 
= 255*[0.229, 0.224, 0.225]. Last step is to transpose it from HWC to CHW layout.

The Copilot generated code compiled and ran but the confidence values still didn’t look right, and the results tensor contained 1000 confidences values.

static void Main()
{
   string modelPath = "resnet50-v2-7.onnx"; // Updated model path
   string imagePath = "pizza.jpg"; // Updated image path

   using var session = new InferenceSession(modelPath);
   var inputTensor = LoadAndPreprocessImage(imagePath);

   var inputs = new List<NamedOnnxValue>
   {
      NamedOnnxValue.CreateFromTensor("data", inputTensor) // Using "data" as the input tensor name
   };

   using var results = session.Run(inputs);
   var output = results.First().AsTensor<float>().ToArray();

   Console.WriteLine("Predicted class index: " + Array.IndexOf(output, output.Max()));
}

static DenseTensor<float> LoadAndPreprocessImage(string imagePath)
{
   using Bitmap bitmap = new Bitmap(imagePath);
   int width = 224, height = 224; // ResNet50 expects 224x224 input
   using Bitmap resized = new Bitmap(bitmap, new Size(width, height));

   var tensor = new DenseTensor<float>(new[] { 1, 3, width, height });

   // ImageNet mean & standard deviation values
   float[] mean = { 0.485f, 0.456f, 0.406f };
   float[] stdev = { 0.229f, 0.224f, 0.225f };

   for (int y = 0; y < height; y++)
   {
      for (int x = 0; x < width; x++)
      {
         Color pixel = resized.GetPixel(x, y);

         // Normalize using mean and standard deviation
         tensor[0, 0, y, x] = (pixel.R / 255f - mean[0]) / stdev[0]; // Red channel
         tensor[0, 1, y, x] = (pixel.G / 255f - mean[1]) / stdev[1]; // Green channel
         tensor[0, 2, y, x] = (pixel.B / 255f - mean[2]) / stdev[2]; // Blue channel
      }
   }
   return tensor;
}

Because the results didn’t look right, I went back and read the ResNet50 post processing instructions

Output
The model outputs image scores for each of the 1000 classes of ImageNet.

Postprocessing
The post-processing involves calculating the softmax probability scores for each 
class. You can also sort them to report the most probable classes. Check 
imagenet_postprocess.py for code.

The Copilot generated code wouldn’t compile due to a syntax error.

static void Main()
{
   string modelPath = "resnet50-v2-7.onnx"; // Updated model path
   string imagePath = "pizza.jpg"; // Updated image path

   using var session = new InferenceSession(modelPath);
   var inputTensor = LoadAndPreprocessImage(imagePath);

   var inputs = new List<NamedOnnxValue>
   {
      NamedOnnxValue.CreateFromTensor("data", inputTensor) // Using "data" as the input tensor name
   };

   using var results = session.Run(inputs);
   var output = results.First().AsTensor<float>().ToArray();

   // Calculate softmax
   var probabilities = Softmax(output);

   // Get the class index with the highest probability
   int predictedClass = Array.IndexOf(probabilities, probabilities.Max());
   Console.WriteLine($"Predicted class index: {predictedClass}");
   Console.WriteLine($"Probabilities: {string.Join(", ", probabilities.Select(p => p.ToString("F4")))}");
}
...
static float[] Softmax(float[] logits)
{
   // Compute softmax
   var expScores = logits.Select(Math.Exp).ToArray();
   double sumExpScores = expScores.Sum();
   return expScores.Select(score => (float)(score / sumExpScores)).ToArray();
}

Copilot was adamant that the generated code was correct.

After trying different Copilot prompts the code had to be manually fixed, before it would compile

The Copilot generated code ran and the results for the top 10 confidence values looked reasonable

static void Main()
{
   string modelPath = "resnet50-v2-7.onnx"; // Updated model path
   string imagePath = "pizza.jpg"; // Updated image path
   string labelsPath = "labels.txt"; // Path to labels file

   using var session = new InferenceSession(modelPath);
   var inputTensor = LoadAndPreprocessImage(imagePath);

   var inputs = new List<NamedOnnxValue>
   {
       NamedOnnxValue.CreateFromTensor("data", inputTensor) // Using "data" as the input tensor name
   };

   using var results = session.Run(inputs);
   var output = results.First().AsTensor<float>().ToArray();

   // Calculate softmax
   var probabilities = Softmax(output);

   // Load labels
   var labels = File.ReadAllLines(labelsPath);

   // Find Top 10 labels and their confidence scores
   var top10 = probabilities
          .Select((prob, index) => new { Label = labels[index], Confidence = prob })
          .OrderByDescending(item => item.Confidence)
          .Take(10);

   Console.WriteLine("Top 10 Predictions:");
   foreach (var item in top10)
   {
      Console.WriteLine($"{item.Label}: {item.Confidence:F4}");
   }
}
...
static float[] Softmax(float[] logits)
{
   // Compute softmax
   float maxVal = logits.Max();
   var expScores = logits.Select(v => (float)Math.Exp(v - maxVal)).ToArray();
   double sumExpScores = expScores.Sum();
   return expScores.Select(score => (float)(score / sumExpScores)).ToArray();
}

The code will have to run on non-windows devices for System.Drawing.Common had to replaced with SixLabors ImageSharp a multi-platform graphics library.

The SixLabors ImageSharp update compiled and ran first time.

using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;

using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;

namespace ResnetV5ObjectClassificationApplication
{
   class Program
   {
      static void Main()
      {
         string modelPath = "resnet50-v2-7.onnx"; // Updated model path
         string imagePath = "pizza.jpg"; // Updated image path
         string labelsPath = "labels.txt"; // Path to labels file

         using var session = new InferenceSession(modelPath);
         var inputTensor = LoadAndPreprocessImage(imagePath);

         var inputs = new List<NamedOnnxValue>
         {
            NamedOnnxValue.CreateFromTensor("data", inputTensor) // Using "data" as the input tensor name
         };

         using var results = session.Run(inputs);
         var output = results.First().AsTensor<float>().ToArray();

         // Calculate softmax
         var probabilities = Softmax(output);

         // Load labels
         var labels = File.ReadAllLines(labelsPath);

         // Find Top 10 labels and their confidence scores
         var top10 = probabilities
             .Select((prob, index) => new { Label = labels[index], Confidence = prob })
             .OrderByDescending(item => item.Confidence)
             .Take(10);

         Console.WriteLine("Top 10 Predictions:");
         foreach (var item in top10)
         {
            Console.WriteLine($"{item.Label}: {item.Confidence}");
         }

         Console.WriteLine("Press ENTER to exit");
         Console.ReadLine();
      }

      static DenseTensor<float> LoadAndPreprocessImage(string imagePath)
      {
         int width = 224, height = 224; // ResNet50 expects 224x224 input

         using var image = Image.Load<Rgb24>(imagePath);
         image.Mutate(x => x.Resize(width, height));

         var tensor = new DenseTensor<float>(new[] { 1, 3, width, height });

         // ImageNet mean & standard deviation values
         float[] mean = { 0.485f, 0.456f, 0.406f };
         float[] stdev = { 0.229f, 0.224f, 0.225f };

         for (int y = 0; y < height; y++)
         {
            for (int x = 0; x < width; x++)
            {
               var pixel = image[x, y];

               // Normalize using mean and standard deviation
               tensor[0, 0, y, x] = (pixel.R / 255f - mean[0]) / stdev[0]; // Red channel
               tensor[0, 1, y, x] = (pixel.G / 255f - mean[1]) / stdev[1]; // Green channel
               tensor[0, 2, y, x] = (pixel.B / 255f - mean[2]) / stdev[2]; // Blue channel
            }
         }

         return tensor;
      }

      static float[] Softmax(float[] logits)
      {
         // Compute softmax  
         float maxVal = logits.Max();
         var expScores = logits.Select(logit => Math.Exp(logit - maxVal)).ToArray(); // Explicitly cast logit to double  
         double sumExpScores = expScores.Sum();
         return expScores.Select(score => (float)(score / sumExpScores)).ToArray();
      }
   }
}

Summary

The Copilot generated code in this post in this was “inspired” by the Image recognition with ResNet50v2 in C# sample application.

The Copilot generated code in this post is not suitable for production

Building Cloud AI with Github Copilot- YoloSharp Azure HTTP Functions

Introduction

For this post I have used Github Copilot prompts to generate Azure HTTP Trigger functions which use Ultralytics YoloV8 and Compunet YoloSharp for object classification, object detection, and pose estimation.

I started with the Visual Studio 2022 Azure functions quick start code which ran first time.

using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

namespace YoloSharpxxxxxHttpTriggerFunction
{
    public class Function1
    {
        private readonly ILogger<Function1> _logger;

        public Function1(ILogger<Function1> logger)
        {
            _logger = logger;
        }

        [Function("Function1")]
        public IActionResult Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequest req)
        {
            _logger.LogInformation("C# HTTP trigger function processed a request.");
            return new OkObjectResult("Welcome to Azure Functions!");
        }
    }
}

The code generated by Github Copilot for the three functions changed the Function attribute to FunctionName and didn’t initialise the ILogger correctly.

[FunctionName("DetectObjects")]
public static async Task<IActionResult> Run(
     [HttpTrigger(AuthorizationLevel.Function, "post", Route = "detect")] HttpRequest req,
     ILogger log)
{
   log.LogInformation("DetectObjects function received a request for object detection.");
   ...
}

Every so often, when uploading more than one image at a time there was a “System.IO.InvalidData Exception: The stream exceeded the data limit 16384” error. In previous examples I had tried the batch processing of multiple images but had hit memory issues, so putting the image to be in the message body rather than Multipart Form Data for future development seems like a reasonable approach.

Image in the body of the request

Object Classification

I started with an object classification function because in previous examples Github Copilot was better at unpacking the results.

please write an httpTrigger azure function that uses YoloSharp and a Yolo image classification onnx model
Image classification not object detection
The image is in the form data
The multipart/form-data check can be removed
The YoloPredictor should be released after use
Many image files could be uploaded in one request
Only one image file can be uploaded in one request

[Function("YoloImageClassificationFunction")]
public async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req)
{
   _logger.LogInformation("C# HTTP trigger function processed a request.");

   var form = await req.ReadFormAsync();
   var files = form.Files;

   if (files.Count == 0)
   {
      return new BadRequestObjectResult("No images uploaded.");
   }

   if (files.Count > 1)
   {
      return new BadRequestObjectResult("Only one image file can be uploaded in one request.");
   }

   var file = files[0];
   if (file.Length > 0)
   {
      using (var memoryStream = new MemoryStream())
      {
         await file.CopyToAsync(memoryStream);
         var imageBytes = memoryStream.ToArray();

         using (var yoloModel = new YoloPredictor("yolov8s-cls.onnx"))
         {
            var classifications = yoloModel.Classify(imageBytes);

            return new OkObjectResult(new { file.FileName, classifications });
         }
      }
   }

   return new BadRequestObjectResult("Invalid image file.");
}

Image in the form data of the request

When I compared the results with previous applications, they were the same. I should have added code which discarded any result with less that a specified confidence to reduce the size of the results.

Object Detection

please write an azure function with an http trigger that uses yolosharp and an onnx file to detect objects in an image
The image is not in the body it is in the form
Image.Load is not used
yolo.Detect can process an image file stream
The YoloPredictor should be released after use
Many image files could be uploaded in one request
Only one image per request
Add multipart/form-data content type check
Add check that there are headers — Didn’t get this right
Add check that req.Headers is not null
Add check that request has ContentType and it is multipart/form-data

[Function("ObjectDetectionFunction")]
public async Task<IActionResult> Run( [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req)
{
   _logger.LogInformation("C# HTTP trigger function processed a request.");

   // Intellisense fix
   if (req.Headers == null || !req.Headers.TryGetValue("Content-Type", out Microsoft.Extensions.Primitives.StringValues value) || !value.ToString().Contains("multipart/form-data"))
   {
      return new BadRequestObjectResult("The request must have a Content-Type header with multipart/form-data.");
   }

   // Read the images from the form data
   var form = await req.ReadFormAsync();
   var files = form.Files;
   if (files.Count == 0)
   {
      return new BadRequestObjectResult("No image files uploaded.");
   }

   if (files.Count > 1)
   {
      return new BadRequestObjectResult("Only one image file is allowed per request.");
   }

   var file = files[0];
   if (file.Length == 0)
   {
      return new BadRequestObjectResult("The uploaded image file is empty.");
   }

   // Load the YOLOv8 model
   using (var yolo = new YoloPredictor("yolov8s.onnx"))
   {
      // Perform object detection
      using (var stream = file.OpenReadStream())
      {
         var items = yolo.Detect(stream);
         var result = new { FileName = file.FileName, Detections = items };

         // Return the detection results
         return new OkObjectResult(result);
      }
   }
}

After a couple of manual fixes the function ran in the desktop emulator

When I compared the results with previous applications, they were the same.

Pose Estimation

I forgot the ILogger initialisation so had to do it at the end

please write an azure function with an http trigger that uses yolosharp to estimate the pose of humans in an uploaded image.
Yolo v8 pose estimation model and yolosharp library
Make into azure function
The image files are in the form of the request
Modify the code so more than one image per request can be processed
Initialise ILogger in the constructor

//[FunctionName("PoseEstimation")]
[Function("PoseEstimation")]
public async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req)
{
   _log.LogInformation("Pose estimation function processed a request.");

   if (!req.HasFormContentType || !req.Form.Files.Any())
   {
      return new BadRequestObjectResult("Please upload image files.");
   }

   var results = new List<object>();

   foreach (var file in req.Form.Files)
   {
      using var memoryStream = new MemoryStream();
      await file.CopyToAsync(memoryStream);
      memoryStream.Position = 0;

      using var image = Image.Load<Rgba32>(memoryStream);

      // Initialize the YOLO model
      //using var predictor = new YoloPredictor("path/to/model.onnx");
      using var predictor = new YoloPredictor("yolov8s-pose.onnx");

      // Perform pose estimation
      var result = await predictor.PoseAsync(image);

      // Format the results
      //var poses = result.Poses.Select(pose => new
      var poses = result.Select(pose => new
      {
         //Keypoints = pose.Keypoints.Select(k => new { k.X, k.Y }),
         Keypoints = pose.Select(k => new { k.Point.X, k.Point.Y }),
         Confidence = pose.Confidence
      });

      results.Add(new
      {
         Image = file.FileName,
         Poses = poses
      });
   }

   return new OkObjectResult(new { results });
}

After a couple of manual fixes including changing the way the results were generated the function ran in the desktop emulator.

Summary

The generated code worked but required manual fixes and was pretty ugly

The Github Copilot generated code in this post is not suitable for production

Building Edge AI with Copilot – Faster R-CNN Client

Introduction

To better understand how Copilot and Github Copilot generated code was impacted by the wording and order of prompts I re-played one of my previous sessions.

The generated code looked “reasonable” and vaguely familiar.

static void Main(string[] args)
{
    // Paths to the ONNX model and the image file (adjust these paths as needed)
    string modelPath = @"..\\..\\..\\..\\Models\\FasterRCNN-10.onnx";
    string imagePath = "sports.jpg";

    // Create the OnnxInference session
    using var session = new InferenceSession(modelPath);

    // Load the image from disk using ImageSharp
    using var image = Image.Load<Rgb24>(imagePath);

    // Resize the image to fit within the range and adjust dimensions to be divisible by 32
    ResizeImage(image);

    // Extract tensor data from the image (with shape [3, height, width])
    var inputTensor = ExtractTensorFromImage(image);

    // Create NamedOnnxValue input (ensure that the input name "image" matches your model's input)
    var inputs = new List<NamedOnnxValue>
    {
        NamedOnnxValue.CreateFromTensor("image", inputTensor)
    };

    // Run the model inference  
    using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs);

    // Process and display the outputs (bounding boxes, labels, and confidences)
    ProcessOutput(results);

    Console.WriteLine("Press Enter to exit");
    Console.ReadLine();
}

/// <summary>
/// Resizes the input image such that both width and height are within the range [800, 1333]
/// and ensures the dimensions are divisible by 32.
/// </summary>
private static void ResizeImage(Image<Rgb24> image)
{
   const int minSize = 800;
   const int maxSize = 1333;
   int originalWidth = image.Width;
   int originalHeight = image.Height;

   // Determine the scaling factor so that the smallest side is at least minSize and the largest does not exceed maxSize.
   float scale = Math.Min((float)maxSize / Math.Max(originalWidth, originalHeight),
                                   (float)minSize / Math.Min(originalWidth, originalHeight));

   // Compute the new dimensions based on the scale
   int newWidth = (int)(originalWidth * scale);
   int newHeight = (int)(originalHeight * scale);

   // Adjust dimensions to be divisible by 32
   newWidth = (newWidth / 32) * 32;
   newHeight = (newHeight / 32) * 32;

   image.Mutate(x => x.Resize(newWidth, newHeight));
}

/// <summary>
/// Converts the resized image into a DenseTensor<float> with shape [3, height, width].
/// The image is processed to subtract the Faster‑RCNN channel means (B, G, R order).
/// </summary>
private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
   int width = image.Width;
   int height = image.Height;

   // Create a tensor with shape [channels, height, width]
   var tensor = new DenseTensor<float>(new[] { 3, height, width });

   // Faster‑RCNN channel means (order: blue, green, red)
   float[] mean = { 102.9801f, 115.9465f, 122.7717f };

   // Process each pixel row; ImageSharp provides efficient pixel row access.
   image.ProcessPixelRows(accessor =>
   {
      for (int y = 0; y < height; y++)
      {
         var pixelRow = accessor.GetRowSpan(y);
         for (int x = 0; x < width; x++)
         {
             // Subtract the channel mean value (ensuring B, G, R order)
             tensor[0, y, x] = pixelRow[x].B - mean[0];
             tensor[1, y, x] = pixelRow[x].G - mean[1];
             tensor[2, y, x] = pixelRow[x].R - mean[2];
         }
      }
   });

return tensor;
}

/// <summary>
/// Processes the model output, extracting bounding boxes, labels, and confidences.
/// Only detections with confidence scores above a defined threshold are printed.
/// </summary>
private static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> output)
{
    // Note: The output node names ("6379", "6381", "6383") are model-specific.
    // They correspond, respectively, to boxes, labels, and confidence scores.
    var boxesTensor = output.First(x => x.Name == "6379").AsTensor<float>();
    var labelsTensor = output.First(x => x.Name == "6381").AsTensor<long>();
    var confidencesTensor = output.First(x => x.Name == "6383").AsTensor<float>();

    float[] boxes = boxesTensor.ToArray();
    long[] labels = labelsTensor.ToArray();
    float[] confidences = confidencesTensor.ToArray();

    const float minConfidence = 0.7f;

    // Each bounding box is represented by 4 values: x1, y1, x2, y2.
    for (int i = 0; i < boxes.Length; i += 4)
    {
        int detectionIndex = i / 4;
        if (confidences[detectionIndex] >= minConfidence)
        {
           long label = labels[detectionIndex];
           float confidence = confidences[detectionIndex];
           float x1 = boxes[i];
           float y1 = boxes[i + 1];
           float x2 = boxes[i + 2];
           float y2 = boxes[i + 3];
           Console.WriteLine($"Label: {label}, Confidence: {confidence}, Bounding Box: [{x1}, {y1}, {x2}, {y2}]");
        }
    }
}

The Copilot generated code had the names of the output tensors (6379,6381, 6383), the mean calculation and the order of the colours (B,G,R) correct. The name of the image file and the path to the model file in The Explanation and Additional information looked a lot like mine.

All I had to do was add the Microsoft.ML.OnnxRuntime and SixLabors.ImageSharp NuGets then the code compiled and ran first time. I then checked the results, and they looked reasonable.

The similarities between the generated code for the different blog posts was suspicious so I asked…

Summary

The Copilot generated code in this post in this was “inspired” the Copilot code generated for my Building Edge AI with GitHub Copilot – Faster R-CNN Client, Building Edge AI with GitHub Copilot – Faster R-CNN Client Revisited or AIIoTForTheEdgeAndAzureBuiltWithCopilot repository.

The Github Copilot generated code in my AIIoTForTheEdgeAndAzureBuiltWithCopilot repository was then “inspired” by the Object detection with Faster RCNN Deep Learning in C# and Image recognition with ResNet50v2 in C# samples.

The Copilot generated code in this post is not suitable for production

Building Edge AI with GitHub Copilot – Faster R-CNN Client Revisited

Introduction

While building the first GitHub Copilot version I noticed that when System.Drawing.Common was changed to Sixlabors.ImageSharp made a significant difference to the generated code. For this post I’ll be using GitHub Copilot to generate code for a console application that uses a Faster R-CNN ONNX model to process an image loaded from disk.

please write a C# console application that uses Onnx to run a ONNX Faster-RCNN object detection model on an image loaded from disk

I manually added the System.Drawing.Commom, Microsoft.ML.OnnxRuntime NuGet packages, using directives and then the code compiled.

static void Main(string[] args)
{
   // Path to the ONNX model and input image
   string modelPath = "FasterRCNN-10.onnx";
   string imagePath = "sports.jpg";

   // Load the image
   Bitmap image = new Bitmap(imagePath);
   var inputTensor = PreprocessImage(image);

   // Load the ONNX model
   using var session = new InferenceSession(modelPath);

   // Create input data
   var inputName = session.InputMetadata.Keys.First();
   var inputs = new List<NamedOnnxValue>
   {
         NamedOnnxValue.CreateFromTensor(inputName, inputTensor)
   };

   //// Run inference
   using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs);

   // Process output
   foreach (var result in results)
   {
      Console.WriteLine($"{result.Name}: {string.Join(", ", result.AsEnumerable<float>())}");
   }
}

static DenseTensor<float> PreprocessImage(Bitmap image)
{
   // Resize image to model's expected input size (e.g., 224x224)
   int targetWidth = 224;
   int targetHeight = 224;
   Bitmap resizedImage = new Bitmap(image, new Size(targetWidth, targetHeight));

   // Convert image to float array and normalize
   var input = new DenseTensor<float>(new[] { 1, 3, targetHeight, targetWidth });
   for (int y = 0; y < targetHeight; y++)
   {
      for (int x = 0; x < targetWidth; x++)
      {
         Color pixel = resizedImage.GetPixel(x, y);
         input[0, 0, y, x] = pixel.R / 255f;
         input[0, 1, y, x] = pixel.G / 255f;
         input[0, 2, y, x] = pixel.B / 255f;
      }
   }

   return input;
}

I fixed the paths to the ONNX model and image file, so the application ran but there was an error.

The dimensions of the input tensor were wrong based on the Faster R-CNN documentation because the declaration had an extra dimension.

Change dimensions of tensor by removing batch from declaration

static DenseTensor<float> PreprocessImage(Bitmap image)
{
   // Resize image to model's expected input size (e.g., 224x224)
   int targetWidth = 224;
   int targetHeight = 224;
   Bitmap resizedImage = new Bitmap(image, new Size(targetWidth, targetHeight));

   // Convert image to float array and normalize
   var input = new DenseTensor<float>(new[] { 3, targetHeight, targetWidth });
   for (int y = 0; y < targetHeight; y++)
   {
      for (int x = 0; x < targetWidth; x++)
      {
         Color pixel = resizedImage.GetPixel(x, y);
         input[0, y, x] = pixel.R / 255f;
         input[1, y, x] = pixel.G / 255f;
         input[2, y, x] = pixel.B / 255f;
      }
   }

   return input;
}

GitHub Copilot fixed the tensor declaration by removing the batch id, then the application ran but there was an error.

The dimensions of the input image were wrong so, based on the Faster R-CNN documentation I used GitHub Copilot to resize the image

resize the image such that both height and width are within the range of [800, 1333], such that both height and width are divisible by 32

static Bitmap ResizeImageForModel(Bitmap image)
{
   // Define the target range and divisibility
   const int minSize = 800;
   const int maxSize = 1333;
   const int divisor = 32;

   // Get original dimensions
   int originalWidth = image.Width;
   int originalHeight = image.Height;

   // Calculate scale factor to fit within the range while maintaining aspect ratio
   float scale = Math.Min((float)maxSize / Math.Max(originalWidth, originalHeight),
                          (float)minSize / Math.Min(originalWidth, originalHeight));

   // Calculate new dimensions
   int newWidth = (int)(originalWidth * scale);
   int newHeight = (int)(originalHeight * scale);

   // Ensure dimensions are divisible by 32
   newWidth = (newWidth / divisor) * divisor;
   newHeight = (newHeight / divisor) * divisor;

   // Resize the image
   return new Bitmap(image, new Size(newWidth, newHeight));
}

GitHub Copilot fixed the image resizing, so the application ran but there was still an error.

The processing of the output tensor was wrong so, based on the Faster R-CNN documentation I used GitHub Copilot to add the code required to “correctly” display the results.

Display label, confidence and bounding box

I also manually added the using directive for System.Drawing.Drawing2D

static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results, Bitmap image)
{
   // Extract output tensors
   var boxes = results.First(r => r.Name == "boxes").AsEnumerable<float>().ToArray();
   var labels = results.First(r => r.Name == "labels").AsEnumerable<long>().ToArray();
   var scores = results.First(r => r.Name == "scores").AsEnumerable<float>().ToArray();

   using Graphics graphics = Graphics.FromImage(image);
   graphics.SmoothingMode = SmoothingMode.AntiAlias;

   for (int i = 0; i < labels.Length; i++)
   {
      if (scores[i] < 0.5) continue; // Filter low-confidence detections

      // Extract bounding box coordinates
      float x1 = boxes[i * 4];
      float y1 = boxes[i * 4 + 1];
      float x2 = boxes[i * 4 + 2];
      float y2 = boxes[i * 4 + 3];

      // Draw bounding box
      RectangleF rect = new RectangleF(x1, y1, x2 - x1, y2 - y1);
      graphics.DrawRectangle(Pens.Red, rect.X, rect.Y, rect.Width, rect.Height);

      // Display label and confidence
      string label = $"Label: {labels[i]}, Confidence: {scores[i]:0.00}";
      graphics.DrawString(label, new Font("Arial", 12), Brushes.Yellow, new PointF(x1, y1 - 20));
   }

   // Save the image with annotations
   image.Save("output.jpg");
   Console.WriteLine("Output image saved as 'output.jpg'.");
}

The application ran but there was an error because the output tensor names were wrong.

I used Netron to determine the correct output tensor names.

It was quicker to manually fix the output tensor names

static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results, Bitmap image)
 {
    // Extract output tensors
    var boxes = results.First(r => r.Name == "6379").AsEnumerable<float>().ToArray();
    var labels = results.First(r => r.Name == "6381").AsEnumerable<long>().ToArray();
    var scores = results.First(r => r.Name == "6383").AsEnumerable<float>().ToArray();

    using Graphics graphics = Graphics.FromImage(image);
    graphics.SmoothingMode = SmoothingMode.AntiAlias;

    for (int i = 0; i < labels.Length; i++)
    {
       if (scores[i] < 0.5) continue; // Filter low-confidence detections

       // Extract bounding box coordinates
       float x1 = boxes[i * 4];
       float y1 = boxes[i * 4 + 1];
       float x2 = boxes[i * 4 + 2];
       float y2 = boxes[i * 4 + 3];

       // Draw bounding box
       RectangleF rect = new RectangleF(x1, y1, x2 - x1, y2 - y1);
       graphics.DrawRectangle(Pens.Red, rect.X, rect.Y, rect.Width, rect.Height);

       // Display label and confidence
       string label = $"Label: {labels[i]}, Confidence: {scores[i]:0.00}";
       graphics.DrawString(label, new Font("Arial", 12), Brushes.Yellow, new PointF(x1, y1 - 20));
    }

    // Save the image with annotations
    image.Save("output.jpg");
    Console.WriteLine("Output image saved as 'output.jpg'.");
 }

The application ran but the results were bad, so I checked format of the input tensor and figured out the mean adjustment was missing.

Apply mean to each channel

I used GitHub Copilot to add code for the mean adjustment for each pixel

static DenseTensor<float> PreprocessImage(Bitmap image)
{
   // Resize image to model's expected input size  
   Bitmap resizedImage = ResizeImageForModel(image);

   // Apply FasterRCNN mean values to each channel  
   float[] mean = { 102.9801f, 115.9465f, 122.7717f };

   // Convert image to float array and normalize  
   var input = new DenseTensor<float>(new[] { 3, resizedImage.Height, resizedImage.Width });
   for (int y = 0; y < resizedImage.Height; y++)
   {
      for (int x = 0; x < resizedImage.Width; x++)
      {
         Color pixel = resizedImage.GetPixel(x, y);
         input[0, y, x] = (pixel.R - mean[0]) / 255f;
         input[1, y, x] = (pixel.G - mean[1]) / 255f;
         input[2, y, x] = (pixel.B - mean[2]) / 255f;
      }
   }

   return input;
}

The application ran but the results were still bad, so I checked format of the input tensor and figured out the mean adjustment was wrong. It was quicker to manually fix up the mean calculation.

static DenseTensor<float> PreprocessImage(Bitmap image)
{
   // Resize image to model's expected input size  
   Bitmap resizedImage = ResizeImageForModel(image);

   // Apply FasterRCNN mean values to each channel  
   float[] mean = { 102.9801f, 115.9465f, 122.7717f };

   // Convert image to float array and normalize  
   var input = new DenseTensor<float>(new[] { 3, resizedImage.Height, resizedImage.Width });
   for (int y = 0; y < resizedImage.Height; y++)
   {
      for (int x = 0; x < resizedImage.Width; x++)
      {
         Color pixel = resizedImage.GetPixel(x, y);

         input[0, y, x] = pixel.R - mean[0];
         input[1, y, x] = pixel.G - mean[1];
         input[2, y, x] = pixel.B - mean[2];
      }
   }

   return input;
}

The application ran but the results were still bad, so I checked format of the input tensor and figured out the input tensor was BGR rather than RGB.

Change to B,G,R

static DenseTensor<float> PreprocessImage(Bitmap image)
{
   // Resize image to model's expected input size  
   Bitmap resizedImage = ResizeImageForModel(image);

   // Apply FasterRCNN mean values to each channel  
   float[] mean = { 102.9801f, 115.9465f, 122.7717f };

   // Convert image to float array and normalize  
   var input = new DenseTensor<float>(new[] { 3, resizedImage.Height, resizedImage.Width });
   for (int y = 0; y < resizedImage.Height; y++)
   {
      for (int x = 0; x < resizedImage.Width; x++)
      {
         Color pixel = resizedImage.GetPixel(x, y);
         input[0, y, x] = pixel.B - mean[0] ;
         input[1, y, x] = pixel.G - mean[1] ;
         input[2, y, x] = pixel.R - mean[2] ;
      }
   }

   return input;
}

Finally, the application minimum bounding rectangles (MBRs), labels and confidences looked correct.

Summary

The GitHub Copilot generated code looked like it was “inspired” by the onnxruntime.ai Object detection with Faster RCNN Deep Learning in C# sample.

The additional code for marking up the image in the ProcessOutput was unexpected and I wonder if there wasn’t a Sixlabors.ImageSharp example for “inspiration”

The Copilot generated code in this post is not suitable for production

Building Edge AI with GitHub Copilot – Faster R-CNN Client

Introduction

For this post I’ll be using GitHub Copilot to generate code for a console application that uses a Faster R-CNN ONNX model to process an image loaded from disk.

please write a C# console application that uses Onnx to run a ONNX Faster-RCNN object detection model on an image loaded from disk

I added the System.Drawing.Common and Microsoft.ML.OnnxRuntime NuGet packages, required using directives then the code compiled. The ProcessOutput method was stubbed out so I will need to figure out the format of the output tensor

static void Main(string[] args)
{
   string modelPath = "FasterRCNN-10.onnx";
   string imagePath = "sports.jpg";

   using var session = new InferenceSession(modelPath);
   using var image = new Bitmap(imagePath);

   var inputTensor = ExtractTensorFromImage(image);
   var inputs = new List<NamedOnnxValue>
   {
       NamedOnnxValue.CreateFromTensor("image", inputTensor)
   };

   using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs);
   var output = results.First().AsEnumerable<float>().ToArray();

   // Process the output (e.g., draw bounding boxes on the image)
   ProcessOutput(output, image);

   image.Save("output.jpg", ImageFormat.Jpeg);
   Console.WriteLine("Object detection completed. Output saved as output.jpg");
}

private static DenseTensor<float> ExtractTensorFromImage(Bitmap image)
{
   int width = image.Width;
   int height = image.Height;
   var tensor = new DenseTensor<float>(new[] { 1, 3, height, width });

   for (int y = 0; y < height; y++)
   {
      for (int x = 0; x < width; x++)
      {
         Color pixel = image.GetPixel(x, y);
         tensor[0, 0, y, x] = pixel.R / 255.0f;
         tensor[0, 1, y, x] = pixel.G / 255.0f;
         tensor[0, 2, y, x] = pixel.B / 255.0f;
      }
   }

   return tensor;
}

private static void ProcessOutput(float[] output, Bitmap image)
{
   // Implement the logic to process the output and draw bounding boxes on the image
   // This is a placeholder for the actual implementation
}

I fixed the paths to the ONNX model and image files, so the application ran but there was an error.

The dimensions of the input tensor were wrong because the declaration included a batch id.

Change dimensions of tensor by removing batch from declaration

private static DenseTensor<float> ExtractTensorFromImage(Bitmap image)
{
   int width = image.Width;
   int height = image.Height;
   var tensor = new DenseTensor<float>(new[] { 3, height, width });

   for (int y = 0; y < height; y++)
   {
      for (int x = 0; x < width; x++)
      {
         Color pixel = image.GetPixel(x, y);
         tensor[0, y, x] = pixel.R / 255.0f;
         tensor[1, y, x] = pixel.G / 255.0f;
         tensor[2, y, x] = pixel.B / 255.0f;
      }
   }

   return tensor;
 }

GitHub Copilot fixed the tensor declaration by removing the batch id. The application also used System.Drawing which is not supported on my target embedded platforms.

change from System.Drawing to ImageSharp

I added the SixLabors.ImageSharp and removed System.Drawing.Common NuGets. Then updated the using directives as required.

using var image = Image.Load<Rgb24>(imagePath);
...
private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
   int width = image.Width;
   int height = image.Height;
   var tensor = new DenseTensor<float>(new[] { 3, height, width });

   image.ProcessPixelRows(accessor =>
   {
      for (int y = 0; y < height; y++)
      {
         var pixelRow = accessor.GetRowSpan(y);
         for (int x = 0; x < width; x++)
         {
            tensor[0, y, x] = pixelRow[x].R / 255.0f;
            tensor[1, y, x] = pixelRow[x].G / 255.0f;
            tensor[2, y, x] = pixelRow[x].B / 255.0f;
         }
      }
   });

   return tensor;
}

The application ran but there was an error as the dimensions of the input image were wrong.

Based on the Faster R-CNN documentation I used ImageSharp to resize the image

Use ImageSharp to resize the image such that both height and width are within the range of [800, 1333], such that both height and width are divisible by 32.

 private static void ResizeImage(Image<Rgb24> image)
 {
    const int minSize = 800;
    const int maxSize = 1333;

    int originalWidth = image.Width;
    int originalHeight = image.Height;

    float scale = Math.Min((float)maxSize / Math.Max(originalWidth, originalHeight), (float)minSize / Math.Min(originalWidth, originalHeight));
    int newWidth = (int)(originalWidth * scale);
    int newHeight = (int)(originalHeight * scale);

    // Ensure dimensions are divisible by 32
    newWidth = (newWidth / 32) * 32;
    newHeight = (newHeight / 32) * 32;

    image.Mutate(x => x.Resize(newWidth, newHeight));
 }

The application ran but the results were bad, so I checked format of the input tensor and figured out the R,G,B values need to be adjusted.

Apply mean to each channel

private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
   int width = image.Width;
   int height = image.Height;
   var tensor = new DenseTensor<float>(new[] { 3, height, width });

   // Mean values for each channel
   float[] mean = { 0.485f, 0.456f, 0.406f };

   image.ProcessPixelRows(accessor =>
   {
      for (int y = 0; y < height; y++)
      {
         var pixelRow = accessor.GetRowSpan(y);
         for (int x = 0; x < width; x++)
         {
            tensor[0, y, x] = (pixelRow[x].R / 255.0f) - mean[0];
            tensor[1, y, x] = (pixelRow[x].G / 255.0f) - mean[1];
            tensor[2, y, x] = (pixelRow[x].B / 255.0f) - mean[2];
         }
      }
   });

   return tensor;
 }

The application ran but the results were still bad, so I checked format of the input tensor and figured out the Resnet50 means had be used and the input tensor was BGR rather than RGB

Use FasterRCNN means not resnet ones
Change to B,G,R

private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
   int width = image.Width;
   int height = image.Height;
   var tensor = new DenseTensor<float>(new[] { 3, height, width });

   // Mean values for each channel for FasterRCNN
   float[] mean = { 102.9801f, 115.9465f, 122.7717f };

   image.ProcessPixelRows(accessor =>
   {
      for (int y = 0; y < height; y++)
      {
         var pixelRow = accessor.GetRowSpan(y);
         for (int x = 0; x < width; x++)
         {
            tensor[0, y, x] = pixelRow[x].B - mean[0];
            tensor[1, y, x] = pixelRow[x].G - mean[1];
            tensor[2, y, x] = pixelRow[x].R - mean[2];
         }
      }
   });

   return tensor;
}

When I inspected the values in the output tensor in the debugger they looked “reasonable” so got GitHub Copilot to add the code required to display the results.

Display label, confidence and bounding box

The application ran but there was an exception because the names of the output tensor “dimensions” were wrong.

I used Netron to get the correct output tensor “dimension” names.

I then manually fixed the output tensor “dimension” names

private static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> output)
{
   var boxes = output.First(x => x.Name == "6379").AsTensor<float>().ToArray();
   var labels = output.First(x => x.Name == "6381").AsTensor<long>().ToArray();
   var confidences = output.First(x => x.Name == "6383").AsTensor<float>().ToArray();

   const float minConfidence = 0.7f;

   for (int i = 0; i < boxes.Length; i += 4)
   {
      var index = i / 4;
      if (confidences[index] >= minConfidence)
      {
         long label = labels[index];
         float confidence = confidences[index];
         float x1 = boxes[i];
         float y1 = boxes[i + 1];
         float x2 = boxes[i + 2];
         float y2 = boxes[i + 3];

         Console.WriteLine($"Label: {label}, Confidence: {confidence}, Bounding Box: [{x1}, {y1}, {x2}, {y2}]");
      }
   }
}

I manually compared the output of the console application with equivalent YoloSharp application output and the results looked close enough.

Summary

The Copilot prompts required to generate code were significantly more complex than previous examples and I had to regularly refer to the documentation to figure out what was wrong. The code wasn’t great and Copilot didn’t add much value

The Copilot generated code in this post is not suitable for production

Building Edge AI with Github Copilot- Security Camera HTTP YoloSharp

When I started with the Security Camera HTTP code and added code to process the images with Ultralytics Object Detection model I found the order of the prompts could make a difference. My first attempt at adding YoloSharp to the SecurityCameraHttpClient application with Github Copilot didn’t go well and needed some “human intervention”. When I thought more about the order of the prompts the adding the same functionality went a lot better.

// Use a stream rather than loading image from a file
// Use YoloSharp to run an onnx Object Detection model on the image
// Make the YoloPredictor a class variable
// Save image if object with specified image class name detected
// Modify so objectDetected supports multiple image class names
// Modify code to make use of GPU configurable
// Make display of detections configurable in app settings
// Make saving of image configurable in app settings

internal class Program
{
   private static HttpClient _client;
   private static bool _isRetrievingImage = false;
   private static ApplicationSettings _applicationSettings;
   private static YoloPredictor _yoloPredictor;

   static void Main(string[] args)
   {
      Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss} SecurityCameraClient starting");
#if RELEASE
         Console.WriteLine("RELEASE");
#else
         Console.WriteLine("DEBUG");
#endif

      var configuration = new ConfigurationBuilder()
            .AddJsonFile("appsettings.json", false, true)
            .AddUserSecrets<Program>()
            .Build();

      _applicationSettings = configuration.GetSection("ApplicationSettings").Get<ApplicationSettings>();

      // Initialize YoloPredictor with GPU configuration
      _yoloPredictor = new YoloPredictor(_applicationSettings.OnnxModelPath, new YoloPredictorOptions()
      {
         UseCuda = _applicationSettings.UseCuda, // Configurable GPU usage
      });

      using (HttpClientHandler handler = new HttpClientHandler { Credentials = new NetworkCredential(_applicationSettings.Username, _applicationSettings.Password) })
      using (_client = new HttpClient(handler))
      using (var timer = new Timer(async _ => await RetrieveImageAsync(), null, _applicationSettings.TimerDue, _applicationSettings.TimerPeriod))
      {
         Console.WriteLine("Press any key to exit...");
         Console.ReadKey();
      }
   }

      private static async Task RetrieveImageAsync()
      {
         if (_isRetrievingImage) return;

         _isRetrievingImage = true;
         try
         {
            Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} SecurityCameraClient download starting");

            HttpResponseMessage response = await _client.GetAsync(_applicationSettings.CameraUrl);
            response.EnsureSuccessStatusCode();

            using (Stream imageStream = await response.Content.ReadAsStreamAsync())
            {
               var detections = _yoloPredictor.Detect(imageStream);
               bool objectDetected = false;

               foreach (var detection in detections)
               {
                  if (_applicationSettings.LogDetections) // Check if logging detections is enabled
                  {
                     Console.WriteLine($"Detected {detection.Name.Name} with confidence {detection.Confidence}");
                  }

                  if (_applicationSettings.ClassNames.Contains(detection.Name.Name))
                  {
                     objectDetected = true;
                  }
               }

               if (objectDetected && _applicationSettings.SaveImage) // Check if saving images is enabled
               {
                  string savePath = string.Format(_applicationSettings.SavePath, DateTime.UtcNow);
                  using (FileStream fileStream = new FileStream(savePath, FileMode.Create, FileAccess.Write, FileShare.None))
                  {
                     imageStream.Position = 0;
                     await imageStream.CopyToAsync(fileStream);
                  }
               }
            }

            Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} SecurityCameraClient download done");
         }
         catch (Exception ex)
         {
            Console.WriteLine($"An error occurred: {ex.Message}");
         }
         finally
         {
            _isRetrievingImage = false;
         }
      }
}

public class ApplicationSettings
{
   public string CameraUrl { get; set; } = "";
   public string SavePath { get; set; } = "";
   public string Username { get; set; } = "";
   public string Password { get; set; } = "";
   public TimeSpan TimerDue { get; set; } = TimeSpan.Zero;
   public TimeSpan TimerPeriod { get; set; } = TimeSpan.Zero;
   public string OnnxModelPath { get; set; } = "";
   public bool UseCuda { get; set; } = false; // Configurable GPU usage
   public List<string> ClassNames { get; set; } //= new List<string>();
   public bool LogDetections { get; set; } = false; // Configurable logging of detections 
   public bool SaveImage { get; set; } = false; // Configurable saving of images
   }
}

The interactions Visual Studio IntelliSense with the GitHub Copilot prompts was interesting.

I wonder if this is because Visual Studio Intellisense has local context, whereas Github Copilot has “cloud” context.

It took a couple of failed attempts to find the best order, which I think would reduce over time.

The Copilot generated code in this post is not suitable for production

Building Edge AI with AI- YoloDotNet Client

Introduction

For this post I have used Copilot prompts to generate code which uses Ultralytics YoloV8 and YoloDoNet by NickSwardh for object detection, object classification, and pose estimation.

Object Detection

static void Main(string[] args)
{
   string modelPath = "path/to/yolo_model.onnx";  // Update with actual model path
   string imagePath = "path/to/image.jpg";        // Update with actual image path

   // Load YOLO model
   var yolo = new YoloModel(modelPath);

   // Load image
   using Bitmap image = new Bitmap(imagePath);

   // Run classification
   var results = yolo.Predict(image);

   // Display results
   foreach (var result in results)
   {
      Console.WriteLine($"Detected: {result.Label} with confidence {result.Confidence:P}");
   }
}

The code didn’t compile so I tried a different prompt

This didn’t compile and looked like it was based on https://github.com/techwingslab/yolov5-net. The generated code also added code to check that the model and image files existed.

static void Main(string[] args)
{
   string modelPath = "yolov8.onnx"; // Replace with your actual model path
   string imagePath = "image.jpg"; // Replace with your actual image path

   if (!File.Exists(modelPath))
   {
      Console.WriteLine("Error: Model file not found!");
      return;
   }

   if (!File.Exists(imagePath))
   {
      Console.WriteLine("Error: Image file not found!");
      return;
   }

   try
   {
      // Load the YOLO model
      using var yolo = new Yolo(modelPath);

      // Load image from disk
      using var image = new Bitmap(imagePath);

      // Run object detection
      var results = yolo.Predict(image);

      // Display detected objects
      foreach (var result in results)
      {
         Console.WriteLine($"Detected: {result.Label} - Confidence: {result.Confidence}");
         Console.WriteLine($"Bounding Box: {result.BoundingBox}");
      }
   }
   catch (Exception ex)
   {
      Console.WriteLine($"Error: {ex.Message}");
   }
}

This didn’t compile so tried again

static void Main(string[] args)
{
   string modelPath = "yolov8.onnx"; // Replace with your actual model path
   string imagePath = "image.jpg"; // Replace with your actual image path
...
   try
   {
      var yoloOptions = new YoloOptions
      {
         ConfidenceThreshold = 0.5, // Confidence threshold (adjust as needed)
         IoUThreshold = 0.4        // Intersection over Union threshold
      };

      // Load the YOLO model
      //using var yolo = new Yolo(modelPath);
      using var yolo = new Yolo(yoloOptions);

      // Load image from disk
      using var image = new Bitmap(imagePath);

      // Run object detection
      //var results = yolo.Predict(image);
      var results = yolo.RunObjectDetection(image);

      // Display detected objects
      foreach (var result in results)
      {
         Console.WriteLine($"Detected: {result.Label} - Confidence: {result.Confidence}");
         Console.WriteLine($"Bounding Box: {result.BoundingBox}");
      }
   }
   catch (Exception ex)
   {
      Console.WriteLine($"Error: {ex.Message}");
   }
}

This didn’t compile so tried a different approach

I manually modified the code removing ConfidenceThreshold and IoUThreshold, then used intellisense to “discover” then add ModelType & modelPath

static void Main(string[] args)
{
   string modelPath = "yolov8.onnx"; // Replace with your actual model path
   string imagePath = "image.jpg"; // Replace with your actual image path
...
   try
   {
      var yoloOptions = new YoloOptions
      {
         ModelType = ModelType.ObjectDetection,
         OnnxModel = modelPath
      };

      // Load the YOLO model
      //using var yolo = new Yolo(modelPath);
      //using var yolo = new Yolo(yoloOptions);
      //using var yolo = new Yolo(modelPath, yoloOptions);
      using var yolo = new Yolo(yoloOptions);

      // Load image using SkiaSharp
      using var skBitmap = SKBitmap.Decode(imagePath);

      // Convert SKBitmap to a format YOLO can process
      using var skImage = SKImage.FromBitmap(skBitmap);
      using var skData = skImage.Encode(SKEncodedImageFormat.Jpeg, 100);
      using var memoryStream = new MemoryStream(skData.ToArray());
      //var results = yolo.Predict(memoryStream);
      var results = yolo.RunObbDetection(skImage);

      // Display detected objects
      foreach (var result in results)
      {
         Console.WriteLine($"Detected: {result.Label} - Confidence: {result.Confidence}");
         Console.WriteLine($"Bounding Box: {result.BoundingBox}");
      }
   }
   catch (Exception ex)
   {
      Console.WriteLine($"Error: {ex.Message}");
   }
}

The code compiled and ran but didn’t work because YoloDoNet assumed that my computer had CUDA support

static void Main(string[] args)
{
   string modelPath = "yolov8.onnx"; // Replace with your actual model path
   string imagePath = "image.jpg"; // Replace with your actual image path
...
         try
         {
            var yoloOptions = new YoloOptions
            {
               ModelType = ModelType.ObjectDetection,
               OnnxModel = modelPath,
               Cuda = false
            };

            // Load the YOLO model
            //using var yolo = new Yolo(modelPath);
            //using var yolo = new Yolo(yoloOptions);
            //using var yolo = new Yolo(modelPath, yoloOptions);
            using var yolo = new Yolo(yoloOptions);

            // Load image using SkiaSharp
            using var skBitmap = SKBitmap.Decode(imagePath);

            // Convert SKBitmap to a format YOLO can process
            using var skImage = SKImage.FromBitmap(skBitmap);
            using var skData = skImage.Encode(SKEncodedImageFormat.Jpeg, 100);
            using var memoryStream = new MemoryStream(skData.ToArray());
            //var results = yolo.Predict(memoryStream);
            var results = yolo.RunObbDetection(skImage);

            // Display detected objects
            foreach (var result in results)
            {
               Console.WriteLine($"Detected: {result.Label} - Confidence: {result.Confidence}");
               Console.WriteLine($"Bounding Box: {result.BoundingBox}");
            }
         }
         catch (Exception ex)
         {
            Console.WriteLine($"Error: {ex.Message}");
         }
      }
   }
}

The code compiled and ran but didn’t work because I had accidentally used oriented bounding box rather than the object detection.

I was getting sick of prompts so manually fixed SKImage loading, RunObjectDetection and then removed the “cruft”.

static void Main(string[] args)
{
   string modelPath = "yolov8s.onnx"; // Replace with your actual model path
   string imagePath = "sports.jpg"; // Replace with your actual image path
...
   try
   {
      var yoloOptions = new YoloOptions
      {
         ModelType = ModelType.ObjectDetection,
         OnnxModel = modelPath,
         Cuda = false
      };

      using var yolo = new Yolo(yoloOptions);

     using var skImage = SKImage.FromEncodedData(imagePath);

      var results = yolo.RunObjectDetection(skImage);

      foreach (var result in results)
      {
         Console.WriteLine($"Detected: {result.Label} - Confidence: {result.Confidence:F2}");
         Console.WriteLine($"Bounding Box: {result.BoundingBox}");
      }
   }
   catch (Exception ex)
   {
      Console.WriteLine($"Error: {ex.Message}");
   }

   Console.WriteLine("Press Enter to exit the application");
   Console.ReadLine();
}

I tested the implementation with sample “sports” image from the YoloSharp Github repository

The console application output looked reasonable

Classification

My initial Copilot prompt

Don’t understand why reference to OpenCV was included

static void Main(string[] args)
{
   string modelPath = "path/to/yolo_model.onnx"; // Update with actual model path
   string imagePath = "path/to/image.jpg"; // Update with actual image path

   // Load YOLO model
   var yolo = new YoloModel(modelPath);

   // Load image
   using Bitmap image = new Bitmap(imagePath);

   // Run classification
   var results = yolo.Predict(image);

   // Display results
  foreach (var result in results)
  {
      Console.WriteLine($"Detected: {result.Label} with confidence {result.Confidence:P}");
   }
}

The code didn’t compile so I prompted the code be modified to use SkiaSharp which is used by YoloDoNet

This was a bit strange, so I tried again

I was getting sick of prompts so manually fixed SKImage loading, RunClassification and then removed the “cruft”.

static void Main(string[] args)
{
   string modelPath = "yolov8s-cls.onnx";  // Update with actual model path
   string imagePath = "pizza.jpg";        // Update with actual image path

   var yolo = new Yolo(new YoloOptions()
   {
      ModelType = ModelType.Classification,
      OnnxModel = modelPath,
      Cuda = false
   });

   // Load image
   using SKImage image = SKImage.FromEncodedData(imagePath);

   // Run classification
   var results = yolo.RunClassification(image);

   // Display results
   foreach (var result in results)
   {
      Console.WriteLine($"Detected: {result.Label} with confidence {result.Confidence:P}");
   }

   Console.WriteLine("Press Enter to exit the application");
   Console.ReadLine();
}

At this point the code compiled and ran

Pretty confident this i a picture of a pizza

Pose

My Copilot prompt

Replace, path/to/yolo_model.onnx, and path/to/image.jpg with the actual paths to your model files and input image

This example assumes that YoloDotNet V2 supports the loaded YOLO model. Verify compatibility with the YOLO ObjectDetection variant.

Copilot had “assumed” I meant Ultralytics Yolo V2 and the code didn’t compile. So, I tried again without V2.

At this point I gave up

Summary

Using Copilot prompts to generate code which uses Ultralytics YoloV8 and YoloDoNet didn’t go well. In addition Visual Studio intellisense often struggled to offer useful options.

Overall, the experience wasn’t great.

RTSP Camera rosenbjerg.FFMpegCore GDI Error

While working on my SecurityCameraRTSPClientFFMpegCore project I noticed that every so often after opening the Realtime Streaming Protocol(RTSP) connection with my HiLook IPCT250H Security Camera there was a “Paremeter is not valid” or “A generic error occurred in GDI+.” exception and sometimes the image was corrupted.

My test harness code was “inspired” by the Continuous Snapshots on Live Stream #280 sample

sing (var ms = new MemoryStream())
{
    await FFMpegArguments
        .FromUrlInput(new Uri("udp://192.168.2.12:9000"))
        .OutputToPipe(new StreamPipeSink(ms), options => options
            .ForceFormat("rawvideo")
            .WithVideoCodec(VideoCodec.Png)
            .Resize(new Size(Config.JpgWidthLarge, Config.JpgHeightLarge))
            .WithCustomArgument("-vf fps=1 -update 1")
        )
        .NotifyOnProgress(o => 
        {
            try
            {
                if (ms.Length > 0)
                {
                    ms.Position = 0;
                    using (var bitmap = new Bitmap(ms))
                    {
                        // Modify bitmap here

                        // Save the bitmap
                        bitmap.Save("test.png");
                    }

                    ms.SetLength(0);
                }
            }
            catch { }
        })
        .ProcessAsynchronously();
}

My implementation is slightly different because I caught then displayed any exceptions generated converting the image stream to a bitmap or saving it.

using (var ms = new MemoryStream())
{
   await FFMpegArguments
         .FromUrlInput(new Uri(_applicationSettings.CameraUrl))
         .OutputToPipe(new StreamPipeSink(ms), options => options
         .ForceFormat("mpeg1video")
         //.ForceFormat("rawvideo")
         .WithCustomArgument("-rtsp_transport tcp")
         .WithFramerate(10)
         .WithVideoCodec(VideoCodec.Png)
         //.Resize(1024, 1024)
         //.ForceFormat("image2pipe")
         //.Resize(new Size(Config.JpgWidthLarge, Config.JpgHeightLarge))
         //.Resize(new Size(Config.JpgWidthLarge, Config.JpgHeightLarge))
         //.WithCustomArgument("-vf fps=1 -update 1")
         //.WithCustomArgument("-vf fps=5 -update 1")
         //.WithSpeedPreset( Speed.)
         //.UsingMultithreading()
         //.UsingThreads()
         //.WithVideoFilters(filter => filter.Scale(640, 480))
         //.UsingShortest()
         //.WithFastStart()
         )
         .NotifyOnProgress(o =>
         {
            try
            {
               if (ms.Length > 0)
               {
                  ms.Position = 0;

                  string outputPath = Path.Combine(_applicationSettings.SavePath, string.Format(_applicationSettings.FrameFileNameFormat, DateTime.UtcNow ));

                  using (var bitmap = new Bitmap(ms))
                  {
                     // Save the bitmap
                     bitmap.Save(outputPath);
                  }

                  ms.SetLength(0);
               }
            }
            catch (Exception ex)
            {
               Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} {ex.Message}");
            }
         })
         .ProcessAsynchronously();
}

I have created a Continuous Snapshots on Live Stream Memory stream contains invalid bitmap image #562 to track the issue.

One odd thing that I noticed when scrolling “back and forth” through the images around when there was exception was that the date and time on the top left of the image was broken.

I wonder if the image was “broken” in some subtle way and FFMpegCore is handling this differently to the other libraries I’m trialing.