Building Edge AI with Github Copilot- Security Camera HTTP YoloSharp

When I started with the Security Camera HTTP code and added code to process the images with Ultralytics Object Detection model I found the order of the prompts could make a difference. My first attempt at adding YoloSharp to the SecurityCameraHttpClient application with Github Copilot didn’t go well and needed some “human intervention”. When I thought more about the order of the prompts the adding the same functionality went a lot better.

// Use a stream rather than loading image from a file
// Use YoloSharp to run an onnx Object Detection model on the image
// Make the YoloPredictor a class variable
// Save image if object with specified image class name detected
// Modify so objectDetected supports multiple image class names
// Modify code to make use of GPU configurable
// Make display of detections configurable in app settings
// Make saving of image configurable in app settings

internal class Program
{
   private static HttpClient _client;
   private static bool _isRetrievingImage = false;
   private static ApplicationSettings _applicationSettings;
   private static YoloPredictor _yoloPredictor;

   static void Main(string[] args)
   {
      Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss} SecurityCameraClient starting");
#if RELEASE
         Console.WriteLine("RELEASE");
#else
         Console.WriteLine("DEBUG");
#endif

      var configuration = new ConfigurationBuilder()
            .AddJsonFile("appsettings.json", false, true)
            .AddUserSecrets<Program>()
            .Build();

      _applicationSettings = configuration.GetSection("ApplicationSettings").Get<ApplicationSettings>();

      // Initialize YoloPredictor with GPU configuration
      _yoloPredictor = new YoloPredictor(_applicationSettings.OnnxModelPath, new YoloPredictorOptions()
      {
         UseCuda = _applicationSettings.UseCuda, // Configurable GPU usage
      });

      using (HttpClientHandler handler = new HttpClientHandler { Credentials = new NetworkCredential(_applicationSettings.Username, _applicationSettings.Password) })
      using (_client = new HttpClient(handler))
      using (var timer = new Timer(async _ => await RetrieveImageAsync(), null, _applicationSettings.TimerDue, _applicationSettings.TimerPeriod))
      {
         Console.WriteLine("Press any key to exit...");
         Console.ReadKey();
      }
   }

      private static async Task RetrieveImageAsync()
      {
         if (_isRetrievingImage) return;

         _isRetrievingImage = true;
         try
         {
            Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} SecurityCameraClient download starting");

            HttpResponseMessage response = await _client.GetAsync(_applicationSettings.CameraUrl);
            response.EnsureSuccessStatusCode();

            using (Stream imageStream = await response.Content.ReadAsStreamAsync())
            {
               var detections = _yoloPredictor.Detect(imageStream);
               bool objectDetected = false;

               foreach (var detection in detections)
               {
                  if (_applicationSettings.LogDetections) // Check if logging detections is enabled
                  {
                     Console.WriteLine($"Detected {detection.Name.Name} with confidence {detection.Confidence}");
                  }

                  if (_applicationSettings.ClassNames.Contains(detection.Name.Name))
                  {
                     objectDetected = true;
                  }
               }

               if (objectDetected && _applicationSettings.SaveImage) // Check if saving images is enabled
               {
                  string savePath = string.Format(_applicationSettings.SavePath, DateTime.UtcNow);
                  using (FileStream fileStream = new FileStream(savePath, FileMode.Create, FileAccess.Write, FileShare.None))
                  {
                     imageStream.Position = 0;
                     await imageStream.CopyToAsync(fileStream);
                  }
               }
            }

            Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} SecurityCameraClient download done");
         }
         catch (Exception ex)
         {
            Console.WriteLine($"An error occurred: {ex.Message}");
         }
         finally
         {
            _isRetrievingImage = false;
         }
      }
}

public class ApplicationSettings
{
   public string CameraUrl { get; set; } = "";
   public string SavePath { get; set; } = "";
   public string Username { get; set; } = "";
   public string Password { get; set; } = "";
   public TimeSpan TimerDue { get; set; } = TimeSpan.Zero;
   public TimeSpan TimerPeriod { get; set; } = TimeSpan.Zero;
   public string OnnxModelPath { get; set; } = "";
   public bool UseCuda { get; set; } = false; // Configurable GPU usage
   public List<string> ClassNames { get; set; } //= new List<string>();
   public bool LogDetections { get; set; } = false; // Configurable logging of detections 
   public bool SaveImage { get; set; } = false; // Configurable saving of images
   }
}

The interactions Visual Studio IntelliSense with the GitHub Copilot prompts was interesting.

I wonder if this is because Visual Studio Intellisense has local context, whereas Github Copilot has “cloud” context.

It took a couple of failed attempts to find the best order, which I think would reduce over time.

The Copilot generated code in this post is not suitable for production

NickSwardh NuGet Nvidia Jetson Orin Nano™ GPU CUDA Inferencing

My Seeedstudio reComputer J3011 has two processors an ARM64 CPU and an Nividia Jetson Orin 8G coprocessor. YoloDotNet by NickSwardh V2 (uses SkiaSharp) was significantly faster when run on the ARM64 CPU so I wanted to try inferencing with the Nividia Jetson Orin 8G coprocessor.

Performance of YoloDotNet by NickSwardh V2 running on the ARM64 CPU

Performance of YoloDotNet by NickSwardh V2 running on the Nividia Jetson Orin 8G with Compute Unified Device Architecture (CUDA) enabled.

Enabling CUDA reduced the total image scaling, pre-processing, inferencing, and post processing time from 115mSec to 36mSec which is a significant improvement.

YoloV8 NuGet on a diet – Part 1

Recently most of my YoloV8 projects inference on edge devices or in Azure and do not use Graphics Processing Unit(GPU) hardware. Most of my projects use the dme-compunet YoloV8 NuGet and for compatibility (Removal of extended CUDA & TensorRT configuration functionality) reasons this post uses version 4.2 of the source.

The initial version of the YoloV8.dll in the version 4.2 of the NuGet was 96.5KB. Most of my applications deployed to edge devices and Azure do not require plotting functionality so I started by commenting out (not terribly subtle).

The intermediate version of the YoloV8.dll in the version 4.2 of the NuGet on a “diet” with the plotting functionality “commented out” also meant the SixLabors.ImageSharp.Drawing NuGet could be removed.

The final release version of the YoloV8.dll in the version 4.2 of the NuGet was 64.0 KB. The main purpose of this process was to remove any unnecessary functionality to see how hard it would be to replace the SixLabors.ImageSharp and SixLabors.ImageSharp.Drawing with libraries that have better performance.

YoloV8-NuGet Performance ARM64 CPU

To see how the dme-compunet, updated YoloDotNet and sstainba NuGets performed on an ARM64 CPU I built a test rig for the different NuGets using standard images and ONNX Models.

I started with the dme-compunet YoloV8 NuGet which found all the tennis balls and the results were consistent with earlier tests.

The YoloDotNet by NickSwardh NuGet update had some “breaking changes” so I built “old” and “updated” test harnesses.

The YoloDotNet by NickSwardh V1 and V2 results were slightly different. The V2 NuGet uses SkiaSharp which appears to significantly improve the performance.

Even though the YoloV8 by sstainba NuGet hadn’t been updated I ran the test harness just in case

The dme-compunet YoloV8 and NickSwardh YoloDotNet V1 versions produced the same results, but the NickSwardh YoloDotNet V2 results were slightly different.

  • dme-Compunet 291 mSec
  • NickSwardV1 480 mSec
  • NickSwardV2 115 mSecs
  • SStainba 422 mSec

Like in the YoloV8-NuGet Performance X64 CPU post the NickSwardV2 implementation which uses SkiaSharp was significantly faster so it looks like Sixlabors.ImageSharp is the issue.

To support Compute Unified Device Architecture (CUDA) or TensorRT inferencing with NickSwardV2(for SkiaSharp) will need some major modifications to the code so it might be better to build my own YoloV8 Nuget.

YoloV8-NuGet Performance X64 CPU

When checking the dme-compunet, YoloDotNet, and sstainba and NuGets I noticed YoloDotNet readme.md detailed some performance enhancements…

What’s new in YoloDotNet v2.0?

YoloDotNet 2.0 is a Speed Demon release where the main focus has been on supercharging performance to bring you the fastest and most efficient version yet. With major code optimizations, a switch to SkiaSharp for lightning-fast image processing, and added support for Yolov10 as a little extra 😉 this release is set to redefine your YoloDotNet experience:

Changing the implementation to use SkiaSharp caught my attention because in previous testing manipulating images with the Sixlabors.ImageSharp library took longer than expected.

I built a test rig for comparing the performance of the different NuGets using standard images and ONNX Models.

I started with the dme-compunet YoloV8 NuGet which found all the tennis balls and the results were consistent with earlier tests.

dme-compunet test harness image bounding boxes

The YoloDotNet by NickSwardh NuGet update had some “breaking changes” so I built “old” and “updated” test harnesses. The V1 version found all the tennis balls and the results were consistent with earlier tests.

NickSwardh V1 test harness image bounding boxes

The YoloDotNet by NickSwardh NuGet update had some “breaking changes” so there were some code changes but the V1 and V2 results were slightly different.

NickSwardh V2 test harness image bounding boxes

Even though the YoloV8 by sstainba NuGet hadn’t been updated I ran the test harness just in case and the results were consistent with previous tests.

sstainba test harness image bounding boxes

The dme-compunet YoloV8 and NickSwardh YoloDotNet V1 versions produce the same results, but the NickSwardh YoloDotNet V2 results were slightly different. The YoloV8 by sstainba results were unchanged.

  • dme-Compunet 71 mSec
  • NickSwardV1 76 mSec
  • NickSwardV2 33 mSecs
  • SStainba 82mSec

The NickSwardV2 implementation was significantly faster, but I need to investigate the slight difference in the bounding boxes. It looks like Sixlabors.ImageSharp might be the issue.

Azure Event Grid YoloV8- Basic MQTT Client Pose Estimation

The Azure.EventGrid.Image.YoloV8.Pose application downloads images from a security camera, processes them with the default YoloV8(by Ultralytics) Pose Estimation model then publishes the results to an Azure Event Grid MQTT broker topic.

private async void ImageUpdateTimerCallback(object? state)
{
   DateTime requestAtUtc = DateTime.UtcNow;

   // Just incase - stop code being called while photo or prediction already in progress
   if (_ImageProcessing)
   {
      return;
   }
   _ImageProcessing = true;

   try
   {
      _logger.LogDebug("Camera request start");

      PoseResult result;

      using (Stream cameraStream = await _httpClient.GetStreamAsync(_applicationSettings.CameraUrl))
      {
         result = await _predictor.PoseAsync(cameraStream);
      }

      _logger.LogInformation("Speed Preprocess:{Preprocess} Postprocess:{Postprocess}", result.Speed.Preprocess, result.Speed.Postprocess);


      if (_logger.IsEnabled(LogLevel.Debug))
      {
         _logger.LogDebug("Pose results");

         foreach (var box in result.Boxes)
         {
            _logger.LogDebug(" Class:{box.Class} Confidence:{Confidence:f1}% X:{X} Y:{Y} Width:{Width} Height:{Height}", box.Class.Name, box.Confidence * 100.0, box.Bounds.X, box.Bounds.Y, box.Bounds.Width, box.Bounds.Height);

            foreach (var keypoint in box.Keypoints)
            {
               Model.PoseMarker poseMarker = (Model.PoseMarker)keypoint.Index;

               _logger.LogDebug("  Class:{Class} Confidence:{Confidence:f1}% X:{X} Y:{Y}", Enum.GetName(poseMarker), keypoint.Confidence * 100.0, keypoint.Point.X, keypoint.Point.Y);
            }
         }
      }

      var message = new MQTT5PublishMessage
      {
         Topic = string.Format(_applicationSettings.PublishTopic, _applicationSettings.UserName),
         Payload = Encoding.ASCII.GetBytes(JsonSerializer.Serialize(new
         {
            result.Boxes
         })),
         QoS = _applicationSettings.PublishQualityOfService,
      };

      _logger.LogDebug("HiveMQ.Publish start");

      var resultPublish = await _mqttclient.PublishAsync(message);

      _logger.LogDebug("HiveMQ.Publish done");
   }
   catch (Exception ex)
   {
      _logger.LogError(ex, "Camera image download, processing, or telemetry failed");
   }
   finally
   {
      _ImageProcessing = false;
   }

   TimeSpan duration = DateTime.UtcNow - requestAtUtc;

   _logger.LogDebug("Camera Image download, processing and telemetry done {TotalSeconds:f2} sec", duration.TotalSeconds);
}

The application uses a Timer(with configurable Due and Period times) to poll the security camera, detect objects in the image then publish a JavaScript Object Notation(JSON) representation of the results to Azure Event Grid MQTT broker topic using a HiveMQ client.

Utralytics Pose Model input image

The Unv ADZK-10 camera used in this sample has a Hypertext Transfer Protocol (HTTP) Uniform Resource Locator(URL) for downloading the current image. Like the YoloV8.Detect.SecurityCamera.Stream sample the image “streamed” using the HttpClient.GetStreamAsync to the YoloV8 PoseAsync method.

Azure.EventGrid.Image.YoloV8.Pose application console output

The same approach as the YoloV8.Detect.SecurityCamera.Stream sample is used because the image doesn’t have to be saved on the local filesystem.

Utralytics Pose Model marked-up image

To check the results, I put a breakpoint in the timer just after PoseAsync method is called and then used the Visual Studio 2022 Debugger QuickWatch functionality to inspect the contents of the PoseResult object.

Visual Studio 2022 Debugger PoseResult Quickwatch

For testing I configured a single Azure Event Grid custom topic subscription an Azure Storage Queue.

Azure Event Grid Topic Metrics

An Azure Storage Queue is an easy way to store messages while debugging/testing an application.

Azure Storage Explorer messages list

Azure Storage Explorer is a good tool for listing recent messages, then inspecting their payloads.

Azure Storage Explorer Message Details

The Azure Event Grid custom topic message text(in data_base64) contains the JavaScript Object Notation(JSON) of the pose detection result.

{"Boxes":[{"Keypoints":[{"Index":0,"Point":{"X":744,"Y":58,"IsEmpty":false},"Confidence":0.6334442},{"Index":1,"Point":{"X":746,"Y":33,"IsEmpty":false},"Confidence":0.759928},{"Index":2,"Point":{"X":739,"Y":46,"IsEmpty":false},"Confidence":0.19036674},{"Index":3,"Point":{"X":784,"Y":8,"IsEmpty":false},"Confidence":0.8745915},{"Index":4,"Point":{"X":766,"Y":45,"IsEmpty":false},"Confidence":0.086735755},{"Index":5,"Point":{"X":852,"Y":50,"IsEmpty":false},"Confidence":0.9166329},{"Index":6,"Point":{"X":837,"Y":121,"IsEmpty":false},"Confidence":0.85815763},{"Index":7,"Point":{"X":888,"Y":31,"IsEmpty":false},"Confidence":0.6234426},{"Index":8,"Point":{"X":871,"Y":205,"IsEmpty":false},"Confidence":0.37670398},{"Index":9,"Point":{"X":799,"Y":21,"IsEmpty":false},"Confidence":0.3686208},{"Index":10,"Point":{"X":768,"Y":205,"IsEmpty":false},"Confidence":0.21734264},{"Index":11,"Point":{"X":912,"Y":364,"IsEmpty":false},"Confidence":0.98523325},{"Index":12,"Point":{"X":896,"Y":382,"IsEmpty":false},"Confidence":0.98377174},{"Index":13,"Point":{"X":888,"Y":637,"IsEmpty":false},"Confidence":0.985927},{"Index":14,"Point":{"X":849,"Y":645,"IsEmpty":false},"Confidence":0.9834709},{"Index":15,"Point":{"X":951,"Y":909,"IsEmpty":false},"Confidence":0.96191007},{"Index":16,"Point":{"X":921,"Y":894,"IsEmpty":false},"Confidence":0.9618156}],"Class":{"Id":0,"Name":"person"},"Bounds":{"X":690,"Y":3,"Width":315,"Height":1001,"Location":{"X":690,"Y":3,"IsEmpty":false},"Size":{"Width":315,"Height":1001,"IsEmpty":false},"IsEmpty":false,"Top":3,"Right":1005,"Bottom":1004,"Left":690},"Confidence":0.8341071}]}

YoloV8-Training a model with Ultralytics Hub

After uploading the roboflow Tennis Ball dataset from my previous post to an Ultralytics Hub dataset. I then used my Ultralytics Pro plan to train a proof of concept(PoC) YoloV8 model.

Creating a new Ultralytics project
Selecting training type the dataset to upload
Checking the Tennis Ball dataset upload
Confirming the number of classes and splits of the training dataset
Selecting the output model architecture (YoloV8s).
Configuring the number of epochs and payment method
Preparing the cloud instance(s) for training
The midpoint of the training process
The training process completed with some basic model metrics.
The resources used and model accuracy metrics.
Model training metrics.
Testing the trained model inference results with my test image.
Exporting the trained YoloV8 model in ONNX format.
The duration and cost of training the model.
Testing the YoloV8 model with the dem-compunet.Image console application
Marked-up image generated by the dem-compunet.Image console application.

In this post I have not covered YoloV8 model selection and tuning of the training configuration to optimise the “performance” of the model. I used the default settings and then ran the model training overnight which cost USD6.77

This post is not about how create a “good” model it is the approach I took to create a “proof of concept” model for a demonstration.

YoloV8-Selecting a roboflow dataset

To comply with the Ultralytics AGPL-3.0 License and to use an Ultralytics Pro plan the source code and models for an application have to be open source. Rather than publishing my YoloV8 model (which is quite large) this is the first in a series of posts which detail the process I used to create it. (which I think is more useful)

The single test image (not a good idea) is a photograph of 30 tennis balls on my living room floor.

Test image of 30 tennis balls on my living room floor

I stared with the “default” yolov8s.onnx model which is included in the YoloV8 nuget package Github repository YoloV8.Demo application.

YoloV8s.Onnx Tennis ball object detection results

The object detection results using the “default” model were pretty bad, but this wasn’t a surprise as the model is not optimised for this sort of problem.

Roboflow has a suite of tools for annotating, automatic labelling, training and deployment of models as well as a roboflow universe which (according to their website) is “The largest resource of computer vision datasets and pre-trained models”.

roboflow universe open-source model dataset search

I have used datasets from roboflow universe which is a great resource for building “proof of concept” applications.

roboflow universe dataset search

The first step was to identify some datasets which would improve my tennis ball object detection model results. After some searching (with tennis, tennis-ball etc. classes) and filtering (object detection, has a model for faster evaluation, more the 5000 images) to reduce the search results to a manageable number, I identified 5 datasets worth further evaluation.

In my scenario the performance of the Acebot by Mrunal model was worse than the “default” yolov8s model.

In my scenario the performance of the tennis racket by test model was similar to the “default” yolov8s model.

In my scenario the performance of the Tennis Ball by Hust model was a bit better than the “default” yolov8s mode

In my scenario the performance of the roboflow_oball by ahmedelshalkany model was pretty good it detected 28 of the 30 tennis balls.

In my scenario the performance of the Tennis Ball by Ugur Ozdemir model was good it detected all of the 30 tennis balls.

I then exported the Tennis Ball by Ugur Ozdemir dataset in a YoloV8 compatible format so I could use it on the Ultralytics Hub service with my Ultralytics Pro plan to train a model.

This post is not about how create a “good” dataset it is the approach I took to create a “proof of concept” dataset for a demonstration.

Azure Event Grid YoloV8- Basic MQTT Client Object Detection

The Azure.EventGrid.Image.Detect application downloads images from a security camera, processes them with the default YoloV8(by Ultralytics) object detection model, then publishes the results to an Azure Event Grid MQTT broker topic.

The Unv ADZK-10 camera used in this sample has a Hypertext Transfer Protocol (HTTP) Uniform Resource Locator(URL) for downloading the current image. Like the YoloV8.Detect.SecurityCamera.Stream sample the image “streamed” using the HttpClient.GetStreamAsync to the YoloV8 DetectAsync method.

private async void ImageUpdateTimerCallback(object? state)
{
   DateTime requestAtUtc = DateTime.UtcNow;

   // Just incase - stop code being called while photo or prediction already in progress
   if (_ImageProcessing)
   {
      return;
   }
   _ImageProcessing = true;

   try
   {
      _logger.LogDebug("Camera request start");

      DetectionResult result;

      using (Stream cameraStream = await _httpClient.GetStreamAsync(_applicationSettings.CameraUrl))
      {
         result = await _predictor.DetectAsync(cameraStream);
      }

      _logger.LogInformation("Speed Preprocess:{Preprocess} Postprocess:{Postprocess}", result.Speed.Preprocess, result.Speed.Postprocess);

      if (_logger.IsEnabled(LogLevel.Debug))
      {
         _logger.LogDebug("Detection results");

         foreach (var box in result.Boxes)
         {
            _logger.LogDebug(" Class {box.Class} {Confidence:f1}% X:{box.Bounds.X} Y:{box.Bounds.Y} Width:{box.Bounds.Width} Height:{box.Bounds.Height}", box.Class, box.Confidence * 100.0, box.Bounds.X, box.Bounds.Y, box.Bounds.Width, box.Bounds.Height);
         }
      }

      var message = new MQTT5PublishMessage
      {
         Topic = string.Format(_applicationSettings.PublishTopic, _applicationSettings.UserName),
         Payload = Encoding.ASCII.GetBytes(JsonSerializer.Serialize(new
         {
            result.Boxes,
         })),
         QoS = _applicationSettings.PublishQualityOfService,
      };

      _logger.LogDebug("HiveMQ.Publish start");

      var resultPublish = await _mqttclient.PublishAsync(message);

      _logger.LogDebug("HiveMQ.Publish done");
   }
   catch (Exception ex)
   {
      _logger.LogError(ex, "Camera image download, processing, or telemetry failed");
   }
   finally
   {
      _ImageProcessing = false;
   }

   TimeSpan duration = DateTime.UtcNow - requestAtUtc;

   _logger.LogDebug("Camera Image download, processing and telemetry done {TotalSeconds:f2} sec", duration.TotalSeconds);
}

The application uses a Timer(with configurable Due and Period times) to poll the security camera, detect objects in the image then publish a JavaScript Object Notation(JSON) representation of the results to Azure Event Grid MQTT broker topic using a HiveMQ client.

Console application displaying object detection results

The uses the Microsoft.Extensions.Logging library to publish diagnostic information to the console while debugging the application.

Visual Studio 2022 QuickWatch displaying object detection results.

To check the results I put a breakpoint in the timer just after DetectAsync method is called and then used the Visual Studio 2022 Debugger QuickWatch functionality to inspect the contents of the DetectionResult object.

Visual Studio 2022 JSON Visualiser displaying object detection results.

To check the JSON payload of the MQTT message I put a breakpoint just before the HiveMQ PublishAsync method. I then inspected the payload using the Visual Studio 2022 JSON Visualizer.

Security Camera image for object detection photo bombed by Yarnold our Standard Apricot Poodle.

This application can also be deployed as a Linux systemd Service so it will start then run in the background. The same approach as the YoloV8.Detect.SecurityCamera.Stream sample is used because the image doesn’t have to be saved on the local filesystem.