RTSP Camera rosenbjerg.FFMpegCore GDI Error

While working on my SecurityCameraRTSPClientFFMpegCore project I noticed that every so often after opening the Realtime Streaming Protocol(RTSP) connection with my HiLook IPCT250H Security Camera there was a “Paremeter is not valid” or “A generic error occurred in GDI+.” exception and sometimes the image was corrupted.

My test harness code was “inspired” by the Continuous Snapshots on Live Stream #280 sample

sing (var ms = new MemoryStream())
{
    await FFMpegArguments
        .FromUrlInput(new Uri("udp://192.168.2.12:9000"))
        .OutputToPipe(new StreamPipeSink(ms), options => options
            .ForceFormat("rawvideo")
            .WithVideoCodec(VideoCodec.Png)
            .Resize(new Size(Config.JpgWidthLarge, Config.JpgHeightLarge))
            .WithCustomArgument("-vf fps=1 -update 1")
        )
        .NotifyOnProgress(o => 
        {
            try
            {
                if (ms.Length > 0)
                {
                    ms.Position = 0;
                    using (var bitmap = new Bitmap(ms))
                    {
                        // Modify bitmap here

                        // Save the bitmap
                        bitmap.Save("test.png");
                    }

                    ms.SetLength(0);
                }
            }
            catch { }
        })
        .ProcessAsynchronously();
}

My implementation is slightly different because I caught then displayed any exceptions generated converting the image stream to a bitmap or saving it.

using (var ms = new MemoryStream())
{
   await FFMpegArguments
         .FromUrlInput(new Uri(_applicationSettings.CameraUrl))
         .OutputToPipe(new StreamPipeSink(ms), options => options
         .ForceFormat("mpeg1video")
         //.ForceFormat("rawvideo")
         .WithCustomArgument("-rtsp_transport tcp")
         .WithFramerate(10)
         .WithVideoCodec(VideoCodec.Png)
         //.Resize(1024, 1024)
         //.ForceFormat("image2pipe")
         //.Resize(new Size(Config.JpgWidthLarge, Config.JpgHeightLarge))
         //.Resize(new Size(Config.JpgWidthLarge, Config.JpgHeightLarge))
         //.WithCustomArgument("-vf fps=1 -update 1")
         //.WithCustomArgument("-vf fps=5 -update 1")
         //.WithSpeedPreset( Speed.)
         //.UsingMultithreading()
         //.UsingThreads()
         //.WithVideoFilters(filter => filter.Scale(640, 480))
         //.UsingShortest()
         //.WithFastStart()
         )
         .NotifyOnProgress(o =>
         {
            try
            {
               if (ms.Length > 0)
               {
                  ms.Position = 0;

                  string outputPath = Path.Combine(_applicationSettings.SavePath, string.Format(_applicationSettings.FrameFileNameFormat, DateTime.UtcNow ));

                  using (var bitmap = new Bitmap(ms))
                  {
                     // Save the bitmap
                     bitmap.Save(outputPath);
                  }

                  ms.SetLength(0);
               }
            }
            catch (Exception ex)
            {
               Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} {ex.Message}");
            }
         })
         .ProcessAsynchronously();
}

I have created a Continuous Snapshots on Live Stream Memory stream contains invalid bitmap image #562 to track the issue.

One odd thing that I noticed when scrolling “back and forth” through the images around when there was exception was that the date and time on the top left of the image was broken.

I wonder if the image was “broken” in some subtle way and FFMpegCore is handling this differently to the other libraries I’m trialing.

RTSP Camera RabbitOM.Streaming

The RTSPCameraNagerVideoStream library had significant latency which wasn’t good as I wanted to trigger the processing of images from the Real-time Streaming Protocol(RTSP) on my Seeedstudio J3011 Industrial device by strobing one of the digital inputs and combine streamed images with timestamped static ones.

HiLook IPCT250H Camera configuration

To get a Moving Picture Experts Group(MPEG) stream I had to change the camera channel rather than use than H.264+ video Encoding

RtspCameraUrl”: “rtsp://10.0.0.19/ISAPI/Streaming/channels/102”

The KSAH-42.RabbitOM library looked worth testing so I built a test harness inspired by RabbitOM.Streaming.Tests.ConsoleApp.

client.PacketReceived += (sender, e) =>
{
   var interleavedPacket = e.Packet as RtspInterleavedPacket;

   if (interleavedPacket != null && interleavedPacket.Channel > 0)
   {
      // In most of case, avoid this packet
      Console.ForegroundColor = ConsoleColor.DarkCyan;
      Console.WriteLine("Skipping some data : size {0}", e.Packet.Data.Length);
      return;
   }

   Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} New image received, bytes:{e.Packet.Data.Length}");

   File.WriteAllBytes(Path.Combine(_applicationSettings.SavePath, string.Format(_applicationSettings.FrameFileNameFormat, DateTime.UtcNow)), e.Packet.Data);
};

When I ran my test harness the number of images didn’t match the frame rate configured in the camera

The format of the images was corrupted, and I couldn’t open them

It looked like I was writing RTSP packets to the disk rather than Joint Photographic Experts Group(JPEG) images from the MPEG stream.

There was another sample application RabbitOM.Streaming.Tests.Mjpeg which displayed JPEG images. After looking at the code I figured out I need to use the RtpFrameBuilder class to assemble the RTSP packets into frames.

private static readonly RtpFrameBuilder _frameBuilder = new JpegFrameBuilder();
...
_frameBuilder.FrameReceived += OnFrameReceived;
...
client.PacketReceived += (sender, e) =>
{
   var interleavedPacket = e.Packet as RtspInterleavedPacket;

   if (interleavedPacket != null && interleavedPacket.Channel > 0)
   {
      // In most of case, avoid this packet
      Console.ForegroundColor = ConsoleColor.DarkCyan;
      Console.WriteLine("Skipping some data : size {0}", e.Packet.Data.Length);
      return;
   }

   _frameBuilder.Write(interleavedPacket.Data); 
};
private static void OnFrameReceived(object sender, RtpFrameReceivedEventArgs e)
{
   Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} New image received, bytes:{e.Frame.Data.Length}");

   File.WriteAllBytes(Path.Combine(_applicationSettings.SavePath, string.Format(_applicationSettings.FrameFileNameFormat, DateTime.UtcNow)), e.Frame.Data);
}

With the modified code the image size looked roughly the same as the SecurityCameraHttpClient images

The format of the images was good, and I could open them

Looks like KSAH-42.RabbitOM might be a good choice as it doesn’t have any external dependencies and the latency is minimal.

Building Cloud AI with AI-YoloSharp Object detection Azure HTTP Trigger function

Introduction

Using Copilot to generate the code to retrieve an image from a security camera and small applications for object detection, object classification, with YoloSharp also went surprisingly well. An Azure HTTP Trigger function is a bit more complex so I decided to do one next. I started with the Visual Studio 2022 generated code

using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

namespace YoloObjectDetectionFunction
{
    public class Function1
    {
        private readonly ILogger<Function1> _logger;

        public Function1(ILogger<Function1> logger)
        {
            _logger = logger;
        }

        [Function("Function1")]
        public IActionResult Run([HttpTrigger(AuthorizationLevel.Function, "get", "post")] HttpRequest req)
        {
            _logger.LogInformation("C# HTTP trigger function processed a request.");
            return new OkObjectResult("Welcome to Azure Functions!");
        }
    }
}

I then generated the first version of the code with Copilot desktop

The code looked “okay” but had lots of extra “usings”, used FunctionName vs Function attributes etc. so didn’t compile.

I think the “var yolo = new Yolo(modelPath)” comes from YoloDotNet by Niklas Swärd and “var detections = yolo.Predict(image)” is from YoloV8.Net by Samuel Stainback

using System.IO;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Extensions.Logging;
using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;
using YoloSharp;
using YoloSharp.DataStructures;

public static class YoloObjectDetectionFunction
{
    private static readonly string modelPath = "path/to/your/yolo-model.onnx";

    [FunctionName("YoloObjectDetection")]
    public static async Task<IActionResult> Run(
        [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequestMessage req,
        ILogger log)
    {
        log.LogInformation("Yolo Object Detection function processed a request.");

        // Read the image from the request
        byte[] imageBytes = await req.Content.ReadAsByteArrayAsync();
        using (Image<Rgba32> image = Image.Load<Rgba32>(imageBytes))
        {
            // Initialize YoloSharp
            using (var yolo = new Yolo(modelPath))
            {
                // Detect objects in the image
                var detections = yolo.Predict(image);

                // Process the detections
                foreach (var detection in detections)
                {
                    // Draw bounding boxes on the image
                    image.Mutate(ctx => ctx.DrawPolygon(Rgba32.Red, 2, detection.BoundingBox.ToPolygon()));
                }

                // Save the modified image to a memory stream
                using (var ms = new MemoryStream())
                {
                    image.SaveAsPng(ms);
                    ms.Seek(0, SeekOrigin.Begin);

                    // Return the modified image as a response
                    var result = new HttpResponseMessage(HttpStatusCode.OK)
                    {
                        Content = new StreamContent(ms)
                    };
                    result.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("image/png");
                    return new FileStreamResult(ms, "image/png");
                }
            }
        }
    }
}

I manually fixed all the issues in the code with a lot of assistance from Visual Studio 2022 Intellisense

using System.Net;

using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;
using SixLabors.ImageSharp.Drawing.Processing;

//using YoloSharp;
//using YoloSharp.DataStructures;
using Compunet.YoloSharp;


public static class YoloObjectDetectionFunction
{
   //private static readonly string modelPath = "path/to/your/yolo-model.onnx";
   private static readonly string modelPath = "yolov8s.onnx";

   //[FunctionName("YoloObjectDetection")]
   [Function("YoloObjectDetection")]
   public static async Task<IActionResult> Run(
       [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequestMessage req,
       ILogger log)
   {
      log.LogInformation("Yolo Object Detection function processed a request.");

      // Read the image from the request
      byte[] imageBytes = await req.Content.ReadAsByteArrayAsync();
      using (Image<Rgba32> image = Image.Load<Rgba32>(imageBytes))
      {
         // Initialize YoloSharp
         //using (var yolo = new Yolo(modelPath))
         using (var yolo = new YoloPredictor(modelPath))
         {
            // Detect objects in the image
            //var detections = yolo.Predict(image);
            var detections = yolo.Detect(image);

            // Process the detections
            foreach (var detection in detections)
            {
               // Draw bounding boxes on the image
               //image.Mutate(ctx => ctx.DrawPolygon(Rgba32.Red, 2, detection.BoundingBox.ToPolygon()));

               var rectangle = new PointF[] {new Point(detection.Bounds.Bottom, detection.Bounds.Left), new Point(detection.Bounds.Bottom, detection.Bounds.Right), new Point(detection.Bounds.Right, detection.Bounds.Top), new Point(detection.Bounds.Left, detection.Bounds.Top)};

               image.Mutate(ctx => ctx.DrawPolygon(Rgba32.ParseHex("FF0000"), 2, rectangle));
            }

            // Save the modified image to a memory stream
            using (var ms = new MemoryStream())
            {
               image.SaveAsPng(ms);
               ms.Seek(0, SeekOrigin.Begin);

               // Return the modified image as a response
               var result = new HttpResponseMessage(HttpStatusCode.OK)
               {
                  Content = new StreamContent(ms)
               };
               result.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("image/png");
               return new FileStreamResult(ms, "image/png");
            }
         }
      }
   }
}

When I posted an image to the function with Telerik Fiddler Classic it threw an exception, because the ILogger injected into the Run method was null.

I then asked Github Copilot to “Modify code so the ILogger is parameter of constructor”

using System.Net;

using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;
using SixLabors.ImageSharp.Drawing.Processing;

//using YoloSharp;
//using YoloSharp.DataStructures;
using Compunet.YoloSharp;


//public static class YoloObjectDetectionFunction
public class YoloObjectDetectionFunction
{
   private static readonly ILogger _log;
   //private static readonly string modelPath = "path/to/your/yolo-model.onnx";
   private static readonly string modelPath = "yolov8s.onnx";

   public YoloObjectDetectionFunction(ILogger<YoloObjectDetectionFunction> log)
   {
      _log = log;
   }

   //[FunctionName("YoloObjectDetection")]
   [Function("YoloObjectDetection")]
   //public static async Task<IActionResult> Run( [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequestMessage req, ILogger log)
   public static async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequestMessage req)
   {
      _log.LogInformation("Yolo Object Detection function processed a request.");

      // Read the image from the request
      byte[] imageBytes = await req.Content.ReadAsByteArrayAsync();
      using (Image<Rgba32> image = Image.Load<Rgba32>(imageBytes))
      {
         // Initialize YoloSharp
         //using (var yolo = new Yolo(modelPath))
         using (var yolo = new YoloPredictor(modelPath))
         {
            // Detect objects in the image
            //var detections = yolo.Predict(image);
            var detections = yolo.Detect(image);

            // Process the detections
            foreach (var detection in detections)
            {
               // Draw bounding boxes on the image
               //image.Mutate(ctx => ctx.DrawPolygon(Rgba32.Red, 2, detection.BoundingBox.ToPolygon()));

               var rectangle = new PointF[] {new Point(detection.Bounds.Bottom, detection.Bounds.Left), new Point(detection.Bounds.Bottom, detection.Bounds.Right), new Point(detection.Bounds.Right, detection.Bounds.Top), new Point(detection.Bounds.Left, detection.Bounds.Top)};

               image.Mutate(ctx => ctx.DrawPolygon(Rgba32.ParseHex("FF0000"), 2, rectangle));
            }

            // Save the modified image to a memory stream
            using (var ms = new MemoryStream())
            {
               image.SaveAsPng(ms);
               ms.Seek(0, SeekOrigin.Begin);

               // Return the modified image as a response
               var result = new HttpResponseMessage(HttpStatusCode.OK)
               {
                  Content = new StreamContent(ms)
               };
               result.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("image/png");
               return new FileStreamResult(ms, "image/png");
            }
         }
      }
   }
}

When I posted an image to the function it threw an exception, because content of the HttpRequestMessage was null.

I then asked Github Copilot to “Modify the code so that the image is read from the form”

// Read the image from the form
var form = await req.ReadFormAsync();
var file = form.Files["image"];
if (file == null || file.Length == 0)
{
   return new BadRequestObjectResult("Image file is missing or empty.");
}

When I posted an image to the function it returned a 400 Bad Request Error.

After inspecting the request I realized that the name field was wrong, as the generated code was looking for “image”

Content-Disposition: form-data; name=”image”; filename=”sports.jpg”

Then, when I posted an image to the function it returned a 500 error.

But, the FileStreamResult was failing so I modified the code to return a FileContentResult

using (var ms = new MemoryStream())
{
   image.SaveAsJpeg(ms);

   return new FileContentResult(ms.ToArray(), "image/jpg");
}

Then, when I posted an image to the function it succeeded

But, the bounding boxes around the detected objects were wrong.

I then manually fixed up the polygon code so the lines for each bounding box were drawn in the correct order.

// Process the detections
foreach (var detection in detections)
{
   var rectangle = new PointF[] {
      new Point(detection.Bounds.Left, detection.Bounds.Bottom),
      new Point(detection.Bounds.Right, detection.Bounds.Bottom),
      new Point(detection.Bounds.Right, detection.Bounds.Top),
      new Point(detection.Bounds.Left, detection.Bounds.Top)
 };

Then, when I posted an image to the function it succeeded

The bounding boxes around the detected objects were correct.

I then “refactored” the code, removing all the unused “using”s, removed any commented out code, changed ILogger to be initialised using a Primary Constructor etc.

using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;
using SixLabors.ImageSharp.Drawing.Processing;

using Compunet.YoloSharp;

public class YoloObjectDetectionFunction(ILogger<YoloObjectDetectionFunction> log)
{
   private readonly ILogger<YoloObjectDetectionFunction> _log = log;
   private readonly string modelPath = "yolov8s.onnx";

   [Function("YoloObjectDetection")]
   public async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req)
   {
      _log.LogInformation("Yolo Object Detection function processed a request.");

      // Read the image from the form
      var form = await req.ReadFormAsync();
      var file = form.Files["image"];
      if (file == null || file.Length == 0)
      {
         return new BadRequestObjectResult("Image file is missing or empty.");
      }

      using (var stream = file.OpenReadStream())
      using (Image<Rgba32> image = Image.Load<Rgba32>(stream))
      {
         // Initialize YoloSharp
         using (var yolo = new YoloPredictor(modelPath))
         {
            // Detect objects in the image
            var detections = yolo.Detect(image);

            // Process the detections
            foreach (var detection in detections)
            {
               var rectangle = new PointF[] {
                  new Point(detection.Bounds.Left, detection.Bounds.Bottom),
                  new Point(detection.Bounds.Right, detection.Bounds.Bottom),
                  new Point(detection.Bounds.Right, detection.Bounds.Top),
                  new Point(detection.Bounds.Left, detection.Bounds.Top)
               };

               image.Mutate(ctx => ctx.DrawPolygon(Rgba32.ParseHex("FF0000"), 2, rectangle));
            }

            // Save the modified image to a memory stream
            using (var ms = new MemoryStream())
            {
               image.SaveAsJpeg(ms);

              return new FileContentResult(ms.ToArray(), "image/jpg");
            }
         }
      }
   }
}

Summary

The initial code generated by Copilot was badly broken but with the assistance of Visual Studio 2022 Intellisense was fixed fairly quickly. The ILogger not being initialised, me using the “wrong” upload file name were easy to debug, but the FileContentResult exception was a bit more difficult.

It took me a quite a bit longer to write the function with Copilot desktop/Github Copilot than what it would have taken me normally. But, I think a lot of this was due to having to take screen shots, writing this blog post as I went, and having already written several Azure HTTP Trigger function for processing uploaded images.

The Copilot generated code in this post is not suitable for production

IoT and AI Building Edge AI with AI- YoloSharp Client

Introduction

Using Copilot to generate the code to retrieve an image from a security camera went surprisingly well so next step was something a bit more complex. For this post I have used Copilot prompts to generate code which uses Ultralytics YoloV8 and Compunet YoloSharp for object detection, object classification, pose estimation, and instance segmentation

using System;
using System.Drawing;
using YOLOSharp;

class Program
{
    static void Main(string[] args)
    {
        string modelPath = "path/to/yolov8.onnx"; // Replace with the path to your YOLOv8 model
        string imagePath = "path/to/image.jpg"; // Replace with the path to your image

        // Load the YOLOv8 model
        var yolo = new YOLO(modelPath);

        // Load the image from disk
        using (var image = Image.FromFile(imagePath))
        {
            // Run the YOLOv8 model on the image
            var results = yolo.Predict(image);

            // Display the results
            foreach (var result in results)
            {
                Console.WriteLine($"Label: {result.Label}, Confidence: {result.Confidence}, BoundingBox: {result.BoundingBox}");
            }
        }
    }
}
In this example:

Replace path/to/yolov8.onnx with the actual path to your YOLOv8 model file.

Replace path/to/image.jpg with the actual path to your image file.

This application loads the YOLOv8 model using YOLOSharp, loads an image from disk, runs the model on the image, and displays the results, including the label, confidence, and bounding box for each detected object.

Happy coding! 🚀

Copilot had “assumed” I meant object detection(reasonable) with YoloSharp and the code didn’t compile.

Object Detection

After some minor fixes and shifting the configuration to an appsettings file the application compiled.

using Compunet.YoloSharp;
using Microsoft.Extensions.Configuration;
using SixLabors.ImageSharp;

class Program
{
   static void Main(string[] args)
   {
      Model.ApplicationSettings applicationSettings;

      Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss} 0.3.CameraHTTPClient starting");
#if RELEASE
       Console.WriteLine("RELEASE");
#else
      Console.WriteLine("DEBUG");
#endif
      // load the app settings into configuration
      var configuration = new ConfigurationBuilder()
           .AddJsonFile("appsettings.json", false, true)
      .Build();

      applicationSettings = configuration.GetSection("ApplicationSettings").Get<Model.ApplicationSettings>();

      // Load the YOLOv8 model
      var yolo = new YoloPredictor(applicationSettings.ModelPath);

      // Load the image from disk
      using (var image = Image.Load(applicationSettings.ImagePath))
      {
         // Run the YOLOv8 model on the image
         var results = yolo.Detect(image);

         // Display the results
         foreach (var result in results)
         {
            Console.WriteLine($"Label: {result.Name}, Confidence: {result.Confidence}, BoundingBox: {result.Bounds}");
         }
      }
   }
}

I tested the implementation with sample “sports” image from the YoloSharp Github repository

The console application output looked reasonable

I manually compared the output of the console application with the Ultralytics Hub model preview tab and the results looked close enough.

Pose

My Copilot prompt

please write a C# console application that uses Compunet YoloSharp to run a yolov8 pose model on an image loaded from disk
using Compunet.YoloSharp;
using Microsoft.Extensions.Configuration;
using SixLabors.ImageSharp;

class Program
{
   static void Main(string[] args)
   {
      Model.ApplicationSettings applicationSettings;

      Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss} 0.3.CameraHTTPClient starting");
#if RELEASE
       Console.WriteLine("RELEASE");
#else
      Console.WriteLine("DEBUG");
#endif
      // load the app settings into configuration
      var configuration = new ConfigurationBuilder()
           .AddJsonFile("appsettings.json", false, true)
      .Build();

      applicationSettings = configuration.GetSection("ApplicationSettings").Get<Model.ApplicationSettings>();

      // Load the YOLOv8 pose model
      var yolo = new YoloPredictor(applicationSettings.ModelPath);

      // Load the image from disk
      using (var image = Image.Load(applicationSettings.ImagePath))
      {
         // Run the YOLOv8 pose model on the image
         var results = yolo.Pose(image);

         // Display the results
         foreach (var result in results)
         {
            Console.WriteLine($"Label: {result.Name.Name}, Confidence: {result.Confidence}, BoundingBox: {result.Bounds}");
            Console.WriteLine("Keypoints:");
            foreach (var keypoint in result)
            {
               Console.WriteLine($"  - {keypoint.Point}");
            }
         }
      }
   }
}

After some minor fixes and shifting the configuration to an appsettings file the application compiled. I tested the implementation with sample “sports” image from the YoloSharp Github repository

The console application output looked reasonable

I manually compared the output of the console application with the Ultralytics Hub model preview tab and the results were reasonable

Classification

My Copilot prompt

please write a C# console application that uses Compunet YoloSharp to run a yolov8 pose model on an image loaded from disk
using Compunet.YoloSharp;
using Microsoft.Extensions.Configuration;
using SixLabors.ImageSharp;

class Program
{
   static void Main(string[] args)
   {
      Model.ApplicationSettings applicationSettings;

      Console.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss} 0.3.CameraHTTPClient starting");
#if RELEASE
       Console.WriteLine("RELEASE");
#else
      Console.WriteLine("DEBUG");
#endif

      // load the app settings into configuration
      var configuration = new ConfigurationBuilder()
           .AddJsonFile("appsettings.json", false, true)
      .Build();

      applicationSettings = configuration.GetSection("ApplicationSettings").Get<Model.ApplicationSettings>();

      // Load the YOLOv8 classification model
      var yolo = new YoloPredictor(applicationSettings.ModelPath);

      // Load the image from disk
      using (var image = Image.Load(applicationSettings.ImagePath))
      {
         // Run the YOLOv8 classification model on the image
         var results = yolo.Classify(image);

         // Display the results
         foreach (var result in results)
         {
             Console.WriteLine($"Label: {result.Name.Name}, Confidence: {result.Confidence}");
         }
      }
   }
}

After some minor fixes and shifting the configuration to an appsettings file the application compiled. I tested the implementation with sample “toaster” image from the YoloSharp Github repository

The console application output looked reasonable

I’m pretty confident the input image was a toaster.

Summary

The Copilot prompts to generate code which uses Ultralytics YoloV8 and Compunet YoloSharp and may have produced better code with some “prompt engineering”. Using Visual Studio intellisense the generated code was easy to fix.

The Copilot generated code in this post is not suitable for production

RTSP Camera Nager.VideoStream Startup Latency

While working on my RTSPCameraNagerVideoStream project I noticed that after opening the Realtime Streaming Protocol(RTSP) connection with my HiLook IPCT250H Security Camera it took a while for the application to start writing image files.

HiLook IPCT250H Camera configuration

My test harness code was “inspired” by the Nager.VideoStream.TestConsole application with a slightly different file format for the start-stop marker text and camera images files.

private static async Task StartStreamProcessingAsync(InputSource inputSource, CancellationToken cancellationToken = default)
{
   Console.WriteLine("Start Stream Processing");
   try
   {
      var client = new VideoStreamClient();

      client.NewImageReceived += NewImageReceived;
#if FFMPEG_INFO_DISPLAY
      client.FFmpegInfoReceived += FFmpegInfoReceived;
#endif
      File.WriteAllText(Path.Combine(_applicationSettings.ImageFilepathLocal, $"{DateTime.UtcNow:yyyyMMdd-HHmmss.fff}.txt"), "Start");

      await client.StartFrameReaderAsync(inputSource, OutputImageFormat.Png, cancellationToken: cancellationToken);

      File.WriteAllText(Path.Combine(_applicationSettings.ImageFilepathLocal, $"{DateTime.UtcNow:yyyyMMdd-HHmmss.fff}.txt"), "Finish");

      client.NewImageReceived -= NewImageReceived;
#if FFMPEG_INFO_DISPLAY
      client.FFmpegInfoReceived -= FFmpegInfoReceived;
#endif
      Console.WriteLine("End Stream Processing");
   }
   catch (Exception exception)
   {
      Console.WriteLine($"{exception}");
   }
}

private static void NewImageReceived(byte[] imageData)
{
   Debug.WriteLine($"{DateTime.UtcNow:yy-MM-dd HH:mm:ss.fff} NewImageReceived");

   File.WriteAllBytes( Path.Combine(_applicationSettings.ImageFilepathLocal, $"{DateTime.UtcNow:yyyyMMdd-HHmmss.fff}.png"), imageData);
}

I used Path.Combine so no code or configuration changes were required when the application was run on different operating systems (still need to ensure ImageFilepathLocal in the appsettings.json is the correct format).

Developer Desktop

I used my desktop computer a 13th Gen Intel(R) Core(TM) i7-13700 2.10 GHz with 32.0 GB running Windows 11 Pro 24H2.

In the test results below (representative of multiple runs while testing) the delay between starting streaming and the first image file was on average 3.7 seconds with the gap between the images roughly 100mSec.

Files written by NagerVideoStream timestamps roughly 100mSec apart, but 3

Industrial Computer

I used a reComputer J3011 – Edge AI Computer with NVIDIA® Jetson™ Orin™ Nano 8GB running Ubuntu 22.04.5 LTS (Jammy Jellyfish)

In the test results below (representative of multiple runs while testing) the delay between starting streaming and the first image file was on average roughly 3.7 seconds but the time between images varied a lot from 30mSec to >300mSec.

At 10FPS the results for my developer desktop were more consistent, and the reComputer J3011 had significantly more “jitter”. Both could cope with 1oFPS so the next step is to integrate YoloDotNet library to process the video frames.

NickSwardh NuGet Nvidia Jetson Orin Nano™ GPU CUDA Inferencing

My Seeedstudio reComputer J3011 has two processors an ARM64 CPU and an Nividia Jetson Orin 8G coprocessor. YoloDotNet by NickSwardh V2 (uses SkiaSharp) was significantly faster when run on the ARM64 CPU so I wanted to try inferencing with the Nividia Jetson Orin 8G coprocessor.

Performance of YoloDotNet by NickSwardh V2 running on the ARM64 CPU

Performance of YoloDotNet by NickSwardh V2 running on the Nividia Jetson Orin 8G with Compute Unified Device Architecture (CUDA) enabled.

Enabling CUDA reduced the total image scaling, pre-processing, inferencing, and post processing time from 115mSec to 36mSec which is a significant improvement.

YoloV8-NuGet Performance ARM64 CPU

To see how the dme-compunet, updated YoloDotNet and sstainba NuGets performed on an ARM64 CPU I built a test rig for the different NuGets using standard images and ONNX Models.

I started with the dme-compunet YoloV8 NuGet which found all the tennis balls and the results were consistent with earlier tests.

The YoloDotNet by NickSwardh NuGet update had some “breaking changes” so I built “old” and “updated” test harnesses.

The YoloDotNet by NickSwardh V1 and V2 results were slightly different. The V2 NuGet uses SkiaSharp which appears to significantly improve the performance.

Even though the YoloV8 by sstainba NuGet hadn’t been updated I ran the test harness just in case

The dme-compunet YoloV8 and NickSwardh YoloDotNet V1 versions produced the same results, but the NickSwardh YoloDotNet V2 results were slightly different.

  • dme-Compunet 291 mSec
  • NickSwardV1 480 mSec
  • NickSwardV2 115 mSecs
  • SStainba 422 mSec

Like in the YoloV8-NuGet Performance X64 CPU post the NickSwardV2 implementation which uses SkiaSharp was significantly faster so it looks like Sixlabors.ImageSharp is the issue.

To support Compute Unified Device Architecture (CUDA) or TensorRT inferencing with NickSwardV2(for SkiaSharp) will need some major modifications to the code so it might be better to build my own YoloV8 Nuget.

YoloV8-NuGet Performance X64 CPU

When checking the dme-compunet, YoloDotNet, and sstainba and NuGets I noticed YoloDotNet readme.md detailed some performance enhancements…

What’s new in YoloDotNet v2.0?

YoloDotNet 2.0 is a Speed Demon release where the main focus has been on supercharging performance to bring you the fastest and most efficient version yet. With major code optimizations, a switch to SkiaSharp for lightning-fast image processing, and added support for Yolov10 as a little extra 😉 this release is set to redefine your YoloDotNet experience:

Changing the implementation to use SkiaSharp caught my attention because in previous testing manipulating images with the Sixlabors.ImageSharp library took longer than expected.

I built a test rig for comparing the performance of the different NuGets using standard images and ONNX Models.

I started with the dme-compunet YoloV8 NuGet which found all the tennis balls and the results were consistent with earlier tests.

dme-compunet test harness image bounding boxes

The YoloDotNet by NickSwardh NuGet update had some “breaking changes” so I built “old” and “updated” test harnesses. The V1 version found all the tennis balls and the results were consistent with earlier tests.

NickSwardh V1 test harness image bounding boxes

The YoloDotNet by NickSwardh NuGet update had some “breaking changes” so there were some code changes but the V1 and V2 results were slightly different.

NickSwardh V2 test harness image bounding boxes

Even though the YoloV8 by sstainba NuGet hadn’t been updated I ran the test harness just in case and the results were consistent with previous tests.

sstainba test harness image bounding boxes

The dme-compunet YoloV8 and NickSwardh YoloDotNet V1 versions produce the same results, but the NickSwardh YoloDotNet V2 results were slightly different. The YoloV8 by sstainba results were unchanged.

  • dme-Compunet 71 mSec
  • NickSwardV1 76 mSec
  • NickSwardV2 33 mSecs
  • SStainba 82mSec

The NickSwardV2 implementation was significantly faster, but I need to investigate the slight difference in the bounding boxes. It looks like Sixlabors.ImageSharp might be the issue.

YoloV8 ONNX – Nvidia Jetson Orin Nano™ Execution Providers

The Seeedstudio reComputer J3011 has two processors an ARM64 CPU and an Nvidia Jetson Orin 8G which can be used for inferencing with the Open Neural Network Exchange(ONNX)Runtime.

Story of Fail

Inferencing worked first time on the ARM64 CPU because the required runtime is included in the Microsoft.ML.OnnxRuntime NuGet

ARM64 Linux ONNX runtime
Microsoft.ML.OnnxRuntime NuGet ARM64 Linux runtime

Inferencing failed on the Nividia Jetson Orin 8G because the CUDA Execution provider and TensorRT Execution Provider for the ONNXRuntime were not included in the Microsoft.ML.OnnxRuntime.GPU.Linux NuGet.

Missing ARM64 Linux GPU runtime

There were Linux x64 and Windows x64 versions of the ONNXRuntime library included in the Microsoft.ML.OnnxRuntime.Gpu NuGet

Microsoft.ML.OnnxRuntime.Gpu NuGet x64 Linux runtime

Desperately Seeking libonnxruntime.so

The Nvidia ONNX runtime site had pip wheel files for the different versions of Python and the Open Neural Network Exchange(ONNX)Runtime.

The onnxruntime_gpu-1.18.0-cp312-cp312-linux_aarch64.whl matched the version of the ONNXRuntime I needed and version of Python on the device..

When the pip wheel file was renamed onnxruntime_gpu-1.18.0-cp312-cp312-linux_aarch64.zip it could be opened, but there wasn’t a libonnruntime.so.

Onnxruntime_gpu-1.18.0-cp312-cp312-linux_aarch64 file listing

Building the TensorRT & CUDA Execution Providers

The ONNXRuntime build has to be done on Nividia Jetson Orin so after installing all the necessary prerequisites the first attempt failed.

bryn@ubuntu:~/onnxruntime/onnxruntime$ ./build.sh --config Release --update --build --build_wheel \
--use_tensorrt --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu \
--tensorrt_home /usr/lib/aarch64-linux-gnu

When in high power mode more cores are used but this consumes more resource when building the ONNXRuntime. To limit resource utilisation --parallel2 was added the command line because the compile process was having “out of memory” failures.

bryn@ubuntu:~/onnxruntime/onnxruntime$ ./build.sh --config Release --update --build --parallel 2 --build_wheel \
--use_tensorrt --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu \
--tensorrt_home /usr/lib/aarch64-linux-gnu

There were some compiler warnings but they appear to be benign.

First attempt at running the application failed because libonnxruntime.so was missing so –build_shared_lib was added to the command line

2024-06-10 18:21:58,480 build [INFO] - Build complete
bryn@ubuntu:~/onnxruntime/onnxruntime$ ./build.sh --config Release --update --build --parallel 2 --build_wheel --use_tensorrt --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu --tensorrt_home /usr/lib/aarch64-linux-gnu --build_shared_lib

When the build completed the files were copied to the runtime folder of the program.

The application could then be configured to use the TensorRT Execution Provider.

Getting CUDA and TensorRT working on the Nvidia Jetson Orin 8G took much longer than I expected, with many dead ends and device factory resets before the process was repeatable.

YoloV8 ONNX – Nvidia Jetson Orin Nano™ CPU & GPU TensorRT Inferencing

The Seeedstudio reComputer J3011 has two processors an ARM64 CPU and an Nividia Jetson Orin 8G. To speed up TensorRT inferencing I built an Open Neural Network Exchange(ONNX) TensorRT Execution Provider. After updating the code to add a “warm-up” and tracking of average pre-processing, inferencing & post-processing durations I did a series of CPU & GPU performance tests.

The testing consisted of permutations of three models TennisBallsYoloV8s20240618640×640.onnx, TennisBallsYoloV8s2024062410241024.onnx & TennisBallsYoloV8x20240614640×640 (limited testing as slow) and three images TennisBallsLandscape640x640.jpg, TennisBallsLandscape1024x1024.jpg & TennisBallsLandscape3072x4080.jpg.

Executive Summary

As expected, inferencing with a TensorRT 640×640 model and a 640×640 image was fastest, 9mSec pre-processing, 21mSec inferencing, then 4mSec post-processing.

If the image had to be scaled with SixLabors.ImageSharp this significantly increased the preprocessing (and overall) time.

CPU Inferencing

GPU TensorRT Small model Inferencing

GPU TensorRT Large model Inferencing