Random wanderings through Microsoft Azure esp. PaaS plumbing, the IoT bits, AI on Micro controllers, AI on Edge Devices, .NET nanoFramework, .NET Core on *nix and ML.NET+ONNX
public class Function1
{
private readonly ILogger<Function1> _logger;
private readonly List<string> _labels;
private readonly InferenceSession _session;
public Function1(ILogger<Function1> logger)
{
_logger = logger;
_labels = File.ReadAllLines(Path.Combine(AppContext.BaseDirectory, "labels.txt")).ToList();
_session = new InferenceSession(Path.Combine(AppContext.BaseDirectory, "FasterRCNN-10.onnx"));
}
[Function("ObjectDetectionFunction")]
public async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req, ExecutionContext context)
{
if (!req.ContentType.StartsWith("image/"))
return new BadRequestObjectResult("Content-Type must be an image.");
using var ms = new MemoryStream();
await req.Body.CopyToAsync(ms);
ms.Position = 0;
using var image = Image.Load<Rgb24>(ms);
var inputTensor = PreprocessImage(image);
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("image", inputTensor)
};
using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = _session.Run(inputs);
var output = results.ToDictionary(x => x.Name, x => x.Value);
var boxes = (DenseTensor<float>)output["6379"];
var labels = (DenseTensor<long>)output["6381"];
var scores = (DenseTensor<float>)output["6383"];
var detections = new List<object>();
for (int i = 0; i < scores.Length; i++)
{
if (scores[i] > 0.5)
{
detections.Add(new
{
label = _labels[(int)labels[i]],
score = scores[i],
box = new
{
x1 = boxes[i, 0],
y1 = boxes[i, 1],
x2 = boxes[i, 2],
y2 = boxes[i, 3]
}
});
}
}
return new OkObjectResult(detections);
}
private static DenseTensor<float> PreprocessImage(Image<Rgb24> image)
{
// Step 1: Resize so that min(H, W) = 800, max(H, W) <= 1333, keeping aspect ratio
int origWidth = image.Width;
int origHeight = image.Height;
int minSize = 800;
int maxSize = 1333;
float scale = Math.Min((float)minSize / Math.Min(origWidth, origHeight),
(float)maxSize / Math.Max(origWidth, origHeight));
int resizedWidth = (int)Math.Round(origWidth * scale);
int resizedHeight = (int)Math.Round(origHeight * scale);
image.Mutate(x => x.Resize(resizedWidth, resizedHeight));
// Step 2: Pad so that both dimensions are divisible by 32
int padWidth = ((resizedWidth + 31) / 32) * 32;
int padHeight = ((resizedHeight + 31) / 32) * 32;
var paddedImage = new Image<Rgb24>(padWidth, padHeight);
paddedImage.Mutate(ctx => ctx.DrawImage(image, new Point(0, 0), 1f));
// Step 3: Convert to BGR and normalize
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
var tensor = new DenseTensor<float>(new[] { 3, padHeight, padWidth });
for (int y = 0; y < padHeight; y++)
{
for (int x = 0; x < padWidth; x++)
{
Rgb24 pixel = default;
if (x < resizedWidth && y < resizedHeight)
pixel = paddedImage[x, y];
tensor[0, y, x] = pixel.B - mean[0];
tensor[1, y, x] = pixel.G - mean[1];
tensor[2, y, x] = pixel.R - mean[2];
}
}
paddedImage.Dispose();
return tensor;
}
}
For my initial testing in the Azure Functions emulator using Fiddler Classic I manually generated 10 requests, then replayed them sequentially, and then finally concurrently.
The results for the manual, then sequential results were fairly consistent but the 10 concurrent requests each to took more than 10x longer. In addition, the CPU was at 100% usage while the concurrently executed functions were running.
The next couple of posts will compare and look at options for improving the “performance” (scalability, execution duration, latency, jitter, billing etc.) of the Github Copilot generated code.
While testing the FasterRCNNObjectDetectionHttpTrigger function with Telerik Fiddler Classic and my “standard” test image I noticed the response bodies were different sizes.
Initially the application plan was an S1 SKU (1 vCPU 1.75G RAM)
I used Netron to inspect the model properties to get the correct names for the output tensors
I had a couple of attempts at resizing the image to see what impact this had on the accuracy of the confidence and minimum bounding rectangles.
resize the image such that both height and width are within the range of [800, 1333], and then pad the image with zeros such that both height and width are divisible by 32.
modify the code to resize the image such that both height and width are within the range of [800, 1333], and then pad the image with zeros such that both height and width are divisible by 32 and the aspect ratio is not changed.
The final version of the image processing code scaled then right padded the image to keep the aspect ratio and MBR coordinates correct.
As a final test I deployed the code to Azure and the first time I ran the function it failed because the labels file couldn’t be found because Unix file paths are case sensitive (labels.txt vs. Labels.txt).
The inferencing time was a bit longer than I expected.
// please write an httpTrigger azure function that uses Faster RCNN and ONNX to detect the object in an image uploaded in the body of an HTTP Post
// manually added the ML.Net ONNX NuGet + using directives
// manually added the ImageSharp NuGet + using directives
// Used Copilot to add Microsoft.ML.OnnxRuntime.Tensors using directive
// Manually added ONNX FIle + labels file sorted out paths
// Used Netron to fixup output tensor names
// Change DenseTensor to BGR (based on https://github.com/onnx/models/tree/main/validated/vision/object_detection_segmentation/faster-rcnn#preprocessing-steps)
// Normalise colour values with mean = [102.9801, 115.9465, 122.7717]
// resize the image such that both height and width are within the range of [800, 1333], and then pad the image with zeros such that both height and width are divisible by 32.
// modify the code to resize the image such that both height and width are within the range of [800, 1333], and then pad the image with zeros such that both height and width are divisible by 32 and the aspect ratio is not changed.
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
using SixLabors.ImageSharp; // Couldn't get inteliisense after adding NuGet package
using SixLabors.ImageSharp.PixelFormats; // Couldn't get inteliisense after adding NuGet package
using SixLabors.ImageSharp.Processing; // Couldn't get inteliisense after adding NuGet package
namespace FasterRCNNObjectDetectionHttpTriggerGithubCopilot
{
public class Function1
{
private readonly ILogger<Function1> _logger;
private readonly InferenceSession _session;
private readonly List<string> _labels;
public Function1(ILogger<Function1> logger)
{
_logger = logger;
_session = new InferenceSession("FasterRCNN-10.onnx");
_labels = File.ReadAllLines("labels.txt").ToList();
}
[Function("ObjectDetectionFunction")]
public async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "post", Route = null)] HttpRequest req)
{
if (!req.ContentType.StartsWith("image/"))
return new BadRequestObjectResult("Content-Type must be an image.");
using var ms = new MemoryStream();
await req.Body.CopyToAsync(ms);
ms.Position = 0;
using var image = Image.Load<Rgb24>(ms);
var inputTensor = PreprocessImage(image);
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("image", inputTensor)
};
using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = _session.Run(inputs);
var output = results.ToDictionary(x => x.Name, x => x.Value);
var boxes = (DenseTensor<float>)output["6379"];
var labels = (DenseTensor<long>)output["6381"];
var scores = (DenseTensor<float>)output["6383"];
var detections = new List<object>();
for (int i = 0; i < scores.Length; i++)
{
if (scores[i] > 0.5)
{
detections.Add(new
{
label = _labels[(int)labels[i]],
score = scores[i],
box = new
{
x1 = boxes[i, 0],
y1 = boxes[i, 1],
x2 = boxes[i, 2],
y2 = boxes[i, 3]
}
});
}
}
return new OkObjectResult(detections);
}
private static DenseTensor<float> PreprocessImage( Image<Rgb24> image)
{
// Step 1: Resize so that min(H, W) = 800, max(H, W) <= 1333, keeping aspect ratio
int origWidth = image.Width;
int origHeight = image.Height;
int minSize = 800;
int maxSize = 1333;
float scale = Math.Min((float)minSize / Math.Min(origWidth, origHeight),
(float)maxSize / Math.Max(origWidth, origHeight));
/*
float scale = 1.0f;
// If either dimension is less than 800, scale up so the smaller is 800
if (origWidth < minSize || origHeight < minSize)
{
scale = Math.Max((float)minSize / origWidth, (float)minSize / origHeight);
}
// If either dimension is greater than 1333, scale down so the larger is 1333
if (origWidth * scale > maxSize || origHeight * scale > maxSize)
{
scale = Math.Min((float)maxSize / origWidth, (float)maxSize / origHeight);
}
*/
int resizedWidth = (int)Math.Round(origWidth * scale);
int resizedHeight = (int)Math.Round(origHeight * scale);
image.Mutate(x => x.Resize(resizedWidth, resizedHeight));
// Step 2: Pad so that both dimensions are divisible by 32
int padWidth = ((resizedWidth + 31) / 32) * 32;
int padHeight = ((resizedHeight + 31) / 32) * 32;
var paddedImage = new Image<Rgb24>(padWidth, padHeight);
paddedImage.Mutate(ctx => ctx.DrawImage(image, new Point(0, 0), 1f));
// Step 3: Convert to BGR and normalize
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
var tensor = new DenseTensor<float>(new[] { 3, padHeight, padWidth });
for (int y = 0; y < padHeight; y++)
{
for (int x = 0; x < padWidth; x++)
{
Rgb24 pixel = default;
if (x < resizedWidth && y < resizedHeight)
pixel = paddedImage[x, y];
tensor[0, y, x] = pixel.B - mean[0];
tensor[1, y, x] = pixel.G - mean[1];
tensor[2, y, x] = pixel.R - mean[2];
}
}
paddedImage.Dispose();
return tensor;
}
}
}
It took roughly an hour to “vibe code” the function, but it would have taken much longer for someone not familiar with the problem domain.
Summary
The Github Copilot generated code was okay but would be fragile, performance would suck and not scale terribly well.
The Copilot generated code in this post is not suitable for production
After some investigation I think the scaling of the image used for inferencing (based on the requirements on the model), then the scaling of the minimum bounding rectangles isn’t quite right.
To better understand how Copilot and Github Copilot generated code was impacted by the wording and order of prompts I re-played one of my previous sessions.
The generated code looked “reasonable” and vaguely familiar.
static void Main(string[] args)
{
// Paths to the ONNX model and the image file (adjust these paths as needed)
string modelPath = @"..\\..\\..\\..\\Models\\FasterRCNN-10.onnx";
string imagePath = "sports.jpg";
// Create the OnnxInference session
using var session = new InferenceSession(modelPath);
// Load the image from disk using ImageSharp
using var image = Image.Load<Rgb24>(imagePath);
// Resize the image to fit within the range and adjust dimensions to be divisible by 32
ResizeImage(image);
// Extract tensor data from the image (with shape [3, height, width])
var inputTensor = ExtractTensorFromImage(image);
// Create NamedOnnxValue input (ensure that the input name "image" matches your model's input)
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("image", inputTensor)
};
// Run the model inference
using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs);
// Process and display the outputs (bounding boxes, labels, and confidences)
ProcessOutput(results);
Console.WriteLine("Press Enter to exit");
Console.ReadLine();
}
/// <summary>
/// Resizes the input image such that both width and height are within the range [800, 1333]
/// and ensures the dimensions are divisible by 32.
/// </summary>
private static void ResizeImage(Image<Rgb24> image)
{
const int minSize = 800;
const int maxSize = 1333;
int originalWidth = image.Width;
int originalHeight = image.Height;
// Determine the scaling factor so that the smallest side is at least minSize and the largest does not exceed maxSize.
float scale = Math.Min((float)maxSize / Math.Max(originalWidth, originalHeight),
(float)minSize / Math.Min(originalWidth, originalHeight));
// Compute the new dimensions based on the scale
int newWidth = (int)(originalWidth * scale);
int newHeight = (int)(originalHeight * scale);
// Adjust dimensions to be divisible by 32
newWidth = (newWidth / 32) * 32;
newHeight = (newHeight / 32) * 32;
image.Mutate(x => x.Resize(newWidth, newHeight));
}
/// <summary>
/// Converts the resized image into a DenseTensor<float> with shape [3, height, width].
/// The image is processed to subtract the Faster‑RCNN channel means (B, G, R order).
/// </summary>
private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
int width = image.Width;
int height = image.Height;
// Create a tensor with shape [channels, height, width]
var tensor = new DenseTensor<float>(new[] { 3, height, width });
// Faster‑RCNN channel means (order: blue, green, red)
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
// Process each pixel row; ImageSharp provides efficient pixel row access.
image.ProcessPixelRows(accessor =>
{
for (int y = 0; y < height; y++)
{
var pixelRow = accessor.GetRowSpan(y);
for (int x = 0; x < width; x++)
{
// Subtract the channel mean value (ensuring B, G, R order)
tensor[0, y, x] = pixelRow[x].B - mean[0];
tensor[1, y, x] = pixelRow[x].G - mean[1];
tensor[2, y, x] = pixelRow[x].R - mean[2];
}
}
});
return tensor;
}
/// <summary>
/// Processes the model output, extracting bounding boxes, labels, and confidences.
/// Only detections with confidence scores above a defined threshold are printed.
/// </summary>
private static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> output)
{
// Note: The output node names ("6379", "6381", "6383") are model-specific.
// They correspond, respectively, to boxes, labels, and confidence scores.
var boxesTensor = output.First(x => x.Name == "6379").AsTensor<float>();
var labelsTensor = output.First(x => x.Name == "6381").AsTensor<long>();
var confidencesTensor = output.First(x => x.Name == "6383").AsTensor<float>();
float[] boxes = boxesTensor.ToArray();
long[] labels = labelsTensor.ToArray();
float[] confidences = confidencesTensor.ToArray();
const float minConfidence = 0.7f;
// Each bounding box is represented by 4 values: x1, y1, x2, y2.
for (int i = 0; i < boxes.Length; i += 4)
{
int detectionIndex = i / 4;
if (confidences[detectionIndex] >= minConfidence)
{
long label = labels[detectionIndex];
float confidence = confidences[detectionIndex];
float x1 = boxes[i];
float y1 = boxes[i + 1];
float x2 = boxes[i + 2];
float y2 = boxes[i + 3];
Console.WriteLine($"Label: {label}, Confidence: {confidence}, Bounding Box: [{x1}, {y1}, {x2}, {y2}]");
}
}
}
The Copilot generated code had the names of the output tensors (6379,6381, 6383), the mean calculation and the order of the colours (B,G,R) correct. The name of the image file and the path to the model file in The Explanation and Additional information looked a lot like mine.
static void Main(string[] args)
{
// Path to the ONNX model and input image
string modelPath = "FasterRCNN-10.onnx";
string imagePath = "sports.jpg";
// Load the image
Bitmap image = new Bitmap(imagePath);
var inputTensor = PreprocessImage(image);
// Load the ONNX model
using var session = new InferenceSession(modelPath);
// Create input data
var inputName = session.InputMetadata.Keys.First();
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor(inputName, inputTensor)
};
//// Run inference
using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs);
// Process output
foreach (var result in results)
{
Console.WriteLine($"{result.Name}: {string.Join(", ", result.AsEnumerable<float>())}");
}
}
static DenseTensor<float> PreprocessImage(Bitmap image)
{
// Resize image to model's expected input size (e.g., 224x224)
int targetWidth = 224;
int targetHeight = 224;
Bitmap resizedImage = new Bitmap(image, new Size(targetWidth, targetHeight));
// Convert image to float array and normalize
var input = new DenseTensor<float>(new[] { 1, 3, targetHeight, targetWidth });
for (int y = 0; y < targetHeight; y++)
{
for (int x = 0; x < targetWidth; x++)
{
Color pixel = resizedImage.GetPixel(x, y);
input[0, 0, y, x] = pixel.R / 255f;
input[0, 1, y, x] = pixel.G / 255f;
input[0, 2, y, x] = pixel.B / 255f;
}
}
return input;
}
I fixed the paths to the ONNX model and image file, so the application ran but there was an error.
The dimensions of the input tensor were wrong based on the Faster R-CNN documentation because the declaration had an extra dimension.
Change dimensions of tensor by removing batch from declaration
static DenseTensor<float> PreprocessImage(Bitmap image)
{
// Resize image to model's expected input size (e.g., 224x224)
int targetWidth = 224;
int targetHeight = 224;
Bitmap resizedImage = new Bitmap(image, new Size(targetWidth, targetHeight));
// Convert image to float array and normalize
var input = new DenseTensor<float>(new[] { 3, targetHeight, targetWidth });
for (int y = 0; y < targetHeight; y++)
{
for (int x = 0; x < targetWidth; x++)
{
Color pixel = resizedImage.GetPixel(x, y);
input[0, y, x] = pixel.R / 255f;
input[1, y, x] = pixel.G / 255f;
input[2, y, x] = pixel.B / 255f;
}
}
return input;
}
GitHub Copilot fixed the tensor declaration by removing the batch id, then the application ran but there was an error.
resize the image such that both height and width are within the range of [800, 1333], such that both height and width are divisible by 32
static Bitmap ResizeImageForModel(Bitmap image)
{
// Define the target range and divisibility
const int minSize = 800;
const int maxSize = 1333;
const int divisor = 32;
// Get original dimensions
int originalWidth = image.Width;
int originalHeight = image.Height;
// Calculate scale factor to fit within the range while maintaining aspect ratio
float scale = Math.Min((float)maxSize / Math.Max(originalWidth, originalHeight),
(float)minSize / Math.Min(originalWidth, originalHeight));
// Calculate new dimensions
int newWidth = (int)(originalWidth * scale);
int newHeight = (int)(originalHeight * scale);
// Ensure dimensions are divisible by 32
newWidth = (newWidth / divisor) * divisor;
newHeight = (newHeight / divisor) * divisor;
// Resize the image
return new Bitmap(image, new Size(newWidth, newHeight));
}
GitHub Copilot fixed the image resizing, so the application ran but there was still an error.
I also manually added the using directive for System.Drawing.Drawing2D
static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results, Bitmap image)
{
// Extract output tensors
var boxes = results.First(r => r.Name == "boxes").AsEnumerable<float>().ToArray();
var labels = results.First(r => r.Name == "labels").AsEnumerable<long>().ToArray();
var scores = results.First(r => r.Name == "scores").AsEnumerable<float>().ToArray();
using Graphics graphics = Graphics.FromImage(image);
graphics.SmoothingMode = SmoothingMode.AntiAlias;
for (int i = 0; i < labels.Length; i++)
{
if (scores[i] < 0.5) continue; // Filter low-confidence detections
// Extract bounding box coordinates
float x1 = boxes[i * 4];
float y1 = boxes[i * 4 + 1];
float x2 = boxes[i * 4 + 2];
float y2 = boxes[i * 4 + 3];
// Draw bounding box
RectangleF rect = new RectangleF(x1, y1, x2 - x1, y2 - y1);
graphics.DrawRectangle(Pens.Red, rect.X, rect.Y, rect.Width, rect.Height);
// Display label and confidence
string label = $"Label: {labels[i]}, Confidence: {scores[i]:0.00}";
graphics.DrawString(label, new Font("Arial", 12), Brushes.Yellow, new PointF(x1, y1 - 20));
}
// Save the image with annotations
image.Save("output.jpg");
Console.WriteLine("Output image saved as 'output.jpg'.");
}
The application ran but there was an error because the output tensor names were wrong.
I used Netron to determine the correct output tensor names.
It was quicker to manually fix the output tensor names
static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results, Bitmap image)
{
// Extract output tensors
var boxes = results.First(r => r.Name == "6379").AsEnumerable<float>().ToArray();
var labels = results.First(r => r.Name == "6381").AsEnumerable<long>().ToArray();
var scores = results.First(r => r.Name == "6383").AsEnumerable<float>().ToArray();
using Graphics graphics = Graphics.FromImage(image);
graphics.SmoothingMode = SmoothingMode.AntiAlias;
for (int i = 0; i < labels.Length; i++)
{
if (scores[i] < 0.5) continue; // Filter low-confidence detections
// Extract bounding box coordinates
float x1 = boxes[i * 4];
float y1 = boxes[i * 4 + 1];
float x2 = boxes[i * 4 + 2];
float y2 = boxes[i * 4 + 3];
// Draw bounding box
RectangleF rect = new RectangleF(x1, y1, x2 - x1, y2 - y1);
graphics.DrawRectangle(Pens.Red, rect.X, rect.Y, rect.Width, rect.Height);
// Display label and confidence
string label = $"Label: {labels[i]}, Confidence: {scores[i]:0.00}";
graphics.DrawString(label, new Font("Arial", 12), Brushes.Yellow, new PointF(x1, y1 - 20));
}
// Save the image with annotations
image.Save("output.jpg");
Console.WriteLine("Output image saved as 'output.jpg'.");
}
The application ran but the results were bad, so I checked format of the input tensor and figured out the mean adjustment was missing.
Apply mean to each channel
I used GitHub Copilot to add code for the mean adjustment for each pixel
static DenseTensor<float> PreprocessImage(Bitmap image)
{
// Resize image to model's expected input size
Bitmap resizedImage = ResizeImageForModel(image);
// Apply FasterRCNN mean values to each channel
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
// Convert image to float array and normalize
var input = new DenseTensor<float>(new[] { 3, resizedImage.Height, resizedImage.Width });
for (int y = 0; y < resizedImage.Height; y++)
{
for (int x = 0; x < resizedImage.Width; x++)
{
Color pixel = resizedImage.GetPixel(x, y);
input[0, y, x] = (pixel.R - mean[0]) / 255f;
input[1, y, x] = (pixel.G - mean[1]) / 255f;
input[2, y, x] = (pixel.B - mean[2]) / 255f;
}
}
return input;
}
The application ran but the results were still bad, so I checked format of the input tensor and figured out the mean adjustment was wrong. It was quicker to manually fix up the mean calculation.
static DenseTensor<float> PreprocessImage(Bitmap image)
{
// Resize image to model's expected input size
Bitmap resizedImage = ResizeImageForModel(image);
// Apply FasterRCNN mean values to each channel
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
// Convert image to float array and normalize
var input = new DenseTensor<float>(new[] { 3, resizedImage.Height, resizedImage.Width });
for (int y = 0; y < resizedImage.Height; y++)
{
for (int x = 0; x < resizedImage.Width; x++)
{
Color pixel = resizedImage.GetPixel(x, y);
input[0, y, x] = pixel.R - mean[0];
input[1, y, x] = pixel.G - mean[1];
input[2, y, x] = pixel.B - mean[2];
}
}
return input;
}
The application ran but the results were still bad, so I checked format of the input tensor and figured out the input tensor was BGR rather than RGB.
Change to B,G,R
static DenseTensor<float> PreprocessImage(Bitmap image)
{
// Resize image to model's expected input size
Bitmap resizedImage = ResizeImageForModel(image);
// Apply FasterRCNN mean values to each channel
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
// Convert image to float array and normalize
var input = new DenseTensor<float>(new[] { 3, resizedImage.Height, resizedImage.Width });
for (int y = 0; y < resizedImage.Height; y++)
{
for (int x = 0; x < resizedImage.Width; x++)
{
Color pixel = resizedImage.GetPixel(x, y);
input[0, y, x] = pixel.B - mean[0] ;
input[1, y, x] = pixel.G - mean[1] ;
input[2, y, x] = pixel.R - mean[2] ;
}
}
return input;
}
Finally, the application minimum bounding rectangles (MBRs), labels and confidences looked correct.
The additional code for marking up the image in the ProcessOutput was unexpected and I wonder if there wasn’t a Sixlabors.ImageSharp example for “inspiration”
The Copilot generated code in this post is not suitable for production
For this post I’ll be using GitHub Copilot to generate code for a console application that uses a Faster R-CNNONNX model to process an image loaded from disk.
please write a C# console application that uses Onnx to run a ONNX Faster-RCNN object detection model on an image loaded from disk
static void Main(string[] args)
{
string modelPath = "FasterRCNN-10.onnx";
string imagePath = "sports.jpg";
using var session = new InferenceSession(modelPath);
using var image = new Bitmap(imagePath);
var inputTensor = ExtractTensorFromImage(image);
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("image", inputTensor)
};
using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = session.Run(inputs);
var output = results.First().AsEnumerable<float>().ToArray();
// Process the output (e.g., draw bounding boxes on the image)
ProcessOutput(output, image);
image.Save("output.jpg", ImageFormat.Jpeg);
Console.WriteLine("Object detection completed. Output saved as output.jpg");
}
private static DenseTensor<float> ExtractTensorFromImage(Bitmap image)
{
int width = image.Width;
int height = image.Height;
var tensor = new DenseTensor<float>(new[] { 1, 3, height, width });
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
Color pixel = image.GetPixel(x, y);
tensor[0, 0, y, x] = pixel.R / 255.0f;
tensor[0, 1, y, x] = pixel.G / 255.0f;
tensor[0, 2, y, x] = pixel.B / 255.0f;
}
}
return tensor;
}
private static void ProcessOutput(float[] output, Bitmap image)
{
// Implement the logic to process the output and draw bounding boxes on the image
// This is a placeholder for the actual implementation
}
I fixed the paths to the ONNX model and image files, so the application ran but there was an error.
The dimensions of the input tensor were wrong because the declaration included a batch id.
Change dimensions of tensor by removing batch from declaration
private static DenseTensor<float> ExtractTensorFromImage(Bitmap image)
{
int width = image.Width;
int height = image.Height;
var tensor = new DenseTensor<float>(new[] { 3, height, width });
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
Color pixel = image.GetPixel(x, y);
tensor[0, y, x] = pixel.R / 255.0f;
tensor[1, y, x] = pixel.G / 255.0f;
tensor[2, y, x] = pixel.B / 255.0f;
}
}
return tensor;
}
GitHub Copilot fixed the tensor declaration by removing the batch id. The application also used System.Drawing which is not supported on my target embedded platforms.
Use ImageSharp to resize the image such that both height and width are within the range of [800, 1333], such that both height and width are divisible by 32.
private static void ResizeImage(Image<Rgb24> image)
{
const int minSize = 800;
const int maxSize = 1333;
int originalWidth = image.Width;
int originalHeight = image.Height;
float scale = Math.Min((float)maxSize / Math.Max(originalWidth, originalHeight), (float)minSize / Math.Min(originalWidth, originalHeight));
int newWidth = (int)(originalWidth * scale);
int newHeight = (int)(originalHeight * scale);
// Ensure dimensions are divisible by 32
newWidth = (newWidth / 32) * 32;
newHeight = (newHeight / 32) * 32;
image.Mutate(x => x.Resize(newWidth, newHeight));
}
The application ran but the results were bad, so I checked format of the input tensor and figured out the R,G,B values need to be adjusted.
Apply mean to each channel
private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
int width = image.Width;
int height = image.Height;
var tensor = new DenseTensor<float>(new[] { 3, height, width });
// Mean values for each channel
float[] mean = { 0.485f, 0.456f, 0.406f };
image.ProcessPixelRows(accessor =>
{
for (int y = 0; y < height; y++)
{
var pixelRow = accessor.GetRowSpan(y);
for (int x = 0; x < width; x++)
{
tensor[0, y, x] = (pixelRow[x].R / 255.0f) - mean[0];
tensor[1, y, x] = (pixelRow[x].G / 255.0f) - mean[1];
tensor[2, y, x] = (pixelRow[x].B / 255.0f) - mean[2];
}
}
});
return tensor;
}
The application ran but the results were still bad, so I checked format of the input tensor and figured out the Resnet50 means had be used and the input tensor was BGR rather than RGB
Use FasterRCNN means not resnet ones Change to B,G,R
private static DenseTensor<float> ExtractTensorFromImage(Image<Rgb24> image)
{
int width = image.Width;
int height = image.Height;
var tensor = new DenseTensor<float>(new[] { 3, height, width });
// Mean values for each channel for FasterRCNN
float[] mean = { 102.9801f, 115.9465f, 122.7717f };
image.ProcessPixelRows(accessor =>
{
for (int y = 0; y < height; y++)
{
var pixelRow = accessor.GetRowSpan(y);
for (int x = 0; x < width; x++)
{
tensor[0, y, x] = pixelRow[x].B - mean[0];
tensor[1, y, x] = pixelRow[x].G - mean[1];
tensor[2, y, x] = pixelRow[x].R - mean[2];
}
}
});
return tensor;
}
When I inspected the values in the output tensor in the debugger they looked “reasonable” so got GitHub Copilot to add the code required to display the results.
Display label, confidence and bounding box
The application ran but there was an exception because the names of the output tensor “dimensions” were wrong.
I used Netron to get the correct output tensor “dimension” names.
I then manually fixed the output tensor “dimension” names
private static void ProcessOutput(IDisposableReadOnlyCollection<DisposableNamedOnnxValue> output)
{
var boxes = output.First(x => x.Name == "6379").AsTensor<float>().ToArray();
var labels = output.First(x => x.Name == "6381").AsTensor<long>().ToArray();
var confidences = output.First(x => x.Name == "6383").AsTensor<float>().ToArray();
const float minConfidence = 0.7f;
for (int i = 0; i < boxes.Length; i += 4)
{
var index = i / 4;
if (confidences[index] >= minConfidence)
{
long label = labels[index];
float confidence = confidences[index];
float x1 = boxes[i];
float y1 = boxes[i + 1];
float x2 = boxes[i + 2];
float y2 = boxes[i + 3];
Console.WriteLine($"Label: {label}, Confidence: {confidence}, Bounding Box: [{x1}, {y1}, {x2}, {y2}]");
}
}
}
I manually compared the output of the console application with equivalent YoloSharp application output and the results looked close enough.
Summary
The Copilot prompts required to generate code were significantly more complex than previous examples and I had to regularly refer to the documentation to figure out what was wrong. The code wasn’t great and Copilot didn’t add much value
The Copilot generated code in this post is not suitable for production