Windows 10 IoT Core Cognitive Services Computer Vision API

This application was inspired by one of teachers I work with wanting to check occupancy of different areas in the school library. I had been using the Computer Vision service to try and identify objects around my home and office which had been moderately successful but not terribly useful or accurate.

I added the Azure Cognitive Services Computer Vision API NuGet packages to my Visual Studio 2017 Windows IoT Core project.

Azure Cognitive Services Computer Vision API library

Then I initialised the Computer Vision API client

try
{
	this.computerVisionClient = new ComputerVisionClient(
			 new Microsoft.Azure.CognitiveServices.Vision.ComputerVision.ApiKeyServiceClientCredentials(this.azureCognitiveServicesSubscriptionKey),
			 new System.Net.Http.DelegatingHandler[] { })
	{
		Endpoint = this.azureCognitiveServicesEndpoint,
	};
}
catch (Exception ex)
{
	this.logging.LogMessage("Azure Cognitive Services Computer Vision client configuration failed " + ex.Message, LoggingLevel.Error);
	return;
}

Every time the digital input is strobed by the passive infra red motion detector an image is captured, then uploaded for processing, and finally results displayed. For this sample I’m looking for categories which indicate the image is of a group of people (The categories are configured in the appsettings file)

{
  "InterruptPinNumber": 24,
  "interruptTriggerOn": "RisingEdge",
  "DisplayPinNumber": 35,
  "AzureCognitiveServicesEndpoint": "https://australiaeast.api.cognitive.microsoft.com/",
  "AzureCognitiveServicesSubscriptionKey": "1234567890abcdefghijklmnopqrstuv",
  "ComputerVisionCategoryNames":"people_group,people_many",
  "LocalImageFilenameFormatLatest": "{0}.jpg",
  "LocalImageFilenameFormatHistoric": "{1:yyMMddHHmmss}.jpg",
  "DebounceTimeout": "00:00:30"
} 

If any of the specified categories are identified in the image I illuminate a Light Emitting Diode (LED) for 5 seconds, if an image is being processed or the minimum period between images has not passed the LED is illuminated for 5 milliseconds .

		private async void InterruptGpioPin_ValueChanged(GpioPin sender, GpioPinValueChangedEventArgs args)
		{
			DateTime currentTime = DateTime.UtcNow;
			Debug.WriteLine($"Digital Input Interrupt {sender.PinNumber} triggered {args.Edge}");

			if (args.Edge != this.interruptTriggerOn)
			{
				return;
			}

			// Check that enough time has passed for picture to be taken
			if ((currentTime - this.imageLastCapturedAtUtc) < this.debounceTimeout)
			{
				this.displayGpioPin.Write(GpioPinValue.High);
				this.displayOffTimer.Change(this.timerPeriodDetectIlluminated, this.timerPeriodInfinite);
				return;
			}

			this.imageLastCapturedAtUtc = currentTime;

			// Just incase - stop code being called while photo already in progress
			if (this.cameraBusy)
			{
				this.displayGpioPin.Write(GpioPinValue.High);
				this.displayOffTimer.Change(this.timerPeriodDetectIlluminated, this.timerPeriodInfinite);
				return;
			}

			this.cameraBusy = true;

			try
			{
				using (Windows.Storage.Streams.InMemoryRandomAccessStream captureStream = new Windows.Storage.Streams.InMemoryRandomAccessStream())
				{
					this.mediaCapture.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), captureStream).AsTask().Wait();
					captureStream.FlushAsync().AsTask().Wait();
					captureStream.Seek(0);

					IStorageFile photoFile = await KnownFolders.PicturesLibrary.CreateFileAsync(ImageFilename, CreationCollisionOption.ReplaceExisting);
					ImageEncodingProperties imageProperties = ImageEncodingProperties.CreateJpeg();
					await this.mediaCapture.CapturePhotoToStorageFileAsync(imageProperties, photoFile);

					ImageAnalysis imageAnalysis = await this.computerVisionClient.AnalyzeImageInStreamAsync(captureStream.AsStreamForRead());

					Debug.WriteLine($"Tag count {imageAnalysis.Categories.Count}");

					if (imageAnalysis.Categories.Intersect(this.categoryList, new CategoryComparer()).Any())
					{
						this.displayGpioPin.Write(GpioPinValue.High);

						// Start the timer to turn the LED off
						this.displayOffTimer.Change(this.timerPeriodFaceIlluminated, this.timerPeriodInfinite);
					}

					LoggingFields imageInformation = new LoggingFields();

					imageInformation.AddDateTime("TakenAtUTC", currentTime);
					imageInformation.AddInt32("Pin", sender.PinNumber);
					Debug.WriteLine($"Categories:{imageAnalysis.Categories.Count}");
					imageInformation.AddInt32("Categories", imageAnalysis.Categories.Count);
					foreach (Category category in imageAnalysis.Categories)
					{
						Debug.WriteLine($" Category:{category.Name} {category.Score}");
						imageInformation.AddDouble($"Category:{category.Name}", category.Score);
					}

					this.logging.LogEvent("Captured image processed by Cognitive Services", imageInformation);
				}
			}
			catch (Exception ex)
			{
				this.logging.LogMessage("Camera photo or save failed " + ex.Message, LoggingLevel.Error);
			}
			finally
			{
				this.cameraBusy = false;
			}
		}

		private void TimerCallback(object state)
		{
			this.displayGpioPin.Write(GpioPinValue.Low);
		}

		internal class CategoryComparer : IEqualityComparer<Category>
		{
			public bool Equals(Category x, Category y)
			{
				if (string.Equals(x.Name, y.Name, StringComparison.OrdinalIgnoreCase))
				{
					return true;
				}

				return false;
			}

			public int GetHashCode(Category obj)
			{
				return obj.Name.GetHashCode();
			}
		}

I found that the Computer vision service was pretty good at categorising photos of images like this displayed on my second monitor as containing a group of people.

The debugging output of the application includes the different categories identified in the captured image.

Digital Input Interrupt 24 triggered RisingEdge
Digital Input Interrupt 24 triggered FallingEdge
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Diagnostics.DiagnosticSource.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Collections.NonGeneric.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Runtime.Serialization.Formatters.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Diagnostics.TraceSource.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Collections.Specialized.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Drawing.Primitives.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Runtime.Serialization.Primitives.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Data.Common.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Xml.ReaderWriter.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'C:\Data\Programs\WindowsApps\Microsoft.NET.CoreFramework.Debug.2.2_2.2.27505.2_arm__8wekyb3d8bbwe\System.Private.Xml.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'backgroundTaskHost.exe' (CoreCLR: CoreCLR_UWP_Domain): Loaded 'Anonymously Hosted DynamicMethods Assembly'. 
Tag count 1
Categories:1
 Category:people_group 0.8671875
The thread 0x634 has exited with code 0 (0x0).

I used an infrared motion sensor to trigger capture and processing of an image to simulate a application for detecting if there is a group of people in an area of the school library.

I’m going to run this application alongside one of my time-lapse applications to record a days worth of images and manually check the accuracy of the image categorisation. I think that camera location maybe important as well so I’ll try a selection of different USB cameras and locations.

Trial PIR triggered computer vision client

I also found the small PIR motion detector didn’t work very well in a larger space so I’m going to trial a configurable sensor and a repurposed burglar alarm sensor.

Windows 10 IoT Core Cognitive Services Face API

After building a series of Windows 10 IoT Core applications to capture images and store them

I figured some sample applications which used Azure Cognitive Services Vision Services to process captured images would be interesting.

This application was inspired by one of my students who has been looking at an Arduino based LoRa wireless connected sensor for monitoring Ultraviolet(UV) light levels and wanted to check that juniors at the school were wearing their hats on sunny days before going outside.

First I needed create a Cognitive Services instance and get the subscription key and endpoint.

Azure Cognitive Services Instance Creation

Then I added the Azure Cognitive Services Face API NuGet packages into my Visual Studio Windows IoT Core project

Azure Cognitive Services Vision Face API library

Then initialise the Face API client

try
{
	this.faceClient = new FaceClient(
			 new Microsoft.Azure.CognitiveServices.Vision.Face.ApiKeyServiceClientCredentials(this.azureCognitiveServicesSubscriptionKey),
											 new System.Net.Http.DelegatingHandler[] { })
	{
		Endpoint = this.azureCognitiveServicesEndpoint,
	};
}
catch (Exception ex)
{
	this.logging.LogMessage("Azure Cognitive Services Face Client configuration failed " + ex.Message, LoggingLevel.Error);
	return;
}

Then every time a digital input is strobed and image is captured, then uploaded for processing, and finally results displayed. The interrupt handler has code to stop re-entrancy and contactor bounce causing issues. I also requested that the Face service include age and gender attributes with associated confidence values.

If a face is found in the image I illuminate a Light Emitting Diode (LED) for 5 seconds, if an image is being processed or the minimum period between images has not passed the LED is illuminated for 5 milliseconds .

private async void InterruptGpioPin_ValueChanged(GpioPin sender, GpioPinValueChangedEventArgs args)
{
	DateTime currentTime = DateTime.UtcNow;
	Debug.WriteLine($"Digital Input Interrupt {sender.PinNumber} triggered {args.Edge}");

	if (args.Edge != this.interruptTriggerOn)
	{
		return;
	}

	// Check that enough time has passed for picture to be taken
	if ((currentTime - this.imageLastCapturedAtUtc) < this.debounceTimeout)
	{
		this.displayGpioPin.Write(GpioPinValue.High);
		this.displayOffTimer.Change(this.timerPeriodDetectIlluminated, this.timerPeriodInfinite);
		return;
	}

	this.imageLastCapturedAtUtc = currentTime;

	// Just incase - stop code being called while photo already in progress
	if (this.cameraBusy)
	{
		this.displayGpioPin.Write(GpioPinValue.High);
		this.displayOffTimer.Change(this.timerPeriodDetectIlluminated, this.timerPeriodInfinite);
		return;
	}

	this.cameraBusy = true;

	try
	{
		using (Windows.Storage.Streams.InMemoryRandomAccessStream captureStream = new Windows.Storage.Streams.InMemoryRandomAccessStream())
		{
			this.mediaCapture.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), captureStream).AsTask().Wait();
			captureStream.FlushAsync().AsTask().Wait();
			captureStream.Seek(0);
			IStorageFile photoFile = await KnownFolders.PicturesLibrary.CreateFileAsync(ImageFilename, CreationCollisionOption.ReplaceExisting);
			ImageEncodingProperties imageProperties = ImageEncodingProperties.CreateJpeg();
			await this.mediaCapture.CapturePhotoToStorageFileAsync(imageProperties, photoFile);

			IList<FaceAttributeType> returnfaceAttributes = new List<FaceAttributeType>();
			returnfaceAttributes.Add(FaceAttributeType.Gender);
			returnfaceAttributes.Add(FaceAttributeType.Age);

			IList<DetectedFace> detectedFaces = await this.faceClient.Face.DetectWithStreamAsync(captureStream.AsStreamForRead(), returnFaceAttributes: returnfaceAttributes);

			Debug.WriteLine($"Count {detectedFaces.Count}");

			if (detectedFaces.Count > 0)
			{
				this.displayGpioPin.Write(GpioPinValue.High);

						// Start the timer to turn the LED off
				this.displayOffTimer.Change(this.timerPeriodFaceIlluminated, this.timerPeriodInfinite);
			}

			LoggingFields imageInformation = new LoggingFields();
			imageInformation.AddDateTime("TakenAtUTC", currentTime);
			imageInformation.AddInt32("Pin", sender.PinNumber);
			imageInformation.AddInt32("Faces", detectedFaces.Count);
			foreach (DetectedFace detectedFace in detectedFaces)
			{
				Debug.WriteLine("Face");
				if (detectedFace.FaceId.HasValue)
				{
					imageInformation.AddGuid("FaceId", detectedFace.FaceId.Value);
					Debug.WriteLine($" Id:{detectedFace.FaceId.Value}");
				}
				imageInformation.AddInt32("Left", detectedFace.FaceRectangle.Left);
				imageInformation.AddInt32("Width", detectedFace.FaceRectangle.Width);
				imageInformation.AddInt32("Top", detectedFace.FaceRectangle.Top);
				imageInformation.AddInt32("Height", detectedFace.FaceRectangle.Height);
				Debug.WriteLine($" L:{detectedFace.FaceRectangle.Left} W:{detectedFace.FaceRectangle.Width} T:{detectedFace.FaceRectangle.Top} H:{detectedFace.FaceRectangle.Height}");
				if (detectedFace.FaceAttributes != null)
				{
					if (detectedFace.FaceAttributes.Gender.HasValue)
					{
						imageInformation.AddString("Gender", detectedFace.FaceAttributes.Gender.Value.ToString());
						Debug.WriteLine($" Gender:{detectedFace.FaceAttributes.Gender.ToString()}");
					}

					if (detectedFace.FaceAttributes.Age.HasValue)
					{
						imageInformation.AddDouble("Age", detectedFace.FaceAttributes.Age.Value);
						Debug.WriteLine($" Age:{detectedFace.FaceAttributes.Age.Value.ToString("F1")}");
					}
				}
			}

			this.logging.LogEvent("Captured image processed by Cognitive Services", imageInformation);
		}
	}
	catch (Exception ex)
	{
		this.logging.LogMessage("Camera photo or save failed " + ex.Message, LoggingLevel.Error);
	}
	finally
	{
		this.cameraBusy = false;
	}
}

private void TimerCallback(object state)
{
	this.displayGpioPin.Write(GpioPinValue.Low);
}

This is the image uploaded to the Cognitive Services Vision Face API from my DragonBoard 410C

Which was a photo of this sample image displayed on my second monitor

The debugging output of the application includes the bounding box, gender, age and unique identifier of each detected face.

Digital Input Interrupt 24 triggered RisingEdge
Digital Input Interrupt 24 triggered FallingEdge
Count 13
Face
 Id:41ab8a38-180e-4b63-ab47-d502b8534467
 L:12 W:51 T:129 H:51
 Gender:Female
 Age:24.0
Face
 Id:554f7557-2b78-4392-9c73-5e51fedf0300
 L:115 W:48 T:146 H:48
 Gender:Female
 Age:19.0
Face
 Id:f67ae4cc-1129-46a8-8c5b-0e79f350cbaa
 L:547 W:46 T:162 H:46
 Gender:Female
 Age:56.0
Face
 Id:fad453fb-0923-4ae2-8c9d-73c9d89eaaf4
 L:585 W:45 T:116 H:45
 Gender:Female
 Age:25.0
Face
 Id:c2d2ca4e-faa6-49e8-8cd9-8d21abfc374c
 L:410 W:44 T:154 H:44
 Gender:Female
 Age:23.0
Face
 Id:6fb75edb-654c-47ff-baf0-847a31d2fd85
 L:70 W:44 T:57 H:44
 Gender:Male
 Age:37.0
Face
 Id:d6c97a9a-c49f-4d9c-8eac-eb2fbc03abc1
 L:469 W:44 T:122 H:44
 Gender:Female
 Age:38.0
Face
 Id:e193bf15-6d8c-4c30-adb5-4ca5fb0f0271
 L:206 W:44 T:117 H:44
 Gender:Male
 Age:33.0
Face
 Id:d1ba5a42-0475-4b65-afc8-0651439e1f1e
 L:293 W:44 T:74 H:44
 Gender:Male
 Age:59.0
Face
 Id:b6a7c551-bdad-4e38-8976-923b568d2721
 L:282 W:43 T:144 H:43
 Gender:Female
 Age:28.0
Face
 Id:8be87f6d-7350-4bc3-87f5-3415894b8fac
 L:513 W:42 T:78 H:42
 Gender:Male
 Age:36.0
Face
 Id:e73bd4d7-81a4-403c-aa73-1408ae1068c0
 L:163 W:36 T:94 H:36
 Gender:Female
 Age:44.0
Face
 Id:462a6948-a05e-4fea-918d-23d8289e0401
 L:407 W:36 T:73 H:36
 Gender:Male
 Age:27.0
The thread 0x8e0 has exited with code 0 (0x0).

I used a simple infrared proximity sensor trigger the image capture to simulate an application for monitoring the number of people in or people entering a room.

Infrared Proximity Sensor triggered Face API test client

Overall I found that with not a lot of code I could capture an image, upload it to Azure Cognitive Services Face API for processing and the algorithm would reasonably reliably detect faces and features.