HTTPS with NetMF calling an Azure Service Bus endpoint

Back in Q1 of 2013 I posted a sample application which called an Azure Service Bus end point just to confirm that the certificate configuration etc. was good.

Since I published that sample the Azure Root certificate has been migrated so I have created a new project which uses the Baltimore Cyber Trust Root certificate.

The sample Azure ServiceBus client uses a wired LAN connection (original one used wifi module) and to run it locally you will have to update the Date information around line 24.

SQL Azure Database performance oddness

For a few years I have worked on a midsize Azure application for managing promotional vouchers which are delivered via SMS or email and redeemed via software integrated into the point of sale terminal software of several local vendors.

A promotion has a batch of vouchers which are available for allocation to consumers and I had noticed from the logs and performance counters that the process of getting the next voucher slowed down significantly as the number of available vouchers decreased.

Voucher batches range in size from 100’s to 100,000’s of vouchers and the performance of a small promotion for a local wine shop was where I initially noticed how much the duration increased. The promotion had roughly 850 vouchers the allocation of the first vouchers took 10’s of mSec each but the last 10 vouchers sometimes took more than 1000mSec each.

I took a copy of the live database and removed the customer data so I could explore the performance of my TSQL in a controlled environment. I initially downloaded a copy of the database to one of my development servers and tried to simulate the problem while monitoring performance using SQL Profiler and other tools but the allocation time was fast and consistent.

The database performance appeared to be a SQL Azure specific issue so I built a cut back web role test harness which called the underlying stored procedure so I could closely monitor the performance. The test harness could make a specified number of calls recording the duration of each call and the overall duration. I then de allocated all the vouchers in the wine shop promotion and allocated them in chunks(all durations are in mSec and are the average time it takes to make a single call)

100 vouchers at a time (0 – 800 of 843 vouchers)

43, 88, 136, 191, 260, 305, 358, 379

10 vouchers at a time (800-840 of 843 vouchers)

431, 440, 404, 412

1 voucher at a time (840-843 of 843 vouchers, last one is failure)

400, 423,404, 390

After some debugging, progressive removal of code and looking at query plans I identified the problematic TSQL statement.

UPDATE TOP(1) Voucher SET
Voucher.AllocatedByActivity = @ActivityUID,
Voucher.AllocatedAtUTC = @RequestDateTimeUTC,
Voucher.AllocationReference = @ClientReference,
@VoucherCode = Voucher.Code
WHERE (( Voucher.VoucherBatchUID = @VoucherBatchUID ) AND ( Voucher.AllocatedByActivity IS NULL ))

This update statement uses a single statement transaction to get the next random un-allocated voucher code in the specified voucher batch.

After some conversations with a SQL Azure support engineer at Microsoft (who was very helpful) he figured out that in SQL Azure the query processor needed a query hint to tell it to use an existing index to make it perform consistently.

UPDATE TOP(1) Voucher SET
Voucher.AllocatedByActivity = @ActivityUID,
Voucher.AllocatedAtUTC = @RequestDateTimeUTC,
Voucher.AllocationReference = @ClientReference,
@VoucherCode = Voucher.Code
FROM Voucher WITH (INDEX(IX_VoucherByVoucherBatchUIDAllocatedByActivity))
WHERE (( Voucher.VoucherBatchUID = @VoucherBatchUID ) AND ( Voucher.AllocatedByActivity IS NULL ))

100 vouchers at a time (0 – 800 of 843 vouchers)

20, 8, 13, 9, 32, 7, 15, 9

10 vouchers at a time (800-840 of 843 vouchers)

17, 9, 14, 13

1 voucher at a time (840-843 of 843 vouchers, last one is failure)

12,11,13, 5

I always have plenty of performance counters and logging (using the enterprise library) on my Azure web and worker roles but I was also lucky that I noticed something odd in the logs while checking on the progress of another promotion. I actively monitor the performance of my Azure applications as over time the performance characteristics of the underlying hardware will change as fixes and enhancements are released.

EntLib Logging in Azure – Revisited

At one of my clients I was debugging an application which used the Microsoft Enterprise Library logging block and I added some code to log the service configuration on start up.

The odd thing was that when I turned on verbose logging I noticed the first couple of log entries were missing sometimes. After some mucking around I found that the log entries were only missing for a new deployment and everything was fine for a restart.

The missing log entries weren’t immediately obvious as I was testing an application which was clustered and there were multiple instances were logging information simultaneously.

public override bool OnStart()
{
Trace.Listeners.Add(new DiagnosticMonitorTraceListener());
Trace.AutoFlush = true;

CloudStorageAccount storageAccount = CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"));
RoleInstanceDiagnosticManager roleInstanceDiagnosticManager = storageAccount.CreateRoleInstanceDiagnosticManager(RoleEnvironment.DeploymentId, RoleEnvironment.CurrentRoleInstance.Role.Name, RoleEnvironment.CurrentRoleInstance.Id);

var configuration = roleInstanceDiagnosticManager.GetCurrentConfiguration();
if (configuration == null)
{
configuration = DiagnosticMonitor.GetDefaultInitialConfiguration();
}

configuration.Logs.BufferQuotaInMB = 4;
configuration.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
configuration.Logs.ScheduledTransferLogLevelFilter = LogLevel.Undefined;
roleInstanceDiagnosticManager.SetCurrentConfiguration(configuration);

EnterpriseLibraryContainer.Current = EnterpriseLibraryContainer.CreateDefaultContainer(new FileConfigurationSource("web.config", false));
Logger.Write("OnStart " + DateTime.UtcNow.ToLongTimeString());


return base.OnStart();
}

public override void Run()
{
int index = 0;

Logger.Write(“Run Start ” + DateTime.UtcNow.ToLongTimeString());

while (true)
{
Logger.Write(“Run Loop ” + index.ToString() + ” ” + DateTime.UtcNow.ToLongTimeString());
index++;
Thread.Sleep(10000);
}
}

The trace log for a new deployment looked like this (I have found it starts somewhere between loop 2 thru 6 )

Run Loop 6 8:04:09 AM
Run Loop 7 8:04:19 AM
Run Loop 8 8:04:29 AM
….

For a restart like this

OnStart 8:19:01 AM
Run Start 8:19:01 AM
Run Loop 0 8:19:01 AM
Run Loop 1 8:19:11 AM
Run Loop 2 8:19:21 AM

I’m assuming that it this is a result of the time it takes for DiagnosticsAgent.exe to spin up.

If I use the Windows Event log the results are similar but with duplicate entries every so often which is odd.

Christchurch DNUG Internet of Things Presentation 5:30PM 26th March

I’m doing a presentation at the EPIC centre on the 26th of March about Windows Azure and the Internet of Things.

The local Java user group have been invited as well so I’ll be going into interop and connectivity considerations as well.  More information about the location and time is available from here.

Tim Carr from Mindkits has also kindly provided some swag. I have got a Fez Panda II and Robot Chassis to give away for the best two questions.

HTTP Headers Payload encryption

So far the content of the messages has been sent as clear text which would not be acceptable for many applications. The requirement for data privacy causes a problem on the Netduino+ (Nov 2012) as the standard NetMF crypto libraries are not baked into the platform.

I then set about finding some crypto libraries which were NetMF friendly. RSA and Xtea are included in some of other NetMF platforms in the Microsoft.SPOT.Cryptography assembly so Xtea seemed like a reasonable choice to ensure interoperability.

When looking for crypto implementations one of the best sources is the Legion of the Bouncy Castle which was where I started. I downloaded the the V17.7 zip file had a look at the size of the Xtea code & all the associated support libraries and then parked that approach as I was worried about the size and performance of the resulting binaries.

I then looked at other possible tea, xtea & xxtea implementations (including porting the original C code to C#)

I am not a cryptographer so I can’t personally confirm the quality and correctness of an implementation. So after having interop problems I went back to the Legion of the Bouncy Castle which has been peer reviewed by experts and had another look. To get an Xtea implementation working on a .NetMF platform like the Netduino+ you need to include the following files…

  • CryptoException.cs
  • DataLengthException.cs
  • IBlockCipher.cs
  • ICipherParameters.cs
  • KeyParameter.cs
  • XTEAEngine.cs

On the Azure side of things where size is not so critical I just added a reference to the Bouncy Castle main project.

Xtea requires 128 bit blocks so you need to pad out the data on the client, then trim off the padding on the server.
// Pad out the data to a multiple of 8 bytes with spaces
if (xPosition.Length % 8 != 0)
{
xPosition += new string(' ', 8 - xPosition.Length % 8);
}

The key and the data need to be converted to byte arrarys, the Xtea engine initialised and a buffer for storing the encrypted data created.

byte[] dataBytes = Encoding.UTF8.GetBytes(xPosition);
byte[] keyBytes = Encoding.UTF8.GetBytes(key);

xteaEngine.Init(true, new KeyParameter(keyBytes));

Then the data can be encrypted in 8 byte chunks
byte[] cryptoBytes = new byte[dataBytes.Length];
for (int i = 0; i < dataBytes.Length; i += 8)
{
xteaEngine.ProcessBlock(dataBytes, i, cryptoBytes, i);
}

I hex encoded the encrypted data for transmission. Downside to this was it doubled the size of the payload
string hexEncodedCryptoBytes = ByteArrayToHex(cryptoBytes);

I added a StopWatch so I could measure the time taken to encrypt the position data (roughly 72 chars) on my Netduino+
285,342,277,345,282,345,342,350,278,343
Average 318mSec

The size of the payload had grown a bit
Request - Bytes Sent: 262
POST http://gpstrackerhttpheaders.cloudapp.net/posV7.aspx HTTP/1.1
Host: gpstrackerhttpheaders.cloudapp.net
x-Pos: 693A7AC6EBF4E5848CE8ABBA2BC6CAC1ED20574C1B2384E7E246A202C8A67E3DE14EE5231A5DF98C211F64F8402547F8BFDCC2241AAE3782A820086E5EF37AA2C50744941F588442
Content-Length: 0
Connection: Close

Response - Bytes Received: 132
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html
Date: Sun, 03 Feb 2013 04:53:30 GMT
Content-Length: 0

This increase in size had an impact on the time taken to send the message

1123,1144,1122,1142,1125,1125,1138,1111,1099,1141
Average 1127mSec

The binary downloaded to the Netduino+ had grown to 28K which still left plenty of space for additional functionality.

HTTP Headers Request reduction

The requests generated by the HTTP_Client were a bit chunky with user-agent strings, content type etc.

POST http://gpstrackerhttpheaders.cloudapp.net/posV4.aspx HTTP/1.1
Accept: */*
Accept-Language: en
User-Agent: NETMFToolbox/0.1 (textmode; Netduino; IntegratedSocket; HTTP_Client)
Host: gpstrackerhttpheaders.cloudapp.net
x-Pos: 5C-86-4A-00-3F-63 20130130080711 F -43.00000 172.00000 16.9 1.28 0 0
Content-Type: application/x-www-form-urlencoded
Content-Length: 0
Connection: Close

I then modified the HTTP_Client so that the accept, accept language, user agent, and content Type headers could be removed if not necessary.

POST http://gpstrackerhttpheaders.cloudapp.net/posV4.aspx HTTP/1.1
Host: gpstrackerhttpheaders.cloudapp.net
x-Pos: 5C-86-4A-00-3F-63 20130130081228 F -43.00000 172.00000 31.4 1.31 0 0
Content-Length: 0
Connection: Close

This reduced the size of the request down to 186 bytes which was comparable with the smallest System.http HTTPRequest. But, the durations looked a bit odd..

1295,1381,1323,1347,1281,1264,1351,1305,1350,1269
Average duration 1317 mSec

This was roughly 200mSec slower than the larger request version. After some digging I think the TCP socket was buffering (Nagle algorithm send coalescing) the send.

I then modified IntegratedSocket.cs connect method to disable this buffering

EndPoint Destination = new IPEndPoint(address.AddressList[0], (int)this._Port);
// Connects to the socket
this._Sock.Connect(Destination);
this._Sock.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.NoDelay, true);

I then checked the request durations again

10 requests to gpstrackerhttpheaders.cloudapp.net
1026,1248,983,1027,966,843,1009,983,834,915
Average duration 983 mSec

10 requests to IP Address
584, 595,578,597,581,595,577,593,580,599
Average duration 588 mSec

These were faster than any of the results with the baked in HttpWebRequest in System.http.

So I fired up .Net Reflector and had a look at the decompiled System.http code and though there are calls socket.SetSocketOption in EstablishConnection I can’t see any way for a user to set the NoDelay socket option as allowWriteStreamBuffering appears not to be used.

HTTP Headers NetMF Toolbox

The system.http assembly adds roughly 40K to the size of an application download. The NetMF toolbox has an HTTP client implementation which should significantly reduce the size of the download.

For my first proof of concept attempt, I downloaded the latest version of the NetMF toolbox (21733) and

  • Removed system.http reference
  • Added Toolbox.NETMF.NET.HTTP_Client (4.2)
  • AddedToolbox.NETMF.NET.Core (4.2)
  • Added Toolbox.NETMF.NET.Integrated (4.2)
  • Added Toolbox.NETMF.Core (4.2)
  • Modified code to use HTTP_Client instead of HttpWebRequest

The standard NetMF Toolbox HTTP_Client didn’t support adding HTTP headers to a request so I modified HTTP_Client.cs adding my code in the style of the cookies implementation. After a look at the HTTP request generation code I was interested to see what the performance of the NetMF Toolbox HTTP client was compared to the Microsoft one with my payload reducing tweaks.

gpstrackerhttpheaders.cloudapp.net
1122, 1175, 1200, 1169, 1162, 1153, 1150, 1262, 1171, 1152
Average 1171 mSec

Which was a bit slower than I expected

I then hacked the HTTP client code so I could use Fiddler to inspect the request & response payloads
Request Bytes Sent:351
POST http://gpstrackerhttpheaders.cloudapp.net/posV4.aspx HTTP/1.1
Accept: */*
Accept-Language: en
User-Agent: NETMFToolbox/0.1 (textmode; Netduino; IntegratedSocket; HTTP_Client)
Host: gpstrackerhttpheaders.cloudapp.net
x-Pos: 5C-86-4A-00-3F-63 20000101025524 F -43.00000 172.00000 25.2 0.80 0 0
Content-Type: application/x-www-form-urlencoded
Content-Length: 0
Connection: Close


Bytes Received:132
HTTP/1.1 200 OK
Cache-Control: private
x-UpdMin: 30
Date: Tue, 29 Jan 2013 02:55:25 GMT
Connection: close
Content-Length: 0

The downloaded went from 57K to 25k which is roughly a 32K reduction in size. The HTTP Request was a bit more chunky at 351 chars vs. 192 chars, but in a future post I’ll look into reducing it.

HTTP Headers threading

Running the HTTP requests and the NMEA stack on the foreground thread was always going to be a bit marginal. There were regular buffer overflows and memory allocation issues which the code would not always recover from. The simplest approach was to modify the code so a new thread is spawned for each HTTP request.

static void Gps_PositionChangedV5(uint data1, uint data2, DateTime time)
{
...
Thread thread = new Thread(() => HandleRequestSyncV5(Gps.Fix3D, Gps.GPSTime, Gps.Latitude, Gps.Longitude, Gps.HDoP, Gps.Altitude, Gps.Kmh, Gps.TrackAngle));
Thread.Start();
}

static void HandleRequestSyncV5(bool fix3D, DateTime gpsTime, float latitude, float longitude, float hDop, float altitude, float kmh, float trackAngle)
{
try
{
Stopwatch stopwatch = Stopwatch.StartNew();

using (HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(CloudServerUrl + "posV4.aspx"))
{
stopwatch.Stop();
...

10 requests to gpstrackerhttpheaders.cloudapp.net
1172,1143,1182,1136,1178,1257,1110,1117,1099,1225
Average 1161 mSec

10 requests to IP Address
1031,834,1033,1086,1029,1108,1049,1072,1036,819
Average 1010 mSec

The application is now quite stable but at roughly 57K2 bytes there isn’t a lot of space for any business functionality…

HTTP Headers Request reduction

Approaching the minimal request & response payload sizes.

Request Bytes Sent: 192
POST http://gpstrackerhttpheaders.cloudapp.net/posV3.aspx HTTP/1.1
x-Pos: 5C-86-4A-00-3F-63 20130126054537 T -43.00000 172.00000 12.2 4.08 1 83
Content-Length: 0
Connection: Keep-Alive
Host: gpstrackerhttpheaders.cloudapp.net

Response Bytes Received:113
HTTP/1.1 200 OK
Cache-Control: private
x-UpdMin: 30
Date: Sat, 26 Jan 2013 05:45:37 GMT
Content-Length: 0

10 requests to gpstrackerhttpheaders.cloudapp.net
944,969,1246,1043,1282,948,940,968,980,968
Average 1029 mSec

10 requests to IP Address
692,696,691,702,693,684,691,692,689,693
Average 692 mSec

Removing anymore headers from the request or response could break could HTTP break compatibility.

I went back and timed a V1 request & response to see what impact the reduction in payload size was…

10 requests to gpstrackerhttpheaders.cloudapp.net
2534,1457,1426,1571,1453,1462,1516,1422,1416,1421
Average 1569 mSec

10 requests to IP Address
1222,1200,1221,1238,1210,1266,1199,1235,1218,1230,1215
Average 1345 mSec

V1 573 bytes – 1569mSec & 1345 mSec
V2 480 bytes
V3 376 bytes
V4 305 bytes – 1029 mSec & 692 mSec

Most probably another 20 or 30 bytes could be saved by removing the ‘-‘ from the MAC address, shorter url, fixed width fields with no delimiter, removing the keep alive etc.

During my testing the NMEA stack was quite unstable it would fail with buffer overflows and overruns. The HTTP client requests are blocking and sharing the main execution thread meant the NMEA stack was unable to keep up.

HTTP Headers response reduction

In the previous post the response was still looking a bit chunky

Response – Bytes Received: 217
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html
Server: Microsoft-IIS/7.0
x-UpdMin: 30
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Sat, 19 Jan 2013 10:25:12 GMT
Content-Length: 0

The HTTP headers returned by IIS were a significant portion of the response. I found that there were several different approaches but the simplest to setup was discussed here.

Response – Bytes Received: 113
HTTP/1.1 200 OK
Cache-Control: private
x-UpdMin: 30
Date: Sun, 20 Jan 2013 01:51:37 GMT
Content-Length: 0

V1 573 bytes
V2 480 bytes
V3 376 bytes

With a little effort roughly 200 bytes or 1/3 reduction in payload size. Still some scope for further reductions…