Airbnb Dataset – NetTopologySuite spatial searching

I have used Entity Framework Core a “full fat” Object Relational Mapper (ORM) for a couple of projects, and it supports mapping to spatial data types using the NetTopologySuite library. On Stackoverflow there was some discussion about Dapper’s spatial support so I thought I would try it out.

The DapperSpatialNetTopologySuite project has Dapper SQLMapper TypeHandler, Microsoft SQL Server stored procedure name and ASP.NET Core Minimal API for each location column handler implementation.

app.MapGet("/Spatial/NearbyGeography", async (double latitude, double longitude, int distance, [FromServices] IDapperContext dapperContext) =>
{
   var origin = new Point(longitude, latitude) { SRID = 4326 };

   using (var connection = dapperContext.ConnectionCreate())
   {
      var results = await connection.QueryWithRetryAsync<Model.ListingNearbyListGeographyDto>("ListingsSpatialNearbyNTS_____", new { origin, distance }, commandType: CommandType.StoredProcedure);

      return results;
   }
})
.Produces<IList<Model.ListingNearbyListGeographyDto>>(StatusCodes.Status200OK)
.Produces<ProblemDetails>(StatusCodes.Status400BadRequest)
.WithOpenApi();

After adding the NetTopologySuite spatial library to the project the schemas list got a lot bigger.

NetTopology Suite additional schemas

My first attempt inspired by a really old Marc Gravell post and the NetTopologySuite.IO.SqlServerBytes documentation didn’t work.

CREATE PROCEDURE [dbo].[ListingsSpatialNearbyNTSLocation]
	@Origin AS GEOGRAPHY,
	@distance AS INTEGER
AS
BEGIN
DECLARE @Circle AS GEOGRAPHY = @Origin.STBuffer(@distance); 

SELECT TOP(50) UID AS ListingUID
	,[Name]
	,listing_url as ListingUrl
	,Listing.Location.STDistance(@Origin) as Distance
	,Listing.Location
FROM Listing
WHERE (Listing.Location.STWithin(@Circle) = 1) 
ORDER BY Distance
END
NetTopology Suite Microsoft.SqlServer.Types library load exception

I could see the Dapper SQLMapper TypeHandler for the @origin parameter getting called but the not locations

NetTopology Suite @Origin parameter typehandler in Visual Studio 2022 Debugger

Then found a Brice Lambson post about how to use the NetTopologySuite.IO.SqlServerBytes library to read and write geography and geometry columns.

CREATE PROCEDURE [dbo].[ListingsSpatialNearbyNTSSerialize]
	@Origin AS GEOGRAPHY,
	@distance AS INTEGER
AS
BEGIN
DECLARE @Circle AS GEOGRAPHY = @Origin.STBuffer(@distance); 

SELECT TOP(50) UID AS ListingUID
	,[Name]
	,listing_url as ListingUrl
	,Listing.Location.STDistance(@Origin) as Distance
	,Location.Serialize() as Location
FROM [listing] 
WHERE (Location.STWithin(@Circle) = 1) 
ORDER BY Distance
END
class PointHandlerSerialise : SqlMapper.TypeHandler<Point>
{
   public override Point Parse(object value)
   {
      var reader = new SqlServerBytesReader { IsGeography = true };

      return (Point)reader.Read((byte[])value);
   }

   public override void SetValue(IDbDataParameter parameter, Point? value)
   {
      ((SqlParameter)parameter).SqlDbType = SqlDbType.Udt;  // @Origin parameter?
      ((SqlParameter)parameter).UdtTypeName = "GEOGRAPHY";

      var writer = new SqlServerBytesWriter { IsGeography = true };

      parameter.Value = writer.Write(value);
   }
}

Once the location column serialisation was working (I could see a valid response in the debugger) the generation of the response was failing with a “System.Text.Json.JsonException: A possible object cycle was detected. This can either be due to a cycle or if the object depth is larger than the maximum allowed depth of 64″.

NetTopology Suite serialisation “possible object cycle detection” exception

After fixing that issue the response generation failed with “System.ArgumentException: .NET number values such as positive and negative infinity cannot be written as valid JSON.”

NetTopology Suite serialisation “possible object cycle detection” exception

Fixing these two issues required adjustment of two HttpJsonOptions

//...
builder.Services.ConfigureHttpJsonOptions(options =>
{
   options.SerializerOptions.ReferenceHandler = ReferenceHandler.IgnoreCycles;
   options.SerializerOptions.NumberHandling = JsonNumberHandling.AllowNamedFloatingPointLiterals;
});

var app = builder.Build();
//...
Swagger NetTopology Suite location serialisation response

After digging into the Dapper source code I wondered how ADO.Net handled loading Microsoft.SQLServer.Types library

app.MapGet("/Listing/Search/Ado", async (double latitude, double longitude, int distance, [FromServices] IDapperContext dapperContext) =>
{
   var origin = new Point(longitude, latitude) { SRID = 4326 };

   using (SqlConnection connection = (SqlConnection)dapperContext.ConnectionCreate())
   {
      await connection.OpenAsync();

      var geographyWriter = new SqlServerBytesWriter { IsGeography = true };

      using (SqlCommand command = connection.CreateCommand())
      {
         command.CommandText = "ListingsSpatialNearbyNTSLocation";
         command.CommandType = CommandType.StoredProcedure;

         var originParameter = command.CreateParameter();
         originParameter.ParameterName = "Origin";
         originParameter.Value = new SqlBytes(geographyWriter.Write(origin));
         originParameter.SqlDbType = SqlDbType.Udt;
         originParameter.UdtTypeName = "GEOGRAPHY";
         command.Parameters.Add(originParameter);

         var distanceParameter = command.CreateParameter();
         distanceParameter.ParameterName = "Distance";
         distanceParameter.Value = distance;
         distanceParameter.DbType = DbType.Int32;
         command.Parameters.Add(distanceParameter);

         var geographyReader = new SqlServerBytesReader { IsGeography = true };

         using (var dbDataReader = await command.ExecuteReaderAsync())
         {
            List<Model.ListingNearbyListGeographyDto> listings = new List<Model.ListingNearbyListGeographyDto>();

            int listingUIDColumn = dbDataReader.GetOrdinal("ListingUID");
            int nameColumn = dbDataReader.GetOrdinal("Name");
            int listingUrlColumn = dbDataReader.GetOrdinal("ListingUrl");
            int distanceColumn = dbDataReader.GetOrdinal("Distance");
            int LocationColumn = dbDataReader.GetOrdinal("Location");

            while (await dbDataReader.ReadAsync())
            {
               listings.Add(new Model.ListingNearbyListGeographyDto
               {
                  ListingUID = dbDataReader.GetGuid(listingUIDColumn),
                  Name = dbDataReader.GetString(nameColumn),
                  ListingUrl = dbDataReader.GetString(listingUrlColumn),
                  Distance = (int)dbDataReader.GetDouble(distanceColumn),
                  Location = (Point)geographyReader.Read(dbDataReader.GetSqlBytes(LocationColumn).Value)
               });
            }

            return listings;
         }
      }
   }
})
.Produces<IList<Model.ListingNearbyListGeographyDto>>(StatusCodes.Status200OK)
.Produces<ProblemDetails>(StatusCodes.Status400BadRequest)
.WithOpenApi();

The ADO.Net implementation worked and didn’t produce any exceptions.

Swagger ADO.Net location serialisation response

In the Visual Studio 2022 debugger I could see the Microsoft.SQLServer.Types exception but this wasn’t “bubbling” up to the response generation code.

ADO.Net location serialisation Microsoft.SqlServer.Types load failure

The location columns could also be returned as Open Geospatial Consortium (OGC) Well-Known Binary (WKB) format using the STAsBinary method.

CREATE PROCEDURE [dbo].[ListingsSpatialNearbyNTSWkb]
	@Origin AS GEOGRAPHY,
	@distance AS INTEGER
AS
BEGIN
DECLARE @Circle AS GEOGRAPHY = @Origin.STBuffer(@distance); 

SELECT TOP(50) UID AS ListingUID
	,[Name]
	,listing_url as ListingUrl
	,Listing.Location.STDistance(@Origin) as Distance
	,Location.STAsBinary() as Location
FROM [listing] 
WHERE (Location.STWithin(@Circle) = 1) 
ORDER BY Distance
END

Then converted to and from NTS Point values using WKBReader and SqlServerBytesWriter

SqlMapper.AddTypeHandler(new PointHandlerWkb());
//...
class PointHandlerWkb : SqlMapper.TypeHandler<Point>
{
   public override Point Parse(object value)
   {
      var reader = new WKBReader();

      return (Point)reader.Read((byte[])value);
   }

   public override void SetValue(IDbDataParameter parameter, Point? value)
   {
      ((SqlParameter)parameter).SqlDbType = SqlDbType.Udt;  // @Origin parameter?
      ((SqlParameter)parameter).UdtTypeName = "GEOGRAPHY";

      var geometryWriter = new SqlServerBytesWriter { IsGeography = true };

      parameter.Value = geometryWriter.Write(value);
   }
}
Successful Location processing with WKBReader

The location columns could also be returned as Open Geospatial Consortium (OGC) Well Known Text(WKT) format using the STAsText and SqlServerBytesWriter;

CREATE PROCEDURE [dbo].[ListingsSpatialNearbyNTSWkt]
	@Origin AS GEOGRAPHY,
	@distance AS INTEGER
AS
BEGIN
DECLARE @Circle AS GEOGRAPHY = @Origin.STBuffer(@distance); 

SELECT TOP(50) UID AS ListingUID
	,[Name]
	,listing_url as ListingUrl
	,Listing.Location.STDistance(@Origin) as Distance
	,Location.STAsText() as Location
FROM [listing] 
WHERE (Location.STWithin(@Circle) = 1) 
ORDER BY Distance
END

Then converted to and from NTS Point values using WKTReader and SqlServerBytesWriter

class PointHandlerWkt : SqlMapper.TypeHandler<Point>
{
   public override Point Parse(object value)
   {
      WKTReader wktReader = new WKTReader();

      return (Point)wktReader.Read(value.ToString());
   }

   public override void SetValue(IDbDataParameter parameter, Point? value)
   {
      ((SqlParameter)parameter).SqlDbType = SqlDbType.Udt;  // @Origin parameter?
      ((SqlParameter)parameter).UdtTypeName = "GEOGRAPHY";

      parameter.Value = new SqlServerBytesWriter() { IsGeography = true }.Write(value);
   }
}
Successful Location processing with WKBReader

I have focused on getting the spatial queries to work and will stress/performance test my implementations in a future post. I will also revisit the /Spatial/NearbyGeography method to see if I can get it to work without “Location.Serialize() as Location”.

Downloaded Microsoft.Data.SqlClient source code and in SqlConnection.cs this doesn’t help….

  // UDT SUPPORT
  private Assembly ResolveTypeAssembly(AssemblyName asmRef, bool throwOnError)
  {
      Debug.Assert(TypeSystemAssemblyVersion != null, "TypeSystemAssembly should be set !");
      if (string.Equals(asmRef.Name, "Microsoft.SqlServer.Types", StringComparison.OrdinalIgnoreCase))
      {
          if (asmRef.Version != TypeSystemAssemblyVersion && SqlClientEventSource.Log.IsTraceEnabled())
          {
              SqlClientEventSource.Log.TryTraceEvent("SqlConnection.ResolveTypeAssembly | SQL CLR type version change: Server sent {0}, client will instantiate {1}", asmRef.Version, TypeSystemAssemblyVersion);
          }
          asmRef.Version = TypeSystemAssemblyVersion;
      }
      try
      {
          return Assembly.Load(asmRef);
      }
      catch (Exception e)
      {
          if (throwOnError || !ADP.IsCatchableExceptionType(e))
          {
              throw;
          }
          else
          {
              return null;
          }
      }
  }

Airbnb Dataset – Microsoft spatial searching

The Inside Airbnb London has the polygons of 33 neighbourhoods and each of the roughly 87900 listings has a latitude and longitude. The WebMinimalAPIDapper Spatial project uses only Microsoft SQL Server’s Spatial functionality for searching and distance calculations. 

Spatial Projections supported by SQL Server

The “magic number” 4326 indicates that the latitude and longitude values are expressed in the World Geodetic System 1984(WGS84) which is also used by the Global Positioning System (GPS) operated by the United States Space Force.

CREATE TABLE [dbo].[Neighbourhood](
	[id] [bigint] IDENTITY(1,1) NOT NULL,
	[NeighbourhoodUID] [uniqueidentifier] NOT NULL,
	[name] [nvarchar](50) NOT NULL,
	[neighbourhood_url] [nvarchar](100) NOT NULL,
	[boundary] [geography] NOT NULL,
 CONSTRAINT [PK_Neighbourhood] PRIMARY KEY CLUSTERED 
(
	[id] ASC
)
-- Then create a spatial index on GEOGRAPHY which contains the boundary polygon(s)
CREATE SPATIAL INDEX [ISX_NeighbourhoodBoundary] ON [dbo].[Neighbourhood]
(
	[boundary]
)

I added a GEOGRAPHY column to the Listing table, populated it using the Latitude and Longitudes of the Listings then added a spatial index.

-- Use latitude and longitude to populate Location GEOGRAPHY column
UPDATE listing
SET Listing.Location = geography::Point(latitude, longitude, 4326)

-- Then index Location column after changing to NOT NULL
CREATE SPATIAL INDEX [IXS_ListingByLocation] ON [dbo].[listing]
(
	[Location]
)

The first spatial search uses the latitude and longitude (most probably extracted from image metadata) to get a Listing’s neighbourhood.

Testing listing in Neighbourhood SQL

It uses the STContains method to find the neighbourhood polygon (if there is one) which the listing location is inside.

const string ListingInNeighbourhoodSQL = @"SELECT neighbourhoodUID, name, neighbourhood_url as neighbourhoodUrl FROM Neighbourhood WHERE Neighbourhood.Boundary.STContains(geography::Point(@Latitude, @Longitude, 4326)) = 1";
...

app.MapGet("/Spatial/Neighbourhood", async (double latitude, double longitude, [FromServices] IDapperContext dapperContext) =>
{
   Model.NeighbourhoodSearchDto neighbourhood;

   using (var connection = dapperContext.ConnectionCreate())
   {
      neighbourhood = await connection.QuerySingleOrDefaultWithRetryAsync<Model.NeighbourhoodSearchDto>(ListingInNeighbourhoodSQL, new { latitude, longitude });
   }

   if (neighbourhood is null)
   {
      return Results.Problem($"Neighbourhood for Latitude:{latitude} Longitude:{longitude} not found", statusCode: StatusCodes.Status404NotFound);
   }

   return Results.Ok(neighbourhood);
})
.Produces<IList<Model.NeighbourhoodSearchDto>>(StatusCodes.Status200OK)
.Produces(StatusCodes.Status404NotFound )
.WithOpenApi();

The case sensitivity of the OGC geography methods tripped me up a few times.

Testing listing Neighbourhood lookup with Swagger user interface

In a future blog post I will compare the performance of STContains vs. STWithin with a load testing application.

Testing listings near a location SQL

The second search simulates a customer looking for listing(s) within a specified distance of a point of interest.

const string ListingsNearbySQL = @"DECLARE @Origin AS GEOGRAPHY = geography::Point(@Latitude, @Longitude, 4326); 
                                  DECLARE @Circle AS GEOGRAPHY = @Origin.STBuffer(@distance); 
                                  --DECLARE @Circle AS GEOGRAPHY = @Origin.BufferWithTolerance(@distance, 0.09,true); 
                                  SELECT uid as ListingUID, Name, listing_url as ListingUrl, 
@Origin.STDistance(Listing.Location) as Distance 
                                  FROM [listing] 
                                  WHERE Listing.Location.STWithin(@Circle) = 1 ORDER BY Distance";
...
app.MapGet("/Spatial/NearbyText", async (double latitude, double longitude, double distance, [FromServices] IDapperContext dapperContext) =>
{
   using (var connection = dapperContext.ConnectionCreate())
   {
      return await connection.QueryWithRetryAsync<Model.ListingNearbyListDto>(ListingsNearbySQL, new { latitude, longitude, distance });
   }
})
.Produces<IList<Model.ListingNearbyListDto>>(StatusCodes.Status200OK)
.WithOpenApi();

The STBuffer command returns a geography object that represents represent a circle centered on @Location with a radius of @distance.

Testing listings near a location with Swagger user interface

The third and final search simulates a customer looking for listing(s) within a specified distance of a point of interest with the latitude and longitude of the listing included in the results.

Testing listings near a location SQL with latitude & Longitude
const string ListingsNearbyLatitudeLongitudeSQL = @"DECLARE @Location AS GEOGRAPHY = geography::Point(@Latitude, @longitude,4326)
                                 DECLARE @Circle AS GEOGRAPHY = @Location.STBuffer(@distance);
                                 SELECT UID as ListingUID
	                              ,[Name]
	                              ,listing_url as ListingUrl
	                              ,Listing.Location.STDistance(@Location) as Distance
	                              ,latitude
                                 ,longitude
                                 FROM [listing]
                                 WHERE Listing.Location.STWithin(@Circle) = 1
                                 ORDER BY Distance";

app.MapGet("/Spatial/NearbyLatitudeLongitude", async (double latitude, double longitude, double distance, [FromServices] IDapperContext dapperContext) =>
{
   using (var connection = dapperContext.ConnectionCreate())
   {
      return await connection.QueryWithRetryAsync<Model.ListingNearbyListLatitudeLongitudeDto>(ListingsNearbyLatitudeLongitudeSQL, new { latitude, longitude, distance });
   }
})
.Produces<IList<Model.ListingNearbyListLatitudeLongitudeDto>>(StatusCodes.Status200OK)
.WithOpenApi();
Testing listings near a location with latitude & Longitude with Swagger user interface

The next couple of posts will use the third-party libraries Geo and NetTopolgySuite

floor, ceil, trunc and casting

I left a Wisnode Track Lite RAK7200 outside on the deck for a day and the way the positions “snapped” to a grid caught my attention. Based on the size of my property the grid looked to be roughly 10 x 10 meters

The sample Low Power Payload Mbed C code uses a cast which is I think is the same as a floor.


“These functions round x downwards to the nearest integer, returning that value as a double. Thus, floor (1.5) is 1.0 and floor (-1.5) is -2.0.”

In the C code the latitude and longitude values are truncated to four decimal places and the altitude to two decimal places. In my C# code I used Math.Round and I wondered what impact that could have…

public void GpsLocationAdd(byte channel, float latitude, float longitude, float altitude)
{
   IsChannelNumberValid(channel);
   IsBfferSizeSufficient(Enumerations.DataType.Gps);

   if ((latitude < Constants.LatitudeMinimum ) || (latitude > Constants.LatitudeMaximum))
   {
      throw new ArgumentException($"Latitude must be between {Constants.LatitudeMinimum} and {Constants.LatitudeMaximum}", "latitude");
   }

   if ((latitude < Constants.LongitudeMinimum) || (latitude > Constants.LongitudeMaximum))
   {
      throw new ArgumentException($"Longitude must be between {Constants.LongitudeMinimum} and {Constants.LongitudeMaximum}", "latitude");
   }

   if ((altitude < Constants.AltitudeMinimum) || (altitude > Constants.AltitudeMaximum))
   {
      throw new ArgumentException($"Altitude must be between {Constants.AltitudeMinimum} and {Constants.AltitudeMaximum}", "altitude");
   }

   int lat = (int)Math.Round(latitude * 10000.0f);
   int lon = (int)Math.Round(longitude * 10000.0f);
   int alt = (int)Math.Round(altitude * 100.0f);

   buffer[index++] = channel;
   buffer[index++] = (byte)Enumerations.DataType.Gps;

   buffer[index++] = (byte)(lat >> 16);
   buffer[index++] = (byte)(lat >> 8);
   buffer[index++] = (byte)lat;
   buffer[index++] = (byte)(lon >> 16);
   buffer[index++] = (byte)(lon >> 8);
   buffer[index++] = (byte)lon;
   buffer[index++] = (byte)(alt >> 16);
   buffer[index++] = (byte)(alt >> 8);
   buffer[index++] = (byte)alt;
}

Using the WGS84 World Geodetic System Distance Calculator to calculate the distance where the Greenwich Meridian and the Equator cross off the coast of Ghana the theoretical maximum error is 15.69m.

I live in Christchurch New Zealand and the theoretical maximum distance is 13.6 m. So, in summary the LPP latitude and longitude values are most probably fine for tracking applications.