Airbnb Dataset – Calendar information

As part of some scale testing of my WebAPIDapper and WebMinimalAPIDapper i have been “cleaning up” a portion of the Inside Airbnb London dataset. To make the scale testing results more realistic I wanted at least one table with lots of rows.

CREATE TABLE [dbo].[CalendarRawDetailed](
	[listing_id] [bigint] NOT NULL,
	[Date] [date] NOT NULL,
	[Xavailable] [bit] NULL,
	[available] [nvarchar](5) NOT NULL,
	[Xprice] [money] NULL,
	[price] [nvarchar](30) NOT NULL,
	[Xadjusted_price] [money] NULL,
	[adjusted_price] [nvarchar](30) NOT NULL,
	[Xminimum_nights] [smallint] NULL,
	[minimum_nights] [nvarchar](30) NOT NULL,
	[Xmaximum_nights] [smallint] NULL,
	[maximum_nights] [nvarchar](30) NOT NULL
) ON [PRIMARY]

The CalendarRawDetailed had some invalid values which were most probably due formatting inconsistencies on the AirBnb website

SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xminimum_nights IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xminimum_nights IS NULL

SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xmaximum_nights IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xmaximum_nights IS NULL

SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xadjusted_price IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xadjusted_price IS NULL

SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xprice IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xprice IS NULL

Where possible I recovered the values with an “incorrect” format, but some rows had to be deleted.

UPDATE CalendarRawDetailed SET Xmaximum_nights =  TRY_CONVERT(smallint,  RTRIM(maximum_nights, '"')) WHERE Xmaximum_nights IS NULL

UPDATE CalendarRawDetailed SET XmINimum_nights =  TRY_CONVERT(smallint,  RTRIM(mINimum_nights, '"')) WHERE Xminimum_nights IS NULL

UPDATE CalendarRawDetailed SET Xadjusted_price =  TRY_CONVERT(money,  LTRIM(adjusted_price, '$')) --WHERE Xmaximum_nights IS NULL

SELECT *
FROM CalendarRawDetailed 
WHERE Xadjusted_price IS NULL 
DELETE FROM CalendarRawDetailed WHERE Xmaximum_nights IS NULL 

UPDATE CalendarRawDetailed set Xavailable = 1 where available = 't'

The Calendar table has 365 rows for each listing, and I will update Calendar dates, so they are in the “future”.

CREATE TABLE [dbo].[Calendar](
   [listing_id] [bigint] NOT NULL,
   [date] [date] NOT NULL,
   [available] [bit] NOT NULL,
   [price] [money] NOT NULL,
   [adjusted_price] [money] NOT NULL,
   [minimum_nights] [smallint] NOT NULL,
   [maximum_nights] [smallint] NOT NULL
) ON [PRIMARY]

The Calendar table as approximately 31 million rows which should be plenty for my scale testing.