As part of some scale testing of my WebAPIDapper and WebMinimalAPIDapper i have been “cleaning up” a portion of the Inside Airbnb London dataset. To make the scale testing results more realistic I wanted at least one table with lots of rows.
CREATE TABLE [dbo].[CalendarRawDetailed](
[listing_id] [bigint] NOT NULL,
[Date] [date] NOT NULL,
[Xavailable] [bit] NULL,
[available] [nvarchar](5) NOT NULL,
[Xprice] [money] NULL,
[price] [nvarchar](30) NOT NULL,
[Xadjusted_price] [money] NULL,
[adjusted_price] [nvarchar](30) NOT NULL,
[Xminimum_nights] [smallint] NULL,
[minimum_nights] [nvarchar](30) NOT NULL,
[Xmaximum_nights] [smallint] NULL,
[maximum_nights] [nvarchar](30) NOT NULL
) ON [PRIMARY]
The CalendarRawDetailed had some invalid values which were most probably due formatting inconsistencies on the AirBnb website
SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xminimum_nights IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xminimum_nights IS NULL
SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xmaximum_nights IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xmaximum_nights IS NULL
SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xadjusted_price IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xadjusted_price IS NULL
SELECT COUNT(*) FROM CalendarRawDetailed WHERE Xprice IS NULL
SELECT * FROM CalendarRawDetailed WHERE Xprice IS NULL
Where possible I recovered the values with an “incorrect” format, but some rows had to be deleted.
UPDATE CalendarRawDetailed SET Xmaximum_nights = TRY_CONVERT(smallint, RTRIM(maximum_nights, '"')) WHERE Xmaximum_nights IS NULL
UPDATE CalendarRawDetailed SET XmINimum_nights = TRY_CONVERT(smallint, RTRIM(mINimum_nights, '"')) WHERE Xminimum_nights IS NULL
UPDATE CalendarRawDetailed SET Xadjusted_price = TRY_CONVERT(money, LTRIM(adjusted_price, '$')) --WHERE Xmaximum_nights IS NULL
SELECT *
FROM CalendarRawDetailed
WHERE Xadjusted_price IS NULL
DELETE FROM CalendarRawDetailed WHERE Xmaximum_nights IS NULL
UPDATE CalendarRawDetailed set Xavailable = 1 where available = 't'
The Calendar table has 365 rows for each listing, and I will update Calendar dates, so they are in the “future”.
CREATE TABLE [dbo].[Calendar](
[listing_id] [bigint] NOT NULL,
[date] [date] NOT NULL,
[available] [bit] NOT NULL,
[price] [money] NOT NULL,
[adjusted_price] [money] NOT NULL,
[minimum_nights] [smallint] NOT NULL,
[maximum_nights] [smallint] NOT NULL
) ON [PRIMARY]
The Calendar table as approximately 31 million rows which should be plenty for my scale testing.