The INE has begun to extract data from Airbnb, Booking and Vrbo in order to know the real number of tourist apartments in Spain

During last year, the National Institute of Statistics has been resorting to 'web scraping' techniques (that is, automated extraction of data from the visible content of web pages) to download information that would allow estimate the number of tourist homes existing in our country.

Since the end of 2019, this public body has incorporated projects of 'experimental statistics' to incorporate new data sources, a necessary step before the traditional refusal of the large platforms in the field of tourist housing to share information.

In fact, Airbnb itself stated the following a month ago in its IPO brochure:

"If a new regulation forces us to share host data with a city, revenues will drop because there will be hosts who don't want to and leave the platform."

The best 11 alternatives to Airbnb to get cheap prices

As for web scraping, it was other types of organizations (companies such as AirDN or DataHippo, or open projects —and, therefore, not always geographically exhaustive—) who used to resort to this technique to collect data on tourist flats.

But now, the INE technical project (PDF) highlights the need for the Admón. Public have your own information updated on this field to better analyze its impact and optimize regulations.

For this, the web scraping has been applied to the Airbnb, Vrbo and Booking websites. The process has been facilitated by the fact that the operating scheme of the three websites was very similar, having "a search engine with the following cells to complete":

  • Destination / Name of the accommodation.

  • Arrival and departure date.

  • Number of guests.

I'm the one who has four tourist flats on Airbnb: "We are taking many unjustified sticks"

This has allowed the INE to download different variables categorized by zones, which collect the basic information of all the accommodations present in the same:

"During this year, a complete download was made for each of the platforms, obtaining more than 100,000 accommodations for all of them."

Via |