Schema

OpenSiteRec is an open dataset for Site Recommendation.
It consists data of four international metropolises: Chicago, New York City, Singapore and Tokyo.
There are five types of entities: Brand, Category, POI, Business Area, Region.
The extensive relations provide sufficient commercial and geographical information.

Statistics & Usage

Statistics

Chicago, New York City, Singapore and Tokyo

Data statistics of OpenSiteRec.

Annotation

Chicago, New York City, Singapore and Tokyo

Annotations of OpenSiteRec.

Brand Category

Chicago, New York City, Singapore and Tokyo

The distributions of brand category are different across cities.

Brand Site Number

Chicago, New York City, Singapore and Tokyo

The long-tail problem is severe in OpenSiteRec.

Chicago

Population: 2.7 million | Area: 590 km2

Chicago is the most populous city in the U.S. state of Illinois and the third most populous in the United States.

New York City

Population: 8.8 million | Area: 778 km2

New York City or NYC, is the most populous city in the United States.

Singapore

Population: 5.6 million | Area: 733 km2

Singapore is a sovereign island country and city-state in maritime Southeast Asia.

Tokyo

Population: 9.7 million | Area: 620 km2

Tokyo, officially the Tokyo Metropolis, is the capital and most populous city of Japan.

Chicago

Population: 2.7 million | Area: 590 km2

Chicago is the most populous city in the U.S. state of Illinois and the third most populous in the United States.

New York City

Population: 8.8 million | Area: 778 km2

New York City or NYC, is the most populous city in the United States.

Singapore

Population: 5.6 million | Area: 733 km2

Singapore is a sovereign island country and city-state in maritime Southeast Asia.

Tokyo

Population: 9.7 million | Area: 620 km2

Tokyo, officially the Tokyo Metropolis, is the capital and most populous city of Japan.

Download

Benchmark

Benchmarking experimental results of 16 representative baselines on OpenSiteRec.
For Machine Learning methods, we have LR, GBDT, SVR and RankNet.
For Collaborative Filtering methods, we have MF-BPR, NeuMF, FISM and NAIS.
For CTR Prediction methods, we have DNN, Wide&Deep, DeepFM, xDeepFM.
For Graph-based methods, we have GC-MC, GraphRec, NGCF and LightGCN.

Please find the implementation codes at Github Repo OpenSiteRec.

Amazing Team

WEST Lab

Department of Computer Science and Technology, Tsinghua University

DMAL

School of Computer Science and Engineering, Nanyang Technological University

AML Lab

School of Data Science, City University of Hong Kong

FIL Lab

Department of Electronic Engineering, Tsinghua University