People are interested in predicting the future. For example, which films will bomb or who will win the upcoming Grammy awards? Making predictions about the future in many aspects is not only fun matters but can bring real value to those who correctly predict the course of world events, such as which stocks are the best purchases for short-term gains. Predictive analytics is thus a field that has attracted major attention in both academia and the industry.
As social media has become an inseparable part of modern life, there has been increasing interest in research of leveraging and exploiting social media as an information source for inferring rich social facts and knowledge. Right now, a large number of social media datasets have been established for various research tasks and helped lead to great advancements in social media technology and applications.
Therefore, as a joint activity with the research teams from the Chinese Academy of Sciences (CAS), Academia Sinica (AS), and Microsoft Research Asia (MSRA), we are releasing a large-scale social media dataset for sociological understanding and predictions, namely Social Media Prediction (SMP) dataset, with over 770K posts and 80K users in total. Our goal is to make the SMP dataset as varied and rich as possible to thoroughly represent the social media “world”.