Open datasets with quality, popularity and AI measures for over 39 million Wikipedia articles are available online! Datasets are released under Creative Commons license.
This dataset includes a list of over 39 million Wikipedia articles in 55 languages with quality scores by WikiRank (https://wikirank.net). Quality scores of articles are based on Wikipedia dumps from May, 2019. Popularity and Authors’ Interest based on activity in April 2019.
- page_id — The identifier of the Wikipedia article (int), e.g. 4519301
- page_name — The title of the Wikipedia article (utf-8), e.g. General relativity
- wikirank_quality — quality score for Wikipedia article in a scale 0-100 (as of May 1, 2019)
- poularity — miedian of daily number of page views of the Wikipedia article during April 2019
- authors_interest — number of authors of the Wikipedia article during April 2019
Datasets available on page: https://doi.org/10.6084/m9.figshare.8231273