2000 character limit reached
Automatic Quality Assessment of Wikipedia Articles -- A Systematic Literature Review (2310.02235v1)
Published 3 Oct 2023 in cs.CL, cs.AI, cs.CY, cs.LG, and cs.SI
Abstract: Wikipedia is the world's largest online encyclopedia, but maintaining article quality through collaboration is challenging. Wikipedia designed a quality scale, but with such a manual assessment process, many articles remain unassessed. We review existing methods for automatically measuring the quality of Wikipedia articles, identifying and comparing machine learning algorithms, article features, quality metrics, and used datasets, examining 149 distinct studies, and exploring commonalities and gaps in them. The literature is extensive, and the approaches follow past technological trends. However, machine learning is still not widely used by Wikipedia, and we hope that our analysis helps future researchers change that reality.
- ACM. 2020. Artifact Review and Badging Version 1.1. https://www.acm.org/publications/policies/artifact-review-and-badging-current. Accessed: 2023-04-03.
- Assigning Trust to Wikipedia Content. In Proceedings of the 4th International Symposium on Wikis (Porto, Portugal) (WikiSym ’08). Association for Computing Machinery, New York, NY, USA, Article 26, 12 pages. https://doi.org/10.1145/1822258.1822293
- Rakshit Agrawal and Luca deAlfaro. 2016. Predicting the quality of user contributions via LSTMs. In OpenSym ’16: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–10. https://dl.acm.org/doi/10.1145/2957792.2957811
- Detection of text quality flaws as a one-class classification problem. In CIKM ’11: ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, New York City, United States, 2313–2316. https://dl.acm.org/doi/10.1145/2063576.2063954
- Towards automatic quality assurance in Wikipedia. In WWW ’11: International Conference Companion on World Wide Web. Association for Computing Machinery, New York City, United States, 5–6. https://dl.acm.org/doi/10.1145/1963192.1963196
- Predicting quality flaws in user-generated content: the case of wikipedia. In SIGIR ’12: International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York City, United States, 981–990. https://dl.acm.org/doi/10.1145/2348283.2348413
- Hélder Antunes and Carla Teixeira Lopes. 2019. Analyzing the Adequacy of Readability Indicators to a Non-English Language. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11696 LNCS (2019), 149–155. https://doi.org/10.1007/978-3-030-28577-7_10/TABLES/3
- Ofer Arazy and Oded Nov. 2010. Determinants of wikipedia quality: the roles of global and local contribution inequality. In CSCW ’10: Conference on Computer Supported Cooperative Work. Association for Computing Machinery, New York City, United States, 233–236. https://dl.acm.org/doi/10.1145/1718918.1718963
- Automatically Labeling Low Quality Content on Wikipedia By Leveraging Patterns in Editing Behaviors. Proceedings of the ACM on Human-Computer Interaction 5 (2021), 1 – 23. Issue CSCW2. https://dl.acm.org/doi/10.1145/3479503
- Richard Bamberger and Annette T. Rabin. 1984. New Approaches to Readability: Austrian Research. The Reading Teacher 37, 6 (1984), 512–519. http://www.jstor.org/stable/20198517
- Elias Bassani and Marco Viviani. 2019a. Automatically assessing the quality of Wikipedia contents. In SAC ’19: ACM/SIGAPP Symposium on Applied Computing. Association for Computing Machinery, New York City, United States, 804–807. https://dl.acm.org/doi/10.1145/3297280.3297357
- Elias Bassani and Marco Viviani. 2019b. Quality of Wikipedia Articles: Analyzing Features and Building a Ground Truth for Supervised Classification. In KDIR ’19: International Conference on Knowledge Discovery and Information Retrieval. Vienna University of Technology, Vienna, Austria, 338–346. https://www.scitepress.org/Link.aspx?doi=10.5220/0008149303380346
- Mining team characteristics to predict Wikipedia article quality. In OpenSym ’16: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–9. https://dl.acm.org/doi/10.1145/2957792.2971802
- Joshua E. Blumenstock. 2008. Size matters: word count as a measure of quality on wikipedia. In WWW ’08: The Web Conference. Association for Computing Machinery, New York City, United States, 1095–1096. https://dl.acm.org/doi/10.1145/1367497.1367673
- WikipediaViz: Conveying article quality for casual Wikipedia readers. In PacificVis ’10: Pacific Visualization Symposium. Institute of Electrical and Electronic Engineers, New York City, United States, 49–56. https://ieeexplore.ieee.org/document/5429611/
- Structural Analysis of Wikigraph to Investigate Quality Grades of Wikipedia Articles. In WWW ’21: The Web Conference. Association for Computing Machinery, New York City, United States, 584–590. https://dl.acm.org/doi/10.1145/3442442.3452345
- Luis Couto and Carla Teixeira Lopes. 2021a. Assessing the quality of health-related Wikipedia articles with generic and specific metrics. In WWW ’21: The Web Conference. Association for Computing Machinery, New York City, United States, 640–647. https://dl.acm.org/doi/10.1145/3442442.3452355
- Luis Couto and Carla Teixeira Lopes. 2021b. Equal opportunities in the access to quality online health information? A multi-lingual study on Wikipedia. In OpenSym ’21: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–13. https://dl.acm.org/doi/10.1145/3479986.3480000
- A matter of words: NLP for quality evaluation of Wikipedia medical articles. In IWCE ’16: International Conference on Web Engineering. Springer, Cham, Switzerland, 448–456. https://link.springer.com/chapter/10.1007/978-3-319-38791-8_31
- QuWi: quality control in Wikipedia. In WICOW ’09: Workshop on Information Credibility on the Web. Association for Computing Machinery, New York City, United States, 27–34. https://dl.acm.org/doi/10.1145/1526993.1527001
- Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia. In JCDL ’09: ACM/IEEE Joint Conference on Digital Libraries. Association for Computing Machinery, New York City, United States, 295–304. https://dl.acm.org/doi/10.1145/1555400.1555449
- Automatic assessment of document quality in web collaborative digital libraries. Journal of Data and Information Quality 2 (2011), 1–30. Issue 3. https://dl.acm.org/doi/10.1145/2063504.2063507
- On MultiView-Based Meta-learning for Automatic Quality Assessment of Wiki Articles. In TPDL ’12: International Conference on Theory and Practice of Digital Libraries. Springer, Berlin, Heidelberg, 234–246. https://link.springer.com/chapter/10.1007/978-3-642-33290-6_26
- A general multiview framework for assessing the quality of collaboratively created content on web 2.0. Journal of the Association for Information Science and Technology 68 (2016), 286–308. Issue 2. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.23650
- Quality assessment of collaborative content with minimal information. In JCDL ’14: ACM/IEEE Joint Conference on Digital Libraries. Association for Computing Machinery, New York City, United States, 201–210. https://dl.acm.org/doi/10.5555/2740769.2740804
- GreenWiki: a tool to support users’ assessment of the quality of Wikipedia articles. In JCDL ’11: ACM/IEEE Joint Conference on Digital Libraries. Association for Computing Machinery, New York City, United States, 469–470. https://dl.acm.org/doi/10.1145/1998076.1998190
- Quang-Vinh Dang. 2021. Assessing the Quality of Wikipedia Articles. In ICMLSC ’21: International Conference on Machine Learning and Soft Computing. Association for Computing Machinery, New York City, United States, 1–4. https://dl.acm.org/doi/10.1145/3453800.3453801
- Quang-Vinh Dang and Claudia-Lavinia Ignat. 2016a. Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia. In CIC ’16: IEEE 2nd International Conference on Collaboration and Internet Computing. Institute of Electrical and Electronic Engineers, New York City, United States, 266–275. https://ieeexplore.ieee.org/document/7809715
- Quang-Vinh Dang and Claudia-Lavinia Ignat. 2016b. Quality assessment of wikipedia articles: a deep learning approach. https://dl.acm.org/doi/10.1145/2996442.2996447
- Quang-Vinh Dang and Claudia-Lavinia Ignat. 2016c. Quality assessment of Wikipedia articles without feature engineering. In JCDL ’16: ACM/IEEE Joint Conference on Digital Libraries. Association for Computing Machinery, New York City, United States, 27–30. https://dl.acm.org/doi/10.1145/2910896.2910917
- Quang-Vinh Dang and Claudia-Lavinia Ignat. 2017. An end-to-end learning solution for assessing the quality of Wikipedia articles. In OpenSym ’17: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–10. https://dl.acm.org/doi/10.1145/3125433.3125448
- Quality Change: Norm or Exception? Measurement, Analysis and Detection of Quality Change in Wikipedia. Proceedings of the ACM on Human-Computer Interaction 6 (2021), 1 – 36. Issue CSCW1. https://dl.acm.org/doi/10.1145/3512959
- Measuring article quality in Wikipedia using the collaboration network. In ASONAM ’15: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Association for Computing Machinery, New York City, United States, 464–471. https://dl.acm.org/doi/10.1145/2808797.2808895
- Structure-Based Features for Predicting the Quality of Articles in Wikipedia. Springer, Cham, Switzerland. https://link.springer.com/chapter/10.1007/978-3-319-51049-1_6
- Understanding the ’Quality Motion’ of Wikipedia Articles Through Semantic Convergence Analysis. In HCIB ’15: International Conference on HCI in Business. Springer, Cham, Switzerland, 64–75. https://link.springer.com/chapter/10.1007/978-3-319-20895-4_7
- WikiLyzer: Interactive Information Quality Assessment in Wikipedia. In IUI ’17: International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York City, United States, 377–388. https://dl.acm.org/doi/10.1145/3025171.3025201
- Pierpaolo Dondio and Stephen Barrett. 2007. Computational Trust in Web Content Quality: A Comparative Evalutation on the Wikipedia Project. Informatica 31 (2007), 151–160. Issue 2. https://arrow.tudublin.ie/scschcomart/25/
- Extracting Trust from Domain Analysis: A Case Study on the Wikipedia Project. In Autonomic and Trusted Computing. Springer Berlin Heidelberg, Berlin, Heidelberg, 362–373.
- Learning to Predict the Quality of Contributions to Wikipedia. https://maroo.cs.umass.edu/getpdf.php?id=834
- A Framework for Assessing the Quality of Wikipedia Articles: A Meta-synthesis of the Literature. International Journal of Information Science and Management 20 (2022), 91–118. Issue 1. https://www.magiran.com/paper/2379640
- Quality flaw prediction in Spanish Wikipedia: A case of study with verifiability flaws. Information Processing & Management 54 (2018), 1169–1181. Issue 6. https://www.sciencedirect.com/science/article/pii/S0306457317309329?via%253Dihub
- On the Use of PU Learning for Quality Flaw Prediction in Wikipedia. In CLEF ’12: Conference and Labs of the Evaluation Forum. CLEF Initiative, Rome, Italy, 1178. https://www.researchgate.net/publication/236565329_On_the_Use_of_PU_Learning_for_Quality_Flaw_Prediction_in_Wikipedia
- Towards Information Quality Assurance in Spanish: Wikipedia. Journal of Computer Science and Technology (JCS&T 17 (2017), 29–36. Issue 1. https://www.semanticscholar.org/paper/8cba1878de84959de7a5401c9181819ee9bdf205
- FlawFinder: A Modular System for Predicting Quality Flaws in Wikipedia. In CLEF ’12: Conference and Labs of the Evaluation Forum. CLEF Initiative, Rome, Italy, 1178. https://www.researchgate.net/publication/235982155_FlawFinder_A_Modular_System_for_Predicting_Quality_Flaws_in_Wikipedia
- Zeta Field. 2015. How to write clearly. Publications Office of the European Union, European Union. https://op.europa.eu/en/publication-detail/-/publication/725b7eb0-d92e-11e5-8fea-01aa75ed71a1
- What makes a good biography?: multidimensional quality analysis based on wikipedia article feedback data. In WWW ’14: International Conference on World Wide Web. Association for Computing Machinery, New York City, United States, 855–866. https://dl.acm.org/doi/10.1145/2566486.2567972
- Review-based ranking of Wikipedia articles. In CASON ’09: International Conference on Computational Aspects of Social Networks. Institute of Electrical and Electronic Engineers, New York City, United States, 98–104. https://ieeexplore.ieee.org/document/5176107/
- Mouzhi Ge and Włodzimierz Lewoniewski. 2020. Developing the Quality Model for Collaborative Open Data. Procedia Computer Science 176 (2020), 1883–1892. https://www.sciencedirect.com/science/article/pii/S187705092032130X
- Discourse Connective - A Marker for Identifying Featured Articles in Biological Wikipedia. Research in Computing Science 117 (2016), 109–119. Issue 1. https://www.researchgate.net/journal/Research-in-Computing-Science-1870-4069
- NwQM: A Neural Quality Assessment Framework for Wikipedia. In EMNLP ’20: Conference on Empirical Methods in Natural Language Processing. ACL Anthology, Online, 8396–8406. https://aclanthology.org/2020.emnlp-main.674/
- Citationchaser: A tool for transparent and efficient forward and backward citation chasing in systematic searching. Research Synthesis Methods 13 (2011), 533–545. Issue 4. https://doi.org/10.1002/jrsm.1563
- Aaron L. Halfaker. 2017. Interpolating Quality Dynamics in Wikipedia and Demonstrating the Keilana Effect. In OpenSym ’17: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–9. https://dl.acm.org/doi/10.1145/3125433.3125475
- Aaron L. Halfaker and R. Stuart Geiger. 2020. ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia. Proceedings of the ACM on Human-Computer Interaction 4 (2020), 1–37. Issue CSCW2. https://dl.acm.org/doi/10.1145/3415219
- A jury of your peers: quality, experience and ownership in Wikipedia. In WikiSym ’09: International Symposium on Wikis and Open Collaboration. Association for Computing Machinery, New York City, United States, 1–10. https://dl.acm.org/doi/10.1145/1641309.1641332
- Rainer Hammwöhner. 2010. Interlingual Aspects Of Wikipedia’s Quality. https://epub.uni-regensburg.de/15572/
- Web Article Quality Assessment in Multi-dimensional Space. In WAIM ’11: International Conference on Web-Age Information Management. Springer, Berlin, Heidelberg, 214–225. https://link.springer.com/chapter/10.1007/978-3-642-23535-1_20
- Probabilistic Quality Assessment of Articles Based on Learning Editing Patterns. In CSSS ’11: International Conference on Computer Science and Service System. Institute of Electrical and Electronic Engineers, New York City, United States, 564–570. https://ieeexplore.ieee.org/abstract/document/5973947
- Probabilistic Quality Assessment Based on Article’s Revision History. In DEXA ’11: International Conference on Database and Expert Systems Applications. Springer, Berlin, Heidelberg, 574–588. https://link.springer.com/chapter/10.1007/978-3-642-23091-2_50
- How do metrics of link analysis correlate to quality, relevance and popularity in wikipedia?. In WebMedia ’13: Brazilian Symposium on Multimedia and the Web. Association for Computing Machinery, New York City, United States, 105–112. https://dl.acm.org/doi/10.1145/2526188.2526198
- An investigation of the relationship between the amount of extra-textual data and the quality of Wikipedia articles. In WebMedia ’13: Brazilian Symposium on Multimedia and the Web. Association for Computing Machinery, New York City, United States, 333–336. https://dl.acm.org/doi/10.1145/2526188.2526218
- Measuring Quality of Wikipedia Articles by Feature Fusion‐based Stack Learning. In ASIST ’21: Association for Information Science and Technology. Association for Information Science & Technology, Silver Spring, Maryland, 206–217. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/pra2.449
- Measuring article quality in wikipedia: models and evaluation. In CIKM ’07: International Conference on Information and Knowledge Management. Association for Computing Machinery, New York City, United States, 243–252. https://dl.acm.org/doi/10.1145/1321440.1321476
- On improving wikipedia search using article quality. In WIDM ’07: ACM International Workshop on Web Information and Data Management. Association for Computing Machinery, New York City, United States, 145–152. https://dl.acm.org/doi/10.1145/1316902.1316926
- Automating assessment of collaborative writing quality in multiple stages: the case of wiki. In LAK ’16: International Conference on Learning Analytics & Knowledge. Association for Computing Machinery, New York City, United States, 518–519. https://dl.acm.org/doi/abs/10.1145/2883851.2883963
- Christoph Hube and Besnik Fetahu. 2018. Detecting Biased Statements in Wikipedia. In Companion Proceedings of the The Web Conference 2018 (Lyon, France) (WWW ’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1779–1786. https://doi.org/10.1145/3184558.3191640
- Network analysis of user generated content quality in Wikipedia. Online Information Review 37 (2013), 602–619. Issue 4. https://www.emerald.com/insight/content/doi/10.1108/OIR-03-2011-0182/full/html
- Sara Javanmardi and Cristina Lopes. 2010. Statistical measure of quality in Wikipedia. In SOMA ’10: Workshop on Social Media Analytics. Association for Computing Machinery, New York City, United States, 132–138. https://dl.acm.org/doi/10.1145/1964858.1964876
- Dariusz Jemielniak and Maciej Wilamowski. 2017. Cultural diversity of quality of information on Wikipedias. Journal of the Association for Information Science and Technology 68 (2017), 2460–2470. Issue 10. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.23901
- Isaac Johnson. 2022. Language-agnostic Wikipedia article quality model card. https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Language-agnostic_Wikipedia_article_quality_model_card. Accessed: 2023-04-04.
- ‘WP2Cochrane’, a tool linking Wikipedia to the Cochrane Library: Results of a bibliometric analysis evaluating article quality and importance. Health Informatics Journal 26 (2019), 1881 – 1897. Issue 3. https://journals.sagepub.com/doi/10.1177/1460458219892711
- Estimating the Quality of Articles in Russian Wikipedia Using the Logical-Linguistic Model of Fact Extraction. In BIS ’17: International Conference on Business Information Systems. Springer, Cham, Switzerland, 28–40. https://link.springer.com/chapter/10.1007/978-3-319-59336-4_3
- An Empirical Study to Predict the Quality of Wikipedia Articles. In WorldCIST ’19: World Conference on Information Systems and Technologies. Springer, Cham, Switzerland, 485–492. https://link.springer.com/chapter/10.1007/978-3-030-16187-3_47
- Automatic Detection of Point of View Differences in Wikipedia. In Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, India, 33–50. https://aclanthology.org/C12-1003
- Can You Ever Trust a Wiki? Impacting Perceived Trustworthiness in Wikipedia. In Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work (San Diego, CA, USA) (CSCW ’08). Association for Computing Machinery, New York, NY, USA, 477–480. https://doi.org/10.1145/1460563.1460639
- On Quality Assesement in Wikipedia Articles Based on Markov Random Fields. In ACIIDS ’17: Asian Conference on Intelligent Information and Database Systems. Springer, Cham, Switzerland, 782–791. https://link.springer.com/chapter/10.1007/978-3-319-54472-4_73
- Templates and Trust-o-Meters: Towards a Widely Deployable Indicator of Trust in Wikipedia. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 125, 17 pages. https://doi.org/10.1145/3491102.3517523
- Gabriel De la Calzada and Alex Dekhtyar. 2010. On measuring the quality of Wikipedia articles. In WICOU ’10: Workshop on Information Credibility on the Web. Association for Computing Machinery, New York City, United States, 11–18. https://dl.acm.org/doi/10.1145/1772938.1772943
- Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31st International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 32), Eric P. Xing and Tony Jebara (Eds.). PMLR, Beijing, China, 1188–1196. https://proceedings.mlr.press/v32/le14.html
- Tao-Chi Lee and Jayakrishnan Unnikrishnan. 2013. Monitoring network structure and content quality of signal processing articles on wikipedia. In ICASSP ’13: International Conference on Acoustics. Institute of Electrical and Electronic Engineers, New York City, United States, 8766–8770. https://ieeexplore.ieee.org/document/6639378
- AIMQ: a methodology for information quality assessment. Information & Management 40 (2002), 133–146. https://www.sciencedirect.com/science/article/abs/pii/S0378720602000435
- Jürgen Lerner and Alessandro Lomi. 2018. Knowledge categorization affects popularity and quality of Wikipedia articles. PLoS ONE 13 (2018), 1–22. Issue 1. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0190674
- Włodzimierz Lewoniewski. 2017. Enrichment of Information in Multilingual Wikipedia Based on Quality Analysis. In BIS ’17: International Conference on Business Information Systems. Springer, Cham, Switzerland, 216–227. https://link.springer.com/chapter/10.1007/978-3-319-69023-0_19
- Włodzimierz Lewoniewski. 2018. Measures for Quality Assessment of Articles and Infoboxes in Multilingual Wikipedia. In BIS ’18: International Conference on Business Information Systems. Springer, Cham, Switzerland, 619–633. https://link.springer.com/chapter/10.1007/978-3-030-04849-5_53
- Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources. In ICIST ’18: International Conference on Information and Software Technologies. Springer, Cham, Switzerland, 139–152. https://link.springer.com/chapter/10.1007/978-3-319-99972-2_11
- Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia. In ICIST ’17: International Conference on Information and Software Technologies. Springer, Cham, Switzerland, 550–560. https://link.springer.com/chapter/10.1007/978-3-319-67642-5_46
- Włodzimierz Lewoniewski and Krzysztof Węcel. 2017. Relative Quality Assessment of Wikipedia Articles in Different Languages Using Synthetic Measure. In BIS ’17: International Conference on Business Information Systems. Springer, Cham, Switzerland, 282–292. https://link.springer.com/chapter/10.1007/978-3-319-69023-0_24
- Quality and Importance of Wikipedia Articles in Different Languages. In ICIST ’16: International Conference on Information and Software Technologies. Springer, Cham, Switzerland, 613–624. https://link.springer.com/chapter/10.1007/978-3-319-46254-7_50
- Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles. Informatics 4 (2017), 43. Issue 4. https://www.mdpi.com/2227-9709/4/4/43
- Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features. In ICIST ’18: International Conference on Information and Software Technologies. Springer, Cham, Switzerland, 546–558. https://link.springer.com/chapter/10.1007/978-3-319-99972-2_45
- Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics. Computers 8 (2019), 60. Issue 3. https://www.mdpi.com/2073-431X/8/3/60
- Measuring the quality of web content using factual information. In WebQuality ’12: Joint WICOW/AIRWeb Workshop on Web Quality. Association for Computing Machinery, New York City, United States, 7–10. https://dl.acm.org/doi/10.1145/2184305.2184308
- Is cross-linguistic advert flaw detection in Wikipedia feasible? A multilingual-BERT-based transfer learning approach. Knowledge-Based Systems 252 (2022). Issue 109330. https://www.sciencedirect.com/science/article/pii/S0950705122006670
- Automatically Assessing Wikipedia Article Quality by Exploiting Article-Editor Networks. In ECIR ’15: European Conference on Information Retrieval. Springer, Cham, Switzerland, 574–580. https://link.springer.com/chapter/10.1007/978-3-319-16354-3_64
- Measuring Qualities of Articles Contributed by Online Communities. In WI ’16: IEEE WIC ACM International Conference on Web Intelligence. Institute of Electrical and Electronic Engineers, New York City, United States, 81–87. https://ieeexplore.ieee.org/document/4061345
- Yan Lin and Chenxi Wang. 2020. Wisdom of crowds: the effect of participant composition and contribution behavior on Wikipedia article quality. Journal of Knowledge Management 24 (2020), 324–345. Issue 2. https://www.emerald.com/insight/content/doi/10.1108/JKM-08-2019-0416/full/html
- AUC: a better measure than accuracy in comparing learning algorithms. In Advances in Artificial Intelligence: 16th Conference of the Canadian Society for Computational Studies of Intelligence, AI 2003, Halifax, Canada, June 11–13, 2003, Proceedings 16. Springer, Springer Berlin Heidelberg, Berlin, Heidelberg, 329–341.
- Nedim Lipka and Benno Stein. 2010. Identifying featured articles in wikipedia: writing style matters. In WWW ’10: International Conference on the World Wide Web. Association for Computing Machinery, New York City, United States, 1147–1148. https://dl.acm.org/doi/10.1145/1772690.1772847
- Jun Liu and Sudha Ram. 2011. Who does what: Collaboration patterns in the wikipedia and their impact on article quality. ACM Transactions on Management Information Systems 2 (2011), 1–23. Issue 2. https://dl.acm.org/doi/10.1145/1985347.1985352
- Jun Liu and Sudha Ram. 2018. Using big data and network analysis to understand Wikipedia article quality. Data & Knowledge Engineering 115 (2018), 80–93. https://www.sciencedirect.com/science/article/pii/S0169023X18300685?via%253Dihub
- Evaluating Article Quality and Editor Reputation in Wikipedia. In CSWS ’13: China Semantic Web Symposium and Web Science Conference. Springer, Berlin, Heidelberg, 215–227. https://link.springer.com/chapter/10.1007/978-3-642-54025-7_19
- Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation. Expert Systems with Applications 132 (2019), 226–238. https://www.sciencedirect.com/science/article/pii/S0957417419302830
- Introduction to Information Retrieval. Cambridge University Press, Cambridge, England. https://books.google.pt/books?id=t1PoSh4uwVcC
- An Edit-centric Approach for Wikipedia Article Quality Assessment. In WNUT ’19: Workshop on Noisy User-generated Text. ACL Anthology, Online, 381–386. https://aclanthology.org/D19-5550/
- Improved Automatic Maturity Assessment of Wikipedia Medical Articles. In OTM ’14: Confederated International Conferences ”On the Move to Meaningful Internet Systems”. Springer, Berlin, Heidelberg, 612–662. https://link.springer.com/chapter/10.1007/978-3-662-45563-0_37
- Sai T. Moturu and Huan Liu. 2009. Evaluating the trustworthiness of Wikipedia articles through quality and credibility. In WikiSym ’09: International Symposium on Wikis and Open Collaboration. Association for Computing Machinery, New York City, United States, 1–2. https://dl.acm.org/doi/10.1145/1641309.1641349
- Pedro Miguel Moás and Carla Teixeira Lopes. 2023. Automatic Quality Assessment of Wikipedia Articles - A Systematic Literature Review Dataset [Dataset]. INESC TEC. https://doi.org/10.25747/s5fa-d428
- Nir Ofek and Lior Rokach. 2015. A classifier to determine which Wikipedia biographies will be accepted. Journal of the Association for Information Science and Technology 66 (2015), 213–218. Issue 1. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.23199
- OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
- The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372 (3 2021). https://doi.org/10.1136/BMJ.N71
- PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ 372 (2021), 1–36. https://doi.org/10.1136/bmj.n160 arXiv:https://www.bmj.com/content/372/bmj.n160.full.pdf
- Predicting Information Quality Flaws in Wikipedia by Using Classical and Deep Learning Approaches. In CACIC ’19: Argentine Congress of Computer Science. Springer, Cham, Switzerland, 3–18. https://link.springer.com/chapter/10.1007/978-3-030-48325-8_1
- David Pierce. 2023. ChatGPT started a new kind of AI race — and made text boxes cool again. https://www.theverge.com/2023/3/26/23655456/chatgpt-bard-bing-ai-race-text-boxes. Accessed: 2023-04-01.
- Identifying featured articles in Spanish Wikipedia. http://sedici.unlp.edu.ar/bitstream/handle/10915/42288/Documento_completo.pdf?sequence=1
- Xiangju Qin and Pádraig Cunningham. 2012. Assessing the Quality of Wikipedia Pages Using Edit Longevity and Contributor Centrality. https://arxiv.org/abs/1206.2517
- Classifying Wikipedia Article Quality With Revision History Networks. In OpenSym ’20: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–7. https://dl.acm.org/doi/10.1145/3412569.3412581
- Exploring the Feasibility of Automatically Rating Online Article Quality. In Wikimania ’07: Wikimania Conference. Wikimedia Foundation, San Francisco, California. https://scholar.google.pt/citations?view_op=view_citation%26hl=pt-PT%26user=T_sFnwoAAAAJ%26citation_for_view=T_sFnwoAAAAJ:u-x6o8ySG0sC
- A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages. In Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 2373–2380. https://aclanthology.org/2020.lrec-1.289
- Information asymmetry in Wikipedia across different languages: A statistical analysis. Journal of the Association for Information Science and Technology 73 (3 2022), 347–361. Issue 3.
- On the Relation of Edit Behavior, Link Structure, and Article Quality on Wikipedia. In COMPLEX NETWORKS ’19: International Workshop on Complex Networks & Their Applications. Springer, Cham, Switzerland, 242–254. https://link.springer.com/chapter/10.1007/978-3-030-36683-4_20
- Relating Wikipedia article quality to edit behavior and link structure. Applied Network Science 5 (2020), 1–20. Issue 61. https://appliednetsci.springeropen.com/articles/10.1007/s41109-020-00305-y
- Giuseppe De Ruvo and Antonella Santone. 2015. Analysing wiki quality using probabilistic model checking. In WET ICE ’15: IEEE International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises. Institute of Electrical and Electronic Engineers, New York City, United States, 224–229. https://ieeexplore.ieee.org/document/71943655
- Kanchana Saengthongpattana and Nuanwan Soonthornphisaj. 2014. Assessing the Quality of Thai Wikipedia Articles Using Concept and Statistical Features. In WorldCIST ’14: World Conference on Information Systems and Technologies. Springer, Cham, Switzerland, 513–523. https://link.springer.com/chapter/10.1007/978-3-319-05951-8_49
- Ontology-Based Classifiers for Wikipedia Article Quality Classification. In iSAI-NLP ’17: International Joint Symposium on Artificial Intelligence and Natural Language Processing. Springer, Cham, Switzerland, 68–81. https://link.springer.com/chapter/10.1007/978-3-319-94703-7_7
- Quality Classification of ASEAN Wikipedia Articles using Statistical Features. In iSAI-NLP ’18: International Joint Symposium on Artificial Intelligence and Natural Language Processing. Institute of Electrical and Electronic Engineers, New York City, United States, 1–6. https://ieeexplore.ieee.org/document/8692954/
- A Large-Scale Study of Wikipedia Users’ Quality of Experience. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 3194–3200. https://doi.org/10.1145/3308558.3313467
- StRE: Self Attentive Edit Quality Prediction in Wikipedia. In ACL ’19: Annual Meeting of the Association for Computational Linguistics. ACL Anthology, Online, 3962–3972. https://aclanthology.org/P19-1387/
- Robert E. Schapire. 2003. The Boosting Approach to Machine Learning: An Overview. Springer New York, New York, NY, 149–171. https://doi.org/10.1007/978-0-387-21579-2_9
- Manuel Schmidt and Eva Zangerle. 2019. Article quality classification on Wikipedia: introducing document embeddings and content features. In OpenSym ’19: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–8. https://dl.acm.org/doi/10.1145/3306446.3340831
- Qualifying Articles of Persian Wikipedia Encyclopedia Through J48 Algorithm, ANFIS and Subtractive Clustering. Automation 3 (2016), 141–153. Issue 6. https://www.sciencepublishinggroup.com/journal/paperinfo?journalid=134%26doi=10.11648/j.acis.20150306.18
- A Hybrid Model for Quality Assessment of Wikipedia Articles. In ALTA ’17: Australasian Language Technology Association Workshop. ACL Anthology, Online, 43–52. https://aclanthology.org/U17-1005/
- A joint model for multimodal document quality assessment. In JCDL ’19: Joint Conference on Digital Libraries. Association for Computing Machinery, New York City, United States, 107–110. https://dl.acm.org/doi/10.1109/JCDL.2019.00024
- A multimodal approach to assessing document quality. Journal of Artificial Intelligence Research 68 (2020), 607–632. https://www.jair.org/index.php/jair/article/view/11647
- Nuanwan Soonthornphisaj and Peerapoom Paengporn. 2017. Thai Wikipedia article quality filtering algorithm. In IMECS ’17: International MultiConference of Engineers and Computer Scientists. International Association of Engineers, Hong Kong, China. https://www.iaeng.org/publication/IMECS2017/IMECS2017_pp299-305.pdf
- Klaus Stein and Claudia Hess. 2007. Does it matter who contributes: a study on featured articles in the german wikipedia. In HT ’07: Conference on Hypertext and Hypermedia. Association for Computing Machinery, New York City, United States, 171–174. https://dl.acm.org/doi/10.1145/1286240.1286290
- Issues of cross-contextual information quality evaluation—The case of Arabic, English, and Korean Wikipedias. Library & Information Science Research 31 (2009), 232–239. Issue 4. https://www.sciencedirect.com/science/article/pii/S0740818809000954
- A framework for information quality assessment. Journal of the Association for Information Science and Technology 58 (2007), 1720–1733. Issue 12. https://onlinelibrary.wiley.com/doi/10.1002/asi.20652
- Information quality discussions in wikipedia. In ICKM ’05: International Conference on Knowledge Management. Universiti Putra Malaysia, Seri Kembangan, Malaysia. https://www.researchgate.net/publication/200773232_Information_Quality_Discussions_in_Wikipedia
- Assessing information quality of a community-based encyclopedia. In ICIQ ’05: International Conference on Information Quality. Massachusetts Institute of Technology, Cambridge, Massachusetts, 442–454. https://www.semanticscholar.org/paper/Assessing-Information-Quality-of-a-Community-Based-Stvilia-Twidale/dd888dddccc2075a44f99ec2380fda652040afaf
- Qi Su and Pengyuan Liu. 2015. A Psycho-Lexical Approach to the Assessment of Information Quality on Wikipedia. In WI-IAT ’15: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Institute of Electrical and Electronic Engineers, New York City, United States, 184–187. https://ieeexplore.ieee.org/document/7397452
- Chinthani Sugandhika and Supunmali Ahangama. 2022. Assessing Information Quality of Wikipedia Articles through Google’s E-A-T Model. IEEE Access 10 (2022), 52196–52209. https://ieeexplore.ieee.org/document/9770051
- Modelling Wikipedia’s Information Quality using Informativeness, Reliability and Authority. In ICAC ’21: International Conference on Advancements in Computing. Institute of Electrical and Electronic Engineers, New York City, United States, 169–174. https://ieeexplore.ieee.org/document/9671092
- Yu Suzuki. 2012. Assessing Quality Values of Wikipedia Articles Using Implicit Positive and Negative Ratings. In WAIM ’12: International Conference on Web-Age Information Management. Springer, Berlin, Heidelberg, 127–138. https://link.springer.com/chapter/10.1007/978-3-642-32281-5_13
- Yu Suzuki. 2013. Effects of Implicit Positive Ratings for Quality Assessment of Wikipedia Articles. Journal of Information Processing 21 (2013), 342–348. Issue 2. https://www.jstage.jst.go.jp/article/ipsjjip/21/2/21_342/_article
- Yu Suzuki. 2015. Quality Assessment of Wikipedia Articles Using h-index. Journal of Information Processing 23 (2015), 22–30. Issue 1. https://www.jstage.jst.go.jp/article/ipsjjip/23/1/23_22/_article
- Yu Suzuki and Masatoshi Yoshikawa. 2012a. Mutual evaluation of editors and texts for assessing quality of Wikipedia articles. In WikiSym ’12: International Symposium on Wikis and Open Collaboration. Association for Computing Machinery, New York City, United States, 1–10. https://dl.acm.org/doi/10.1145/2462932.2462956
- Yu Suzuki and Masatoshi Yoshikawa. 2012b. QualityRank: assessing quality of wikipedia articles by mutually evaluating editors and texts. In HT ’12: ACM Conference on Hypertext & Social Media. Association for Computing Machinery, New York City, United States, 307–308. https://dl.acm.org/doi/10.1145/2309996.2310047
- Yu Suzuki and Masatoshi Yoshikawa. 2013. Assessing quality score of Wikipedia article using mutual evaluation of editors and texts. In CIKM ’13: ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, New York City, United States, 1722–1732. https://dl.acm.org/doi/10.1145/2505515.2505610
- Diversity of editors and teams versus quality of cooperative work: experiments on wikipedia. Journal of Intelligent Information Systems 48 (2017), 601–632. https://link.springer.com/article/10.1007/s10844-016-0428-1
- Diego Sáez-Trumper. 2021. Disinformation and AI: The Differences Between Wikipedia and Social Media. https://diff.wikimedia.org/2021/09/15/disinformation-and-ai-the-differences-between-wikipedia-and-social-media/. Accessed: 2023-04-04.
- Nathan Teblunthuis. 2021. Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression. In Proceedings of the 17th International Symposium on Open Collaboration (Online, Spain) (OpenSym ’21). Association for Computing Machinery, New York, NY, USA, Article 5, 10 pages. https://doi.org/10.1145/3479986.3479991
- Michail Tsikerdekis. 2017. Cumulative Experience and Recent Behavior and their Relation to Content Quality on Wikipedia. Interacting with Computers 29 (2017), 737–754. Issue 5. https://academic.oup.com/iwc/article/29/5/737/3885842
- On the Assessment of Information Quality in Spanish Wikipedia. In CACIC ’19: Argentine Congress of Computer Science. National University of La Plata, La Plata, Argentina, 702–711. http://sedici.unlp.edu.ar/handle/10915/56750
- Srikar Velichety. 2019. Quality Assessment of Peer-Produced Content in Knowledge Repositories Using Big Data and Social Networks: The Case of Implicit Collaboration in Wikipedia. ACM SIGMIS Database: The DATABASE for Advances in Information Systems 50 (2019), 28–51. Issue 4. https://dl.acm.org/doi/10.1145/3371041.3371045
- Quality Assessment of Peer-Produced Content in Knowledge Repositories using Development and Coordination Activities. Journal of Management Information Systems 36 (2019), 478–512. Issue 2. https://www.tandfonline.com/doi/full/10.1080/07421222.2019.1598692
- On the Feasibility of External Factual Support as Wikipedia’s Quality Metric. Processamiento del Lenguaje Natural 58 (2017), 93–100. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/5417
- Nicholas Vincent and Brent Hecht. 2021. A Deeper Investigation of the Importance of Wikipedia Links to Search Engine Results. Proceedings of the ACM on Human-Computer Interaction 5 (2021), 18. Issue CSCW1. https://doi.org/10.1145/3449078
- A hybrid approach to classifying Wikipedia article quality flaws with feature fusion framework. Expert Systems with Applications 181 (2021), 115089. Issue 1. https://www.sciencedirect.com/science/article/pii/S0957417421005303?via%253Dihub
- Ping Wang and Xiaodan Li. 2020. Assessing the quality of information on wikipedia: A deep‐learning approach. Journal of the Association for Information Science and Technology 71 (2020), 16–28. Issue 1. https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24210
- A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia. Journal of Information Science 47 (2019), 176 – 191. Issue 2. https://journals.sagepub.com/doi/10.1177/0165551519877646
- Se Wang and Mizuho Iwaihara. 2010. Quality Evaluation of Wikipedia Articles through Edit History and Editor Groups. In APWeb ’11: Asia-Pacific Web Conference. Springer, Berlin, Heidelberg, 188–199. https://link.springer.com/chapter/10.1007/978-3-642-20291-9_20
- Tell me more: an actionable quality model for Wikipedia. In WikiSym ’13: International Symposium on Open Collaboration. Association for Computing Machinery, New York City, United States, 1–10. https://dl.acm.org/doi/10.1145/2491055.2491063
- Wikimedia. 2022a. Wikipedia Statistics - Edit and Revert Trends. https://stats.wikimedia.org/EN/EditsRevertsEN.htm. Accessed: 2023-04-03.
- Wikimedia. 2022b. Wikistats - Statistics For Wikimedia Projects. https://stats.wikimedia.org. Accessed: 2023-04-03.
- Wikipedia. 2022a. List of Wikipedias. https://meta.wikimedia.org/wiki/List_of_Wikipedias. Accessed: 2023-04-03.
- Wikipedia. 2022b. Wikipedia. https://en.wikipedia.org/wiki/Wikipedia. Accessed: 2023-04-03.
- Wikipedia. 2022c. Wikipedia: Content assessment. https://en.wikipedia.org/wiki/Wikipedia:Content_assessment. Accessed: 2023-04-03.
- Wikipedia. 2022d. Wikipedia: Size of Wikipedia. https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia. Accessed: 2023-04-03.
- Wikipedia. 2023. Wikipedia: Manual of Style. https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/. Accessed: 2023-04-04.
- Dennis Wilkinson and Bernardo Huberman. 2007. Cooperation and quality in wikipedia. In WikiSym ’07: International Symposium on Wikis. Association for Computing Machinery, New York City, United States, 157–164. https://dl.acm.org/doi/10.1145/1296951.1296968
- David H Wolpert and William G Macready. 1997. No Free Lunch Theorems for Optimization. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 (1997), 67. Issue 1.
- Characterizing Wikipedia pages using edit network motif profiles. In SMUC ’11: International Workshop on Search and Mining User-generated Contents. Association for Computing Machinery, New York City, United States, 45–52. https://dl.acm.org/doi/10.1145/2065023.2065036
- Classifying Wikipedia articles using network motif counts and ratios. In WikiSym ’12: International Symposium on Wikis and Open Collaboration. Association for Computing Machinery, New York City, United States, 1–12. https://dl.acm.org/doi/10.1145/2462932.2462948
- Mining the Factors Affecting the Quality of Wikipedia Articles. In ISME ’10: International Conference of Information Science and Management Engineering. Institute of Electrical and Electronic Engineers, New York City, United States, 343–346. https://ieeexplore.ieee.org/document/5572324
- Good Authors = Good Articles? - How Wikis Work. In WI ’15: International Conference on Wirtschaftsinformatik. Association for Information Systems, Atlanta, Georgia. https://aisel.aisnet.org/wi2015/59/?utm_source=aisel.aisnet.org%252Fwi2015%252F59%26utm_medium=PDF%26utm_campaign=PDFCoverPages
- Thomas Wöhner and Ralf Peters. 2009. Assessing the quality of Wikipedia articles with lifecycle based metrics. In WikiSym ’09: International Symposium on Wikis and Open Collaboration. Association for Computing Machinery, New York City, United States, 1–10. https://dl.acm.org/doi/10.1145/1641309.1641333
- Krzysztof Węcel and Włodzimierz Lewoniewski. 2015. Modelling the Quality of Attributes in Wikipedia Infoboxes. In BIS ’15: International Conference on Business Information Systems. Springer, Cham, Switzerland, 308–320. https://link.springer.com/chapter/10.1007/978-3-319-26762-3_27
- Detection of Article Qualities in the Chinese Wikipedia Based on C4.5 Decision Tree. In KSEM ’13: International Conference on Knowledge Science. Springer, Berlin, Heidelberg, 444–452. https://link.springer.com/chapter/10.1007/978-3-642-39787-5_36
- Explainable AI: A brief survey on history, research areas, approaches and challenges. In Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8. Springer International Publishing, Cham, 563–574.
- Yanxiang Xu and Tiejian Luo. 2011. Measuring article quality in Wikipedia: Lexical clue model. In SWS ’11: Symposium on Web Society. Institute of Electrical and Electronic Engineers, New York City, United States, 141–146. https://ieeexplore.ieee.org/document/6101286
- Models for Arabic Document Quality Assessment. In BIS ’20: International Conference on Business Information Systems. Springer, Cham, Switzerland, 297–310. https://link.springer.com/chapter/10.1007/978-3-030-61146-0_24
- Adnan Yahya and Ali Salhi. 2014. Quality assessment of Arabic web content: The case of the Arabic Wikipedia. In IIT ’14: International Conference on Innovations in Information Technology. Institute of Electrical and Electronic Engineers, New York City, United States, 36–41. https://ieeexplore.ieee.org/document/6987558
- Who Did What: Editor Role Identification in Wikipedia. In ICWSM ’16: International AAAI Conference on Web and Social Media. Association for the Advancement of Artificial Intelligence, Palo Alto, California, 446–455. https://ojs.aaai.org/index.php/ICWSM/article/view/14732
- Literature Review of Deep Learning Research Areas. Gazi Mühendislik Bilimleri Dergisi 5 (2019), 188 – 215. Issue 3. https://doi.org/10.30855/gmbd.2019.03.01
- Linfeng Yu and Mizuho Iwaihara. 2018. Finding high quality documents through link and click graphs. In IIAI-AAI ’18: International Congress on Advanced Applied Informatics. Institute of Electrical and Electronic Engineers, New York City, United States, 49–54. https://ieeexplore.ieee.org/abstract/document/8693372
- Computing trust from revision history. Technical Report. Stanford Univ Ca Knowledge Systems LAB. https://apps.dtic.mil/sti/citations/ADA454704
- Predicting Low-Quality Wikipedia Articles Using User’s Judgements. Springer, Cham, Switzerland. https://link.springer.com/chapter/10.1007/978-3-319-05467-4_6
- History-Based Article Quality Assessment on Wikipedia. In BIGCOMP ’18: International Conference on Big Data and Smart Computing. Institute of Electrical and Electronic Engineers, New York City, United States, 1–8. https://ieeexplore.ieee.org/document/8367090
- Adjusting the imbalance ratio by the dimensionality of imbalanced data. Pattern Recognition Letters 133 (2020), 217–223. https://doi.org/10.1016/j.patrec.2020.03.004
- Didem Ölçer and Tuğba Taşkaya Temizel. 2022. Quality assessment of web-based information on type 2 diabetes. Online Information Review 46 (2022), 715–732. Issue 4. https://www.emerald.com/insight/content/doi/10.1108/OIR-02-2021-0089/full/html