Muhammad Humayoun
Research
Machine learning, computational linguistics, and natural language processing
In my doctoral work, I worked on a framework aimed at formalizing mathematical language for mechanical validation. I explored translating informal mathematical texts into formal representations understandable by proof assistants.
I have been involved in the development of language resources such as morphologies, and lexicons for Urdu and Punjabi - two South Asian languages. Furthermore, I have been involved with text summarization, especially for Urdu, creating benchmark corpora and assessing the impact of pre-processing on automatic summarization algorithms. In addition, my participation in text classification competitions has led to good results in abusive language detection and fake news detection for Urdu.
Measuring plagiarism in programming assignments is an essential task. I have published on the methods of plagiarism and their detection in introductory programming course assignments written in C++. I have used machine learning classifiers/models extensively to perform these tasks.
Keywords: Computational Linguistics, NLP, Formalization, South Asian languages, Linguistic Resources, Text Summarization, Text Classification
Teaching
I have taught the following courses frequently from 2012 to 2018 to undergraduate level (unless stated otherwise).Â
- Machine Learning (Graduate level)
- Natural Language Processing  (Graduate level)
- Artificial Intelligence
- Introduction to Machine Learning
- Discrete Structures
- Programming Fundamentals (using C)
- Computing for Management
I have taught the following programming courses from 2018 to 2023 at Higher Colleges of Technologies to undergraduate level students:
- Computational Thinking and Coding (using Python)
- Programming for Information Security  (using Python)
- Fundamentals of Programming  (using Java)
- Data Driven Web Development (ASP.NET MVC and C# )
- Web Technologies (HTML, CSS, JavaScript)
- Enterprise Database Applications (Using Oracle Apex)
- Advanced Object Oriented Programming (Using Java)
- Advanced Mobile Application Development (Android)
Bio
Currently, I am working at ¹û¶³´«Ã½, Sweden as a Senior Lecturer, designing and teaching courses related to informatics.Â
I earned a Ph.D. in computer science from the University of Grenoble Alpes, France in January 2012 under the supervision of Prof. Â Dr. Christophe Raffalli and Prof. Â Dr. Aarne Ranta. I received a Master's degree in 2006 from Chalmers University of Technology, Sweden. Â
I have more than ten years of Post-PhD teaching and research experience. I am generally interested in natural language processing/technology in the context of under-resourced languages and controlled languages. Â I have practical experience in the development of natural language applications.
Selected publications
- Muhammad Humayoun and Naheed Akhtar (2022) CORPURES: Benchmark Corpus for Urdu Extractive Summaries and Experiments using Supervised Learning. Intelligent Systems with Applications. Elsevier. [Online]
- Muhammad Humayoun, Muhammad Adnan Hashmi, Ali Hanzala Khan (2022). Measuring Plagiarism in Introductory Programming Course Assignments. 8th International Conference on Information Technology Trends (ITT), Higher Colleges of Technology - Dubai Men’s Campus on 25-26 May 2022. Dubai, United Arab Emirates. [online]
- Muhammad Humayoun (2021). Abusive and Threatening Language Detection in Urdu using Supervised Machine Learning and Feature Combinations. FIRE 2021. CICLing 2021 track. International Conference on Computational Linguistics and Intelligent Text Processing. [online]
Recognition: Our submitted results were selected for the third recognition with a monetary prize of 10K Rub (Russian ruble) from ODS Summer of Code. URL:https://www.urduthreat2021.cicling.org/home - Muhammad Humayoun (2021). The 2021 Urdu Fake News Detection Task using Supervised Machine Learning and Feature Combinations. Â FIRE 2021. CICLing 2021 track. International Conference on Computational Linguistics and Intelligent Text Processing. [online]
Recognition: Our system ranked 5th among 18 teams. During paper submission, we improved results higher than the second-best score in the competition. URL: https://www.urdufake2021.cicling.org/results-and-rankings - Muhammad Humayoun, Rao Muhammad Adeel Nawab, Muhammad Uzair, Saba Aslam and Omer Farzand (2016). Urdu summary corpus. In Nicoletta Calzolari, et al., editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA). ISBN: 978-2-9517408-9-1. [online]
- Muhammad Humayoun and Hwanjo Yu (2016), Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization. In Nicoletta Calzolari, et al., editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA). ISBN: 978-2-9517408-9-1. [online]
- Muhammad Salman Khan, Adnan Ahmad, Muhammad Humayoun (2014).Building an Effective Automated Assessment System for C/C++ Introductory Programming Courses in ODL Environment. Proceedings of 28th Annual Conference of the Asian Association of Open Universities. The Hong Kong University, Hong Kong, China. Â [pdf]
- Jinoh Oh, Youngchul Sung, Jinha Kim, Muhammad Humayoun, Young-Ho Park, Hwanjo Yu (2012). Time-Dependent User Profiling for TV Recommendation. Second International Conference on Cloud and Green Computing (CGC 2012), 10.1109/CGC.2012.119, Page(s): 783 – 787. IEEE Conference Publications
- Shafqat M. Virk, M. Humayoun, A. Ranta (2011). An Open Source Punjabi Resource Grammar. Proceedings of the 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011). (Ranking: 0.54, in range 0.00–1.00, short paper acceptance rate: 38%) [online]
- M. Humayoun and C. Raffalli (2010). MathAbs: A Representational Language for Mathematics. 8th International Conference on Frontiers of Information Technology. December 21-23, 2010, Islamabad, Pakistan. ACM 978-1-4503-0342-2/10/12. (Acceptance rate: 29.25%) [online]
- M. Humayoun and C. Raffalli (2010). MathNat - Mathematical Text in a Controlled Natural Language. Special issue: Natural Language Processing and its Applications. Journal on Research in Computing Science. Volume 46. ISSN:1870-4069. CICLing 2010:11th International Conference on Intelligent Text Processing and Computational Linguistics, March 21-27, 2010, Iasi, Romania. (Acceptance rate: 27%). [online]
- M. Humayoun and A. Ranta (2010). Developing Punjabi Morphology, Corpus and Lexicon. In R. Otoguro, K. Ishikawa, H. Umemoto, K. Yoshimoto, and Y. Harada, editors, Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation (PACLIC24). Pages 163–172. Tohoku University, Japan, November 2010. ISBN 978–4–905166–00–9. (Acceptance rate:27.45%) [online]
- Shafqat M. Virk, M. Humayoun, A. Ranta (2010). An Open Source Urdu Resource Grammar. Proceedings of the Eight Workshop on Asian Language Resources. August 2010, Beijing, China. Co-located with COLING 2010. (Acceptance rate: 62.86%) [online]
- M. Humayoun, H. Hammarstrom, and A. Ranta (2007). Urdu Morphology, Orthography and Lexicon Extraction. In Ali Farghaly & Karine Megerdoomian (eds.), Proceedings of the 2nd Workshop on Computational Approaches to Arabic Script-based Languages. Pages 59–68, LSA 2007 Linguistic Institute, Stanford University, USA. (Acceptance rate: not mentioned, but frequently cited paper) [online]