About me |
|
Recent Updates |
I am a Text Analytics and NLP Researcher, currently working as a ML Engineer at Aidaptive, powered by Jarvis ML. I have completed my Ph.D. degree from Dept. of Computer Science, University of Delhi, under the supervision of Prof. Vasudha Bhatnagar. During my Ph.D., I was a researcher at the Network Science Lab lead by Prof. Bhatnagar. My research interests include Text Mining, Graph Analytics, Natural Language Processing, Machine Learning, and Discourse Analysis. Apart from research, I enjoy reading fictions and classic novels, watching culinary and travel vlogs, and writing poems and short stories. For more details, please download my CV. |
|
November 28, 2022: Successfully defended my Ph.D. thesis titled "Complex Networks for Textual Discourse Coherence Analysis". June 22, 2020: Presented my Pre-PhD seminar on proposed thesis title "Complex Networks for Textual Discourse Coherence Analysis". Currently, I am in the process of drafting my thesis, which I intend to submit by the end of February, 2021. June 04, 2020: Communicated our work on discourse coherence analysis of scholarly articles. This work is currently under review. |
Duari, S., & Bhatnagar, V. (March, 2019). sCAKE: Semantic Connectivity Aware Keyword Extraction. Information Sciences, 477, 100 – 117. DOI: 10.1016/j.ins.2018.10.034. (Paper: ScienceDirect, arXiv Preprint) (Data and Code) (SCI, Impact Factor: 8.233 (2021))
Duari, S., & Bhatnagar, V. (2020). Complex Network based Supervised Keyword Extractor. Expert Systems with Applications, 140, 112876. DOI: 10.1016/j.eswa.2019.112876. (Paper: ScienceDirect, arXiv Preprint) (Data and Code) (SCIE, Impact Factor: 8.665 (2021))
Duari, S., & Bhatnagar, V. (May, 2019). Semi-automatic System for Title Construction. Gani A., Das P., Kharb L., Chahal D. (eds) Information, Communication and Computing Technology. ICICCT 2019. Communications in Computer and Information Science, Springer, Singapore., 1025, 216-227. DOI: 10.1007/978-981-15-1384-8_18. (Paper: SpringerLink, arXiv Preprint) (Data and Code)
Chaturvedi, R., Dhani, J.S., Joshi, A., Khanna, A., Tomar, N., Duari, S., Khurana, A. and Bhatnagar, V. (November, 2020). Divide and Conquer: From Complexity to Simplicity for Lay Summarization. In Proceedings of the First Workshop on Scholarly Document Processing (pp. 344-355). (Paper: ACL Anthology)
Duari, S. and Bhatnagar, V., (2021). FFCD: A Fast-and-Frugal Coherence Detection Method. IEEE Access. vol. 10, pp. 85305-85314. DOI: 10.1109/ACCESS.2021.3135048. (SCIE, Impact Factor: 3.476 (2021)) Bhatnagar, V., Duari, S., and Gupta, S. K. (2022). Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts using Multilayer Networks. IEEE Access. vol. 10, pp. 88538-88557. DOI: 10.1109/ACCESS.2022.3198952. (SCIE, Impact Factor: 3.476 (2021))I am currently working on a project on computational discourse coherence analysis. My objective is to analyse scientific articles on the basis of their cohesion and coherence, and quantify the measure of writing quality in terms of these properties. We have recently communicated a paper, where we explored complex network based framework for modelling textual discourse
Used classical ML algorithms to engineer solutions for automatically extracting keywords from single documents. We transformed the text to a complex network representation and extracted node properties as features. The proposed method works on all texts, irrespective of the domain, collection, or language.
View ProjectEngineered solutions for unsupervised, graph-based keyword extraction from single documents. We proposed a novel, parameterless, graph-based keyword extraction algorithm (sCAKE) and its language-agnostic variant (LAKE). We also proposed a context-aware graph construction method and a semantic connectivity based word scoring method.
View ProjectDesigned a semi-automatic system for identifying and recommending keywords for inclusion in the title of a scientific manuscript. Here, the keyword extraction phase is automatic and title construction from extracted keyword is manual. For extracting keywords, we induced supervised models using a graph- theoretic feature set.
View ProjectDesigned a Revenue Management System for Assam Power Distribution Company Limited (APDCL), Assam, India. It was developed using JSP as front-end and MySQL as back-end. The objective was to design an efficient database to store information regarding revenue collection of APDCL and to build an web-based application to effectively view, manipulate, and aggregate the stored information.
An NLP and sentiment Analysis based project for movie classification. The objective was to analyse movie reviews and assign an aggregated sentiment polarity (positive, negative, and neutral) to the reviews. It was developed using JSP, and bag-of-words model was used for document representation and polarity dataset v1.0 was used for evaluation.
I am collaborating with Masters students (2019 Batch) of the department for this project on fake news detection. My role is to provide guidance and do brainstorming with the students. I work as an assistant under the guidance of my PhD supervisor, Prof. Vasudha Bhatnagar.
I work on developing property recommendation models for short-term vacation rental using textual similarity.
Highlights:
Courses taught: Introduction to Computers, Algorithms, Data Structures, Internet and Web Technologies, DBMS, Software Engineering
Courses taught: "Computer Skills"-A compulsory paper for all 2nd year undergraduate students. Special Mention: Took initiative to teach computer skills to economically weaker students from the college. Simultaneously taught 3 batches of HS and BA students (class size = max 10 students) 3-months courses on basic computer skill. This initiative was outside of my regular academic responsibilities.
I was trained in J2EE. As a group project during training, our team developed a finance management system using J2EE as front-end and MySQL as back-end.
Thesis Title: Complex Networks for Textual Discourse Coherence Analysis Supervisor: Prof. Vasudha Bhatnagar Research Area: Text Analytics, NLP, and Computational Discourse Analysis Relevant Coursework: Machine Learning, Special Topics in Data Mining (Graph Analytics), Text Mining
Graduated with CGPA 9/10 | Secured 2nd rank Relevant Coursework: Algorithms, Data Mining, Data Structures, DBMS, Advance Discreet Structures, Compiler Design
Graduated with 80.2% | Secured 1st rank Relevant Coursework: Data Structures, DBMS, Algorithms,Theory of Computing, Programming in C and C++
Python (Proficient), R (Proficient), Java (Proficient), C (Proficient), C++ (Prior experience), JSP (Prior experience), JavaScript (Prior experience), HTML (Prior experience), CSS (Prior experience).
BigQuery (Intermediate), Amazon RDS (Familiar), MySQL (Prior experience), Oracle (Prior experience)
LaTeX, Weka, AWS Lambda, EC2, Amazon SageMaker, Amazon S3, GCP, Apache Airflow
Text Analytics, Machine Learning, Deep Learning, Data Analytics