Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment

Abstract Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evalua...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sanskar Bansal, Vinay Gupta, Eshita Gupta, Peeyush Garg
Format:	Article
Language:	English
Published:	Springer 2025-08-01
Series:	International Journal of Computational Intelligence Systems
Subjects:	Large language model DeepSeek AI-based evaluation technique Evaluating handwritten answer sheet
Online Access:	https://doi.org/10.1007/s44196-025-00946-w
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849331749069258752
author	Sanskar Bansal Vinay Gupta Eshita Gupta Peeyush Garg
author_facet	Sanskar Bansal Vinay Gupta Eshita Gupta Peeyush Garg
author_sort	Sanskar Bansal
collection	DOAJ
description	Abstract Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evaluation techniques are influenced by the evaluator's mental and physical state, environmental factors, human bias, emotional swings, and logistical challenges like storage and retrieval. Although sequence-to-sequence neural networks and other existing AI evaluation methods have demonstrated promise, they are constrained by their reliance on high-performance hardware, such as GPUs, lengthy training periods, and challenges in managing a variety of scenarios. The state-of-the-art technique known as Bidirectional Encoders Representation from Transformer (BERT) has overcome the drawbacks of previous NLP techniques like Bag of Words, TF-IDF, and Word2Vec. But BERT depends on surface-level keyword similarity, if the keywords are different then the accuracy is not perfect. This study presents a technique that combines optical character recognition (OCR) technology with DeepSeek-R1 1.5B model to create a robust, efficient, and accurate grading system. To overcome the above-mentioned challenges, we proposed an evaluation technique that uses the Google Cloud Vision API to extract and convert handwritten responses into machine-readable text, thereby providing a pre-processed input for further evaluation in this study. The main aim of this study is to develop a scalable, automated, and effective system for grading handwritten responses by combining DeepSeek for response evaluation with the Google Cloud Vision API for text extraction. To check the performance of the proposed DeepSeek evaluation method, we compare its results with cosine similarity metrics. After testing on multiple assignments, DeepSeek’s independent evaluation method gave the best results: lowest MAE—0.0580, lowest RMSE—0.147, and strongest correlation—0.895. The finding of this technique has shown that the proposed technique is reliable and accurate.
format	Article
id	doaj-art-966e6f28e4d9414c8e1e55ac024b8f67
institution	Kabale University
issn	1875-6883
language	English
publishDate	2025-08-01
publisher	Springer
record_format	Article
series	International Journal of Computational Intelligence Systems
spelling	doaj-art-966e6f28e4d9414c8e1e55ac024b8f672025-08-20T03:46:24ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832025-08-0118111610.1007/s44196-025-00946-wEvaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based AssessmentSanskar Bansal0Vinay Gupta1Eshita Gupta2Peeyush Garg3Department of Electrical Engineering, Manipal University JaipurDepartment of Electrical Engineering, Manipal University JaipurDepartment of AI & Machine Learning, Manipal University JaipurDepartment of Electrical Engineering, Manipal University JaipurAbstract Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evaluation techniques are influenced by the evaluator's mental and physical state, environmental factors, human bias, emotional swings, and logistical challenges like storage and retrieval. Although sequence-to-sequence neural networks and other existing AI evaluation methods have demonstrated promise, they are constrained by their reliance on high-performance hardware, such as GPUs, lengthy training periods, and challenges in managing a variety of scenarios. The state-of-the-art technique known as Bidirectional Encoders Representation from Transformer (BERT) has overcome the drawbacks of previous NLP techniques like Bag of Words, TF-IDF, and Word2Vec. But BERT depends on surface-level keyword similarity, if the keywords are different then the accuracy is not perfect. This study presents a technique that combines optical character recognition (OCR) technology with DeepSeek-R1 1.5B model to create a robust, efficient, and accurate grading system. To overcome the above-mentioned challenges, we proposed an evaluation technique that uses the Google Cloud Vision API to extract and convert handwritten responses into machine-readable text, thereby providing a pre-processed input for further evaluation in this study. The main aim of this study is to develop a scalable, automated, and effective system for grading handwritten responses by combining DeepSeek for response evaluation with the Google Cloud Vision API for text extraction. To check the performance of the proposed DeepSeek evaluation method, we compare its results with cosine similarity metrics. After testing on multiple assignments, DeepSeek’s independent evaluation method gave the best results: lowest MAE—0.0580, lowest RMSE—0.147, and strongest correlation—0.895. The finding of this technique has shown that the proposed technique is reliable and accurate.https://doi.org/10.1007/s44196-025-00946-wLarge language modelDeepSeekAI-based evaluation techniqueEvaluating handwritten answer sheet
spellingShingle	Sanskar Bansal Vinay Gupta Eshita Gupta Peeyush Garg Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment International Journal of Computational Intelligence Systems Large language model DeepSeek AI-based evaluation technique Evaluating handwritten answer sheet
title	Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_full	Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_fullStr	Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_full_unstemmed	Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_short	Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_sort	evaluating handwritten answers using deepseek a comparative analysis of deep learning based assessment
topic	Large language model DeepSeek AI-based evaluation technique Evaluating handwritten answer sheet
url	https://doi.org/10.1007/s44196-025-00946-w
work_keys_str_mv	AT sanskarbansal evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment AT vinaygupta evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment AT eshitagupta evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment AT peeyushgarg evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment

Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment

Similar Items