Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment

Abstract Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evalua...

Full description

Saved in:
Bibliographic Details
Main Authors: Sanskar Bansal, Vinay Gupta, Eshita Gupta, Peeyush Garg
Format: Article
Language:English
Published: Springer 2025-08-01
Series:International Journal of Computational Intelligence Systems
Subjects:
Online Access:https://doi.org/10.1007/s44196-025-00946-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849331749069258752
author Sanskar Bansal
Vinay Gupta
Eshita Gupta
Peeyush Garg
author_facet Sanskar Bansal
Vinay Gupta
Eshita Gupta
Peeyush Garg
author_sort Sanskar Bansal
collection DOAJ
description Abstract Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evaluation techniques are influenced by the evaluator's mental and physical state, environmental factors, human bias, emotional swings, and logistical challenges like storage and retrieval. Although sequence-to-sequence neural networks and other existing AI evaluation methods have demonstrated promise, they are constrained by their reliance on high-performance hardware, such as GPUs, lengthy training periods, and challenges in managing a variety of scenarios. The state-of-the-art technique known as Bidirectional Encoders Representation from Transformer (BERT) has overcome the drawbacks of previous NLP techniques like Bag of Words, TF-IDF, and Word2Vec. But BERT depends on surface-level keyword similarity, if the keywords are different then the accuracy is not perfect. This study presents a technique that combines optical character recognition (OCR) technology with DeepSeek-R1 1.5B model to create a robust, efficient, and accurate grading system. To overcome the above-mentioned challenges, we proposed an evaluation technique that uses the Google Cloud Vision API to extract and convert handwritten responses into machine-readable text, thereby providing a pre-processed input for further evaluation in this study. The main aim of this study is to develop a scalable, automated, and effective system for grading handwritten responses by combining DeepSeek for response evaluation with the Google Cloud Vision API for text extraction. To check the performance of the proposed DeepSeek evaluation method, we compare its results with cosine similarity metrics. After testing on multiple assignments, DeepSeek’s independent evaluation method gave the best results: lowest MAE—0.0580, lowest RMSE—0.147, and strongest correlation—0.895. The finding of this technique has shown that the proposed technique is reliable and accurate.
format Article
id doaj-art-966e6f28e4d9414c8e1e55ac024b8f67
institution Kabale University
issn 1875-6883
language English
publishDate 2025-08-01
publisher Springer
record_format Article
series International Journal of Computational Intelligence Systems
spelling doaj-art-966e6f28e4d9414c8e1e55ac024b8f672025-08-20T03:46:24ZengSpringerInternational Journal of Computational Intelligence Systems1875-68832025-08-0118111610.1007/s44196-025-00946-wEvaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based AssessmentSanskar Bansal0Vinay Gupta1Eshita Gupta2Peeyush Garg3Department of Electrical Engineering, Manipal University JaipurDepartment of Electrical Engineering, Manipal University JaipurDepartment of AI & Machine Learning, Manipal University JaipurDepartment of Electrical Engineering, Manipal University JaipurAbstract Artificial intelligence is revolutionizing the education sector by making learning more accessible, efficient, and customized. Recent advancements in artificial intelligence have sparked significant interest in automating the evaluation of handwritten answers. Traditional handwritten evaluation techniques are influenced by the evaluator's mental and physical state, environmental factors, human bias, emotional swings, and logistical challenges like storage and retrieval. Although sequence-to-sequence neural networks and other existing AI evaluation methods have demonstrated promise, they are constrained by their reliance on high-performance hardware, such as GPUs, lengthy training periods, and challenges in managing a variety of scenarios. The state-of-the-art technique known as Bidirectional Encoders Representation from Transformer (BERT) has overcome the drawbacks of previous NLP techniques like Bag of Words, TF-IDF, and Word2Vec. But BERT depends on surface-level keyword similarity, if the keywords are different then the accuracy is not perfect. This study presents a technique that combines optical character recognition (OCR) technology with DeepSeek-R1 1.5B model to create a robust, efficient, and accurate grading system. To overcome the above-mentioned challenges, we proposed an evaluation technique that uses the Google Cloud Vision API to extract and convert handwritten responses into machine-readable text, thereby providing a pre-processed input for further evaluation in this study. The main aim of this study is to develop a scalable, automated, and effective system for grading handwritten responses by combining DeepSeek for response evaluation with the Google Cloud Vision API for text extraction. To check the performance of the proposed DeepSeek evaluation method, we compare its results with cosine similarity metrics. After testing on multiple assignments, DeepSeek’s independent evaluation method gave the best results: lowest MAE—0.0580, lowest RMSE—0.147, and strongest correlation—0.895. The finding of this technique has shown that the proposed technique is reliable and accurate.https://doi.org/10.1007/s44196-025-00946-wLarge language modelDeepSeekAI-based evaluation techniqueEvaluating handwritten answer sheet
spellingShingle Sanskar Bansal
Vinay Gupta
Eshita Gupta
Peeyush Garg
Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
International Journal of Computational Intelligence Systems
Large language model
DeepSeek
AI-based evaluation technique
Evaluating handwritten answer sheet
title Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_full Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_fullStr Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_full_unstemmed Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_short Evaluating Handwritten Answers Using DeepSeek: A Comparative Analysis of Deep Learning-Based Assessment
title_sort evaluating handwritten answers using deepseek a comparative analysis of deep learning based assessment
topic Large language model
DeepSeek
AI-based evaluation technique
Evaluating handwritten answer sheet
url https://doi.org/10.1007/s44196-025-00946-w
work_keys_str_mv AT sanskarbansal evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment
AT vinaygupta evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment
AT eshitagupta evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment
AT peeyushgarg evaluatinghandwrittenanswersusingdeepseekacomparativeanalysisofdeeplearningbasedassessment