Recent developments in deep learning have greatly improved automated plant identification; however, accu- rate recognition of medicinal plants remains challenging due to high intra-class variation, visual similarity among species, and limited availability of labeled data. This study proposes a Vision Transformer (ViT)–based medicinal plant classification and knowledge extraction framework for robust leaf-based recognition. The proposed system was trained on a curated dataset comprising 6,104 leaf images from 90 medicinal plant species, using 909 photos for testing and 5,195 photos for training. The dataset integrates samples from a publicly available benchmark dataset along with additional images and metadata sourced from the IMPPAT medicinal plant database. A ViT-Base Patch-16 architecture was fine-tuned on 224 × 224 leaf images using extensive data augmentation to improve generalization. Beyond classification, the framework incorporates a structured herbal information retrieval module that automatically extracts species-specific details, including scientific and local names, medicinal uses, plant parts utilized, other applications, and cultivation information. Experimental results demonstrate strong validation performance, highlighting the effectiveness of transformer-based models for fine-grained botanical classification. The suggested framework provides a useful and expandable solution for medicinal plant identification while supporting ethnobotanical knowledge preservation and digital herbal documentation