Text this: Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education