publications

2025

  1. Evaluating Intermediate Reasoning of Code-Assisted Large Language Models for Mathematics
    Zena Al-Khalili, Nick Howell, and Dietrich Klakow
    In Proceedings of the Workshop on Generation, Evaluation, and Metrics @ ACL 2025, 2025