Our Transformers Code Agent beats the GAIA benchmark!
โข
45
transformers.agents
is a bit tedious to use, because it goes in great detail.FYI, I've finally built this here: https://huggingface.co/spaces/m-ric/text_to_dollars
Read the article here: https://www.adyen.com/knowledge-hub/llm-inference-at-scale-with-tgi
Yes that's why I added the big if behind!
Thanks a lot @Wauplin !
You can use this newer repo in Github @sidbin : https://github.com/aymeric-roucher/GAIA, it has a requirements.txt!
It's not using GPT-4o for evaluation, evaluation is done with exact string match!
Great idea! Can I build it @victor or you'd like to make it yourself?
{
"rationale": "The answer does not match the true answer at all."
"score": 1,
"confidence_level": 0.85
}