Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can hover over some stuff, click on the model to get more info like tested categories, hover the correct test numbers to see some info about what they got wrong.

I just started on this, so currently adding more tests and I keep improving the UI. Let me know if you have any suggestions.

The ranking currently is mostly about the "smartest" model, which is most likely to respond correctly to any given question or request, regardless of the domain.

 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: