Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> By better I mean grading based on whether there is any nonsense in the output or any internal contradictions, or similar criteria

Sounds like you want a hard ai to determine whether a language model generates nonsense.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: