Discussion about this post

User's avatar
John Giudice's avatar

On your discussion point about evaluating AI models - I would propose that people develop a set of evaluation prompts and questions for AI implementations, with expected answers, that could be regularly used to evaluate the AI models people care about. These evaluation questions would be run regularly, maybe quarterly, to track the quality changes, and other aspects that are important. Since many models are regularly being updated and changed it would be helpful to track and understand the changes and the progress on their implementation. Let’s discuss this if others are interested as well. I don’t think anyone is publicly doing this that I have found.

Expand full comment
Nick Potkalitsky's avatar

Nice work. The AI Snake Oil guys are putting on an all day forum about safety and open AI model on Sept 21. Should be good.

Expand full comment

No posts