GrowlErval

Big idea:

A high-level view of the workflow:

Eval

Do we need a better name?

Requirements for an ideal Eval setup:

How does this work?

I should note that one blocker for this is Gallup World Poll since it can’t be imported without huge amount of RAM. We would need to partition it for this to work but that’s already on our to-do list.

Growler

The beauty of the above change to Eval is that it makes Growler simpler. It runs in a shorter loop with fewer steps and eval happens out of the main agent’s control in a more realistic setting (actual Chat Plot tools and agent).

Growler would have a series of agents/prompts/loops that are good at different tasks:

The editor and writer are a natural loop that doesn’t stop until we’ve got solid work. Then the Fact Checker comes in and verifies that changes were not hallucinated.

Data Catalog

Our catalog expands to hold more things:

Chat Plot

Required changes to Chat Plot for this to work: