Currently, if a message exceeds 200,000 tokens, we don't stop it from going through. This is problematic both from a cost perspective and at that point a single query is costing us dollars. We should lower the token limit and detect when this is happening and either (a) summarize the conversation so far or (b) tell the user to start a new conversation.