At the moment, we have a problem: When a user inputs a message, the app freezes until the response of the language model is generated. This does not look nice.
Ideally, the request to the language model should be asynchronous, such that the app continues doing things while waiting for the language model reply.
However, Flask is not able to handle async requests by default.
Possible solutions:
- Quart. Async replacement of Flask. Not sure how easy the replacement is.
- celery. But has a high overhead
- If we use a responsive front-end framework, we might be able to send the request from there.
At the moment, we have a problem: When a user inputs a message, the app freezes until the response of the language model is generated. This does not look nice.
Ideally, the request to the language model should be asynchronous, such that the app continues doing things while waiting for the language model reply.
However, Flask is not able to handle async requests by default.
Possible solutions: