chat
Send multi-turn conversations to a local Ollama model. Define roles (user, assistant, system) and messages for context-aware responses.
Instructions
Send a multi-turn conversation to an Ollama model. Messages should follow the format [{'role': 'user'|'assistant'|'system', 'content': '...'}].
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes | ||
| messages | Yes | ||
| temperature | No | ||
| max_tokens | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- src/foundry_reverse/server.py:189-219 (registration)MCP tool registration and handler function for the 'chat' tool. The @mcp.tool decorator registers it with name='chat' and the async function implements the handler logic, calling oc.chat() to delegate to the Ollama client.
@mcp.tool( name="chat", description=( "Send a multi-turn conversation to an Ollama model. Messages should " "follow the format [{'role': 'user'|'assistant'|'system', 'content': '...'}]." ), ) async def chat( model: str, messages: list[dict[str, str]], temperature: float | None = None, max_tokens: int | None = None, ) -> dict[str, Any]: """ Args: model: Ollama model name. messages: Conversation history as a list of role/content dicts. temperature: Sampling temperature (0.0–2.0). max_tokens: Maximum tokens to generate. """ options: dict[str, Any] = {} if temperature is not None: options["temperature"] = temperature if max_tokens is not None: options["num_predict"] = max_tokens response = await oc.chat( model=model, messages=messages, options=options or None, ) return {"model": model, "response": response} - Helper function that makes the HTTP POST to Ollama's /api/chat endpoint. Constructs the payload with model, messages, and options, then returns the assistant's content from the response.
async def chat( model: str, messages: list[dict[str, str]], options: dict[str, Any] | None = None, ) -> str: payload: dict[str, Any] = { "model": model, "messages": messages, "stream": False, } if options: payload["options"] = options async with _client() as c: r = await c.post("/api/chat", json=payload) r.raise_for_status() return r.json().get("message", {}).get("content", "") - tests/test_ollama_client.py:53-68 (helper)Unit test for the ollama_client.chat() function, mocking the HTTP client to verify correct request/response behavior.
@pytest.mark.asyncio async def test_chat(): mock_client = AsyncMock() mock_client.__aenter__ = AsyncMock(return_value=mock_client) mock_client.__aexit__ = AsyncMock(return_value=False) mock_client.post = AsyncMock( return_value=_mock_response({"message": {"content": "Hi there!", "role": "assistant"}}) ) with patch("foundry_reverse.ollama_client.httpx.AsyncClient", return_value=mock_client): result = await oc.chat( model="llama3", messages=[{"role": "user", "content": "Hello"}], ) assert result == "Hi there!" - Comment delineating the INFERENCE section which includes both 'generate' and 'chat' tools.
# INFERENCE (generate / chat) # ────────────────────────────────────────────────────────────────────────────