OllamaLLM#

classDiagram LLMBase <|-- OllamaLLM Monitorable <|-- LLMBase
class council.llm.OllamaLLM(config: OllamaLLMConfiguration)[source]#

Bases: LLMBase[OllamaLLMConfiguration]

__init__(config: OllamaLLMConfiguration) None[source]#

Initialize a new instance.

Parameters:

config (OllamaLLMConfiguration) – configuration for the instance

property client: Client#

Ollama Client.

While self._post_chat_request() focuses on chat-based LLM interactions, you can use the client for broader model management, such as listing, pulling, and deleting models, generating completions and embeddings, etc. See https://github.com/ollama/ollama/blob/main/docs/api.md

pull() Mapping[str, Any][source]#

Download the model from the ollama library.

load(keep_alive: float | str | None = None) Mapping[str, Any][source]#

Load LLM in memory.

unload() Mapping[str, Any][source]#

Unload LLM from memory.

post_chat_request(context: LLMContext, messages: Sequence[LLMMessage], **kwargs: Any) LLMResult#

Sends a chat request to the language model.

Parameters:
  • context (LLMContext) – a context to track execution metrics

  • messages (Sequence[LLMMessage]) – A list of LLMMessage objects representing the chat messages.

  • **kwargs – Additional keyword arguments for the chat request.

Returns:

The response from the language model.

Return type:

LLMResult

Raises:
  • LLMTokenLimitException – If messages exceed the maximum number of tokens.

  • Exception – If an error occurs during the execution of the chat request.

render_as_dict(include_children: bool = True) Dict[str, Any]#

returns the graph of operation as a dictionary

render_as_json() str#

returns the graph of operation as a JSON string