You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
AI APIs are the primary interface for integrating large language models into your applications. This lesson covers how to make API calls using both REST and SDKs, understand the message format, stream responses, and handle errors gracefully.
Most modern LLM APIs use a chat completions interface. You send a list of messages (conversation history) and receive a model-generated response.
| Role | Purpose |
|---|---|
system | Sets the model's behaviour, persona, or instructions |
user | The human user's input |
assistant | The model's previous responses (for context) |
messages = [
{"role": "system", "content": "You are a concise coding assistant."},
{"role": "user", "content": "Write a Python function to reverse a string."},
]
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
temperature=0.7,
max_tokens=256,
)
answer = response.choices[0].message.content
print(answer)
| Parameter | Description | Typical Value |
|---|---|---|
model | Which model to use | gpt-4o-mini |
temperature | Randomness (0 = deterministic, 2 = very creative) | 0.0–1.0 |
max_tokens | Maximum length of the response | 256–4096 |
top_p | Nucleus sampling (alternative to temperature) | 0.9–1.0 |
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You are a helpful assistant.",
messages=[
{"role": "user", "content": "What is the capital of France?"},
],
)
print(message.content[0].text)
Note: Anthropic's API uses a
systemparameter rather than a system message in the messages array.
For long responses, streaming delivers tokens as they are generated, improving perceived latency.
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Explain quantum computing."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain quantum computing."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
You can also use the APIs directly via HTTP if you prefer not to use an SDK:
import requests
import os
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.