1.2 Classification

In classification tasks, we provide the model with clear context and ask it to assign a given text to one of several predefined categories. Message roles in this dialogue are simple and complementary: the system message describes the task and lists the allowed classes, while the user message contains the text fragment to be assigned to one of those classes. This order — system first, then user — establishes an unambiguous context in which the model responds predictably and consistently.

To lock down the format, let’s start with the smallest example: classify a customer review by sentiment — “Positive”, “Negative”, or “Neutral”. The system message gives a direct instruction, and the user message provides the review text to be evaluated.

system_message = """Classify the customer review into one of the categories: Positive, Negative, or Neutral."""

For user_message, use the review you want to classify:

user_message = """I recently bought a product at your store. The purchase went great, and the quality exceeded my expectations!"""

This dialogue follows the common Chat Completions API pattern: each message is a structure with role and content keys. role indicates the source (system or user), and content carries the text. Separating roles lets you initialize the model’s behavior up front and then pass in the specific input. In the simplest case, the system message sets rules and style, and the user message formulates the task. For example, if you want a playful poem about a happy carrot, you might first set the instruction {'role': 'system', 'content': "You are an assistant who replies in a playful poet’s style."}, then send {'role': 'user', 'content': "Write a very short poem about a happy carrot."}. The very same user request would produce a different tone and format under a different system context, which is why the system → user sequence is key to controlling model behavior.

A complete working example for classifying reviews is built from these same elements; the only difference is that we call the model from code and return the answer:

import os
from openai import OpenAI
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())  # read local .env

client = OpenAI()

def classify(messages, model="gpt-4o-mini", temperature=0, max_tokens=500):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    return response.choices[0].message["content"]

delimiter = "####"
system_message = """Classify the customer review into one of the categories: Positive, Negative, or Neutral."""

user_message = """I recently bought a product at your store. The purchase went great, and the quality exceeded my expectations!"""

messages = [
    {'role': 'system', 'content': system_message},
    {'role': 'user', 'content': f"{delimiter}{user_message}{delimiter}"},
]

response = classify(messages)
print(response)

The same principles power other classification scenarios. In email, it’s useful to separate work messages from personal and spam (categories: Work, Personal, Spam). A suitable system message might be: “Classify the following email as Work, Personal, or Spam.” with a sample user message: “Great discounts on our new electronics! Click now and save.” For movie review sentiment, distinguish “Positive”, “Negative”, and “Neutral”: the system message could be “Determine the sentiment of the following movie review: Positive, Negative, or Neutral.” and the user message: “Visually stunning, but the plot is predictable and shallow.” For news, classify the topic as Politics, Technology, Sports, or Entertainment: “Determine the topic of the news item: Politics, Technology, Sports, or Entertainment.” with “A new smartphone model uses breakthrough technology that’s reshaping the industry.”

For product ratings from reviews, star classes work well — 1, 2, 3, 4, or 5: “Based on the review, assign a rating from 1 to 5 stars.” and “The design is interesting, but frequent breakdowns and weak support make it hard to recommend.” When routing customer requests, common intents are Billing, Support, Sales, or General Question — “Identify the intent of the request: Billing, Support, Sales, or General Question.” and “Tell me about available plans and current promotions.” For text genre classification, use categories like Fiction, Non‑fiction, Poetry, News — “Identify the genre of the text: Fiction, Non‑fiction, Poetry, or News.” and “In the heart of the city, among noisy streets, there was a garden untouched by time.”

On social media, automatic tone assessment is valuable — Serious, Ironic, Inspiring, or Irritated. A fitting system message: “Determine the tone of the following post: Serious, Ironic, Inspiring, or Irritated.” with a user example: “There’s nothing better than starting the day with a smile. Happiness is contagious!” In academic writing, classify the field: Biology, Computer Science, Psychology, or Mathematics — “Identify the field of the following abstract: Biology, Computer Science, Psychology, or Mathematics.” and “This study examines the algorithmic complexity of sorting methods and their efficiency.” In food reviews, you might extract the flavor profile: Sweet, Salty, Sour, Bitter, Umami — “Identify the flavor profile in the review: Sweet, Salty, Sour, Bitter, or Umami.” and “A dish with a perfect balance of umami and a light sweetness that enhances the taste.” Finally, for emergency calls, quickly determine the situation type: Fire, Medical, Crime, or Other — “Identify the emergency type from the call transcript: Fire, Medical, Crime, or Other.” and “The building next door is filled with smoke; we can see flames. Please help urgently!” In all of these cases, the key to quality answers is a clear system message that defines the boundaries and lists the categories; the user message remains a concise carrier of the text to be labeled.

For each scenario, you can freely change the user_message content for the specific case; the important part is keeping the system message concrete and unambiguous about the set of allowed labels.

Theory Questions

What are the key components of a message when working with GPT models (role and content), and why is it important to distinguish them?
How does the role of system messages differ from user messages in a dialogue with the AI?
Provide an example of how a system message can set the model’s behavior or response style.
How does the system → user message sequence influence the model’s answer?
In the review classification example, which categories are used?
Describe a scenario where classifying the sentiment of a movie review is useful. Which categories fit?
How does classifying the topic of a news article help with content management or recommendations? Give category options.
Discuss the importance of classifying customer requests in business. Which categories help optimize support?
What is the role of user_message in classification tasks, and how should it be structured for accurate results?
How is classifying the tone of social posts useful for moderation or marketing? Provide example categories.