AI Text Classification for Support Tickets: Train BERT Models for Automated Ticket Routing

AI Text Classification for Support Tickets: Train BERT Models for Automated Ticket Routing

Train AI models for ticket classification using BERT and machine learning. Automate ticket routing, priority prediction, and queue assignment with Python.

software-developmentdata-science

Author

Tobias Bueck

This comprehensive guide demonstrates how to train an Artificial Intelligence (AI) for automated ticket classification in helpdesk and support systems like OTOBO. Learn how to implement intelligent ticket routing, automated priority assignment, and smart queue distribution using state-of-the-art machine learning techniques with BERT (Bidirectional Encoder Representations from Transformers).

Whether you're looking to optimize your IT service desk, automate customer support workflows, or implement AI-powered ticket management, this tutorial covers the complete process from data preparation through model training to evaluation.

Why AI-Powered Ticket Classification Matters

Modern support teams handle hundreds or thousands of tickets daily. Manual classification is time-consuming, error-prone, and inconsistent. AI-powered ticket automation offers:

  • Reduced response times through instant, accurate routing
  • Improved first-contact resolution by matching tickets to the right experts
  • Consistent classification across all incoming requests
  • 24/7 automated ticket processing without human intervention
  • Cost savings through workflow optimization and automation

For organizations seeking professional ticket automation services, we offer comprehensive solutions for implementing AI classification systems. Contact us to learn how we can optimize your support workflows with custom AI models and integration services.

Key Use Cases for AI Ticket Classification

Automated ticket classification powered by machine learning can transform various support scenarios:

  • IT Helpdesk: Automatically route hardware, software, and network issues to specialized teams
  • Customer Support: Classify product inquiries, complaints, and technical questions
  • E-commerce: Sort order issues, payment problems, and shipping inquiries
  • SaaS Platforms: Direct bug reports, feature requests, and usage questions appropriately
  • Multi-channel Support: Unify classification across email, chat, phone, and social media

Requirements

  • Python 3.10+
  • Libraries: datasets, transformerstorch, psutil, gputil, nvidia_smi, huggingface_hub, nlpaug, nltk, sentencepiece

Install the required packages with:

pip install datasets transformers[torch] psutil gputil nvidia_smi huggingface_hub nlpaug nltk sentencepiece

or in a Jupyter Notebook:

!pip install datasets transformers[torch] psutil gputil nvidia_smi huggingface_hub nlpaug nltk sentencepiece

Step 1: Data Preparation

First, the ticket data must be prepared. This includes loading the data, cleaning, and preprocessing the text. For this tutorial, we use the following data:

Example Data

subjectbodypriorityqueue
Login IssueUnable to login to the systemHighSoftware
Password ResetNeed to reset my passwordMediumHardware
Email ProblemNot receiving emailsLowAccounting
Network DownNetwork is down in building 5HighSoftware
Printer IssuePrinter not workingMediumHardware

We use subject and body as features, and priority and queue are the labels we want to predict.

Features and Labels

Feature 1Feature 2Label 1Label 2
Login IssueUnable to login to the systemHighSoftware
Password ResetNeed to reset my passwordMediumHardware
Email ProblemNot receiving emailsLowAccounting
Network DownNetwork is down in building 5HighSoftware
Printer IssuePrinter not workingMediumHardware

When using text sequence classification with BERT, we can only use one feature. Therefore, we combine subject and body. Since we want to give more weight to the subject, we concatenate the texts by inserting the subject twice and the body once.

import pandas as pd

# Example Data
data = {
    'subject': ["Login Issue", "Password Reset", "Email Problem", "Network Down", "Printer Issue"],
    'body': ["Unable to login to the system", "Need to reset my password", "Not receiving emails",
             "Network is down in building 5", "Printer not working"],
    'priority': ["High", "Medium", "Low", "High", "Medium"],
    'queue': ["Software", "Hardware", "Accounting", "Software", "Hardware"]
}

df = pd.DataFrame(data)

# Create combined feature
df['combined_feature'] = df.apply(lambda row: f"{row['subject']} {row['subject']} {row['body']}", axis=1)

print(df[['combined_feature', 'priority', 'queue']])

Transformed Table

Combined FeatureLabel 1Label 2
Login Issue Login Issue Unable to login to the systemHighSoftware
Password Reset Password Reset Need to reset my passwordMediumHardware
Email Problem Email Problem Not receiving emailsLowAccounting
Network Down Network Down Network is down in building 5HighSoftware
Printer Issue Printer Issue Printer not workingMediumHardware

To train the model, we need to convert the labels into numbers. Here is the code to do this:

from sklearn.preprocessing import LabelEncoder

# Initialize Label Encoder
le_priority = LabelEncoder()
le_queue = LabelEncoder()

# Convert labels to numbers
df['priority_encoded'] = le_priority.fit_transform(df['priority'])
df['queue_encoded'] = le_queue.fit_transform(df['queue'])

print(df[['combined_feature', 'priority_encoded', 'queue_encoded']])

Result:

Combined Featurepriority_encodedqueue_encoded
Login Issue Login Issue Unable to login to the system02
Password Reset Password Reset Need to reset my password21
Email Problem Email Problem Not receiving emails10
Network Down Network Down Network is down in building 502
Printer Issue Printer Issue Printer not working21

Since we can only have one label for our classification, we now have two options.

  1. Combine the two labels into one. This would result in priority_queue: HIGHSoftware, HIGHHardware, HIGHAccounting, etc. This would lead to PRODUCT[len(unique(label)) for label in labels], in our case len(unique(priorities)) * len(unique(queues)), which is 3 * 3 = 9.
    Advantages:
    • Simple implementation and management.
    • One model for the entire classification.

    Disadvantages:
    • Increased complexity and size of the classification problem.
    • Potentially worse performance with low data per combination.
  2. Train a separate model for each label. In this tutorial, we use method 2. We have a separate model for each of Queue and Priority.

Code to Split the Table into Queue and Priority Table

# Split into Queue and Priority Tables
queue_df = df[['combined_feature', 'queue_encoded']]
priority_df = df[['combined_feature', 'priority_encoded']]

print(queue_df)
print(priority_df)

Table for Queue Model

Combined Featurequeue_encoded
Login Issue Login Issue Unable to login to the system2
Password Reset Password Reset Need to reset my password1
Email Problem Email Problem Not receiving emails0
Network Down Network Down Network is down in building 52
Printer Issue Printer Issue Printer not working1

Table for Priority Model

Combined Featurepriority_encoded
Login Issue Login Issue Unable to login to the system0
Password Reset Password Reset Need to reset my password2
Email Problem Email Problem Not receiving emails1
Network Down Network Down Network is down in building 50
Printer Issue Printer Issue Printer not working2

Tokenizer Explanation

A tokenizer converts text into smaller units called tokens. These tokens can be words, punctuation, or sentence components. Tokenizers are important because machine learning and NLP models require text in a form they can process. Through tokenization, models can analyze text and learn to recognize patterns.

Token Encoding

In token encoding, tokens are converted into numbers so they can be processed by machine learning models. Here is an example of what a tokenized and encoded text for our table might look like:

from transformers import BertTokenizer

# Initialize BERT Tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenization and encoding of an example
example_text = df['combined_feature'][0]
tokens = tokenizer.tokenize(example_text)
encoded_tokens = tokenizer.convert_tokens_to_ids(tokens)
print(tokens)
print(encoded_tokens)

Example output for "Login Issue Login Issue Unable to login to the system":

Tokens:

['login', 'issue', 'login', 'issue', 'unable', 'to', 'login', 'to', 'the', 'system']

Encoded Tokens:

[2653, 3277, 2653, 3277, 3928, 2000, 2653, 2000, 1996, 2291]

Splitting Tables into Train and Test Dataset

To train and test our models, we split the data into training and test datasets. Here is the code to do this:

from sklearn.model_selection import train_test_split

# Split the Queue table into train and test datasets
queue_train, queue_test, y_queue_train, y_queue_test = train_test_split(queue_df['combined_feature'],
                                                                        queue_df['queue_encoded'], test_size=0.2,
                                                                        random_state=42)

# Split the Priority table into train and test datasets
priority_train, priority_test, y_priority_train, y_priority_test = train_test_split(priority_df['combined_feature'],
                                                                                    priority_df['priority_encoded'],
                                                                                    test_size=0.2, random_state=42)

print(queue_train, queue_test, y_queue_train, y_queue_test)
print(priority_train, priority_test, y_priority_train, y_priority_test)

By splitting data, we ensure that we have enough data to train and test our models, allowing us to evaluate their performance.

Step 2: Model Training

Model Training

In this article, we describe how to train the model with our training data. We use the transformers library from Hugging Face and torch for training BERT models.

BERT Model

BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language model capable of capturing the context of words in a sentence. It is commonly used for tasks such as text classification, question answering, and many other NLP tasks.

Parameters for Training

  • batch_size: The number of examples processed in one pass through the model. Smaller batch sizes require less memory but result in more frequent updates of the model parameters.
  • epochs: The number of complete passes through the entire training dataset. More epochs can lead to a better model but risk overfitting.
  • learning_rate: The step size with which the model adjusts its parameters. Too high a learning rate can lead to unstable training processes, while too low a learning rate can result in slow learning.

Initializing the Model

We define a class TicketClassifier that initializes the model and training parameters.

from transformers import BertForSequenceClassification, BertTokenizer, Trainer, TrainingArguments


class TicketClassifier:
    def __init__(self, model_name: str):
        self.tokenizer = BertTokenizer.from_pretrained(model_name)
        self.model = BertForSequenceClassification.from_pretrained(model_name, num_labels=3)  # For 3 classes (Low, Medium, High)

    def train(self, train_data, train_labels):
        training_args = TrainingArguments(output_dir='./results')

        trainer = Trainer(model=self.model, args=training_args, train_dataset=train_data)
        trainer.train()
        return trainer


classifier = TicketClassifier(model_name='bert-base-uncased')

Training the Model

We use the prepared datasets queue_train, queue_test, priority_train, priority_test for training and evaluation.

# Training the Queue model
trainer_queue = classifier.train(queue_train, y_queue_train)

# Training the Priority model
trainer_priority = classifier.train(priority_train, y_priority_train)

Model Evaluation

After training, we evaluate the model with the test data.

# Evaluating the Queue model
eval_queue_results = trainer_queue.evaluate(eval_dataset=queue_test)
print(eval_queue_results)

# Evaluating the Priority model
eval_priority_results = trainer_priority.evaluate(eval_dataset=priority_test)
print(eval_priority_results)

Through these steps, we ensure that our models are well trained and evaluated to successfully solve the classification task.

After evaluating the models, we obtain various metrics that describe the performance of the model. One of the most important metrics is accuracy, which indicates how many of the predictions are correct.

Prediction Accuracy

Accuracy is calculated by dividing the number of correct predictions by the total number of predictions. Here is a Python code to calculate the accuracy:

from sklearn.metrics import accuracy_score

# Example predictions and actual labels
y_true = [0, 2, 1, 0, 2]  # Actual labels
y_pred = [0, 2, 1, 0, 1]  # Predicted labels

# Calculating accuracy
accuracy = accuracy_score(y_true, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")

Evaluation with Continuous Numbers

For continuous numbers like priority levels, it's important to consider the proximity of predictions. For example, the prediction 2 is closer to 1 than 3 is to 1. One way to evaluate this is by calculating the mean absolute error and the mean squared error.

Mean Absolute Error

The mean absolute error measures how far predictions are from the actual values. Here is a Python code to calculate the mean absolute error:

import numpy as np

# Example predictions and actual labels
y_true = np.array([0, 2, 1, 0, 2])  # Actual labels
y_pred = np.array([0, 2, 1, 0, 1])  # Predicted labels

# Calculating mean absolute error
mean_absolute_error = np.mean(np.abs(y_true - y_pred))
print(f"Mean Absolute Error: {mean_absolute_error}")

Mean Squared Error

The mean squared error measures the squared deviation of predictions from actual values, which weights larger errors more heavily. Here is a Python code to calculate the mean squared error:

# Calculating mean squared error
mean_squared_error = np.mean((y_true - y_pred) ** 2)
print(f"Mean Squared Error: {mean_squared_error}")

With these metrics, we can better understand and improve the performance of our models. Accuracy gives us an overall view of the performance, while mean absolute error and mean squared error for continuous numbers allow for a more detailed evaluation.

Professional Ticket Automation Services

While this tutorial provides the foundation for building AI-powered ticket classification systems, implementing production-ready solutions requires additional expertise in:

  • Model optimization and fine-tuning for your specific ticket types
  • Integration with existing ticket systems (OTOBO, Znuny, Zammad, and others)
  • Scalable deployment and infrastructure setup
  • Continuous model improvement based on feedback and new data
  • Multi-language support for international helpdesks
  • Custom classification categories tailored to your business needs

Our Ticket Automation Services

We offer comprehensive ticket automation consulting and implementation services:

  • AI Model Development: Custom-trained models for your specific ticket types and workflows
  • System Integration: Seamless integration with OTOBO, OTRS, Znuny, and other ticketing platforms
  • API Development: REST APIs for real-time ticket classification and routing
  • Training & Support: Team training and ongoing technical support
  • Performance Optimization: Continuous model improvement and fine-tuning

Learn more about our automated ticket classification solutions at OpenTicketAI.com - a specialized platform for AI-powered ticket management and intelligent helpdesk automation.

Ready-to-Use Solutions

If you prefer a managed solution over building from scratch, check out:

For custom requirements or enterprise implementations, contact us at atc-api@softoft.de to discuss your ticket automation needs.

Summary

In this article, we demonstrated how to train an AI model for automated ticket classification using BERT and Python. By leveraging machine learning libraries like transformers and scikit-learn, you can create intelligent systems that automatically classify support tickets by priority, queue, and category.

This AI-powered approach to ticket management significantly improves:

  • Support team efficiency through automated routing
  • Customer satisfaction via faster response times
  • Operational costs by reducing manual classification work
  • Ticket handling consistency across your organization

The techniques covered here form the foundation for building sophisticated helpdesk automation and intelligent ticket routing systems. Whether you're implementing this yourself or seeking professional assistance, AI-powered ticket classification is transforming modern customer support operations.