🔧 Synthetic IT Ticket Generator — Custom Dataset
Discover the expanded version of this dataset with 50,000 ticket entries! Perfect for training models to classify and prioritize support tickets. This dataset includes different files with varying numbers of tickets, languages, and queue configurations.
Get an On-Prem Ticket Tagging AI
Need an on-premises AI to auto-classify tickets? Check out Open Ticket AI — our solution for automated ticket classification that runs in your own infrastructure.
Create Your Own Custom Dataset
Want a dataset tailored to your specific queues, priorities, and language requirements? Use our Synthetic Data Generation Service to create custom ticket data without any personally identifiable information (PII).
👉 Define your queues, priorities, and language preferences 👉 Generate realistic ticket data for your specific use case 👉 Train models on data that matches your business needs
Overview
The Customer IT Support Ticket Dataset is a comprehensive collection of synthetic email tickets designed to support customer support optimization, NLP research, and machine learning projects. The dataset provides well-classified data with complete ticket lifecycle information including customer emails, agent responses, priorities, queues, types, tags, and business context.
Dataset Structure
The dataset offers a detailed structure with classifications by:
- Department/Queue: Where the ticket should be routed
- Type: The nature of the ticket (Incident, Request, Problem, Change)
- Priority: Urgency level (Low, Medium, Critical)
- Language: Multilingual support (EN, DE, ES, FR, PT)
- Subject & Body: Complete email text from customers
- Agent Answer: Professional responses from helpdesk agents
- Business Type: Context of the support organization
- Tags: Additional categorization for detailed analysis
Features and Attributes
| Field | Description | Example Values |
|---|---|---|
| 🔀 Queue | Specifies the department to which the email ticket is routed | Technical Support, Customer Service, Billing and Payments, Product Support, IT Support, Returns and Exchanges, Sales and Pre-Sales, Human Resources, Service Outages and Maintenance, General Inquiry |
| 🚦 Priority | Indicates the urgency and importance of the issue | 🟢 Low, 🟠 Medium, 🔴 Critical |
| 🗣️ Language | Language in which the email is written | EN, DE, ES, FR, PT |
| 📧 Subject | Subject line of the customer's email | Various customer inquiry subjects |
| 📝 Body | Full text content of the customer's email | Detailed customer descriptions |
| 💬 Answer | Response provided by the helpdesk agent | Professional agent responses with solutions |
| 🏷️ Type | Type of ticket as picked by the agent | Incident, Request, Problem, Change |
| 🏢 Business Type | The business type of the support helpdesk | Tech Online Store, IT Services, Software Development Company |
| 🏷️ Tags | Tags/categories assigned to the ticket (10 columns) | Software Bug, Warranty Claim, Password Reset, etc. |
Queue (Department)
Specifies the department to which the email ticket is categorized. This helps in routing the ticket to the appropriate support team for resolution.
| Icon | Queue | Description |
|---|---|---|
| 💻 | Technical Support | Technical issues and support requests |
| 🈂️ | Customer Service | Customer inquiries and service requests |
| 💰 | Billing and Payments | Billing issues and payment processing |
| 🖥️ | Product Support | Support for product-related issues |
| 🌐 | IT Support | Internal IT support and infrastructure issues |
| 🔄 | Returns and Exchanges | Product returns and exchanges |
| 📞 | Sales and Pre-Sales | Sales inquiries and pre-sales questions |
| 🧑💻 | Human Resources | Employee inquiries and HR-related issues |
| ❌ | Service Outages and Maintenance | Service interruptions and maintenance |
| 📮 | General Inquiry | General inquiries and information requests |
Priority Levels
Indicates the urgency and importance of the issue. Helps in managing the workflow by prioritizing tickets that need immediate attention.
| Priority | Level | Description | Examples |
|---|---|---|---|
| 🟢 | 1 (Low) | Non-urgent issues that do not require immediate attention | General inquiries, minor inconveniences, routine updates, feature requests |
| 🟠 | 2 (Medium) | Moderately urgent issues that need timely resolution but are not critical | Performance issues, intermittent errors, detailed user questions |
| 🔴 | 3 (Critical) | Urgent issues that require immediate attention and quick resolution | System outages, security breaches, data loss, major malfunctions |
Language Support
Indicates the language in which the email is written. Useful for language-specific NLP models and multilingual support analysis.
| Language Code | Language | Use Case |
|---|---|---|
| en | English | International support, primary language |
| de | German | DACH region support |
| es | Spanish | Spanish-speaking markets |
| fr | French | French-speaking markets |
| pt | Portuguese | Portuguese-speaking markets |
Ticket Types
Different types of tickets categorized to understand the nature of the requests or issues.
| Icon | Type | Description |
|---|---|---|
| ❗ | Incident | Unexpected issue requiring immediate attention |
| 📝 | Request | Routine inquiry or service request |
| ⚠️ | Problem | Underlying issue causing multiple incidents |
| 🔄 | Change | Planned change or update |
Business Types
The business type of the support helpdesk helps in understanding the context of the support provided.
Examples include:
- Tech Online Store
- IT Services
- Software Development Company
- SaaS Provider
- E-commerce Platform
- Enterprise IT Department
Tags and Categories
Tags/categories assigned to the ticket to further classify and identify common issues or topics. The dataset includes 10 tag columns for comprehensive categorization.
Example Tags:
- Product Support
- Technical Support
- Sales Inquiry
- Software Bug
- Warranty Claim
- Password Reset
- Network Issue
- Account Management
- Feature Request
- Billing Question
Use Cases
| Task | Description |
|---|---|
| Text Classification | Train machine learning models to accurately classify email content into appropriate departments, improving ticket routing and handling |
| Priority Prediction | Develop algorithms to predict the urgency of emails, ensuring that critical issues are addressed promptly |
| Customer Support Analysis | Analyze the dataset to gain insights into common customer issues, optimize support processes, and enhance overall service quality |
| NLP Model Training | Build natural language processing models for intent detection, sentiment analysis, and automated response generation |
| Quality Assurance | Train models to evaluate agent response quality and consistency |
| Multilingual Support | Develop language-specific models or test multilingual NLP approaches |
| Agent Training | Use realistic examples to train new support agents on proper response techniques |
| Process Optimization | Identify patterns in ticket resolution to improve support workflows |
Dataset Statistics
- Total Tickets: 50,000+ entries across different files
- Languages: 5 (EN, DE, ES, FR, PT)
- Queues: 10 different departments
- Priority Levels: 3 (Low, Medium, Critical)
- Ticket Types: 4 (Incident, Request, Problem, Change)
- Business Types: Multiple business contexts
- Tags: Comprehensive categorization with 10 tag columns per ticket
Important Links
Download and Access
- Kaggle Dataset - Download the complete dataset
- Open Ticket AI - On-premises AI for automatic ticket classification
- Synthetic Data Generator - Create custom datasets tailored to your needs
Network Diagram Tags
The dataset includes network diagram representations showing relationships between different ticket attributes, helping visualize how queues, priorities, and types interact within the support ecosystem.
Why Use This Dataset?
✅ Synthetic Data - No PII, completely safe to use for training and development ✅ Comprehensive - Includes full ticket lifecycle from customer email to agent response ✅ Multilingual - Support for 5 languages enables international applications ✅ Realistic - Generated with realistic business scenarios and agent responses ✅ Flexible - Multiple files with different configurations for various use cases ✅ Well-Structured - Clean, consistent format ready for immediate use in ML pipelines
Getting Started
- Download the dataset from Kaggle
- Choose the file that best matches your needs (language, size, queue configuration)
- Load the data into your preferred ML framework
- Start training your ticket classification models!
For more advanced features like custom queue definitions, specific business types, or integration with your existing ticketing system, check out Open Ticket AI.
Support This Project
Your support through an upvote on Kaggle would be greatly appreciated! ❤️🙂 Thank you for helping make this resource available to the community.
Conclusion
The Customer IT Support Ticket Dataset is an invaluable resource for companies and researchers who want to harness data-driven insights into customer support. With 50,000 entries, multilingual support, comprehensive tagging, and realistic agent responses, this dataset offers everything needed to build production-ready ticket classification systems.
Whether you're training ML models, optimizing support processes, conducting NLP research, or developing automated support solutions, this dataset provides the foundation for success.