Your team has decided to add ML to the product. Now someone in the room says “we need to predict this” and someone else says “no, we need to classify it” and a third person is sketching a forecasting model on the whiteboard. The meeting ends without a decision.
This confusion is not a technical problem. It is a framing problem. Prediction, classification, and forecasting are three distinct ML tasks that answer three different types of questions. Getting clear on which one you need before writing a line of code is the decision that determines everything else: what data you need, how long it will take, what the output looks like, and whether the team using it will trust it.
This article gives you the framework to make that call.
Why the distinction matters before you start building
The three terms get used interchangeably in most business conversations about AI. That imprecision has a cost. A team that builds a classification model when they needed a forecast will spend months training a system that answers the wrong question. A team that reaches for a forecasting algorithm when a simple prediction model would have sufficed will spend infrastructure budget they did not need to spend.
According to research published in the Harvard Business Review on AI adoption in organisations, one of the most common reasons ML projects fail to deliver value is misalignment between the business question and the ML task type chosen. The technical execution is often sound. The problem definition is not.
Getting the task type right first is the highest-leverage decision in any ML project.
What is prediction, classification, and forecasting in plain terms?
Prediction answers the question: what will happen? It takes historical data and produces a probability or score about a future event. The model learns from past outcomes to estimate the likelihood of future ones. Will this customer churn? How likely is this lead to convert? What is the risk score for this account? The output is always a number or a probability, not a category.
Classification answers the question: what category does this belong to? It assigns an input to one of a predefined set of labels the model has been trained to recognise. Is this transaction fraudulent or legitimate? Is this support ticket urgent or routine? Is this email spam or not? The output is always a label, not a number.
Forecasting answers the question: how much, and by when? It predicts the future value of a metric at a specific point or across a time horizon, using patterns from historical time-series data. What will revenue look like next quarter? How many units will the warehouse need in 30 days? What will support ticket volume be next week? The output is a value tied to a future moment in time.
The practical test: if you are trying to produce a number or probability about a future event, you need prediction. If you are trying to assign a label, you need classification. If you are tracking a metric over time and projecting it forward, you need forecasting.
Which ML task fits your business question?
Before choosing an algorithm, the team needs to agree on what the system is being asked to produce. The flowchart above gives you a decision path. Start with the output type, then narrow to the specific task. If the path leads to regression rather than prediction, that is still a valid ML approach and often the right one for continuous value outputs like pricing and revenue estimation.
The most expensive mistake is skipping this step entirely and jumping straight to model selection. The algorithm is irrelevant until the output type is defined.
Real use cases by task type
Prediction use cases
Customer churn prediction is the most common SaaS use case for ML prediction. The model is trained on historical data from accounts that churned and accounts that did not, learning the combination of signals that preceded cancellation. The output is a churn probability score per account that updates as new usage data comes in. A sales or customer success team can act on the score before the customer reaches the cancellation decision.
Lead conversion prediction works on the same principle. The model is trained on historical leads that converted and leads that did not, using CRM data, engagement signals, firmographics, and behavioural data. The output is a conversion probability per lead that allows the sales team to prioritise effort without relying on intuition.
Payment default prediction is the standard approach for credit risk in fintech and lending products. The model learns from historical repayment data and produces a default probability per borrower at the point of application or at each billing cycle.
Account expansion prediction identifies which existing accounts are ready for an upsell or expansion conversation, trained on the behavioural patterns that preceded previous expansions in the customer base.
Classification use cases
Fraud detection is one of the oldest and most production-hardened classification use cases in ML. The model is trained on labelled historical transactions, flagged as fraudulent or legitimate, and assigns a label to each incoming transaction in real time. The challenge in fraud classification is handling severe class imbalance: fraudulent transactions are a tiny fraction of the total, and the model must learn to detect them without flagging legitimate activity.
Support ticket triage classifies incoming tickets into categories: urgent versus routine, bug report versus feature request, billing versus technical. The model is trained on historical tickets with known resolutions and routes new tickets to the right team without human intervention at the first step.
Lead quality classification assigns a quality label to inbound leads before a sales rep reviews them. The model is trained on the characteristics of past leads that were qualified, not qualified, or marked as not a fit, and applies those patterns to new inbound leads automatically.
Content moderation classifies user-generated content against a defined set of categories: compliant, policy violation, borderline, or requires human review. The model handles the high-volume routine decisions and routes edge cases to a human moderator.
Forecasting use cases
Revenue forecasting applies time-series ML to project MRR or ARR over a defined horizon, typically a quarter or a fiscal year. The model learns from historical revenue patterns including seasonal effects, growth trends, and the impact of past campaigns or pricing changes, and produces a forward projection with confidence intervals.
Demand planning forecasts how much of a product, service, or resource will be needed at a future point in time. For SaaS products this might be infrastructure capacity. For e-commerce it is inventory. For services businesses it is staffing. The model learns from historical demand patterns and external signals to produce a forward estimate the operations team can act on.
Support volume forecasting predicts how many tickets will arrive next week or next month, broken down by category. This allows the support team to staff appropriately, plan coverage, and avoid reactive hiring decisions.
Hiring pipeline forecasting projects how many open roles the organisation will need to fill within a defined period, based on historical attrition rates, growth trajectories, and departmental headcount plans.
What each ML task actually requires from your team and data
Before committing to any ML task type, the team needs to understand what that task actually demands. The requirements matrix above gives you a comparison across four dimensions: data volume, label quality, time to first result, and infrastructure cost.
Prediction tasks are the most accessible entry point for most SaaS teams. They require medium data volumes, medium label quality (you need historical outcomes but the labelling is often implicit in your existing data), and are achievable with low infrastructure cost and relatively fast time to first result.
Classification tasks have the most demanding label quality requirements. Every training example needs to be correctly labelled, and in cases like fraud detection that labelling needs to be accurate and consistent. The consequences of a mislabelled training set show up directly in the model’s production error rate.
Forecasting tasks have the highest data volume requirements. Time-series models need enough historical data to learn seasonal patterns, trends, and cycles. A model trained on three months of data will not capture annual seasonality. For most business forecasting applications, a minimum of one to two years of historical data produces meaningfully reliable projections.
According to Google’s research on production ML systems published in their technical infrastructure blog, the data pipeline and labelling infrastructure account for a larger share of ML project time and cost than the model training itself. Understanding the data requirements before starting is the most effective form of scope management available to a product team.
Five questions to answer before starting any ML project
Do you have historical data where the outcome already happened? ML learns from examples. If customers have churned, if fraud has occurred, if revenue has been recorded, if tickets have been routed, you have the training signal. Without historical outcomes, there is nothing for the model to learn from. The answer to this question determines whether you can start building now or need to invest in data collection first.
Can you define what a correct answer looks like? Classification and prediction models both require that you can label training examples correctly. If your team cannot agree on what “at risk” means, what “fraudulent” means, or what “high quality” means, the model cannot learn those distinctions. Clarifying the definition is a business problem, not a technical one, and it needs to be solved before any algorithm is selected.
Is your data structured or unstructured? Tabular data from a CRM, database, or event log points toward classical ML: gradient boosting, random forests, logistic regression. Images, audio, video, or raw text at scale points toward deep learning. The data type is one of the most reliable signals for which class of algorithm is appropriate.
Does the answer need to be explainable to a non-technical stakeholder? A churn score a sales rep cannot understand will not be used. A fraud flag a compliance team cannot justify will create operational and legal risk. If explainability is a requirement, it is a constraint on algorithm selection from the start. Decision trees, logistic regression, and gradient boosting with feature importance satisfy this requirement. Most deep learning architectures do not without additional tooling.
What happens if the model gets it wrong? The acceptable false positive rate for a fraud detection model is very different from the acceptable false positive rate for a lead scoring model. Defining the cost of each error type before training begins determines which evaluation metric to optimise for, which threshold to set for predictions, and how to calibrate the model for production use.
What getting this wrong looks like in practice
A team builds a classification model to route support tickets into three categories. They train it on six months of historical tickets and achieve 82 percent accuracy on the evaluation set. They ship it to production.
Three months later, the model is routing 40 percent of tickets incorrectly during a product launch week, because the spike in a new category of tickets was not represented in the training data. The model had never seen the problem pattern before.
The team needed a model that could detect distribution shift, alert on new patterns, and be retrained quickly on fresh examples. None of that infrastructure existed because the question asked before building was “what algorithm should we use” rather than “what happens when the world changes.”
The questions in the section above are designed to surface these gaps before they become production incidents. They are not a checklist to complete once. They are the starting point for every ML project conversation.
If your team is working through any of these questions for a current or upcoming ML project, feel free to connect with us to scope it correctly from the start.
Your queries, our answers
Prediction produces a probability or score about whether a specific event will or will not happen. Forecasting produces a numeric value at a future point in time for a continuous metric. Predicting whether a customer will churn is a prediction task. Projecting what MRR will be in three months is a forecasting task. The key difference is that forecasting is inherently time-series based and accounts for trends, seasonality, and cycles in historical data.
Use classification when the output needs to be a label from a fixed set of categories rather than a probability. If the question is "is this fraudulent or not" or "which support queue does this belong to" or "what product category is this," classification is the right task type. If the question is "how likely is this to happen" or "what is the risk score for this," prediction is the right task type.
It depends on the task type and the complexity of the problem. For a binary classification or prediction task on structured data, a minimum of a few thousand labelled examples is usually enough to train a baseline model that produces useful signals. For forecasting, you need enough historical time-series data to capture the patterns you want the model to learn, typically at least one to two years for any use case with seasonal variation. In all cases, more data improves performance up to a point of diminishing returns.
Customer churn prediction is the most widely deployed ML use case in SaaS. It requires data that most SaaS companies already collect (usage events, login frequency, billing history, support interactions) and produces output that directly maps to a retention workflow the customer success or sales team can act on. It is also one of the most accessible use cases from a data and infrastructure perspective, making it a practical starting point for teams new to production ML.
No. Each task type requires a different model architecture, training objective, and output format. A single model trained on one task type cannot be repurposed for another. A churn prediction model produces probabilities. A support triage classification model produces category labels. A revenue forecasting model produces time-series projections. They are different systems that can run in parallel within a product, but they are not interchangeable.
What happens after you fill-up the form?
Request a consultation
By completely filling out the form, you'll be able to book a meeting at a time that suits you. After booking the meeting, you'll receive two emails - a booking confirmation email and an email from the member of our team you'll be meeting that will help you prepare for the call.
Speak with our experts
During the consultation, we will listen to your questions and challenges, and provide personalised guidance and actionable recommendations to address your specific needs.
Author
SathishPrabhu
Sathish is an accomplished Project Manager at Mallow, leveraging his exceptional business analysis skills to drive success. With over 8 years of experience in the field, he brings a wealth of expertise to his role, consistently delivering outstanding results. Known for his meticulous attention to detail and strategic thinking, Sathish has successfully spearheaded numerous projects, ensuring timely completion and exceeding client expectations. Outside of work, he cherishes his time with family, often seen embarking on exciting travels together.

