The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model that serves as the base for a data science process.
It has six sequential phases:
ML pipeline
Tuning pipelines is hard

HAMLET: (Francia, Giovanelli, and Pisano 2023)
AutoML aims at automating the ML pipeline instantiation:
Examples of AutoML tools:

HAMLET: Human-centric AutoML via Logic and Argumentation

The LogicalKB enables:

The Problem Graph allows to:

The Data Scientist iterates on:

// Declare steps pipeline
s1 : ⇒ step(D).
s2 : ⇒ step(N).
s3 : ⇒ step(Cl).
// Declare classification algorithms
a1 : ⇒ algorithm(Cl, Dt).
a2 : ⇒ algorithm(Cl, Knn).
// Forbid Normalization when using DT
c1 : ⇒ forbidden(⟨N⟩, Dt).
// Mandatory Normalization in Classification Pipelines
c2 : ⇒ mandatory(⟨N⟩, Cl).
// Declare steps pipeline
s1 : ⇒ step(D).
s2 : ⇒ step(N).
s3 : ⇒ step(Cl).
// Declare classification algorithms
a1 : ⇒ algorithm(Cl, Dt).
a2 : ⇒ algorithm(Cl, Knn).
// Forbid Normalization when using DT
c1 : ⇒ forbidden(⟨N⟩, Dt).
// Mandatory Normalization in Classification Pipelines
c2 : ⇒ mandatory(⟨N⟩, Cl).Conflict between c1 and c2!

// Declare steps pipeline
s1 : ⇒ step(D).
s2 : ⇒ step(N).
s3 : ⇒ step(Cl).
// Declare classification algorithms
a1 : ⇒ algorithm(Cl, Dt).
a2 : ⇒ algorithm(Cl, Knn).
// Forbid Normalization when using DT
c1 : ⇒ forbidden(⟨N⟩, Dt).
// Mandatory Normalization in Classification Pipelines
c2 : ⇒ mandatory(⟨N⟩, Cl).
// Resolve conflict between c1 and c2
sup(c1, c2).
Settings:
Accuracy vs budget
Settings:
Computational overhead
Comparison with AutoML tools
Accuracy vs AutoML tools
Key features:
Future directions:

Matteo Francia - Machine Learning and Data Mining (Module 2) - A.Y. 2025/26