Available immediately

Hi, I'm

Tahar Guenfoud

Data Analyst & Data Scientist

A decade of IT education expertise combined with a Master's in Computer Science and intensive Data Science training. I bridge the gap between complex data and clear, impactful decision-making.

Mons, Belgium
Tahar Guenfoud
22M+
Lines Analyzed
10
Years Technical Expertise
400+
Hours Data Science Training
200+
Students Trained/Year

Featured Projects

Real-world data projects solving business problems

★ Featured

Blackspots — Railway Delay Cartography

Geospatial analysis of 19.7 million train records from Infrabel Open Data (Jan 2025 – Feb 2026). Interactive Folium heatmap identifying delay hotspots across 500 Belgian stations. Monthly trend analysis with Plotly, Streamlit dashboard, and optimized Parquet data pipeline.

PythonPandasFoliumPlotlyStreamlitGeospatial
★ Featured

Railway Delay ML — Predictive Modeling

End-to-end ML pipeline on 19.7M Infrabel Open Data records. Geographic clustering (DBSCAN haversine, 28 basins), anomaly detection (Isolation Forest, 50 stations), temporal decomposition (STL weekly), and multi-model comparison (XGBoost · GradientBoosting · RandomForest · Stacking). 5-Fold CV with sklearn Pipeline — no data leakage. AUC=0.813 · MAE=0.275 min (cross-validated).

PythonXGBoostScikit-learnDBSCANIsolation ForestSTLCross-Validation
★ Featured

Infrabel Dashboard — KPI Qualité Réseau

Dashboard interactif de ponctualité et fiabilité du réseau ferroviaire belge. 5 datasets Open Data Infrabel (ponctualité par gare, causes des retards, trains supprimés, Contrat de Performance). Notebook ETL/EDA complet + Streamlit + Power BI.

PythonPandasPlotlyStreamlitPower BIOpen Data
★ Featured

SNCB Live Dashboard

Real-time dashboard monitoring Belgian railway punctuality across 20 major stations. Live data from iRail API (parallel fetch via ThreadPoolExecutor), GPS coordinates extracted directly from API response. KPI gauges, Folium interactive map, Plotly heatmap, delay distribution, auto-refresh every 60s. Jupyter notebook with 12 analysis sections.

PythonStreamlitPlotlyFoliumiRail APIReal-timePandas
★ Featured

Bank Customer Churn Prediction

End-to-end ML pipeline to predict customer churn in banking. Identified 3 key factors impacting 40% of at-risk customers using behavioral analysis of 10,000+ users. Built predictive models (Random Forest, Logistic Regression) with advanced feature engineering.

PythonPandasScikit-learnStreamlitFeature Engineering
★ Featured

Maven Music — Churn Analysis

Customer churn analysis for a music streaming platform. Reduction of customer attrition via behavioral analysis of 10,000+ users. Deployed interactive Streamlit dashboard accessible online.

PythonScikit-learnMatplotlibSeabornStreamlit

Bank Customer Data Preparation

Comprehensive data preparation and cleaning pipeline for bank customer datasets. ETL processes, data quality assessment, and feature preparation for downstream ML models.

PythonPandasNumPyData CleaningETL

Le Wagon — Board Game Popularity Analysis

Capstone project for Le Wagon Data Science & AI bootcamp (400h+). Predictive modeling of board game popularity from 20,000+ BGG records enriched via REST API. Feature engineering, cross-validation, and Streamlit dashboard deployment.

PythonScikit-learnNumPySeabornAPI RESTStreamlit

Technical Skills

💻 Languages & Data

PythonSQLPandasNumPyGitDocker

🤖 Machine Learning

Scikit-learnClassificationRegressionClusteringNLPFeature Engineering

📊 Visualization & BI

Power BITableauStreamlitPlotlySeabornMatplotlib

☁️ Cloud & Infrastructure

AzureETLCI/CDDockerREST APIsMLOps

Experience & Education

2025

Data Science & AI Bootcamp — 400h+

Le Wagon

Machine Learning, Deep Learning, Data Engineering, MLOps. Final team project with end-to-end deployment.

2025

Master in Computer Science

UMONS — University of Mons

Advanced studies in algorithms, databases, and software engineering.

2016 — Present

Computer Science Teacher

Fédération Wallonie-Bruxelles Enseignement · Mons

Teaching Python, SQL, and databases to 100+ students/year. Systematic analysis of performance data to adapt pedagogical strategies. Developed automated correction and reporting tools.

2014 — 2015

Independent IT Consultant

E-zzy · Belgium

Needs analysis, network architecture, and IT infrastructure deployment for SMEs. User training and technical support.

About Me

When I'm not crunching data, you'll find me training for my next triathlon 🏊🚴🏃. I believe that the discipline and analytical thinking required in sports directly translates to my approach in data science.

Languages

French — Bilingual English — Fluent Arabic — Native Russian — Intermediate Dutch — Basic

Let's Connect

Open to data analyst & data scientist opportunities in Belgium