Python pipeline developed during my internship at CEREMA to clean, resample, and analyse
rainfall/flow/storage time series — estimating dry-weather baseflow and separating
rainfall-induced flow from raw sensor data.
Python 100%pandas · numpy · matplotlibBaseflow SeparationRainfall–RunoffCEREMA Data30-min time step
Raw sensor exports from CEREMA field sites contain irregular timestamps, missing value codes
(−6999 / −7999), and mixed rainfall, flowrate, and storage tank data in a
single semicolon-separated file. This pipeline automates the full path from raw export to
clean diagnostic figures, ready for model calibration.
ResampleRegularise irregular timestamps to a chosen time step (default: 30 min) using pandas resampling with gap-aware aggregation.
03
Rain / Dry Period MarkingMark rain vs dry periods from rain-event tables to separate wet- and dry-weather signals in the flow record.
04
Baseflow EstimationDetect storage-tank filling segments from dV/dt dynamics; fit baseflow from dry-weather periods; build a continuous baseflow series.
05
Rainfall-Induced FlowCompute rainfall-induced flow = total flow − baseflow. Export intermediate and final CSVs alongside diagnostic QA figures.
Outputs
Example figures
These figures are generated from the 30-minute resampled time series. Raw input data are
not included in the repository (field site data, confidential).
Figure 1. QA plot — observed total flow (blue), estimated dry-weather baseflow (orange),
and computed rainfall-induced flow (green) at 30-minute resolution. The separation quality can be
visually inspected against the rainfall record.
Figure 2. Rainfall time series (30-min). Two pluviometers shown.
Data cleaned and resampled from raw CEREMA export.
Figure 3. Flowrate and storage tank volume (30-min).
Storage dynamics (dV/dt) are used to detect dry-weather filling segments for baseflow fitting.
Technical Details
Stack & repository structure
Python 3pandasnumpymatplotlibTime Series AnalysisBaseflow SeparationRainfall–RunoffCEREMA Île-de-France30-min resampling
internship-hydrology-workflow/
├── scripts/
│ └── run_pipeline.py ← entry point (410 lines): load → clean → resample → export
├── src/← package modules (expand as workflow grows)
├── example_figures/
│ ├── rainfall_30min.png
│ ├── flow_and_storage_30min.png
│ └── qa_baseflow_rainflow_30min.png
├── Inputs/ ← place raw data here (git-ignored)
├── Outputs/ ← results written here (git-ignored)
├── requirements.txt
└── README.md
The pipeline is designed to be modular — each processing stage is a separate
function, making it easy to extend. The entry script run_pipeline.py currently implements
steps 1–2 (load, clean, resample, export). Steps 3–5 (rain marking, baseflow, flow separation)
are implemented as next modules in src/.