Calibration Prozess

1 Overview

In this article, we will learn how to manage the entire hydrological modeling process (mehr Concept Details in Concept of Modelling) with the minimal model Linear-Reservoir model, including automatic calibration.

2 Library

All the functions used in this tutorial are bundled in the HydroCourse package. You will need to install this package from GitHub before loading it.

Once installed, you’ll also need to load the following packages into your R workspace: HydroCourse, hydroGOF, tidyverse, and plotly.

# If don't installed:
# If remotes not installed, use: install.packages("remotes")
# remotes::install_github("HydroSimul/HydroCourse")
library(HydroCourse)
# Evalute functions
library(hydroGOF)

# Data Manipulation and Plot
library(tidyverse)
theme_set(theme_bw())
# Interactive Plot
library(plotly)

3 Run the Model

The first step is to become familiar with the hydrological model. You need to understand what input data it requires, which typically includes boundary condition (forcing) data, initial conditions, and one or more parameters. Additionally, you should know what kind of output the model will generate.

You can access detailed information about the model using the help(model_linearReservoir) or ?model_linearReservoir commands in R Console. This will provide you with a description of the model’s functionality and usage. Alternatively, you can also find information about the model online in the model_linearReservoir documentation.

Once you have reviewed the model’s documentation, the next step is to test the model with synthetic or test data to get a hands-on understanding of how it works.

To understand the flow of data from concept to simulation, we can create a visual representation. We’ll use blue squares to represent the data and blue circles to represent the functions:

  • 1.1 -> Q_In = 1:10
  • 1.2 -> Q_Out0 = 2
  • 1.3 -> param_K = 5
  • f1 -> model_linearReservoir()
# Try with synthetic data
model_linearReservoir(Q_In = 1:10, Q_Out0 = 2, param_K = 5)
 [1] 2.000000 1.909091 2.016529 2.286251 2.688751 3.199887 3.799908 4.472652
 [9] 5.204897 5.985825

4 Evaluate

Before proceeding with the calibration, it’s essential to evaluate the simulation results. The question arises: are some results better evaluated as good while others as bad?

4.1 Load Experimental Data

In this phase, we will utilize experimental data from a labor experiment. This dataset involves the physical simulation of a linear reservoir and provides the measured inflow and outflow data in liters per second (L/h).

The labor data is available on GitHub.

# Load Labor Data
df_Labor <- read_delim("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/tbl_LaborMess_LinearReservior.txt", delim = "\t")

# Rename the data, in order to more flexible Manipulation
names(df_Labor) <- c("t", "QZ", "QA")

# The first 10 Line
head(df_Labor)
# A tibble: 6 × 3
      t    QZ    QA
  <dbl> <dbl> <dbl>
1     0     0    30
2     1     0    30
3     2     0    29
4     3     0    29
5     4     0    29
6     5     0    29

4.2 Mapping from Concept to Proceeding

To provide clarity, let’s map the conceptual understanding to the simulation before proceeding:

  • 1.1 -> Q_In = df_Labor$QZ
  • 1.2 -> Q_Out0 = 0
  • 1.3 -> param_K = 50
  • f1 -> model_linearReservoir()
  • 2 -> df_Labor$QZ
  • f2 -> NSE(), KGE()

4.3 Running the Model with Forcing Data

As a preliminary test, we can suggest certain parameter values, such as \(K\) at 90 and 60. After the simulation, we will store the results in variables num_Q_Sim and num_Q_Sim2 for further analysis.

# run the model

num_Q_Sim <- model_linearReservoir(df_Labor$QZ, 0, param_K =  90)

num_Q_Sim2 <- model_linearReservoir(df_Labor$QZ, 0, param_K =  60)

4.4 Visual Evaluation

Before employing quantitative criteria, it’s beneficial to visually evaluate the simulation results using time series plots, which provide an initial sense of the model’s performance.

# Visual Evaluation
ggLabor <- ggplot(df_Labor) +
  geom_line(aes(t, QZ, color = "Inflow")) +
  geom_line(aes(t, num_Q_Sim, color = "Simul1")) +
  geom_line(aes(t, num_Q_Sim2, color = "Simul2")) +
  geom_line(aes(t, QA, color = "Observ")) +
  scale_color_manual(values = c(Inflow = "cyan", Simul1 = "red", Simul2 = "orange", Observ = "blue"))+
  labs(x = "Time [s]", y = "In-/Outflow [L/h]", color = "Flow")
ggplotly(ggLabor)

4.5 Quantitative Evaluation

For short or simplified time series, visual evaluation may suffice. However, when dealing with long-term data, we require standardized criteria to objectively assess the model’s performance.

NSE and KGE are the most commonly used criteria in hydrological research. However, there are also additional criteria available in the hydroGOF package (use ?hydroGOF):

Quantitative statistics included are: Mean Error (me), Mean Absolute Error (mae), Root Mean Square Error (rms), Normalized Root Mean Square Error (nrms), Pearson product-moment correlation coefficient (r), Spearman Correlation coefficient (r.Spearman), Coefficient of Determination (R2), Ratio of Standard Deviations (rSD), Nash-Sutcliffe efficiency (NSE), Modified Nash-Sutcliffe efficiency (mNSE), Relative Nash-Sutcliffe efficiency (rNSE), Index of Agreement (d), Modified Index of Agreement (md), Relative Index of Agreement (rd), Coefficient of Persistence (cp), Percent Bias (pbias), Kling-Gupta efficiency (KGE), the coef. of determination multiplied by the slope of the linear regression between ‘sim’ and ‘obs’ (bR2), and volumetric efficiency (VE).

NSE(num_Q_Sim, df_Labor$QA)
[1] 0.9234607
NSE(num_Q_Sim2, df_Labor$QA)
[1] 0.8752109
KGE(num_Q_Sim, df_Labor$QA)
[1] 0.8425622
KGE(num_Q_Sim2, df_Labor$QA)
[1] 0.9273187

5 Calibrate

As always, the first step is to map from the concept to the procedure:

  • 1.1 -> Q_In = df_Labor$QZ
  • 1.2 -> Q_Out0 = 0
  • f1 -> model_linearReservoir()
  • 2 -> df_Labor$QZ
  • f2 -> NSE(), KGE()
  • 3 -> x_Min = 40, x_Max = 90
  • f3.1 -> cali_UVS()
  • f3.2 -> eva_fit()

5.1 Create the Fit Function

Before proceeding with automatic calibration, an important step is to create a function that the calibration algorithm will use. This function should take the parameter to be calibrated as an input. Thus, we need to modify our model and evaluation function into a “Fit Function.”

eva_fit <- function(model_Param,
                     model_Input,
                     Q_Observ,
                     fct_gof = NSE) {
  Q_Simu <- model_linearReservoir(model_Input, param_K = model_Param)
  
  - fct_gof(Q_Simu, Q_Observ)
  
}

eva_fit(60, df_Labor$QZ, df_Labor$QA)
[1] -0.8752109

There is another critical point to consider when creating the Fit Function. Calibration algorithms need to know which criteria is better. Most calibration algorithms compare the current criteria value with the previous one (or several previous ones) and consider the minimum (or maximum) criteria value as the best. However, in the case of NSE and KGE, a better simulation results in a higher value. To handle this, we should set these criteria as negative values. By doing so, calibration algorithms like cali_UVS() can work effectively with them.

5.2 Calibrating

With the fit function in place, we can choose a calibration algorithm to optimize our model parameter (in this case, parameter \(K\)).

lst_Cali <- cali_UVS(eva_fit, x_Min = 40, x_Max = 90, model_Input = df_Labor$QZ, Q_Observ = df_Labor$QA, fct_gof = KGE)

  |                                                                            
  |                                                                      |   0%

6 Validate

After calibration, it’s crucial to validate the calibrated parameter to ensure that it performs well in other phases. In this course, we will demonstrate this through visual evaluation.

num_Q_Sim_Cali <- model_linearReservoir(df_Labor$QZ, param_K =  lst_Cali$x_Best)

ggVali <- ggplot(df_Labor) +
  geom_line(aes(t, QZ), color = "cyan") +
  geom_line(aes(t, num_Q_Sim_Cali), color = "tomato") +
  geom_line(aes(t, QA), color = "blue") +
  labs(x = "Time [s]", y = "In-/Outflow [L/h]")
ggplotly(ggVali)