Energy-efficient and quality-focused control of conveyor belt dryers in petrochemical production (2025)

Introduction

The petrochemical industry is not only crucial for national production and daily life but is also characterized by high pollution and carbon emissions. Within petrochemical plants, the drying process stands out as a major consumer of energy, constituting 10–20% of the total energy consumption1. It goes beyond being a mere production stage; instead, it plays a vital role in determining both product quality and energy utilization. Numerous dryers have been discussed in the literature based on specific applications and products to be dried, including spray dryers, rotary dryers, and belt dryers2. While each type has its advantages and disadvantages, a conveyor belt dryer emerges as a versatile and widely adopted solution, particularly in petrochemical drying, where the product is typically in granular form, requiring continuous and uniform processing3. Effectively addressing the challenge of implementing reliable process control in petrochemical drying is crucial for minimizing energy losses and ensuring product quality.

In conveyor belt dryers, attaining the desired product quality with optimal energy consumption depends on maintaining a delicate balance among various contributing factors such as temperature, load, and speed, among others. Managing these parameters is a challenging task, compounded by uncontrollable climatic conditions that can substantially influence the impact of these controllable factors. Moreover, while the energy consumption profile can be estimated analytically4, predicting product quality is not as straightforward due to the intricate non-linear relationships among these contributing factors.

Common control methods in petrochemical drying still rely on manual empirical operation, which is subjective and demands specialized operators. Alternative automatic control approaches include feedback control, computer simulation control, and model predictive control (MPC). Feedback control, exemplified by proportional, integral, and derivative (PID) tuning, adjusts the drying process based on real-time product moisture detection to minimize deviations5. However, it may not align well with independent events of the inlet product, drying conditions, and outlet product. Computer simulation control utilizes AI algorithms to model and solve non-linear processes6, yet obtaining effective coefficients is challenging due to the randomness of in-field drying. Boltzmann coefficient generators7 have been developed to generate real-time coefficients based on field measurements, but this significantly increases control costs and poses challenges in accurately measuring data online, especially in the presence of large-scale dynamic disturbances. MPC establishes an analytical model of the drying system, incorporates disturbance information through online optimization, and calculates optimal future inputs based on the current state8,9. It addresses time delay effects and is suitable for controlling drying with high inertia, non-linearity, and multiple disturbances during the dehydration process. However, these approaches fall short of achieving complex control objectives, particularly when influenced by factors such as material properties, climatic conditions, and the drying process itself.

Machine learning (ML), particularly reinforcement learning (RL) and deep reinforcement learning (DRL) has emerged as a transformative approach for addressing complex control problem10,11. Their capacity to learn from data, adapt to changing conditions, and optimize outcomes makes them well-suited for the intricate challenges posed by conveyor belt dryers. However, employing these directly for control purposes is not straightforward. The complexities lie in balancing both the product quality and the energy consumption profile, demanding a tailored approach for algorithmic modification to meet the specific control requirements. Simply applying these algorithms without customization might overlook the nuanced balance necessary for optimizing both quality and energy efficiency. Therefore, this study integrates a graph convolution network (GCN) model with the multiagent deep deterministic policy gradient (MADDPG) algorithm. The GCN model is first trained with data derived from industrial drying lines and then validated with real data from a petrochemical dryer. After validating the model, it becomes an integral part of the overall control framework, and the MADDPG algorithm is employed to control the drying process. The trained GCN model is incorporated into the MADDPG algorithm’s reward function to estimate the product quality and energy consumption after each time step. The primary contributions of this paper encompass:

  • Establish a comprehensive model that utilizes a GCN framework, drawing upon real dryer data. The model integrates a detailed description of a conveyor belt dryer, encompassing its dynamic modeling, to enable accurate predictions of product quality in response to fluctuating input conditions.

  • Formulate the drying process as a multiagent control problem, enabling real-time adjustment of process parameters in response to changes in climate conditions. This approach ensures energy-efficient production while adhering to quality standards.

  • Utilize the MADDPG algorithm to enhance energy efficiency in the conveyor belt dryer, treating fans, chambers, and belts as distinct agents, to improve the efficiency of the control.

The rest of the paper is organized as follows: “Results” presents the results, “Discussion” offers a comprehensive discussion, and “Methods” outlines the methods for evaluating quality, energy consumption, and problem formulation.

Results

In this section, we conduct simulation experiments to validate the effectiveness of the GCN modeling and control framework. The experiments involve comparing the production output (dried product) by employing the control policy learned from the proposed method and the output without control. The algorithm is evaluated based on the production quality (samples within the desired product weight range \(X\)) and the associated energy consumption of the chambers and conveyor belts. We aim to demonstrate three key findings: (1) The GCN model accurately predicts the sample product weight (2) The proposed MADDPG-based control framework results in higher yield, and (3) reduces the total energy consumption.

Experiment parameter setting

The simulation experiments are conducted using the Python platform. We create and analyze 50 distinct conveyor belt dryers, which encompass various types such as food drying, grain drying, and chemical drying, among others. Through these experiments, we can verify the effectiveness and resilience of the proposed method across the 50 diverse drying systems. Each dryer is constructed by randomly selecting from the following sets:

$${No}.\,{of\; fans},f\in \left\{2,3,4,5,6\right\}$$

$${No}.\,{of\; conveyor\; belts},b\in \left\{1,2,3,4,5,6\right\}$$

$${No}.\,{of\; chambers},C\in \left\{3,4,\ldots ,18\right\}$$

$${Fan\; speed},{{v}_{f}}_{i}\in \left[700,1400\right],i=1,2,\ldots ,n$$

$${Conveyor\; belt\; speed},{v}_{{b}_{i}}\in \left[10,20\right],i=1,2,\ldots ,n$$

$${Chamber\; temperature},{t}_{{C}_{i}}\in \left[60,110\right],i=1,2,\ldots ,n$$

$${Ambient\; temperature},{t}_{{amb}}\in \left[15,50\right]$$

$${Average\; humidity},h\in \left[15,80\right]$$

To illustrate the application of the proposed method, a comprehensive numerical case study involving a conveyor belt dryer is provided. This dryer consists of 9 chambers, 3 fans, and 3 conveyor belts, similar to the configuration shown in Fig. 5. The specific parameters for this setup are presented in Tables 14.

Full size table
Full size table
Full size table
Full size table

Industrial data

In this research, we employ data derived from a petrochemical plant12 engaged in the processing of molding compound powder (MCP). It’s important to note that the data exclusively pertains to the drying process associated with MCP. Currently, within the industry, a conventional approach is utilized to optimize the drying process. This approach relies on operators’ experience, wherein dryer parameters are initially set and subsequently adhered to. While this methodology generally aligns with quality standards, it falls short when it comes to guaranteeing product quality under sudden fluctuations in uncontrollable parameters, such as ambient temperature and humidity. Furthermore, there hasn’t been any prior exploration into issues related to energy consumption or production optimization.

For analysis and control framework training, we have gathered a dataset encompassing 1000 hours of operational data from the aforementioned dryer system. Each data sample is collected following a time interval of 2 hours. This dataset encompasses information regarding the speeds of three conveyor belts, three fans, and the temperatures of nine drying chambers. The conveyor belts maintain consistent speeds of 40, 40, and 50 RPM, while the temperatures in the drying chambers are carefully controlled within the range of 80–102 °C. Notably, the temperature tends to be higher in the initial chambers and gradually decreases toward the end of the dryer system. Similarly, the fans can operate within a speed range of 800–1200 RPM. These variables are adjustable, falling under the category of controllable parameters, and in typical weather conditions, they ensure the desired product quality. However, it’s important to acknowledge that sudden changes in climatic conditions, i.e., ambient temperature and humidity, can lead to variations in product quality. While it may be feasible to manage them under strict conditions, in the situation at hand, these variables remain uncontrolled and can fluctuate due to changing weather conditions. Therefore, addressing these variations through adjustments in controllable parameters, such as conveyor belt speeds, chamber temperatures, and fan speeds, becomes imperative.

Based on the available data for the dryer system, there are 15 controllable parameters, as outlined in the adjacency matrix in Table 7, along with two uncontrollable parameters: humidity and ambient temperature. For all 17 parameters, the output, which is the product weight, is observed. At each time point, the product weights are recorded based on the specific settings of these parameters. While humidity and ambient temperature cannot be controlled, the 15 controllable parameters are adjusted to keep the product within the desired weight range of 120–130 grams. Any deviation outside this range results in a product that does not meet quality standards and is classified as scrap. The primary control objective is to maintain product quality by adhering to the specified weight standards while minimizing energy consumption.

GCN- MADDPG training

Utilizing the specified system parameters, the proposed method involves two key training phases: GCN training and MADDPG training. First, the GCN model is trained offline using raw labeled data from industrial dryers to estimate product quality, specifically the sampled product weight.

To validate the effectiveness of GCN model in prediction, a comparative study is conducted. As shown in Fig. 1, the training dataset comprises 200 production samples, each taken over a two-hour period. During this time, the dryer operates under a specific set of input parameters, and at the end of two hours, a product sample is tested to measure its weight. The predicted product weight is then compared to the actual sample weight.

a GCN prediction model, b Linear regression model, c Support vector regression model, d Random forest regression model.

Full size image

To evaluate the performance of the proposed GCN regression model, it is compared with several other regression models: linear regression (LR), support vector regression (SVR), and random forest regression (RFR). The GCN regression model achieved the best results, with an R2 score of 0.91 and a mean squared error (MSE) of 2.87. This indicates that the GCN model explains a large proportion of the variance in the data and provides highly accurate predictions, suggesting its effectiveness in capturing complex relationships within the data. In contrast, LR performs poorly with an R2 score of 0.21 and an MSE of 25.87, which can be attributed to its limited ability to handle the non-linear dependencies in the data. SVR exhibits a moderate performance with an R2 score of 0.76 and an MSE of 7.98, indicating that while it handles non-linearity better than LR, it still falls short compared to the GCN model. RFR also performs well with an R2 score of 0.84 and an MSE of 5.11, demonstrating its capability to capture non-linear relationships but not reaching the accuracy level of the GCN model. Overall, the GCN model outperforms the others, highlighting its superior ability to model the data accurately.

Once the GCN model is trained, it is incorporated into the reward function of the MADDPG algorithm. This integration allows the GCN model to guide the MADDPG algorithm in optimizing the control of the dryer system to achieve the desired product quality.

The MADDPG algorithm is then trained in an online manner to continuously adjust control inputs in the dryer, considering the estimated product quality. Control decisions are activated at each time step \(t\) throughout the training process. The neural network architecture includes two fully connected hidden layers, each containing 64 hidden units. The reward function conforms to Equation (19), and the necessary training parameters are detailed in Table 4.

Upon completing the training process, a machine control policy is derived based on the updated network parameter \(\theta\). Figure 2 shows the convergence curve for the proposed GCN-MADDPG algorithm. It is observed that achieving a stable policy takes around 25 episodes, where each episode corresponds to 24 hours of drying time.

Training data, reward versus time.

Full size image

Yield comparison

As our primary focus extends beyond energy consumption to encompass system yield, Fig. 3 provides a comparison between the actual product quality output and that achieved under trained policy. This comparison involves the examination of sample weights obtained from both actual data and the simulation model trained through the proposed method. As previously discussed, disruptions to the dryer setting occur when there are unforeseen changes in uncontrollable parameters. In such instances, the product quality, quantified as the sample weight recorded after each time step \(t\), may deviate from the desired range \(X\), resulting in subpar outcomes or scrap. However, our proposed algorithm demonstrates adaptability by adjusting controllable parameters, i.e., chamber temperatures, fan speeds, and conveyor belt speeds, to counteract these unexpected variations. This adjustment mechanism is designed to maintain product quality within the desired range, ultimately enhancing the system yield.

Quality comparison between actual and controlled parameters.

Full size image

Table 5 displays the outcomes over 500 timesteps, representing 500 samples, where 25 samples were rejected due to deviating from the desired quality range ’X’ under actual operational conditions. Conversely, by utilizing the trained policy, no samples were rejected. This result emphasizes the system’s adeptness in dynamically adjusting controllable parameters, ensuring consistent product quality within the desired range and effectively preventing any scrap production.

Full size table

Energy comparison

Utilizing Eq. (14) as our basis, we calculate the energy consumption attributed to temperature fluctuations. This equation indicates an average energy consumption of 47.5 kWh for the chambers, based on real-world data obtained from the industrial dryer. However, under the trained policy, the average energy consumption notably reduces to approximately 45 kWh. This reduction signifies the achievement of a more optimized and efficient performance under controlled conditions.

Equation (15) offers a stable estimation of conveyor belt energy consumption, particularly in scenarios featuring constant speed and uniform load variations. However, energy consumption may exhibit variance under controlled settings, contingent upon the control policy. The pivotal role of Eqs. (15), (16), and (7) in determining this variance is emphasized. Table 6 presents a detailed breakdown of calculations, including mass computed via Eq. (7) and load determined by the product of mass and gravitational acceleration (g = 10 \(m/{s}^{2}\)). Utilizing data from Tables 2 and 4, Eqs. (15) and (16) facilitate the computation of \({P}_{{mech}}\) and \({P}_{{input}}\). Furthermore, \({P}_{{input}}\) is converted to kilowatts (1 kW = 1000 Nm/s) to measure the belt energy consumption over 1 hour, adjusted for a 2-hour time step. This table serves as a valuable reference, elucidating the energy dynamics within the system.

Full size table

Table 6 not only provides insights into energy consumption under actual (industrial) conditions but also presents data reflecting the average load and speeds observed when operating under the trained policy. These observations serve as the basis for calculating energy consumption. Specifically, the average speeds recorded for belts 1, 2, and 3 under the controlled policy are ~1.15 m/s, 1.15 m/s, and 1.17 m/s, respectively.

In Fig. 4, we offer a comprehensive comparison of the accumulated energy consumption between chambers and belts under real-world (industrial) conditions and controlled settings. This comparison relies on average data collected per time step over a week (168 hours). Notably, there’s an average 3% reduction in energy consumption per time step observed for both chambers and belts. Over the week-long comparison period, this translates to an average energy savings of 154 kWh for controlled chambers and 295 kWh for controlled belts, respectively. It’s important to note that the cumulative nature of energy consumption over time can result in a broad y-axis range, making it challenging to distinguish minor differences between two values.

Comparing energy consumption under an actual and controlled setting.

Full size image

It is important to acknowledge certain limitations in our work. We did not consider the impact of motors’ random disruptions in conveyor belt dryers, which can potentially affect both product quality and energy consumption. Additionally, we overlooked energy losses, particularly those arising from electric motors and within drying chambers. In future research, we aim to address these factors and explore operations under conditions of random disturbance to further enhance the efficiency and robustness of conveyor belt drying systems.

Discussion

This research investigates conveyor belt dryers, focusing specifically on their applications within the petrochemical industry. It entails gathering and analyzing real-world data derived from industrial conveyor belt dryers. Given the intricate interplay among various drying variables—such as chamber temperatures, conveyor belt speeds, fan speeds, and product quality—the study adopts a machine-learning approach, centering around a GCN modified for regression tasks. The GCN model undergoes training on industrial data to predict product quality based on input parameters. It’s imperative to note that product quality isn’t solely dependent on controllable parameters. Uncontrollable factors like ambient temperature and humidity also exert a significant influence. Abrupt shifts in these uncontrollable variables can push product quality beyond acceptable limits. To mitigate this, a MADDPG algorithm is employed for real-time control amid such uncertainties. The MADDPG algorithm, well-suited for managing extensive state-action spaces and continuous action values, treats each drying chamber, conveyor belt, and fan as an autonomous agent. Moreover, energy consumption emerges as a significant concern within the drying process. The heat sources in the chambers and the power driving the conveyor belts constitute major energy consumers in conveyor belt dryers. Hence, the energy associated with these aspects is monitored and analyzed. The MADDPG algorithm plays a pivotal role in controlling both product quality and energy consumption within the conveyor belt drying system. To evaluate the effectiveness of the control algorithm, a case study is conducted, configuring a conveyor belt dryer identical to the industrial setup. The results are then compared, encompassing production quality and energy consumption, between the actual industrial scenario and the controlled setting. Encouragingly, the controlled algorithm yields notable reductions in total energy consumption and improvements in product quality.

Methods

System description

An industrial conveyor belt dryer is constructed using a series of drying chambers, with each drying chamber serving as the fundamental module that forms the entire system. To optimize performance, these drying chambers are grouped to create drying sections. Within a drying section, all chambers share a common conveyor belt, which evenly distributes the product to be dried as it enters the section. Naturally, the product undergoes redistribution as it exits one drying section and enters the subsequent one.

For illustration purposes, a conveyor dryer comprising three belt conveyors \({b}_{i}\), each run by an electric motor is given in Fig. 5. The dryer is divided into 9 chambers, represented as \({C}_{i}\) and each belt \({b}_{i}\) is associated with three chambers (or one section) and one fan \({F}_{i}\). The product arrives constantly at an arrival rate \({q}_{t}\) and each belt \({b}_{i}\) moving at speed \({v}_{{b}_{i}}^{t}\) feeds to the next one. Each chamber \({C}_{i}\) has an individual temperature setting \({t}_{i}\). The fans can operate at a variable speed \({v}_{{F}_{i}}^{t}\).

Structure of a conveyor belt dryer.

Full size image

In this system, the following notations are used.

  1. 1.

    \({C}_{i}\), where \(i={\mathrm{1,2}},\ldots ,n\) represents the \({i}^{{th}}\) Chamber;

  2. 2.

    \({b}_{i}\), where \(i={\mathrm{1,2}},\ldots ,n\) represents a motor controlling the \({i}^{{th}}\) belt;

  3. 3.

    \({F}_{i}\), where \(i={\mathrm{1,2}},\ldots ,n\) denotes the \({i}^{{th}}\) fan;

  4. 4.

    \({v}_{{b}_{i}}^{t}\) represents the speed of the conveyor belt \({b}_{i}\) at time \(t\);

  5. 5.

    \({v}_{{F}_{i}}^{t}\) represents the speed of the fan \({F}_{i}\) at time \(t\);

  6. 6.

    \({t}_{{C}_{i}}^{t}\) is the temperature of the chamber \({C}_{i}\) at time \(t\)

  7. 7.

    \({e}_{{b}_{i}}^{t}\) is the energy consumption rate of the belt \({b}_{i}\) at time \(t\);

  8. 8.

    \({H}_{t}\) is the average humidity in the dryer at time \(t\);

  9. 9.

    \({t}_{{amb}}\) is the ambient temperature;

  10. 10.

    \({q}_{i}^{t}\) is the product arrival rate of the belt \({b}_{i}\) at time \(t\);

  11. 11.

    \({L}_{i}\) is the length of the conveyor belt \({b}_{i}\);

  12. 12.

    \(w\) is the width of each belt \({b}_{i}\) and remains constant for all the belts;

  13. 13.

    \({h}_{i}^{t}\) is the thickness of the product layer on the belt \({b}_{i}\) at time \(t\);

  14. 14.

    \(c\) is the junction point on each conveyor belt which defines the starting and ending of the conveyor’s length. It joins the length of a conveyor into a loop. The arrival rate \({q}_{t}\) and belt speed \({v}_{{b}_{i}}^{t}\) can change only at point \(c\);

  15. 15.

    \(T{h}_{{b}_{i}}^{t}\) is the throughput of the conveyor belt \({b}_{i}\) at time \(t\);

  16. 16.

    \({Q}_{i}^{t}\) denotes the amount of product processed by the belt \({b}_{i}\) at time \(t\);

  17. 17.

    \({d}_{i}^{t}\) represents the product departing rate of the conveyor belt \({b}_{i}\) at time \(t\);

  18. 18.

    \({q}_{{ev}}^{i}\) represents the amount of solvent evaporated in section \(i\);

  19. 19.

    \({c}_{{et}}\) is the cost of energy associated with unit degree change in the chamber’s temperature;

  20. 20.

    \({P}_{{input}}\) represents the electric input power;

  21. 21.

    \({P}_{{mech}}\) represents the mechanical input power;

  22. 22.

    \(\eta\) is the efficiency of an electric motor;

  23. 23.

    \({o}_{t}^{i}\) represents the observation of agent \(i\) at time \(t\);

  24. 24.

    \(X\) denotes an acceptable range for the sample weight i.e., \(X=\left[\min ,\max \right]\);

The product arrives at an arrival rate \({q}_{t}\) to the first conveyor belt \({b}_{1}\). As the product arrives continuously and concurrently with the conveyor’s movement, it accumulates as a layer with a thickness of \(h\) on the conveyor belt. The processed amount \({Q}_{i}\left(x\right)\) by belt \({b}_{i}\) at position \(x\) is the product of speed, width, and thickness:

$${Q}_{i}\left(x\right)={v}_{{b}_{i}}{h}_{i}\left(x\right)w$$

(1)

Let \(q\) be the product arrival rate, \(t\) be the time, \(w\) be the width of the conveyor, \(v\) be the speed of the conveyor belt, and \(h\left(x\right)\) be the initial thickness of the product layer at position \(x\) along the conveyor. The integral of the processed amount over the conveyor length should equal the total product input:

$$\mathop{\int} \nolimits_{0}^{L}Q\left(x\right){dx}={qt}$$

(2)

$$\mathop{\int} \nolimits_{0}^{L}\left({vh}(x)w\right){dx}={qt}$$

(3)

With constant speed and width,

$${vw}\mathop{\int} \nolimits_{0}^{L}h\left(x\right){dx}={qt}$$

(4)

$$\mathop{\int} \nolimits_{0}^{L}h\left(x\right){dx}=({qt})/({vw})$$

(5)

where \(L\) is the length of the conveyor.

Equation (5) implies that the integral of thickness over the conveyor length is constant for a given product arrival rate, time, speed, and conveyor width.

Similarly, let \({q}_{1}^{t}\) and \({q}_{2}^{t}\) be the product arrival rate to conveyors \({b}_{1}\) and \({b}_{2}\) at time \(t\) respectively; \(\rho\) be the density of the product, the conservation of mass equation for a small-time interval \(\Delta t\) can be written as

$${q}_{1}\left(t\right)\Delta t={q}_{2}\left(t\right)\Delta t$$

(6)

where \({q}_{i}={A}_{i}{v}_{i}{\rho }_{i}\) and \(A\) is the cross-section area \(\left(A={wh}\right)\). Based on the conservation of mass equation, the input i.e., product arrival rate \({q}_{i}^{t}\) should be equal to the product departing rate \({d}_{i}^{t}\). Since all the conveyors are interlinked and the product departing rate of conveyor belt \({b}_{i}\) is the product arrival rate of the conveyor belt \({b}_{i+1}\) i.e., \({q}_{i+1}^{t}={d}_{i}^{t}\), therefore

$$\begin{array}{c}{q}_{1}^{t}={q}_{2}^{t}\\ {A}_{1}{v}_{1}^{t}{\rho }_{1}={A}_{2}{v}_{2}^{t}{\rho }_{2}\\ (w{h}_{1}^{t}){v}_{1}^{t}{\rho }_{1}=(w{h}_{2}^{t}){v}_{2}^{t}{\rho }_{2}\end{array}$$

(7)

Since the product is being heated along the conveyor, the density of the same product on the conveyor \({b}_{i}\) will always be greater than that of the conveyor belt \({b}_{i+1}\) because of the continuous evaporation. Let the amount of evaporation in section \(i\) be represented by a constant \({q}_{{ev}}^{i}\), we can write that \({\rho }_{i}={\rho }_{i+1}+{q}_{{ev}}^{i}\). Furthermore, our throughput represents the volume of the product rather than the mass, so we neglect the density variations while estimating the throughput.

$${h}_{1}^{t}{v}_{1}^{t}={h}_{2}^{t}{v}_{2}^{t}$$

(8)

$$\frac{{h}_{1}^{t}}{{h}_{2}^{t}}=\frac{{v}_{2}^{t}}{{v}_{1}^{t}}$$

(9)

Thus, we can see that the speed and layer thickness are inversely related, and the variation in one is balanced by the other. Hence, the overall quantity remains the same i.e.,

$$T{h}_{{b}_{1}}^{t}=T{h}_{{b}_{2}}^{t}=\cdots =T{h}_{{b}_{n}}^{t}={Q}_{i}^{t}$$

(10)

At each time step \(t\), the throughput (processed product) of \({i}^{{th}}\) conveyor belt \({Q}_{i}^{t}\) can be simply calculated based on Eq. (1). Similarly, once \({Q}_{i}^{t}\) is found, we can use \({q}_{i}={{wh}}_{i}{v}_{i}\) to find the thickness of the product layer \({h}_{i}\) on belt \({b}_{i}\). In the dryer system modeling, production throughput is not the sole critical parameter; the quality of the output is equally significant. Typically, quality is measured by assessing the dryness level of the product. To evaluate this, the weight of the product is measured at the end of each time step \(t\). If the product weight falls within a predefined range \(X\), the product batch (where one batch corresponds to the quantity processed in a single time step, \(t\)) is considered acceptable. Otherwise, it necessitates reprocessing through the dryer. It is worth noting that this paper does not delve into the details of the reprocessing, and any product batch failing to meet the quality standard is treated as scrap.

Furthermore, it is important to note that although the quantity of product, denoted as \({Q}_{i}^{t}\) remains constant, the load itself changes as the product progresses through the dryer. This change in load arises from the fact that, as the product traverses the dryer, its mass decreases due to evaporation, subsequently resulting in variations in density. To accurately estimate this density variation, we leverage industrial data and chart the density fluctuations across the drying chambers, as exemplified in Fig. 6. This visual representation of density changes serves as a crucial reference point for estimating the amount of mass that undergoes evaporation. This information, in turn, aids in determining the load on the succeeding conveyor belt, which is used to estimate the belt’s energy consumption in “GCN Modeling”.

Density variation along the drying chambers.

Full size image

Since the load depends on the product mass, i.e., \({load}={mg}\), Eqs. (710) are used to estimate the mass i.e., \({q}_{i}={{wh}}_{i}{v}_{i}{\rho }_{i}\), where \(\rho =m/V\). This will be further used in “GCN Modeling” to estimate the energy.

Building upon the insights gleaned from the density transitions depicted in Fig. 6, we can discern that the average density difference between two successive sections amounts to approximately \(350\frac{{kg}}{{m}^{3}}\). In practical terms, this signifies that, on average, each section undergoes evaporation resulting in the removal of ~350 kg of solvents. Subsequently, for section \(i+1\), we can derive the density, denoted as \({\rho }_{i+1}\), by subtracting this value from the density of the previous section, \({\rho }_{i}\). This relationship is expressed as \({\rho }_{i+1}={\rho }_{i}-350\). Additionally, it is prudent to assume that the capacity of the conveyor motors exceeds the maximum load by a substantial margin, specifically at least 30%. This precautionary measure ensures that the conveyor system can accommodate the variability in loads and operate efficiently under different conditions.

GCN modeling

In the existing literature, rigorous modeling techniques based on the fundamental principles of heat and mass transfer are employed to estimate product quality4,13. However, these models often fall short of establishing a comprehensive model by incorporating environmental conditions and process parameters such as temperature, humidity, fan speed, conveyor belt speed, and product quality due to complexities among these factors. Therefore, in this paper, we adopt a machine-learning approach to establish a sophisticated model to describe the connections between these input parameters and sample product quality. Specifically, we employ a GCN model14, to predict the weight at specific sets of input parameters.

The choice of GCN to model the system is driven by the data’s inherent graph-like structure, the interactions between components in the system, and the suitability of this approach for facilitating the control of the system. In this framework, system components such as fans, chambers, and belts are represented as distinct nodes within a graph, which can later be treated as agents for control purposes. The relationships and interdependencies among these agents are identified and captured in an adjacency matrix, as shown in Table 7. This table outlines the interconnections between the 15 nodes: \({B}_{i}\) represents the three belts, \({C}_{i}\) nine chambers, and \({F}_{i}\) the three fans. A value of 1 in the matrix indicates a relationship between two nodes, while a 0 denotes no connection.

Full size table

The adjacency matrix is then used to construct a graphical representation of the system. Each graph corresponds to a specific combination of input parameters, such as fan speeds, chamber temperatures, and conveyor belt speeds. Additionally, each graph is paired with a corresponding label indicating sample weight. These graph-label pairs form the input dataset for training the GCN model.

The intricate interconnections and dependencies between these nodes pose a challenge for traditional regression models, which are typically designed for simpler, linear relationships and do not account for relationships between components in the data. They may struggle with tasks that involve dependencies or interactions. GCNs, however, offer significant advantages when dealing with graph-structured data. Unlike conventional methods that often assume independent and identically distributed inputs, GCNs are adept at capturing and utilizing the relationships between interconnected nodes. This capability enables GCNs to model complex dependencies and interactions within the data more effectively, resulting in superior performance in scenarios where the data naturally forms a graph-like structure, as is the case here.

The GCN model is designed to learn these complex relationships from empirical data. By training on real-world industry data, the model is enabled to capture patterns and dependencies that may not be apparent through traditional analytical approaches. Once trained on authentic industrial data, the GCN model becomes a valuable tool for estimating product weight based on input parameters. It can be applied to new, unseen graphs, allowing for continuous and reliable predictions. This versatility makes it well-suited for testing and validation in real-world scenarios. The core mathematical operation in each GCN layer can be succinctly described as:

$${H}^{(l+1)}=\sigma \left({\widehat{D}}^{-\frac{1}{2}}\widehat{A}{\widehat{D}}^{-\frac{1}{2}}{H}^{(l)}{W}^{(l)}\right)$$

(11)

where \(\sigma \left(.\right)\) is the activation function, \(\hat{A}\) denotes the adjacency matrix, \(\hat{D}\) is the degree matrix, \({H}^{(l)}\) represents the node features, and \({W}^{(l)}\) is the learnable weight matrix.

Equation (11) encapsulates the propagation of information between product instances through successive layers. After training the model successfully, the sample weight \(w\) can be predicted as follows;

$${w}_{{pred}}={GCN}\left(G\right)$$

(12)

where \(G\) is the individual graph representing a combination of different input parameters, and \({GCN}\) is the trained model.

While GCNs are traditionally used for classification tasks, they can be adapted for regression by modifying their final layer and choosing an appropriate loss function. In our approach, the final layer of the GCN outputs continuous values, allowing it to predict real-numbered variables, such as sample product weight, in our case. The primary adaptation involves using the MSE loss function, which quantifies the discrepancy between the predicted and actual sample product weight. The GCN model tries to minimize the MSE loss function given in Eq. (13)

$$L\left(y,\hat{y}\right)=\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}{\left({y}_{i}-{\hat{y}}_{i}\right)}^{2}$$

(13)

where \(L\) represents the MSE loss; \(n\) is the number of graphs, i.e., each combination of input parameters; \(y\) denotes the true expected weights; \(\hat{y}\) signifies the predicted weights. By optimizing this loss function, the GCN model is fine-tuned to make accurate continuous predictions, making it well-suited for modeling this system.

Energy consumption evaluation

Managing energy consumption in the context of a conveyor belt dryer presents a substantial challenge. The primary contributor to energy consumption during drying is the energy needed to provide the latent heat required for evaporation15. Accurately estimating and calculating this energy demand necessitates a comprehensive mathematical model that accounts for all relevant factors. However, the detailed mathematical modeling of this energy aspect goes beyond the scope of our current paper. Given that we have adopted an ML approach to control the drying process, we introduce a constant, denoted as \({c}_{{et}}\), which represents the cost of energy associated with a unit-degree change in chamber temperature for each time step. This energy cost value plays a pivotal role in the reward function of our control algorithm, as will be elucidated later in our discussion. The overall energy consumed in maintaining a specific temperature within the dryer can be approximated using Eq. (14).

$${chambe}{{rs}}^{\prime}{energy\; consumption}=\mathop{\sum }\limits_{i=1}^{n}{t}_{{C}_{i}}^{t}{c}_{{et}}$$

(14)

where \(n\) is the number of chambers, \({t}_{{C}_{i}}^{t}\) represents the temperature of the chamber \({C}_{i}\) at time \(t\), and \({c}_{{et}}\) is the energy constant associated with unit degree change in the chamber’s temperature.

In addition to the energy consumption associated with heat transfer, conveyor belts represent another significant source of energy usage16. Extensive research has been dedicated to understanding and mitigating energy consumption related to heat transfer17. However, there is a notable research gap when it comes to the energy demands imposed by the robust electric motors responsible for driving conveyor loads. These motors play a pivotal role in the energy consumption of a conveyor belt dryer since their operation is closely linked to the load on the conveyor. The energy consumption of these motors is contingent on their speed and the product’s weight being conveyed.

Most electric motors are designed to operate efficiently within a load range of 50%–100% of their rated capacity. Typically, peak efficiency occurs at around 75% of the rated load. For instance, a 10-horsepower (hp) motor has a suitable load range of 5–10 hp, with its highest efficiency point at 7.5 hp. Below ~50% load, a motor’s efficiency tends to drop significantly18. However, it’s important to note that the specific range of optimal efficiency can vary among individual motors and tends to be broader for larger motors.

To estimate the energy consumption per unit time (typically measured in watts or kilowatts) for an electric motor in a conveyor belt dryer system that features variable speed and varying product weight, we can apply Eq. (15).

$${\rm{Energy\; Consumption\; Rate}},{e}_{{b}_{i}}^{t}=\frac{{P}_{{input}}}{\triangle t}$$

(15)

where \({P}_{{input}}\) is the electric input power and \(\triangle t\) is the time interval over which the estimated energy consumption is to be found. It is usually measured in seconds.

The electric input power \({P}_{{input}}\) depends on the mechanical power \({P}_{{mech}}\) and motor efficiency \(\eta\) i.e.,

$${P}_{{input}}=\frac{{P}_{{mech}}}{\eta }$$

(16)

$${\rm{Similarly}},{P}_{{mech}}={Load}\left(N\frac{2\pi }{60}\right)$$

where \({Load}={mg}\) represents the weight of the product on the conveyor being conveyed. Equations (710) from “Methods” are employed to calculate this mass of the product on the conveyor. \(N\) is the speed, and it depends on the circumference of the drive roller, which can be calculated based on its diameter i.e., \({circumference}=\pi D\). The speed should be changed to radians per second by multiplying with \(\frac{2\pi }{60}\) if it is in \({RPM}\).

Control problem formulation

In a conveyor belt dryer system, the control strategy is not trivial, as it poses a greater challenge due to several control inputs19. In this work, the control inputs include fans’ speeds, belts’ speeds, and chambers’ temperatures, which determine the quality and quantity of the product dried, and the associated energy consumption. Due to the lack of closed-form representation for the system, the applications of traditional control methods become challenging20. Furthermore, the complexity of the problem, which involves a significant number of states, actions, and unknown transition probabilities, poses difficulties in utilizing operations research techniques. As a result, the problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP) and tackled using multi-agent reinforcement learning (MARL)21. The objective of the control problem is to determine an optimal policy, mapping states to actions, to maximize the reward.

The Dec-POMDP framework is particularly suitable for decision-making and coordination in multi-agent environments. It can be represented by a tuple \((n,S,\{{U}_{i}\},\{{O}_{i}\},P,r,\gamma)\)22\(.\) Within this framework, there are \(n\) agents denoted by \(i\in \left\{1,2,\ldots ,n\right\}\) that collaborate to achieve a common task. At each decision point, agent \(i\) selects an action \({u}_{i}\in U\) forming a joint action \({\bf{u}}\in {\bf{U}}\equiv {U}^{i},\forall i\in \left\{1,2,\ldots ,n\right\}\). These actions influence the true state of the environment \(s\in S\) according to a transition probability function \(P\left({s}^{{\prime} },|,s,{\bf{u}}\right):S\times U\times S\to S\). Consequently, a global reward function \(r\left(s,{\bf{u}}\right):S\times U{\mathbb{\to }}{\mathbb{R}}\) is generated. Each agent possesses access only to partial observations, denoted as \({o}^{i}\in {O}^{i}\) and selects actions based on a policy \(\pi ({\mu }^{i}{\rm{|}}{{\rm{\sigma }}}^{i})\) dependent on its local observation-action history \({{\rm{\sigma }}}^{i}\in \sum \equiv \left({O}^{i}\times U\right).\) The primary objective of each agent is to maximize the discounted sum of rewards over an episode, using a discount factor \(\gamma\).

Dec-POMDP framework for MARL

Within the framework of MARL, control decisions typically involve the generation of a probability distribution encompassing the action space. At each discrete time step \(t\), an action is stochastically selected for each agent. This action selection results in a transition to a new state and the accrual of a corresponding reward. The dynamics of this transition to the new state are dictated by a transition probability, a feature inherent to the system. Agents, through a learning process, adapt their action selections based on their environmental observations, collectively known as the policy. The effectiveness of this policy is evaluated by considering the cumulative sum of rewards obtained over an episode, taking into account a discount factor. The primary aim of MARL is to compute a policy that maximizes this cumulative sum of rewards. Notably, in this system, the transition probabilities remain unknown. Consequently, we employ a model-free MARL algorithm, a strategy that obviates the need for making assumptions about transition probabilities. Instead, it concentrates solely on defining the observations, actions, and rewards in the learning process.

Observations: Observations refer to information that each agent has about the current state of the system. These "raw observations" only reflect local information from individual agents and may result in sub-optimal performance in coordinated control problems. The observation of each agent \(i\) is represented as,

$${o}_{t}^{i}=\left\{\begin{array}{cc}\begin{array}{c}{v}_{{b}_{1}}^{t},{v}_{{f}_{1}}^{t},{t}_{{C}_{1}}^{t},{t}_{{C}_{2}}^{t},{t}_{{C}_{3}}^{t},{H}_{t},{t}_{{amb}}{\rm{;}}\\ {v}_{{b}_{2}}^{t},{v}_{{f}_{2}}^{t},{t}_{{C}_{4}}^{t},{t}_{{C}_{5}}^{t},{t}_{{C}_{6}}^{t},{H}_{t},{t}_{{amb}}{\rm{;}}\\ {v}_{{b}_{3}}^{t},{v}_{{f}_{3}}^{t},{t}_{{C}_{7}}^{t},{t}_{{C}_{8}}^{t},{t}_{{C}_{9}}^{t},{H}_{t},{t}_{{amb}}{\rm{;}}\end{array} & \begin{array}{c}{if}i\in [{f}_{1},{b}_{1},{C}_{1},{C}_{2},{C}_{3}]\\ {if}i\in [{f}_{2},{b}_{2},{C}_{4},{C}_{5},{C}_{6}]\\ {if}i\in [{f}_{3},{b}_{3},{C}_{7},{C}_{8},{C}_{9}]\end{array}\end{array}\right.$$

(17)

where \({v}_{{b}_{i}}^{t}\) is the speed of the conveyor belt \({b}_{i}\) at time \(t\); \({v}_{{f}_{i}}^{t}\) is the speed of the fan \({f}_{i}\) at time \(t\); \({t}_{{C}_{i}}^{t}\) is the temperature of the chamber \({C}_{i}\) at time \(t\); \({H}_{t}\) is the humidity at time \(t\); and \({t}_{{amb}}\) is the ambient temperature at time \(t\).

Action: An agent’s control actions encompass the adjustment of fan or conveyor belt speeds and the regulation of chamber temperatures, depending on its type. These control actions have a direct effect on the quantity and quality of the dried product, as well as the energy consumption. The control action is defined as

$$A=\left\{{v}_{{b}_{i}}^{t},{v}_{{f}_{i}}^{t},{t}_{{C}_{j}}^{t}\right\}$$

(18)

where \({v}_{{b}_{i}}^{t}\) represents the action of the conveyor belt \({b}_{i};\) \({v}_{{f}_{i}}^{t}\) represents the action of the fan \({f}_{i}\); and \({t}_{{C}_{j}}^{t}\) represents the action of the chamber \({C}_{j}\).

Reward: A global reward describes how good the combined action of all agents is. Thus, a proper reward setting is required to ensure that the resulting policy does what is expected. A good reward setting requires domain knowledge of the system and is designed in a way that increases the overall discounted sum of this reward pushing the system to work as per expectations. A variety of possible reward settings were explored for the problem and the following was chosen to be the best.

$$R={Q}^{t}-\mathop{\sum }\limits_{j=1}^{k}\mathop{\sum }\limits_{i=1}^{n}\left({t}_{{C}_{j}}^{t}{c}_{{et}}+{e}_{{b}_{i}}^{t}\right)+{\theta }^{t}$$

(19)

where \({Q}^{t}\) is the amount of product dried at time \(t\), \({t}_{{C}_{j}}^{t}\) is the temperature of chamber \(j\) at time \(t\), \({c}_{{et}}\) is the energy consumption rate, and \({e}_{{b}_{i}}^{t}\) represents the energy consumption of conveyor belts, which can be calculated based on Eq. (15) at time \({t;}\) and \({\theta }^{t}\) is a constant defined as follows.

$${\theta }^{t}=\left\{\begin{array}{l}\begin{array}{ll}1{\rm{;}} & {if}{w}_{{pred}}\in X\end{array}\\ \begin{array}{ll}-1{\rm{;}} & {if}{w}_{{pred}}\,\notin\, X\end{array}\end{array}\right.$$

(20)

where \(X\) is the desired range for the predicted sample weight \({w}_{{pred}}\).

MARL implementation using MADDPG algorithm

Contrary to single-agent reinforcement learning algorithms, MARL algorithms can deal with large action spaces. Furthermore, using Actor-Critic MARL algorithms allows for medium to large state spaces unlike very often used value-based algorithms like SARSA, Q-learning, and Deep Q-learning. However, in a multi-agent environment with continuous actions, coordination among the agents is often more subtle and requires fine-grained control, which MADDPG23 can facilitate.

MADDPG, an extension of the Deep Deterministic Policy Gradients (DDPG) algorithm24, is tailored for coordinating the continuous actions of multiple agents in multi-agent environments. Unlike DDPG, which relies solely on self-observation and action data, MADDPG equips critics with additional information, incorporating the observations and actions of all agents. The training process follows a centralized training and decentralized execution approach, where agents independently act based on their policies during execution but collaborate during training. MADDPG employs actor-critic networks, utilizing policy gradients, to optimize agent policies. These networks take into account local observations, actions of other agents, and environmental states during training, enabling cooperative adaptation. However, during execution, agents rely solely on their local observations. Each agent in MADDPG employs two primary neural networks, actor and critic, along with target networks for stability. Critic network updates minimize a loss function, while policy networks are updated to maximize expected rewards. Soft updates ensure target network stability. This comprehensive approach empowers MADDPG to effectively address complex multi-agent scenarios, promoting cooperation and adaptability in dynamic environments.

Figure 7 represents a broad overview of our proposed methodology where the GCN and MADDPG are integrated. The environment, i.e., the dryer system, is employed to extract the relationship between different agents and develop an adjacency matrix. The adjacency matrix is used to graph the relationship between the agents at each time step. This graph is fed as input to the trained GCN model to predict the product weight, which will be used in the reward estimation for MADDPG. At the same time, the MADDPG also receives the graph-structured data and the predicted weight. It also receives energy consumption from the environment, and based on the total reward, it updates the control actions for the interacting agents, and this loop continues until the model is trained.

Process flow for the proposed GCN-MADDPG algorithm.

Full size image
Energy-efficient and quality-focused control of conveyor belt dryers in petrochemical production (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Rubie Ullrich

Last Updated:

Views: 5249

Rating: 4.1 / 5 (52 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Rubie Ullrich

Birthday: 1998-02-02

Address: 743 Stoltenberg Center, Genovevaville, NJ 59925-3119

Phone: +2202978377583

Job: Administration Engineer

Hobby: Surfing, Sailing, Listening to music, Web surfing, Kitesurfing, Geocaching, Backpacking

Introduction: My name is Rubie Ullrich, I am a enthusiastic, perfect, tender, vivacious, talented, famous, delightful person who loves writing and wants to share my knowledge and understanding with you.