### Ordering and shipping

We consider a simple SC composed of *NOT* echelons adopting a policy of order up to the point^{8.30.31} (Fig. 1). Each level can represent a retailer, a wholesaler or manufacturers of certain consumer goods, for example. When the goods are sold on the consumer side of the SC (i.e. the retailer), its inventory is reduced and orders are placed with a wholesaler to restock the products. The wholesaler places orders with the next wholesaler or manufacturer to meet demand. Following these orders, the products or materials are shipped in the opposite direction.

The target stock level in the policy is determined by the forecast model for each tier, which is described in section 2.2. In short, each tier first receives an order from one downstream tier (or from the end customer) and ships to it, followed by an immediate replenishment process by sending an order to the next upstream tier, a single discrete time step. A single discrete time step represents an interval between successive commands, which can represent a day, a week, etc., in a real situation. To simplify, we assume here that each company of the SC places an order in a synchronized manner at each time step. Note that we ignore the lead time in this model to exclude its effects on the boost effect.

Technically we are updating the system from (t = T ) To (t = T + 1 ) as following. First, at the most downstream customer, the demand is generated by a Gaussian distribution, ({ mathcal {N}} ( mu _ {0} (T), sigma _ {0} (T)) ) (with time-varying parameters; see Sec. 2.3 for more details). Note that we also performed all simulations with lognormal distributions (Supporting Information) to confirm that the conclusions were not specific to the Gaussian distribution. Then, we sequentially update each rung from downstream. A one-tier update is twofold, involving shipment to the next downstream tier and subsequent replenishment by placing an order at the next upstream tier. Step *I* receives an order from the next downstream level (i-1 ) (or the end customer for (i = 1 )), which is noted (O_ {i-1} (T) ). If the amount ordered is lower than the level of stock of the stage *I*, (I_ {i}

(1)

or *vs* is the safety factor defining the quantity of safety stock (i.e. the order policy at point^{8.30.31}). The amount ordered is simply the difference between the target and the current (at (t = T + 0.5 )) inventory levels (i.e., (O_i (T) = I_ {i, mathrm {target}} (T) – I_ {i} (T + 0.5) )). If the current stock level is higher than the target stock level (which can happen when the forecast model is recycled), we set (O_i (T) = 0 ). Note that this control policy alone does not amplify the demand signal, and therefore does not cause the boost effect.^{36}.

### Fluctuations in demand

We also fluctuate the distribution of demand at the end customer, ({ mathcal {N}} ( mu _0, sigma _0) ). As an example, we define the following simple stochastic process with moderate variability. we corrected ( sigma _0 = 0.1 ). When updated, the average of the distribution, ( mu _0 ), has been redrawn from a uniform distribution between 1/2 and 2, that is, ( mu _ {0, mathrm {new}} sim U (1/2, 2) ).

We have considered two types of distribution update intervals, (L_ text {{int}} ). The first is a constant update interval, with which the demand distribution is updated every (L_ text {{int}} (= Const.) ) no time. In this study, we examined (L _ { rm integer} = 50 ) and 100. The second type of interval is a random update interval, which is redrawn from the uniform distribution, (U (L_ text {{min}}, L_ text {{max}}) ), whenever the demand distribution is updated. Here we pose (L_ text {{min}} = 50 ) and (L_ text {{max}} = 100 ). These two update intervals were used to show that our results were not influenced by the discrete or stochastic properties of demand.

### Retraining programs for forecasting demand at levels

As the demand model changes, the rung forecast models are also updated (i.e. recycled). Each level *I* refers to the past (L_ text {{train}} ) orders he has received, ( {O_ {i-1} (T-1), ldots, O_ {i-1} (T-L_ text {{train}}) } ). When the model is updated, ( mu _i ) and ( sigma_i ) were replaced by the mean and corrected standard deviation calculated from the sample, respectively. The optimal value of (L_ text {{train}} ) depends on the environment and is generally unknown.

We consider the following four types of diagrams defining when and how to perform recycling at each level.

#### Regular update

We simultaneously update the models of all levels each (L_ text {{train}} ) stages, that is to say at (t = nL_ text {{train}} ) ((n = 1,2, ldots) ).

#### Independent update

At each step of time, each step *I* checks if the sample mean calculated from the most recent past data, ( {O_ {i-1} (T-1), ldots, O_ {i-1} (T-L_ text {{train}}) } ), is in the interval ([mu _i – 1.96sigma _i / sqrt{L_text{{train}}}, mu _i + 1.96sigma _i/ sqrt{L_text{{train}}}]), and if not, it updates the forecast model. If the current model correctly predicts demand, 95% of the sample mean falls within this range. Note that this diagram gives 5% false positives at each time step. This criterion is examined independently at each level.

#### Shared forecasting model

Only step 1 updates the model according to the same rule as in the independent update scheme. If the forecast model is updated, it is copied at all levels. In this way, the Tier 1 forecasting model, which has the most accurate information on demand, is shared with other companies.

### Initial settings and conditions

We performed simulations for (N = 5 ) rungs, variant (L_ text {train} ). The safety factor has been set at (c = 1.96 ). As initial conditions, we set ( mu _0 = mu _1 = cdots = mu _N = 1.0 ), ( sigma _0 = sigma _1 = cdots = sigma _N = 0.1 ), and (I_1 = cdots = I_N = 1.196 ).

### Constant policy

To assess the effectiveness of recycling programs, we have prepared a basic model, called a constant policy. In this policy, the prediction of each step is based on the Gaussian distribution, ({ mathcal {N}} ({ bar { mu}}, sigma) ), which does not change over time. Here, ({ bar { mu}} = 1.25 ) is a fixed parameter representing the long-term average of the demands ( ( mu _0 sim U (1 / 2,2) )), and we looked at different values of ( sigma ) ranging from 0 to 1.2 in separate executions. This policy assumes an ideal situation in which each level knows the value of ({ bar { mu}} ).