[1] Miguel F Anjos, Russell CH Cheng, and Christine SM Currie, Maximizing revenue in the airline industry under one-way pricing, Journal of the Operational Research Society 55 (2004), no. 5, 535{541.
[2] , Optimal pricing policies for perishable products, European Journal of Operational Research 166 (2005), no. 1, 246{254.
[3] Tal Avinadav and Teijo Arponen, An eoq model for items with a xed shelf-life and a declining demand rate based on time-to-expiry technical note, Asia-Pacific Journal of Operational Research 26 (2009), no. 06, 759{767.
[4] Tal Avinadav, Avi Herbon, and Uriel Spiegel, Optimal ordering and pricing policy for demand functions that are separable into price and inventory age, International Journal of Production Economics 155 (2014), 406{417.
[5] Yossi Aviv and Amit Pazgal, A partially observed markov decision process for dynamic pricing, Management science 51 (2005), no. 9, 1400{1416.
[6] Seyed Mohammad Esmaeil Pour Mohammad Azizi and Abdolsadeh Neisy, A new approach in geometric brownian motion model, Fuzzy Information and Engineering and Decision (Cham) (Bing-Yuan Cao, ed.), Springer International
Publishing, 2018, pp. 336{342.
[7] Seyed Mohammad Esmaeil Pourmohammad Azizi and Abdolsadeh Neisy, Mathematic modelling and optimization of bank asset and liability by using fractional goal programing approach, International Journal of Modeling and Optimization 7 (2017), no. 2, 85.
[8] Alexandre X Carvalho and Martin L Puterman, Dynamic pricing and reinforcement learning, Proceedings of the International Joint Conference on Neural Networks, 2003., vol. 4, IEEE, 2003, pp. 2916{2921.
[9] Yan Cheng, Real time demand learning-based q-learning approach for dynamic pricing in e-retailing setting, 2009 International Symposium on Information Engineering and Electronic Commerce, IEEE, 2009, pp. 594{598.
[10] Richard P Covert and George C Philip, An eoq model for items with weibull distribution deterioration, AIIE transactions 5 (1973), no. 4, 323{326.
[11] Guillermo Gallego and Garrett Van Ryzin, Optimal dynamic pricing of inventories with stochastic demand over nite horizons, Management science 40 (1994),no. 8, 999{1020.
[12] PM Ghare, A model for an exponentially decaying inventory, J. ind. Engng 14 (1963), 238{243.
[13] Abhuit Gosavii, Naveen Bandla, and Tapas K Das, A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking, IIE transactions 34 (2002), no. 9, 729{742.
[14] Chengzhi Jiang and Zhaohan Sheng, Case-based reinforcement learning for dynamic inventory control in a multi agent supply-chain system, Expert Systems with Applications 36 (2009), no. 3, 6520{6526.
[15] Ahmet Kara and Ibrahim Dogan, Reinforcement learning approaches for specifying ordering policies of perishable inventory systems, Expert Systems with Applications 91 (2018), 150{158.
[16] Kyle Y Lin, Dynamic pricing with real-time demand learning, European Journal of Operational Research 174 (2006), no. 1, 522{538.
[17] Amir Hossein Nafei, Seyed Mohammad Esmaeil Pourmohammad Azizi, and Rajab Ali Ghasempour, An approach in solving data envelopment analysis with stochastic data, International workshop on Mathematics and Decision Science,
Springer, 2016, pp. 154{162.
[18] Amir Hossein Nafei, Wenjun Yuan, and Hadi Nasseri, Group multi-attribute decision making based on interval neutrosophic sets, In nite Study, 2019.
[19] Jing Peng and Ronald J Williams, Incremental multi-step q-learning, Machine Learning Proceedings 1994, Elsevier, 1994, pp. 226{232.
[20] CVL Raju, Y Narahari, and K Ravikumar, Learning dynamic prices in electronic retail markets with customer segmentation, Annals of Operations Research 143 (2006), no. 1, 59{75.
[21] Rupal Rana and Fernando S. Oliveira, Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning, Omega 47 (2014), no. C, 116{126.
[22] Rupal Rana and Fernando S Oliveira, Dynamic pricing policies for interdepen-dent perishable products or services using reinforcement learning, Expert systems with applications 42 (2015), no. 1, 426{436.
[23] Gerald Tesauro and Jeffrey O Kephart, Pricing in agent economies using multi-agent q-learning, Autonomous agents and multi-agent systems 5 (2002), no. 3,289{304.
[24] Chih-Te Yang, Liang-Yuh Ouyang, and Hsing-Han Wu, Retailer's optimal pricing and ordering policies for non-instantaneous deteriorating items with price dependent demand and partial backlogging, Mathematical Problems in Engineering 2009 (2009).
[25] Yajun Zhang and Zheng Wang, Integrated ordering and pricing policy for perishable products with inventory inaccuracy, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE) (2018), 1230{1236.
[26] Wen Zhao and Yu-Sheng Zheng, Optimal dynamic pricing for perishable assets with nonhomogeneous demand, Management science 46 (2000), no. 3, 375{388.