q-fin.MF

2 posts

arXiv:2501.06701v1 Announce Type: cross Abstract: This paper investigates the investment problem of constructing an optimal no-short sequential portfolio strategy in a market with a latent dependence structure between asset prices and partly unobservable side information, which is often high-dimensional. The results demonstrate that a dynamic strategy, which forms a portfolio based on perfect knowledge of the dependence structure and full market information over time, may not grow at a higher rate infinitely often than a constant strategy, which remains invariant over time. Specifically, if the market is stationary, implying that the dependence structure is statistically stable, the growth rate of an optimal dynamic strategy, utilizing the maximum capacity of the entire market information, almost surely decays over time into an equilibrium state, asymptotically converging to the growth rate of a constant strategy. Technically, this work reassesses the common belief that a constant strategy only attains the optimal limiting growth rate of dynamic strategies when the market process is identically and independently distributed. By analyzing the dynamic log-optimal portfolio strategy as the optimal benchmark in a stationary market with side information, we show that a random optimal constant strategy almost surely exists, even when a limiting growth rate for the dynamic strategy does not. Consequently, two approaches to learning algorithms for portfolio construction are discussed, demonstrating the safety of removing side information from the learning process while still guaranteeing an asymptotic growth rate comparable to that of the optimal dynamic strategy.

Duy Khanh Lam1/14/2025

arxiv

cs.LG math.OC q-fin.MF

Reinforcement Learning for Jump-Diffusions, with Financial Applications

arXiv:2405.16449v3 Announce Type: replace Abstract: We study continuous-time reinforcement learning (RL) for stochastic control in which system dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized exploratory control problem with stochastic policies to capture the exploration--exploitation balance essential for RL. Unlike the pure diffusion case initially studied by Wang et al. (2020), the derivation of the exploratory dynamics under jump-diffusions calls for a careful formulation of the jump part. Through a theoretical analysis, we find that one can simply use the same policy evaluation and $q$-learning algorithms in Jia and Zhou (2022a, 2023), originally developed for controlled diffusions, without needing to check a priori whether the underlying data come from a pure diffusion or a jump-diffusion. However, we show that the presence of jumps ought to affect parameterizations of actors and critics in general. We investigate as an application the mean--variance portfolio selection problem with stock price modelled as a jump-diffusion, and show that both RL algorithms and parameterizations are invariant with respect to jumps. Finally, we present a detailed study on applying the general theory to option hedging.

Xuefeng Gao, Lingfei Li, Xun Yu Zhou1/8/2025