Including context to your dynamic pricing downside can improve alternatives in addition to challenges
In my earlier article, I carried out an intensive evaluation of the most well-liked methods for tackling the dynamic pricing downside utilizing easy Multi-armed Bandits. Should you’ve come right here from that piece, firstly, thanks. It’s not at all a straightforward learn, and I actually admire your enthusiasm for the topic. Secondly, prepare, as this new article guarantees to be much more demanding. Nevertheless, if that is your introduction to the subject, I strongly advise starting with the earlier article. There, I current foundational ideas, which I’ll assume readers are aware of on this dialogue.
Anyway, a short recap: the prior evaluation aimed to simulate a dynamic pricing situation. The principle purpose was to evaluate as shortly as attainable varied worth factors to seek out the one yielding the very best cumulated reward. We explored 4 distinct algorithms: grasping, ε-greedy, Thompson Sampling, and UCB1, detailing the strengths and weaknesses of every. Though the methodology employed in that article is theoretically sound, it bears oversimplifications that don’t maintain up in additional complicated, real-world conditions. Probably the most problematic of those simplifications is the idea that the underlying course of is stationary — which means the optimum worth stays fixed no matter the exterior atmosphere. That is clearly not the case. Contemplate, for instance, fluctuations in demand throughout vacation seasons, sudden shifts in competitor pricing, or adjustments in uncooked materials prices.
To resolve this situation, Contextual Bandits come into play. Contextual Bandits are an extension of the Multi-armed Bandit downside the place the decision-making agent not solely receives a reward for every motion (or “arm”) but additionally has entry to context or environment-related info earlier than selecting an arm. The context could be any piece of knowledge that may affect the result, equivalent to buyer demographics or exterior market circumstances.
Right here’s how they work: earlier than deciding which arm to tug (or, in our case, which worth to set), the agent observes the present…