fast_training_env ================= .. py:module:: fast_training_env Attributes ---------- .. autoapisummary:: fast_training_env.EPSILON fast_training_env.PCT_TO_REWARD_SCALE Classes ------- .. autoapisummary:: fast_training_env.FastTrainingEnv Module Contents --------------- .. py:data:: EPSILON :value: 1e-08 .. py:data:: PCT_TO_REWARD_SCALE :value: 100.0 .. py:class:: FastTrainingEnv(data, cfg, features, time_step = (TimeFrameUnit.Day, 1)) Bases: :py:obj:`trading.src.alg.environments.base_environment.BaseTradingEnv` Fast training environment with minimal state tracking. Optimized for speed with constant-time operations. Does NOT maintain position history, trade constraints, or complex metrics. Target: 10,000 iterations per second. Anti-memorization features: - Symbol shuffling at each reset to prevent learning position-specific patterns - hmax constraint to limit concentration in any single stock .. py:attribute:: initial_cash .. py:attribute:: cash .. py:attribute:: holdings .. py:attribute:: _symbol_permutation .. py:attribute:: _inverse_permutation .. py:attribute:: _hmax .. py:method:: _precompute_price_arrays() Pre-compute price arrays and feature matrices for fast lookups. .. py:method:: _get_observation(i = -1) Get observation with minimal computation using pre-computed matrices. Returns: [cash, holdings, current_prices, indicators] Note: Holdings and prices are returned in SHUFFLED order matching the current symbol permutation, so the model sees a consistent view. .. py:method:: _get_shuffled_features(start_idx, end_idx) Get features for the lookback window, shuffled to match symbol permutation. Maintains speed by using pre-computed indices. .. py:method:: reset(*, seed = None, options = None) Reset to initial state with symbol shuffling. Symbol shuffling prevents the model from memorizing that a specific action index corresponds to the best-performing stock. Each episode, the mapping between action indices and actual stocks is randomized. .. py:method:: step(action) Fast step with minimal state updates. Reward based on immediate portfolio value change. Actions are mapped through the symbol permutation to actual stock indices. hmax constraint limits maximum shares traded per stock per step.