data_loader

Classes

DataSource

Base class for all data sources in the Rambling Realms trading system.

AlpacaDataLoader

! TODO add columns to and from parquet

DataLoader

DataLoader class for loading and processing data from various sources.

Module Contents

class data_loader.DataSource(**kwargs)

Base class for all data sources in the Rambling Realms trading system. This class provides a common interface for fetching and processing data from various sources. Subclasses should implement the get_data method to fetch data from their respective sources.

_registry: ClassVar[Dict[trading.cli.alg.config.DataSourceType, Type[DataSource]]]
abstractmethod get_data(fetch_data, request, df, cache_path, start_date, end_date, time_step_unit=TimeFrameUnit('Day'), cache_enabled=True, time_step_period=1, **kwargs)
Parameters:
  • fetch_data (bool)

  • request (trading.cli.alg.config.DataRequests)

  • df (pandas.DataFrame)

  • cache_path (str)

  • start_date (str)

  • end_date (str)

  • time_step_unit (alpaca.data.timeframe.TimeFrameUnit)

  • cache_enabled (bool)

  • time_step_period (int)

Return type:

pandas.DataFrame

classmethod __init_subclass__(**kwargs)
classmethod factory(data)

Factory method to create an instance of a DataSource subclass based on the provided data. The data dictionary must contain a ‘source’ key that matches one of the DataSourceType enum values. Raises ValueError if the ‘source’ key is missing or if the type is unknown.

Parameters:

data (dict)

Return type:

DataSource

class data_loader.AlpacaDataLoader(**kwargs)

Bases: DataSource

! TODO add columns to and from parquet

TYPE: ClassVar[trading.cli.alg.config.DataSourceType]
get_data(fetch_data, request, df, cache_path, start_date, end_date, time_step_unit=TimeFrameUnit.Day, cache_enabled=True, time_step_period=1, **kwargs)

Fetches data from Alpaca and caches it locally.

Parameters:
  • fetch_data (bool)

  • request (trading.cli.alg.config.DataRequests)

  • df (pandas.DataFrame)

  • cache_path (str)

  • start_date (str)

  • end_date (str)

  • time_step_unit (str)

  • cache_enabled (bool)

  • time_step_period (int)

Return type:

pandas.DataFrame

class data_loader.DataLoader(data_config, feature_config, fetch_data=False, **kwargs)

DataLoader class for loading and processing data from various sources. It initializes with a DataConfig and FeatureConfig, fetches data from the specified sources, and applies the specified features to the data.

! todo no drop on on live

Parameters:
data_config
feature_config
df
columns
features
classmethod data_info(df)

Returns a string representation of the dataframe.

Parameters:

df (pandas.DataFrame)

Return type:

str

get_train_test()

Splits the DataFrame into training and validation sets based on the validation split ratio.

to_csv(path)

Saves the DataFrame to a CSV file.

Parameters:

path (str)