mizarlabs.transformers.targets package

Submodules

mizarlabs.transformers.targets.labeling module

class mizarlabs.transformers.targets.labeling.BaseLabeling(n_expiration_bars: int)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Base class for labeling.

fit(y)[source]

Fit the model (just for sklearn compatibility).

Parameters
  • x

  • y

Returns

transform(y: pandas.core.frame.DataFrame)pandas.core.frame.DataFrame[source]

Return a dataframe with the target labelled.

Parameters

y

Returns

class mizarlabs.transformers.targets.labeling.TripleBarrierMethodLabeling(num_expiration_bars: int, profit_taking_factor: float, stop_loss_factor: float, metalabeling: bool = False, close_column_name: str = 'close', side_column_name: str = 'side', volatility_window: int = 100, volatility_adjusted_horizontal_barriers: bool = True, expiration_label: bool = False)[source]

Bases: mizarlabs.transformers.targets.labeling.BaseLabeling

Implements the triple barrier method used to label the target.

See page 45 of Advances in Financial Machine Learning by Marcos Lopez de Prado for additional information.

Parameters
  • num_expiration_bars (int) – Max number of bars from the position taking to the position closing.

  • profit_taking_factor (float) – The factor that multiplies the volatility for the creation of the horizontal upper barrier

  • stop_loss_factor (float) – The factor that multiplies the volatility for the creation of the horizontal lower barrier

  • metalabeling (bool, optional) – Whether metalabeling is activated

  • close_column_name (str, optional) – The name of the close column

  • side_column_name (str, optional) – The name of the side column (metalabeling)

  • volatility_window (int, optional) – The number of bars used for the volatility calculation

  • volatility_adjusted_horizontal_barriers (bool) – whether to adjust the horizontal barriers with volatility

  • expiration_label (bool, optional) – Labels with 0 are returned to indicate expiration / vertical barrier has been hit

fit(X, y=None, **fit_params)[source]

Fit the model (just for sklearn compatibility).

Parameters
  • x

  • y

Returns

mizarlabs.transformers.targets.labeling.get_daily_vol(close: pandas.core.series.Series, ewm_span: int = 100)pandas.core.series.Series[source]

Estimate the daily volatility.

Parameters
  • close (pd.Series) – Contains the close price

  • ewm_span (int) – The span of the standard deviation

Returns

The daily volatility per each bar

Return type

pd.Series

mizarlabs.transformers.targets.labeling.get_labels(barriers_df: pandas.core.frame.DataFrame, barriers_info_df: pandas.core.frame.DataFrame, close: pandas.core.series.Series, metalabeling: bool, expiration_label: bool = False)pandas.core.frame.DataFrame[source]

Calculate returns and assign return classes based on the first touched bar.

Case 1: (‘side’ not in barriers_info_df): bin in (-1,1) <-label by price action Case 2: (‘side’ in barriers_info_df): bin in (0,1) <-label by pnl (meta-labeling)

Parameters
  • barriers_df (pd.DataFrame) – dataframe with datetime when barriers are hit

  • barriers_info_df (pd.DataFrame) – Info for creating the barriers

  • close (pd.Series) – Series of prices.

  • metalabeling (bool) – Whether or not metalabelign is activated

Returns

Dataframe containing event

Return type

pd.DataFrame

mizarlabs.transformers.targets.labeling.triple_barrier_labeling(close: pandas.core.series.Series, barrier_info_df: pandas.core.frame.DataFrame, profit_taking_factor: float, stop_loss_factor: float)pandas.core.frame.DataFrame[source]

Calculate the first hit on the stop loss and profit taking barrier.

As described in Advances in financial machine learning, Marcos Lopez de Prado, 2018.

Parameters
  • close (pd.Series) – Series of prices.

  • barrier_info_df (pd.DataFrame) – Info for creating the barriers

  • profit_taking_factor (float) – The factor that multiplies the volatility for the creation of the horizontal upper barrier

  • stop_loss_factor (float) – The factor that multiplies the volatility for the creation of the horizontal lower barrier

Returns

Dataframe containing the first hit for each of the barriers

Return type

pd.DataFrame

mizarlabs.transformers.targets.trend_scanning module

Implementation of Trend-Scanning labels described in Advances in Financial Machine Learning: Lecture 3/10

class mizarlabs.transformers.targets.trend_scanning.TrendScannerLabeling(t_events: Optional[Union[numpy.ndarray, list]] = None, look_forward_window: int = 20, min_sample_length: int = 5, step: int = 1)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Trend scanning is both a classification and regression labeling technique. That can be used in the following ways: 1. Classification: By taking the sign of t-value for a given observation we can set {-1, 1} labels to define the

trends as either downward or upward.

  1. Classification: By adding a minimum t-value threshold you can generate {-1, 0, 1} labels for downward, no-trend, upward.

  2. The t-values can be used as sample weights in classification problems.

4. Regression: The t-values can be used in a regression setting to determine the magnitude of the trend. The output of this algorithm is a DataFrame with t1 (time stamp for the farthest observation), t-value, returns for the trend, and bin. :param t_events: filtered events, array/list of pd.Timestamps, defaults to None :type t_events: Union[np.ndarray, list, None], optional :param look_forward_window: maximum look forward window used to get the trend value, defaults to 20 :type look_forward_window: int, optional :param min_sample_length: minimum sample length used to fit regression, defaults to 5 :type min_sample_length: int, optional :param step: optimal t-value index is searched every ‘step’ indices, defaults to 1 :type step: int, optional

fit(y)[source]
transform(y: pandas.core.series.Series)pandas.core.frame.DataFrame[source]

Scans for trends in the provided series and provides a DataFrame with results.

DataFrame contains the start_time, event_end_time, t_value, return and label.

Parameters

y (pd.Series) – series used to label the data set

Returns

DataFrame with as index the start time and in the columns the event_end_time, t_value, return and label.

Return type

pd.DataFrame

Module contents