neurobcl.base package#

Submodules#

neurobcl.base.model module#

class neurobcl.base.model.NeuroBucketClassifier(quantile_gap: int, max_depth: int, indexer_hash: dict = {}, filter_features: Dict[str, List] = [], bucket_features: List = [])[source]#

Bases: ABC

The NeuroBucketClassifier class is used to classify the data into buckets based on the given filters and percentiles

classmethod fromJson(json_str)[source]#

Converts the JSON string to a classifier

Parameters:

json_str (str) – The JSON string to be converted

Returns:

The classifier from the JSON string

Return type:

NeuroBucketClassifier

get(target_key: str, target_key_bucket: int, operator: str = '<', filters: dict = {}, buckets: List[int] = [25, 25, 25, 25], debug: bool = False)[source]#

Get the value for the given target key and target key bucket based on the given filters and buckets

Parameters:
  • target_key (str) – The target key for which the value is to be found

  • target_key_bucket (int) – The target key bucket for which the value is to be found

  • operator (str, optional) – The operator to be used, defaults to ‘<’

  • filters (dict, optional) – The filters to be applied, defaults to {}

  • buckets (List[int], optional) – The buckets to be used, defaults to [25, 25, 25, 25]

  • debug (bool, optional) – The debug flag, outputs verbose data, defaults to False

Returns:

The value at given bucket, including the operator side

Return type:

int

Raises:
  • ValueError – If the feature is not found in bucket features

  • ValueError – If the bucket is out of range

  • ValueError – If the sum of all buckets is not 100

  • ValueError – If no percentile list found for the given filters

classifier = trainer.index()
classifier.get("age", 1, '>', buckets=[25, 25, 25, 25]) # Returns the minimum age that should be present in the first bucket (ie 0-25% of the data)
classifier = trainer.index()
classifier.get("age", 4, '<', buckets=[25, 25, 25, 25], filters={"country": "India"}) # Returns the maximum age that should be present in the fourth bucket (ie 75-100% of the data) for the country "India"
Note:

The operator can be either ‘<’ or ‘>’, if ‘<’ then the upper bound of the bucket is returned, if ‘>’ then the lower bound of the bucket is returned

toJson()[source]#

Converts the classifier to a JSON string

Returns:

The JSON string representation of the classifier

Return type:

str

class neurobcl.base.model.NeuroBucketTrainer(quantile_gap: int = 10, max_depth: int = 2)[source]#

Bases: ABC

The NeuroBucketTrainer class is used to train the classifier based on the given data and features (template only)

depth_features_index(bucket_feat_name: str, depth: int, current_filters={})[source]#

Recursively index the features based on the given depth and current filters

Parameters:
  • bucket_feat_name (str) – The bucket feature name

  • depth (int) – The depth

  • current_filters (dict, optional) – The current filters, defaults to {}

abstract get_at(target_feature: str, rank: int, filters: dict = {})[source]#

Get the value at the given rank for the given target feature and filters

Parameters:
  • target_feature (str) – The target feature for which the value is to be found

  • rank (int) – The rank at which the value is to be found

  • filters (dict) – The filters to be applied

Returns:

The value at the given rank for the given target feature and filters

Return type:

int

abstract get_bucket_features()[source]#

Get the bucket features

Returns:

The bucket features

Return type:

List

abstract get_non_bucket_features()[source]#

Get the non bucket features

Returns:

The non bucket features

Return type:

dict

index()[source]#

Index the classifier based on the given data and features

Returns:

The classifier model

Return type:

NeuroBucketClassifier

Note

This can take a while to index the data

abstract total_items(filters: dict = {})[source]#

Get the total items for the given filters

Parameters:

filters (dict) – The filters to be applied

Returns:

The total items for the given filters

Return type:

int

neurobcl.base.model.order_invariant(feature: str, filters: dict = {})[source]#

Converts the filters to a string and joins with feature to make it order invariant (Helps in searching the same in the hash table)

Parameters:
  • feature (str) – The feature name

  • filters (dict) – The filters to be applied

Returns:

A string representation of the feature and filters

Return type:

str

Module contents#