neurobcl.base package#

Submodules#

neurobcl.base.model module#

class neurobcl.base.model.NeuroBucketClassifier(quantile_gap: int, max_depth: int, indexer_hash: dict = {}, filter_features: Dict[str, List] = [], bucket_features: List = [])[source]#

Bases: ABC

The NeuroBucketClassifier class is used to classify the data into buckets based on the given filters and percentiles

classmethod fromJson(json_str)[source]#

Converts the JSON string to a classifier

Parameters:: json_str (str) – The JSON string to be converted
Returns:: The classifier from the JSON string
Return type:: NeuroBucketClassifier

get(target_key: str, target_key_bucket: int, operator: str = '<', filters: dict = {}, buckets: List[int] = [25, 25, 25, 25], debug: bool = False)[source]#

Get the value for the given target key and target key bucket based on the given filters and buckets

Parameters:

target_key (str) – The target key for which the value is to be found
target_key_bucket (int) – The target key bucket for which the value is to be found
operator (str, optional) – The operator to be used, defaults to ‘<’
filters (dict, optional) – The filters to be applied, defaults to {}
buckets (List[int], optional) – The buckets to be used, defaults to [25, 25, 25, 25]
debug (bool, optional) – The debug flag, outputs verbose data, defaults to False

Returns:

The value at given bucket, including the operator side

Return type:

int

Raises:

ValueError – If the feature is not found in bucket features
ValueError – If the bucket is out of range
ValueError – If the sum of all buckets is not 100
ValueError – If no percentile list found for the given filters

classifier = trainer.index()
classifier.get("age", 1, '>', buckets=[25, 25, 25, 25]) # Returns the minimum age that should be present in the first bucket (ie 0-25% of the data)

classifier = trainer.index()
classifier.get("age", 4, '<', buckets=[25, 25, 25, 25], filters={"country": "India"}) # Returns the maximum age that should be present in the fourth bucket (ie 75-100% of the data) for the country "India"

Note:: The operator can be either ‘<’ or ‘>’, if ‘<’ then the upper bound of the bucket is returned, if ‘>’ then the lower bound of the bucket is returned

toJson()[source]#

Converts the classifier to a JSON string

Returns:: The JSON string representation of the classifier
Return type:: str

class neurobcl.base.model.NeuroBucketTrainer(quantile_gap: int = 10, max_depth: int = 2)[source]#

Bases: ABC

The NeuroBucketTrainer class is used to train the classifier based on the given data and features (template only)

depth_features_index(bucket_feat_name: str, depth: int, current_filters={})[source]#

Recursively index the features based on the given depth and current filters

Parameters:

bucket_feat_name (str) – The bucket feature name
depth (int) – The depth
current_filters (dict, optional) – The current filters, defaults to {}

abstract get_at(target_feature: str, rank: int, filters: dict = {})[source]#

Get the value at the given rank for the given target feature and filters

Parameters:

target_feature (str) – The target feature for which the value is to be found
rank (int) – The rank at which the value is to be found
filters (dict) – The filters to be applied

Returns:

The value at the given rank for the given target feature and filters

Return type:

int

abstract get_bucket_features()[source]#

Get the bucket features

Returns:: The bucket features
Return type:: List

abstract get_non_bucket_features()[source]#

Get the non bucket features

Returns:: The non bucket features
Return type:: dict

index()[source]#

Index the classifier based on the given data and features

Returns:: The classifier model
Return type:: NeuroBucketClassifier

Note

This can take a while to index the data

abstract total_items(filters: dict = {})[source]#

Get the total items for the given filters

Parameters:: filters (dict) – The filters to be applied
Returns:: The total items for the given filters
Return type:: int

neurobcl.base.model.order_invariant(feature: str, filters: dict = {})[source]#

Converts the filters to a string and joins with feature to make it order invariant (Helps in searching the same in the hash table)

Parameters:

feature (str) – The feature name
filters (dict) – The filters to be applied

Returns:

A string representation of the feature and filters

Return type:

str

neurobcl.base package#

Submodules#

neurobcl.base.model module#

Module contents#