ads.data_labeling package¶
Subpackages¶
- ads.data_labeling.interface package
- ads.data_labeling.loader package
- ads.data_labeling.mixin package
- ads.data_labeling.parser package
- ads.data_labeling.reader package
- Submodules
- ads.data_labeling.reader.dataset_reader module
- ads.data_labeling.reader.dls_record_reader module
- ads.data_labeling.reader.export_record_reader module
- ads.data_labeling.reader.jsonl_reader module
- ads.data_labeling.reader.metadata_reader module
- ads.data_labeling.reader.record_reader module
- Module contents
- ads.data_labeling.visualizer package
Submodules¶
ads.data_labeling.boundingbox module¶
- class ads.data_labeling.boundingbox.BoundingBoxItem(top_left: ~typing.Tuple[float, float], bottom_left: ~typing.Tuple[float, float], bottom_right: ~typing.Tuple[float, float], top_right: ~typing.Tuple[float, float], labels: ~typing.List[str] = <factory>)[source]¶
Bases:
object
BoundingBoxItem class representing bounding box label.
Examples
>>> item = BoundingBoxItem( ... labels = ['cat','dog'] ... bottom_left=(0.2, 0.4), ... top_left=(0.2, 0.2), ... top_right=(0.8, 0.2), ... bottom_right=(0.8, 0.4)) >>> item.to_yolo(categories = ['cat','dog', 'horse'])
- classmethod from_yolo(bbox: List[Tuple], categories: List[str] | None = None) BoundingBoxItem [source]¶
Converts the YOLO formated annotations to BoundingBoxItem.
- Parameters:
bboxes (List[Tuple]) – The list of bounding box annotations in YOLO format. Example: [(0, 0.511560675, 0.50234826, 0.47013485, 0.57803468)]
categories (List[str]) – The list of object categories in proper order for model training. Example: [‘cat’,’dog’,’horse’]
- Returns:
The BoundingBoxItem.
- Return type:
- Raises:
TypeError – When categories list has a wrong format.
- to_yolo(categories: List[str]) List[Tuple[int, float, float, float, float]] [source]¶
Converts BoundingBoxItem to the YOLO format.
- Parameters:
categories (List[str]) – The list of object categories in proper order for model training. Example: [‘cat’,’dog’,’horse’]
- Returns:
The list of YOLO formatted bounding boxes.
- Return type:
- Raises:
ValueError – When categories list not provided. When categories list not matched with the labels.
TypeError – When categories list has a wrong format.
- class ads.data_labeling.boundingbox.BoundingBoxItems(items: ~typing.List[~ads.data_labeling.boundingbox.BoundingBoxItem] = <factory>)[source]¶
Bases:
object
BoundingBoxItems class which consists of a list of BoundingBoxItem.
- items¶
List of BoundingBoxItem.
- Type:
List[BoundingBoxItem]
Examples
>>> item = BoundingBoxItem( ... labels = ['cat','dog'] ... bottom_left=(0.2, 0.4), ... top_left=(0.2, 0.2), ... top_right=(0.8, 0.2), ... bottom_right=(0.8, 0.4)) >>> items = BoundingBoxItems(items = [item]) >>> items.to_yolo(categories = ['cat','dog', 'horse'])
- items: List[BoundingBoxItem]¶
- to_yolo(categories: List[str]) List[Tuple[int, float, float, float, float]] [source]¶
Converts BoundingBoxItems to the YOLO format.
- Parameters:
categories (List[str]) – The list of object categories in proper order for model training. Example: [‘cat’,’dog’,’horse’]
- Returns:
The list of YOLO formatted bounding boxes.
- Return type:
- Raises:
ValueError – When categories list not provided. When categories list not matched with the labels.
TypeError – When categories list has a wrong format.
ads.data_labeling.constants module¶
- class ads.data_labeling.constants.AnnotationType[source]¶
Bases:
object
AnnotationType class which contains all the annotation types that data labeling service supports.
- BOUNDING_BOX = 'BOUNDING_BOX'¶
- ENTITY_EXTRACTION = 'ENTITY_EXTRACTION'¶
- MULTI_LABEL = 'MULTI_LABEL'¶
- SINGLE_LABEL = 'SINGLE_LABEL'¶
ads.data_labeling.data_labeling_service module¶
- class ads.data_labeling.data_labeling_service.DataLabeling(compartment_id: str | None = None, dls_cp_client_auth: dict | None = None, dls_dp_client_auth: dict | None = None)[source]¶
Bases:
OCIWorkRequestMixin
Class for data labeling service. Integrate the data labeling service APIs.
Examples
>>> import ads >>> import pandas >>> from ads.data_labeling.data_labeling_service import DataLabeling >>> ads.set_auth("api_key") >>> dls = DataLabeling() >>> dls.list_dataset() >>> metadata_path = dls.export(dataset_id="your dataset id", ... path="oci://<bucket_name>@<namespace>/folder") >>> df = pd.DataFrame.ads.read_labeled_data(metadata_path)
Initialize a DataLabeling class.
- Parameters:
compartment_id (str, optional) – OCID of data labeling datasets’ compartment
dls_cp_client_auth (dict, optional) – Data Labeling control plane client auth. Default is None. The default authetication is set using ads.set_auth API. If you need to override the default, use the ads.common.auth.api_keys or ads.common.auth.resource_principal to create appropriate authentication signer and kwargs required to instantiate IdentityClient object.
dls_dp_client_auth (dict, optional) – Data Labeling data plane client auth. Default is None. The default authetication is set using ads.set_auth API. If you need to override the default, use the ads.common.auth.api_keys or ads.common.auth.resource_principal to create appropriate authentication signer and kwargs required to instantiate IdentityClient object.
- Returns:
Nothing.
- Return type:
None
- export(dataset_id: str, path: str, include_unlabeled=False) str [source]¶
Export dataset based on the dataset_id and save the jsonl files under the path (metadata jsonl file and the records jsonl file) to the object storage path provided by the user and return the metadata jsonl path.
- Parameters:
- Returns:
oci path of the metadata jsonl file.
- Return type:
- list_dataset(**kwargs) DataFrame [source]¶
List all the datasets created from the data labeling service under a given compartment.
- Parameters:
kwargs (dict, optional) – Additional keyword arguments will be passed to oci.data_labeling_serviceDataLabelingManagementClient.list_datasets method.
- Returns:
pandas dataframe which contains the dataset information.
- Return type:
pandas.DataFrame
- Raises:
Exception – If pagination.list_call_get_all_results() fails
ads.data_labeling.metadata module¶
- class ads.data_labeling.metadata.Metadata(source_path: str = '', records_path: str = '', labels: ~typing.List[str] = <factory>, dataset_name: str = '', compartment_id: str = '', dataset_id: str = '', annotation_type: str = '', dataset_type: str = '')[source]¶
Bases:
DataClassSerializable
The class that representing the labeled dataset metadata.
- source_path¶
Contains information on where all the source data(image/text/document) stores.
- Type:
- labels¶
List of classes/labels for the dataset.
- Type:
List
- annotation_type¶
Type of the labeling/annotation task. Currently supports SINGLE_LABEL, MULTI_LABEL, ENTITY_EXTRACTION, BOUNDING_BOX.
- Type:
- classmethod from_dls_dataset(dataset: Dataset) Metadata [source]¶
Contructs a Metadata instance from OCI DLS dataset.
- Parameters:
dataset (OCIDLSDataset) – OCIDLSDataset object.
- Returns:
The ads labeled dataset metadata instance.
- Return type:
ads.data_labeling.ner module¶
- class ads.data_labeling.ner.NERItem(label: str = '', offset: int = 0, length: int = 0)[source]¶
Bases:
object
NERItem class which is a representation of a token span.
- class ads.data_labeling.ner.NERItems(items: ~typing.List[~ads.data_labeling.ner.NERItem] = <factory>)[source]¶
Bases:
object
NERItems class consists of a list of NERItem.
- exception ads.data_labeling.ner.WrongEntityFormatLabelIsEmpty[source]¶
Bases:
ValueError
- exception ads.data_labeling.ner.WrongEntityFormatLabelNotString[source]¶
Bases:
ValueError
- exception ads.data_labeling.ner.WrongEntityFormatLengthIsNegative[source]¶
Bases:
ValueError
- exception ads.data_labeling.ner.WrongEntityFormatLengthNotInteger[source]¶
Bases:
ValueError
- exception ads.data_labeling.ner.WrongEntityFormatOffsetIsNegative[source]¶
Bases:
ValueError
- exception ads.data_labeling.ner.WrongEntityFormatOffsetNotInteger[source]¶
Bases:
ValueError
ads.data_labeling.record module¶
- class ads.data_labeling.record.Record(path: str = '', content: Any | None = None, annotation: Tuple | str | List[BoundingBoxItem] | List[NERItem] | None = None)[source]¶
Bases:
object
Class representing Record.
- content¶
Content of the record.
- Type:
Any
- annotation¶
Annotation/label of the record.
- Type:
Union[Tuple, str, List[BoundingBoxItem], List[NERItem]]