============== Productionize ============== Configure --------- After having set up ``ads opctl`` on your desired machine using ``ads opctl configure``, you are ready to begin you anomaly detection applicaiton. At a bare minimum, you will need to provide the following details about the data: - Path to the input data (input_data) - Name of the Datetime column (datetime_column) - Name of the Target column (target_column) These details exactly match the initial anomaly.yaml file generated by running ``ads operator init --type anomaly``: .. code-block:: yaml kind: operator type: anomaly version: v1 spec: datetime_column: name: Date input_data: url: data.csv target_column: target Optionally, you are able to specify much more. The most common additions are: - Path to the validation data, which has all of the columns of the input_data plus an ``anomaly`` column. (validation_data) - Path to test data, in the event you want to evaluate the selected model on a test set (test_data) - List of column names that index different timeseries within the data, such as a product_ID or some other such series (target_category_columns) - Path to the output directory, where the operator will place the outliers.csv, report.html, and other artifacts produced from the run (output_directory) An extensive list of parameters can be found in the ``YAML Schema`` section. Run --- Once written, run the anomaly.yaml file: .. code-block:: bash ads operator run -f anomaly.yaml Interpret Results ----------------- The anomaly detection operator produces many output files: ``outliers.csv``, ``report.html``, and optionally ``inliers.csv``. We will go through each of these output files in turn. **outliers.csv** This file contains the entire historical dataset with the following columns: * Date: Time series data * Series: Categorical or numerical index * Target Column: Input data * Score: This will give a score from 0-1 of how anomalous a datapoint is **report.html** The report.html file is designed differently for each model type. Generally, it contains a summary of the input and validation data, a plot of the target from input data overlaid with red dots for anomalous values, analysis of the models used, and details about the model components. It also includes a receipt YAML file, providing a fully detailed version of the original anomaly.yaml file. **Metrics.csv** The metrics file includes relevant metrics calculated on the training set. Examples -------- **Simple Example** The simplest yaml file is generated by the ``ads operator init --type anomaly`` and looks like the following: .. code-block:: yaml kind: operator type: anomaly version: v1 spec: datetime_column: name: Date input_data: url: data.csv model: auto target_column: target **Typical Example** A typical anomaly detection application may have the following fields: .. code-block:: yaml kind: operator type: anomaly version: v1 spec: input_data: connect_args: user: XXX password: YYY dsn: "localhost/orclpdb" sql: 'SELECT Series, Total, time FROM live_data' datetime_column: name: time format: "%H:%M:%S" model: "auto" output_directory: url: results target_category_columns: - Series target_column: Total test_data: url: oci://bucket@namespace/test_data.csv **Complex Example** The yaml can also be maximally stated as follows: .. code-block:: yaml kind: operator type: anomaly version: v1 spec: input_data: connect_args: user: XXX password: YYY dsn: "localhost/orclpdb" sql: 'SELECT Store_ID, Sales, Date FROM live_data' validation_data: url: oci://bucket@namespace/additional_data.csv columns: - Date - Store_ID - v1 - v3 - v4 output_directory: url: results test_data: url: test_data.csv target_category_columns: - Store_ID target_column: Sales datetime_column: format: "%d/%m/%y" name: Date model: automlx model_kwargs: time_budget: 100 preprocessing: true generate_metrics: true generate_report: true metrics_filename: metrics.csv report_filename: report.html report_theme: light test_metrics_filename: test_metrics.csv