Skip to content

Latest commit

 

History

History
673 lines (585 loc) · 24.8 KB

File metadata and controls

673 lines (585 loc) · 24.8 KB
copyright
years
2016, 2017
lastupdated 2017-11-16

{:new_window: target="_blank"} {:shortdesc: .shortdesc} {:screen: .screen} {:codeblock: .codeblock} {:pre: .pre}

Continuous learning system

The {{site.data.keyword.pm_full}} service includes a continuous learning system. Continuous learning systems provide automated monitoring of model performance, retraining, and redeployment to ensure prediction quality. {: shortdesc}

Scenario name: Best drug for heart treatment selection.

Scenario description: A biomedical company that produces heart drugs wants to deploy a model that selects the best drug for heart treatment. When new evidence emerges on drug effectiveness, a model update should be considered. A data scientist prepared a predictive model and shares it with you (the developer). Your task is to deploy the model, set learning configuration and execute learning iteration to evaluate, retrain, and redeploy the model.

Note: You can also play with a sample python notebook that creates a sample model, configures learning system, and runs the learning iteration.

Prerequisites

To work with the samples, you must have the following services:

Using the sample model

  1. Go to the Samples tab of the {{site.data.keyword.pm_full}} Dashboard.
  2. In the Sample Models section, find the Heart Drug Selection tile and click the Add model icon (+).

The sample Heart Drug Selection model appears in the list of available models on the Models tab.

Generating the access token

Generate an access token that uses the user and password available from the Service Credentials tab of the {{site.data.keyword.pm_full}} service instance.

Request example:

curl --basic --user <username>:<password> https://ibm-watson-ml.mybluemix.net/v3/identity/token

{: codeblock}

Output example:

{"token":"**********"}

{: codeblock}

Use the following terminal command to assign your token value to the environment variable token:

token="<token_value>"

{: codeblock}

Working with published models

Use the following API call to get your instance details, such as:

  • published models url address
  • deployments url address
  • usage information

Request example:

curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer  $token" https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}

{: codeblock}

Output example:

{
   "metadata":{
      "guid":"360c510b-012c-4793-ae3f-063410081c3e",
      "url":"https://ibm-watson-ml.mybluemix.net/v3/wml_instances/360c510b-012c-4793-ae3f-063410081c3e",
      "created_at":"2017-08-04T09:15:48.344Z",
      "modified_at":"2017-08-22T08:34:47.759Z"
   },
   "entity":{
      "source":"Bluemix",
      "published_models":{
         "url":"https://ibm-watson-ml.mybluemix.net/v3/wml_instances/360c510b-012c-4793-ae3f-063410081c3e/published_models"
      },
      "usage":{
         "expiration_date":"2017-09-01T00:00:00.000Z",
         "computation_time":{
            "limit":18000,
            "current":4
         },
         "model_count":{
            "limit":200,
            "current":2
         },
         "prediction_count":{
            "limit":5000,
            "current":16
         },
         "deployment_count":{
            "limit":5,
            "current":1
         }
      },
      "plan_id":"0f2a3c2c-456b-40f3-9b19-726d2740b11c",
      "status":"Active",
      "organization_guid":"b0e61605-a82e-4f03-9e9f-2767973c084d",
      "region":"us-south",
      "account":{
         "id":"f52968f3dbbe7b0b53e15743d45e5e90",
         "name":"Umit Cakmak's Account",
         "type":"TRIAL"
      },
      "owner":{
         "ibm_id":"31000292EV",
         "email":"umit.cakmak@pl.ibm.com",
         "user_id":"43e0ee0e-6bfb-48fc-bcd8-d61e40d19253",
         "country_code":"POL",
         "beta_user":true
      },
      "deployments":{
         "url":"https://ibm-watson-ml.mybluemix.net/v3/wml_instances/360c510b-012c-4793-ae3f-063410081c3e/deployments"
      },
      "space_guid":"4c55eb1c-d6fe-4f0a-9390-35c9a7ecf27a",
      "plan":"standard"
   }
}

{: codeblock}

After you obtain the published_models url address, use the following API call to get model's details:

Request example:

curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer  $token" https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models

{: codeblock}

Output example:

{  
   "count":1,
   "resources":[  
      {  
         "metadata":{  
            "guid":"361d0edb-c5c9-4b8c-b353-d968e7dc0b8d",
            "url":"https://ibm-watson-ml.mybluemix.net/v3/wml_instances/360c510b-012c-4793-ae3f-063410081c3e/published_models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d",
            "created_at":"2017-08-22T08:34:47.597Z",
            "modified_at":"2017-08-22T08:34:47.691Z"
         },
         "entity":{  
            "runtime_environment":"spark-2.0",
            "author":{  
               "name":"IBM",
               "email":""
            },
            "name":"Best Heart Drug Selection",
            "label_col":"DRUG",
            "training_data_schema":{  
               "fields":[  
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"AGE"
                     },
                     "type":"integer",
                     "name":"AGE",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"SEX"
                     },
                     "type":"string",
                     "name":"SEX",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"BP"
                     },
                     "type":"string",
                     "name":"BP",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"CHOLESTEROL"
                     },
                     "type":"string",
                     "name":"CHOLESTEROL",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":6,
                        "name":"NA"
                     },
                     "type":"decimal(12,6)",
                     "name":"NA",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":6,
                        "name":"K"
                     },
                     "type":"decimal(13,6)",
                     "name":"K",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"DRUG"
                     },
                     "type":"string",
                     "name":"DRUG",
                     "nullable":true
                  }
               ],
               "type":"struct"
            },
            "latest_version":{  
               "url":"https://ibm-watson-ml.mybluemix.net/v2/artifacts/models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d/versions/8dfe9548-57ec-4768-8d3e-6c1e7e5cdfc3",
               "guid":"8dfe9548-57ec-4768-8d3e-6c1e7e5cdfc3",
               "created_at":"2017-08-22T08:34:47.691Z"
            },
            "model_type":"sparkml-model-2.0",
            "deployments":{  
               "count":0,
               "url":"https://ibm-watson-ml.mybluemix.net/v3/wml_instances/360c510b-012c-4793-ae3f-063410081c3e/published_models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d/deployments"
            },
            "input_data_schema":{  
               "fields":[  
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"AGE"
                     },
                     "type":"integer",
                     "name":"AGE",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"SEX"
                     },
                     "type":"string",
                     "name":"SEX",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"BP"
                     },
                     "type":"string",
                     "name":"BP",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":0,
                        "name":"CHOLESTEROL"
                     },
                     "type":"string",
                     "name":"CHOLESTEROL",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":6,
                        "name":"NA"
                     },
                     "type":"decimal(12,6)",
                     "name":"NA",
                     "nullable":true
                  },
                  {  
                     "metadata":{  
                        "scale":6,
                        "name":"K"
                     },
                     "type":"decimal(13,6)",
                     "name":"K",
                     "nullable":true
                  }
               ],
               "type":"struct"
            }
         }
      }
}

{: codeblock}

Set up the continuous learning system for a published model

To configure continuous learning system for your model, you must perform the following tasks:

  1. Prepare the authorization header.
  2. Prepare the feedback data set.
  3. Prepare the configuration payload.

After you complete preparation of the authorization header, feedback data set, and the configuration payload, you can begin iterating your continuous learning system. For more information, see Run continuous learning system iteration.

Prepare the authorization header

To prepare the authorization header that combines {{site.data.keyword.pm_full}} token and Spark instance credentials, provide the following details:

  • The access token created in the previous step

  • Spark service credentials, which can be found on the Service Credentials tab of the {{site.data.keyword.Bluemix_notm}} Spark service dashboard. Before you make the deployment request, Spark credentials must be encoded as base64. They are passed in the header of request in X-Spark-Service-Instance field.

    Depending on the operating system that you are using, you must issue one of the following terminal commands to perform base64 encoding and assign it to the environment variable.

    On the macOS operating system, use the following command:

    spark_credentials=$(echo '{"credentials": {"tenant_id": "s068-ade10277b64956-05b1d10fv12b","tenant_id_full": "00fd89e6-8cf2-4712-a068-ade10277b649_41f37bf2-1b95-4c65-a156-05b1d10fb12b","cluster_master_url": "https://spark.bluemix.net", "instance_id": "00fd89e6-8cf2-4712-a068-ade10277b649","tenant_secret": "c74c37cf-482a-4da4-836e-f32ca26ccbb9","plan": "ibm.SparkService.PayGoPersonal"},"version": "2.0"}' | base64)
    

    {: codeblock}

    On the Microsoft Windows or Linux operating systems, you must use the --wrap=0 parameter with the base64 command to perform base64 encoding:

    spark_credentials=$(echo '{"credentials": {"tenant_id": "s068-ade10277b64956-05b1d10fv12b","tenant_id_full": "00fd89e6-8cf2-4712-a068-ade10277b649_41f37bf2-1b95-4c65-a156-05b1d10fb12b","cluster_master_url": "https://spark.bluemix.net", "instance_id": "00fd89e6-8cf2-4712-a068-ade10277b649","tenant_secret": "c74c37cf-482a-4da4-836e-f32ca26ccbb9","plan": "ibm.SparkService.PayGoPersonal"},"version": "2.0"}' | base64 --wrap=0)
    

    {: codeblock}

Prepare feedback data set

The learning system requires a connection to training data (the data that is used in model training) as well as feedback data. The feedback data store is used to monitor and retrain your model when needed. {{site.data.keyword.dashdbshort}} is supported as a feedback data store. The feedback table is managed, modified, and used by the {{site.data.keyword.pm_short}} service. To prepare the DRUG_FEEDBACK_DATA table in {{site.data.keyword.dashdbshort}}, you must complete the following steps:

  1. Create a {{site.data.keyword.dashdbshort}} Service instance (an entry plan is offered).
  2. Create the DRUG_FEEDBACK_DATA table in {{site.data.keyword.dashdbshort}}.
    1. From the GitHub repository, download the drug_feedback_data.csv file.
    2. To get started with {{site.data.keyword.dashdbshort_notm}}, click the Open the console icon.
    3. Select the Load Data and Desktop load type.
    4. Drag the previously downloaded file and press Next.
    5. To import data, click Schema and then click New Table.
    6. In the new table field type a name and click Next.
    7. For the field separator use a semicolon (;).
    8. Click Next to create table with uploaded data.

Note: You can add new feedback records to the feedback database by using this REST API endpoint. For more information, see Feedback data store section.

Prepare the configuration payload

To specify the learning configuration, you must use the following endpoint:

https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models/{published_model_id}/learning_configuration

{: codeblock}

To finalize the payload, you must define the values of the following parameters:

feedback_data_reference
connection and source of feedback data store
min_feedback_data_size
minimal number of records in the feedback data set to start continuous learning system iteration
auto_retrain
[never, always, conditionally] specifies when the retraining process triggers. When set to [conditionally], it triggers the retraining process when the model quality is less than specified threshold value.
auto_redeploy
[never, always, conditionally] specifies when retrained model should be deployed. When set to [conditionally], it triggers model redeployment when newly trained model quality is better than the quality of the currently deployed one.

Request example:

curl -v -X PUT \
    -H "Content-Type:application/json" \
    -H "Authorization: Bearer $token" \
    -H "X-Spark-Service-Instance: $spark_credentials" \
    -d '{
         "definition": {
           "method": "multiclass",
           "metrics": [
             {
               "name": "accuracy",
               "threshold": 0.8
             }
           ]
         },
         "feedback_data_reference": {
           "connection": {
            "db": "BLUDB",
            "host": "awh-yp-small02.services.dal.bluemix.net",
            "username": "***",
            "password": "***"
           },
           "source": {
            "tablename": "DRUG_FEEDBACK_DATA",
            "type": "dashdb"
           }
          },
          "min_feedback_data_size": 10,
          "auto_retrain": "conditionally",
          "auto_redeploy": "never",
          "last_training_record": 0
        }' \
    https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models/{published_model_id}/learning_configuration

{: codeblock}

Output example:

{
   "auto_retrain":"conditionally",
   "auto_redeploy":"never",
      "evaluation_definition": {
    "method": "multiclass",
    "metrics": [
      {
        "name": "accuracy",
        "threshold": 0.8
      }
    ]
   },
   "feedback_data_reference":{
      "connection":{
         "db":"BLUDB",
         "host":"awh-yp-small02.services.dal.bluemix.net",
         "username":"***",
         "password":"***"
      },
      "source":{
         "tablename":"DRUG_FEEDBACK_DATA",
         "type":"dashdb"
      }
   },
   "min_feedback_data_size":10,
   "spark_service":{
      "credentials":{
         "tenant_id":"***",
         "cluster_master_url":"https://spark.bluemix.net",
         "tenant_id_full":"***",
         "tenant_secret":"***",
         "instance_id":"***",
         "plan":"ibm.SparkService.PayGoPersonal"
      },
      "version":"2.0"
   },
   "training_data_reference": {
   "connection": {
     "db": "BLUDB",
     "host": "dashdb-entry-yp-dal09-08.services.dal.bluemix.net",
     "password": "***",
     "username": "***"
   },
   "source": {
     "tablename": "DRUG_TRAIN_DATA_UPDATED",
     "type": "dashdb"
    }
   }
}

{: codeblock}

Note: The example uses default values for the auto_retrain and auto_redeploy parameters. For more information about these parameters, see REST API documentation for the learning_configuration parameter.

Run continuous learning system iteration

To start iteration of the learning system, use the following REST API method. During iteration the published model is evaluated. If the evaluated accuracy is less than the specified threshold value, the model retraining triggers. Both the training and feedback data sets are used for retraining and evaluation.

Request example:

curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer $token" https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models/{published_model_id}/learning_iterations

{: codeblock}

Output example:

The request has been fulfilled and resulted in a new resource being created.

{: codeblock}

To check the status of the iteration, issue the following request:

Request example:

curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer $token" https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models/{published_model_id}/learning_iterations

{: codeblock}

Output example:

{
  "count": 1,
  "resources": [
    {
      "entity": {
        "status": {
          "state": "INITIALIZED"
        },
        "model_version": {
          "url": "https://ibm-watson-ml-svt.stage1.mybluemix.net/v2/artifacts/models/71dc688a-ebda-4174-9574-e8805059e08f/versions/de0df5a6-32c6-408b-bf94-d7b0e3a7fc2d",
          "created_at": "2017-10-30T15:45:12.926Z",
          "guid": "de0df5a6-32c6-408b-bf94-d7b0e3a7fc2d"
        },
        "spark_service": {
          "credentials": {
            "tenant_id_full": "***",
            "tenant_secret": "***",
            "tenant_id": "***",
            "instance_id": "***",
            "plan": "ibm.SparkService.PayGoPersonal",
            "cluster_master_url": "https://spark.bluemix.net"
          },
          "version": "2.0"
        },
        "published_model": {
          "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/ff558a0e-af34-40e9-9f13-2d51b1a4c8bb/published_models/71dc688a-ebda-4174-9574-e8805059e08f",
          "guid": "71dc688a-ebda-4174-9574-e8805059e08f"
        }
      },
      "metadata": {
        "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/ff558a0e-af34-40e9-9f13-2d51b1a4c8bb/published_models/71dc688a-ebda-4174-9574-e8805059e08f/learning_iterations/a308838b-445f-45b8-9fbf-1c3dd1b392c1",
        "created_at": "2017-10-30T15:54:30.657Z",
        "guid": "a308838b-445f-45b8-9fbf-1c3dd1b392c1"
      }
    }
  ]
}

{: codeblock}

You can also get the list of evaluation metrics and values by issuing the following request:

Request example:

curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer $token" https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models/{published_model_id}/evaluation_metrics

{: codeblock}

Output example:

{
  "count": 1,
  "resources": [
      {
      "phase": "setup",
      "values": [
        {
          "name": "accuracy",
          "value": 0.870968,
          "threshold": 0.8
        }
      ],
      "timestamp": "2017-08-07T10:11:02.000Z",
      ""model_version_url": "https://ibm-watson-ml.mybluemix.net/v2/artifacts/models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d/versions/8dfe9548-57ec-4768-8d3e-6c1e7e5cdfc3"
    },
      {
      "phase": "monitoring",
      "values": [
        {
          "name": "accuracy",
          "value": 0.75,
          "threshold": 0.8
        }
      ],
      "timestamp": "2017-08-08T10:11:12.000Z",
      ""model_version_url": "https://ibm-watson-ml.mybluemix.net/v2/artifacts/models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d/versions/8dfe9548-57ec-4768-8d3e-6c1e7e5cdfc3"
    },
    {
      "phase": "setup",
      "values": [
        {
          "name": "accuracy",
          "value": 0.870968,
          "threshold": 0.8
        }
      ],
      "timestamp": "2017-08-08T10:11:12.000Z",
      ""model_version_url": "https://ibm-watson-ml.mybluemix.net/v2/artifacts/models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d/versions/e390cd8a-e043-4da6-b3e8-1d2d02d971fb"
    },    
    {
      "phase": "training",
      "values": [
        {
          "name": "accuracy",
          "value": 0.88281694,
          "threshold": 0.8
        }
      ],
      "timestamp": "2017-08-08T10:11:22.000Z",
      ""model_version_url": "https://ibm-watson-ml.mybluemix.net/v2/artifacts/models/361d0edb-c5c9-4b8c-b353-d968e7dc0b8d/versions/e390cd8a-e043-4da6-b3e8-1d2d02d971fb"
    }
  ]
}

{: codeblock}

Feedback data store

You can use a feedback endpoint to send new records (records that were not used in training process) to feedback store that is defined in the learning configuration. The feedback data store is used to monitor and retrain your model when needed. {{site.data.keyword.dashdbshort}} is supported as a feedback data store. If the feedback table does not exist, the service creates it. If the table exists, the schema is verified to match that of the training table and an extra column named _training is appended to the table. The additional column is used to mark records that are consumed in the retraining process.

Note: Feedback table is managed, modified, and used by the {{site.data.keyword.pm_short}} service.

You can send a new feedback record to the feedback data store by issuing the following request:

Request example:

curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer $token" \
-d '{
   "fields":[
      "AGE",
      "SEX",
      "BP",
      "CHOLESTEROL",
      "NA",
      "K",
      "DRUG"
   ],
   "values":[
      [
         16,
         "M",
         "HIGH",
         "NORMAL",
         0.58301000000000003,
         0.033884999999999998,
         "drugY"
      ]
   ]
}' \
https://ibm-watson-ml.mybluemix.net/v3/wml_instances/{instance_id}/published_models/{published_model_id}/feedback

{: codeblock}

Learn more

Ready to get started? To create an instance of a service or bind an application, see Using the service with Spark and Python models or Using the service with IBM® SPSS® models.

For more information about the API, see Service API for Spark and Python models or Service API for IBM® SPSS® models.

For more information about IBM® SPSS® Modeler and the modeling algorithms it provides, see IBM Knowledge Center.

For more information about IBM Data Science Experience and the modeling algorithms it provides, see https://datascience.ibm.com.