Last Updated on November 4, 2022 by InfraExam
DP-100 : Designing and Implementing a Data Science Solution on Azure : Part 09
-
You register a model that you plan to use in a batch inference pipeline.
The batch inference pipeline must use a ParallelRunStep step to process files in a file dataset. The script has the ParallelRunStep step runs must process six input files each time the inferencing function is called.
You need to configure the pipeline.
Which configuration setting should you specify in the ParallelRunConfig object for the PrallelRunStep step?
-
process_count_per_node= "6"
-
node_count= "6"
-
mini_batch_size= "6"
-
error_threshold= "6"
Explanation:
node_count is the number of nodes in the compute target used for running the ParallelRunStep.
Incorrect Answers:
A: process_count_per_node
Number of processes executed on each node. (optional, default value is number of cores on node.)C: mini_batch_size
For FileDataset input, this field is the number of files user script can process in one run() call. For TabularDataset input, this field is the approximate size of data the user script can process in one run() call. Example values are 1024, 1024KB, 10MB, and 1GB.D: error_threshold
The number of record failures for TabularDataset and file failures for FileDataset that should be ignored during processing. If the error count goes above this value, then the job will be aborted. -
-
You deploy a real-time inference service for a trained model.
The deployed model supports a business-critical application, and it is important to be able to monitor the data submitted to the web service and the predictions the data generates.
You need to implement a monitoring solution for the deployed model using minimal administrative effort.
What should you do?
- View the explanations for the registered model in Azure ML studio.
- Enable Azure Application Insights for the service endpoint and view logged data in the Azure portal.
- View the log files generated by the experiment used to train the model.
- Create an ML Flow tracking URI that references the endpoint, and view the data logged by ML Flow.
Explanation:Configure logging with Azure Machine Learning studio
You can also enable Azure Application Insights from Azure Machine Learning studio. When you’re ready to deploy your model as a web service, use the following steps to enable Application Insights:
1. Sign in to the studio at https://ml.azure.com.
2. Go to Models and select the model you want to deploy.
3. Select +Deploy.
4. Populate the Deploy model form.
5. Expand the Advanced menu.
6. Select Enable Application Insights diagnostics and data collection.DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q02 166 -
HOTSPOT
You use Azure Machine Learning to train and register a model.
You must deploy the model into production as a real-time web service to an inference cluster named service-compute that the IT department has created in the Azure Machine Learning workspace.
Client applications consuming the deployed web service must be authenticated based on their Azure Active Directory service principal.
You need to write a script that uses the Azure Machine Learning SDK to deploy the model. The necessary modules have been imported.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q03 167 Question DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q03 167 Answer Explanation:Box 1: AksCompute
Example:
aks_target = AksCompute(ws,”myaks”)
# If deploying to a cluster configured for dev/test, ensure that it was created with enough
# cores and memory to handle this deployment configuration. Note that memory is also used by
# things such as dependencies and AML components.
deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
service = Model.deploy(ws, “myservice”, [model], inference_config, deployment_config, aks_target)Box 2: AksWebservice
Box 3: token_auth_enabled=Yes
Whether or not token auth is enabled for the Webservice.Note: A Service principal defined in Azure Active Directory (Azure AD) can act as a principal on which authentication and authorization policies can be enforced in Azure Databricks.
The Azure Active Directory Authentication Library (ADAL) can be used to programmatically get an Azure AD access token for a user.
Incorrect Answers:
auth_enabled (bool): Whether or not to enable key auth for this Webservice. Defaults to True. -
An organization creates and deploys a multi-class image classification deep learning model that uses a set of labeled photographs.
The software engineering team reports there is a heavy inferencing load for the prediction web services during the summer. The production web service for the model fails to meet demand despite having a fully-utilized compute cluster where the web service is deployed.
You need to improve performance of the image classification web service with minimal downtime and minimal administrative effort.
What should you advise the IT Operations team to do?
- Create a new compute cluster by using larger VM sizes for the nodes, redeploy the web service to that cluster, and update the DNS registration for the service endpoint to point to the new cluster.
- Increase the node count of the compute cluster where the web service is deployed.
- Increase the minimum node count of the compute cluster where the web service is deployed.
- Increase the VM size of nodes in the compute cluster where the web service is deployed.
Explanation:The Azure Machine Learning SDK does not provide support scaling an AKS cluster. To scale the nodes in the cluster, use the UI for your AKS cluster in the Azure Machine Learning studio. You can only change the node count, not the VM size of the cluster. -
You use Azure Machine Learning designer to create a real-time service endpoint. You have a single Azure Machine Learning service compute resource.
You train the model and prepare the real-time pipeline for deployment.
You need to publish the inference pipeline as a web service.
Which compute type should you use?
- a new Machine Learning Compute resource
- Azure Kubernetes Services
- HDInsight
- the existing Machine Learning Compute resource
- Azure Databricks
Explanation:Azure Kubernetes Service (AKS) can be used real-time inference. -
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You train and register a machine learning model.
You plan to deploy the model as a real-time web service. Applications must use key-based authentication to use the model.
You need to deploy the web service.
Solution:
Create an AciWebservice instance.
Set the value of the ssl_enabled property to True.
Deploy the model to the service.Does the solution meet the goal?
- Yes
- No
Explanation:Instead use only auth_enabled = TRUE
Note: Key-based authentication.
Web services deployed on AKS have key-based auth enabled by default. ACI-deployed services have key-based auth disabled by default, but you can enable it by setting auth_enabled = TRUE when creating the ACI web service. The following is an example of creating an ACI deployment configuration with key-based auth enabled.deployment_config <- aci_webservice_deployment_config(cpu_cores = 1,
memory_gb = 1,
auth_enabled = TRUE) -
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You train and register a machine learning model.
You plan to deploy the model as a real-time web service. Applications must use key-based authentication to use the model.
You need to deploy the web service.
Solution:
Create an AksWebservice instance.
Set the value of the auth_enabled property to True.
Deploy the model to the service.Does the solution meet the goal?
- Yes
- No
Explanation:Key-based authentication.
Web services deployed on AKS have key-based auth enabled by default. ACI-deployed services have key-based auth disabled by default, but you can enable it by setting auth_enabled = TRUE when creating the ACI web service. The following is an example of creating an ACI deployment configuration with key-based auth enabled.deployment_config <- aci_webservice_deployment_config(cpu_cores = 1,
memory_gb = 1,
auth_enabled = TRUE) -
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You train and register a machine learning model.
You plan to deploy the model as a real-time web service. Applications must use key-based authentication to use the model.
You need to deploy the web service.
Solution:
Create an AksWebservice instance.
Set the value of the auth_enabled property to False.
Set the value of the token_auth_enabled property to True.
Deploy the model to the service.Does the solution meet the goal?
- Yes
- No
Explanation:Instead use only auth_enabled = TRUE
Note: Key-based authentication.
Web services deployed on AKS have key-based auth enabled by default. ACI-deployed services have key-based auth disabled by default, but you can enable it by setting auth_enabled = TRUE when creating the ACI web service. The following is an example of creating an ACI deployment configuration with key-based auth enabled.deployment_config <- aci_webservice_deployment_config(cpu_cores = 1,
memory_gb = 1,
auth_enabled = TRUE) -
You use the following Python code in a notebook to deploy a model as a web service:
from azureml.core.webservice import AciWebservice from azureml.core.model import InferenceConfig inference_config = InferenceConfig(runtime='python', source_directory='model_files', entry_script='score.py', conda_file='env.yml') deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1) service = Model.deploy(ws, 'my-service', [model], inference_config, deployment_config) service.wait_for_deployment(True)
The deployment fails.
You need to use the Python SDK in the notebook to determine the events that occurred during service deployment an initialization.
Which code segment should you use?
-
service.state
-
service.get_logs()
-
service.serialize()
-
service.environment
Explanation:The first step in debugging errors is to get your deployment logs. In Python: service.get_logs() -
-
You use the Azure Machine Learning Python SDK to define a pipeline that consists of multiple steps.
When you run the pipeline, you observe that some steps do not run. The cached output from a previous run is used instead.
You need to ensure that every step in the pipeline is run, even if the parameters and contents of the source directory have not changed since the previous run.
What are two possible ways to achieve this goal? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.
- Use a PipelineData object that references a datastore other than the default datastore.
- Set the regenerate_outputs property of the pipeline to True.
- Set the allow_reuse property of each step in the pipeline to False.
- Restart the compute cluster where the pipeline experiment is configured to run.
- Set the outputs property of each step in the pipeline to True.
Explanation:
B: If regenerate_outputs is set to True, a new submit will always force generation of all step outputs, and disallow data reuse for any step of this run. Once this run is complete, however, subsequent runs may reuse the results of this run.
C: Keep the following in mind when working with pipeline steps, input/output data, and step reuse.
– If data used in a step is in a datastore and allow_reuse is True, then changes to the data change won’t be detected. If the data is uploaded as part of the snapshot (under the step’s source_directory), though this is not recommended, then the hash will change and will trigger a rerun. -
You train a model and register it in your Azure Machine Learning workspace. You are ready to deploy the model as a real-time web service.
You deploy the model to an Azure Kubernetes Service (AKS) inference cluster, but the deployment fails because an error occurs when the service runs the entry script that is associated with the model deployment.
You need to debug the error by iteratively modifying the code and reloading the service, without requiring a re-deployment of the service for each code update.
What should you do?
- Modify the AKS service deployment configuration to enable application insights and re-deploy to AKS.
- Create an Azure Container Instances (ACI) web service deployment configuration and deploy the model on ACI.
- Add a breakpoint to the first line of the entry script and redeploy the service to AKS.
- Create a local web service deployment configuration and deploy the model to a local Docker container.
- Register a new version of the model and update the entry script to load the new version of the model from its registered path.
Explanation:How to work around or solve common Docker deployment errors with Azure Container Instances (ACI) and Azure Kubernetes Service (AKS) using Azure Machine Learning.
The recommended and the most up to date approach for model deployment is via the Model.deploy() API using an Environment object as an input parameter. In this case our service will create a base docker image for you during deployment stage and mount the required models all in one call. The basic deployment tasks are:
1. Register the model in the workspace model registry.
2. Define Inference Configuration:
a) Create an Environment object based on the dependencies you specify in the environment yaml file or use one of our procured environments.
b) Create an inference configuration (InferenceConfig object) based on the environment and the scoring script.
3. Deploy the model to Azure Container Instance (ACI) service or to Azure Kubernetes Service (AKS). -
You use Azure Machine Learning designer to create a training pipeline for a regression model.
You need to prepare the pipeline for deployment as an endpoint that generates predictions asynchronously for a dataset of input data values.
What should you do?
- Clone the training pipeline.
- Create a batch inference pipeline from the training pipeline.
- Create a real-time inference pipeline from the training pipeline.
- Replace the dataset in the training pipeline with an Enter Data Manually module.
Explanation:You must first convert the training pipeline into a real-time inference pipeline. This process removes training modules and adds web service inputs and outputs to handle requests.
Incorrect Answers:
A: Use the Enter Data Manually module to create a small dataset by typing values. -
You retrain an existing model.
You need to register the new version of a model while keeping the current version of the model in the registry.
What should you do?
- Register a model with a different name from the existing model and a custom property named version with the value 2.
- Register the model with the same name as the existing model.
- Save the new model in the default datastore with the same name as the existing model. Do not register the new model.
- Delete the existing model and register the new one with the same name.
Explanation:Model version: A version of a registered model. When a new model is added to the Model Registry, it is added as Version 1. Each model registered to the same model name increments the version number. -
You use the Azure Machine Learning SDK to run a training experiment that trains a classification model and calculates its accuracy metric.
The model will be retrained each month as new data is available.
You must register the model for use in a batch inference pipeline.
You need to register the model and ensure that the models created by subsequent retraining experiments are registered only if their accuracy is higher than the currently registered model.
What are two possible ways to achieve this goal? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.
- Specify a different name for the model each time you register it.
- Register the model with the same name each time regardless of accuracy, and always use the latest version of the model in the batch inferencing pipeline.
- Specify the model framework version when registering the model, and only register subsequent models if this value is higher.
- Specify a property named accuracy with the accuracy metric as a value when registering the model, and only register subsequent models if their accuracy is higher than the accuracy property value of the currently registered model.
- Specify a tag named accuracy with the accuracy metric as a value when registering the model, and only register subsequent models if their accuracy is higher than the accuracy tag value of the currently registered model.
Explanation:E: Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric.
-
You are a data scientist working for a hotel booking website company. You use the Azure Machine Learning service to train a model that identifies fraudulent transactions.
You must deploy the model as an Azure Machine Learning real-time web service using the Model.deploy method in the Azure Machine Learning SDK. The deployed web service must return real-time predictions of fraud based on transaction data input.
You need to create the script that is specified as the entry_script parameter for the InferenceConfig class used to deploy the model.
What should the entry script do?
- Register the model with appropriate tags and properties.
- Create a Conda environment for the web service compute and install the necessary Python packages.
- Load the model and use it to predict labels from input data.
- Start a node on the inference cluster where the web service is deployed.
- Specify the number of cores and the amount of memory required for the inference compute.
Explanation:The entry script receives data submitted to a deployed web service and passes it to the model. It then takes the response returned by the model and returns that to the client. The script is specific to your model. It must understand the data that the model expects and returns.
The two things you need to accomplish in your entry script are:
– Loading your model (using a function called init())
– Running your model on input data (using a function called run()) -
DRAG DROP
You use Azure Machine Learning to deploy a model as a real-time web service.
You need to create an entry script for the service that ensures that the model is loaded when the service starts and is used to score new data as it is received.
Which functions should you include in the script? To answer, drag the appropriate functions to the correct actions. Each function may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q16 168 Question DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q16 168 Answer Explanation:Box 1: init()
The entry script has only two required functions, init() and run(data). These functions are used to initialize the service at startup and run the model using request data passed in by a client. The rest of the script handles loading and running the model(s).Box 2: run()
-
You develop and train a machine learning model to predict fraudulent transactions for a hotel booking website.
Traffic to the site varies considerably. The site experiences heavy traffic on Monday and Friday and much lower traffic on other days. Holidays are also high web traffic days.
You need to deploy the model as an Azure Machine Learning real-time web service endpoint on compute that can dynamically scale up and down to support demand.
Which deployment compute option should you use?
- attached Azure Databricks cluster
- Azure Container Instance (ACI)
- Azure Kubernetes Service (AKS) inference cluster
- Azure Machine Learning Compute Instance
- attached virtual machine in a different region
Explanation:Azure Machine Learning compute cluster is a managed-compute infrastructure that allows you to easily create a single or multi-node compute. The compute is created within your workspace region as a resource that can be shared with other users in your workspace. The compute scales up automatically when a job is submitted, and can be put in an Azure Virtual Network. -
You are a data scientist working for a bank and have used Azure ML to train and register a machine learning model that predicts whether a customer is likely to repay a loan.
You want to understand how your model is making selections and must be sure that the model does not violate government regulations such as denying loans based on where an applicant lives.
You need to determine the extent to which each feature in the customer data is influencing predictions.
What should you do?
- Enable data drift monitoring for the model and its training dataset.
- Score the model against some test data with known label values and use the results to calculate a confusion matrix.
- Use the Hyperdrive library to test the model with multiple hyperparameter values.
- Use the interpretability package to generate an explainer for the model.
- Add tags to the model registration indicating the names of the features in the training dataset.
Explanation:When you compute model explanations and visualize them, you’re not limited to an existing model explanation for an automated ML model. You can also get an explanation for your model with different test data. The steps in this section show you how to compute and visualize engineered feature importance based on your test data.
Incorrect Answers:
A: In the context of machine learning, data drift is the change in model input data that leads to model performance degradation. It is one of the top reasons where model accuracy degrades over time, thus monitoring data drift helps detect model performance issues.B: A confusion matrix is used to describe the performance of a classification model. Each row displays the instances of the true, or actual class in your dataset, and each column represents the instances of the class that was predicted by the model.
C: Hyperparameters are adjustable parameters you choose for model training that guide the training process. The HyperDrive package helps you automate choosing these parameters.
-
HOTSPOT
You write code to retrieve an experiment that is run from your Azure Machine Learning workspace.
The run used the model interpretation support in Azure Machine Learning to generate and upload a model explanation.
Business managers in your organization want to see the importance of the features in the model.
You need to print out the model features and their relative importance in an output that looks similar to the following.
DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q19 169 How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q19 170 Question DP-100 Designing and Implementing a Data Science Solution on Azure Part 09 Q19 170 Answer Explanation:Box 1: from_run_id
from_run_id(workspace, experiment_name, run_id)
Create the client with factory method given a run ID.Returns an instance of the ExplanationClient.
Parameters
– Workspace Workspace – An object that represents a workspace.
– experiment_name str – The name of an experiment.
– run_id str – A GUID that represents a run.Box 2: list_model_explanations
list_model_explanations returns a dictionary of metadata for all model explanations available.Returns
A dictionary of explanation metadata such as id, data type, explanation method, model type, and upload time, sorted by upload timeBox 3: explanation
-
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You train a classification model by using a logistic regression algorithm.
You must be able to explain the model’s predictions by calculating the importance of each feature, both as an overall global relative importance value and as a measure of local importance for a specific set of predictions.
You need to create an explainer that you can use to retrieve the required global and local feature importance values.
Solution: Create a MimicExplainer.
Does the solution meet the goal?
- Yes
- No
Explanation:Instead use Permutation Feature Importance Explainer (PFI).
Note 1: Mimic explainer is based on the idea of training global surrogate models to mimic blackbox models. A global surrogate model is an intrinsically interpretable model that is trained to approximate the predictions of any black box model as accurately as possible. Data scientists can interpret the surrogate model to draw conclusions about the black box model.
Note 2: Permutation Feature Importance Explainer (PFI): Permutation Feature Importance is a technique used to explain classification and regression models. At a high level, the way it works is by randomly shuffling data one feature at a time for the entire dataset and calculating how much the performance metric of interest changes. The larger the change, the more important that feature is. PFI can explain the overall behavior of any underlying model but does not explain individual predictions.