Last Updated on November 4, 2022 by InfraExam

DP-100 : Designing and Implementing a Data Science Solution on Azure : Part 07

  1. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

    After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

    You have a Python script named train.py in a local folder named scripts. The script trains a regression model by using scikit-learn. The script includes code to load a training data file which is also located in the scripts folder.

    You must run the script as an Azure ML experiment on a compute cluster named aml-compute.

    You need to configure the run to ensure that the environment includes the required packages for model training. You have instantiated a variable named aml-compute that references the target compute cluster.

    Solution: Run the following code:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q01 118
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q01 118

    Does the solution meet the goal?

    •  Yes
    • No

    Explanation:

    The scikit-learn estimator provides a simple way of launching a scikit-learn training job on a compute target. It is implemented through the SKLearn class, which can be used to support single-node CPU training.

    Example:
    from azureml.train.sklearn import SKLearn

    }

    estimator = SKLearn(source_directory=project_folder,
    compute_target=compute_target,
    entry_script=’train_iris.py’
    )

  2. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

    After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

    You have a Python script named train.py in a local folder named scripts. The script trains a regression model by using scikit-learn. The script includes code to load a training data file which is also located in the scripts folder.

    You must run the script as an Azure ML experiment on a compute cluster named aml-compute.

    You need to configure the run to ensure that the environment includes the required packages for model training. You have instantiated a variable named aml-compute that references the target compute cluster.

    Solution: Run the following code:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q02 119
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q02 119

    Does the solution meet the goal?

    •  Yes
    • No
    Explanation:

    The scikit-learn estimator provides a simple way of launching a scikit-learn training job on a compute target. It is implemented through the SKLearn class, which can be used to support single-node CPU training.

    Example:
    from azureml.train.sklearn import SKLearn

    }

    estimator = SKLearn(source_directory=project_folder,
    compute_target=compute_target,
    entry_script=’train_iris.py’
    )

  3. DRAG DROP

    You create machine learning models by using Azure Machine Learning.

    You plan to train and score models by using a variety of compute contexts. You also plan to create a new compute resource in Azure Machine Learning studio.

    You need to select the appropriate compute types.

    Which compute types should you select? To answer, drag the appropriate compute types to the correct requirements. Each compute type may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

    NOTE: Each correct selection is worth one point.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q03 120 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q03 120 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q03 120 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q03 120 Answer
    Explanation:

    Box 1: Compute cluster
    Create a single or multi node compute cluster for your training, batch inferencing or reinforcement learning workloads.

    Box 2: Inference cluster

    Box 3: Attached compute
    The compute types that can currently be attached for training include:

    A remote VM
    Azure Databricks (for use in machine learning pipelines)
    Azure Data Lake Analytics (for use in machine learning pipelines)
    Azure HDInsight

    Box 4: Compute cluster

    Note: There are four compute types:
    Compute instance
    Compute clusters
    Inference clusters
    Attached compute

    Note 2:
    Compute clusters
    Create a single or multi node compute cluster for your training, batch inferencing or reinforcement learning workloads.

    Attached compute
    To use compute targets created outside the Azure Machine Learning workspace, you must attach them. Attaching a compute target makes it available to your workspace. Use Attached compute to attach a compute target for training. Use Inference clusters to attach an AKS cluster for inferencing.

    Inference clusters
    Create or attach an Azure Kubernetes Service (AKS) cluster for large scale inferencing.

  4. DRAG DROP

    You are building an experiment using the Azure Machine Learning designer.

    You split a dataset into training and testing sets. You select the Two-Class Boosted Decision Tree as the algorithm.

    You need to determine the Area Under the Curve (AUC) of the model.

    Which three modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 121 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 121 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 121 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 121 Answer
    Explanation:

    Step 1: Train Model
    Two-Class Boosted Decision Tree
    First, set up the boosted decision tree model.

    1. Find the Two-Class Boosted Decision Tree module in the module palette and drag it onto the canvas.
    2. Find the Train Model module, drag it onto the canvas, and then connect the output of the Two-Class Boosted Decision Tree module to the left input port of the Train Model module.
    The Two-Class Boosted Decision Tree module initializes the generic model, and Train Model uses training data to train the model.
    3. Connect the left output of the left Execute R Script module to the right input port of the Train Model module (in this tutorial you used the data coming from the left side of the Split Data module for training).

    This portion of the experiment now looks something like this:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 122
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 122

    Step 2: Score Model
    Score and evaluate the models
    You use the testing data that was separated out by the Split Data module to score our trained models. You can then compare the results of the two models to see which generated better results.

    Add the Score Model modules
    1. Find the Score Model module and drag it onto the canvas.
    2. Connect the Train Model module that’s connected to the Two-Class Boosted Decision Tree module to the left input port of the Score Model module.
    3. Connect the right Execute R Script module (our testing data) to the right input port of the Score Model module.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 123
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 123

    Step 3: Evaluate Model
    To evaluate the two scoring results and compare them, you use an Evaluate Model module.

    1. Find the Evaluate Model module and drag it onto the canvas.
    2. Connect the output port of the Score Model module associated with the boosted decision tree model to the left input port of the Evaluate Model module.
    3. Connect the other Score Model module to the right input port.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 124
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q04 124
  5. You create a multi-class image classification deep learning model that uses a set of labeled images. You create a script file named train.py that uses the PyTorch 1.3 framework to train the model.

    You must run the script by using an estimator. The code must not require any additional Python libraries to be installed in the environment for the estimator. The time required for model training must be minimized.

    You need to define the estimator that will be used to run the script.

    Which estimator type should you use?

    • TensorFlow
    • PyTorch
    • SKLearn
    • Estimator
    Explanation:
    For PyTorch, TensorFlow and Chainer tasks, Azure Machine Learning provides respective PyTorch, TensorFlow, and Chainer estimators to simplify using these frameworks.
  6. You create a pipeline in designer to train a model that predicts automobile prices.

    Because of non-linear relationships in the data, the pipeline calculates the natural log (Ln) of the prices in the training data, trains a model to predict this natural log of price value, and then calculates the exponential of the scored label to get the predicted price.

    The training pipeline is shown in the exhibit. (Click the Training pipeline tab.)

    Training pipeline

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q06 125
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q06 125

    You create a real-time inference pipeline from the training pipeline, as shown in the exhibit. (Click the Real-time pipeline tab.)

    Real-time pipeline

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q06 126
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q06 126

    You need to modify the inference pipeline to ensure that the web service returns the exponential of the scored label as the predicted automobile price and that client applications are not required to include a price value in the input values.

    Which three modifications must you make to the inference pipeline? Each correct answer presents part of the solution.

    NOTE: Each correct selection is worth one point.

    •  Connect the output of the Apply SQL Transformation to the Web Service Output module.
    • Replace the Web Service Input module with a data input that does not include the price column.
    • Add a Select Columns module before the Score Model module to select all columns other than price.
    • Replace the training dataset module with a data input that does not include the price column.
    • Remove the Apply Math Operation module that replaces price with its natural log from the data flow.
    • Remove the Apply SQL Transformation module from the data flow.
  7. HOTSPOT

    You register the following versions of a model.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 127
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 127

    You use the Azure ML Python SDK to run a training experiment. You use a variable named run to reference the experiment run.

    After the run has been submitted and completed, you run the following code:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 128
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 128

    For each of the following statements, select Yes if the statement is true. Otherwise, select No.

    NOTE: Each correct selection is worth one point.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 129 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 129 Question

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 129 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q07 129 Answer
  8. You are creating a classification model for a banking company to identify possible instances of credit card fraud. You plan to create the model in Azure Machine Learning by using automated machine learning.

    The training dataset that you are using is highly unbalanced.

    You need to evaluate the classification model.

    Which primary metric should you use?

    • normalized_mean_absolute_error
    • AUC_weighted
    • accuracy
    • normalized_root_mean_squared_error
    • spearman_correlation
    Explanation:

    AUC_weighted is a Classification metric.

    Note: AUC is the Area under the Receiver Operating Characteristic Curve. Weighted is the arithmetic mean of the score for each class, weighted by the number of true instances in each class.

    Incorrect Answers:
    A: normalized_mean_absolute_error is a regression metric, not a classification metric.

    C: When comparing approaches to imbalanced classification problems, consider using metrics beyond accuracy such as recall, precision, and AUROC. It may be that switching the metric you optimize for during parameter selection or model selection is enough to provide desirable performance detecting the minority class.

    D: normalized_root_mean_squared_error is a regression metric, not a classification metric.

  9. You create a machine learning model by using the Azure Machine Learning designer. You publish the model as a real-time service on an Azure Kubernetes Service (AKS) inference compute cluster. You make no changes to the deployed endpoint configuration.

    You need to provide application developers with the information they need to consume the endpoint.

    Which two values should you provide to application developers? Each correct answer presents part of the solution.

    NOTE: Each correct selection is worth one point.

    • The name of the AKS cluster where the endpoint is hosted.
    • The name of the inference pipeline for the endpoint.
    • The URL of the endpoint.
    • The run ID of the inference pipeline experiment for the endpoint.
    • The key for the endpoint.
    Explanation:

    Deploying an Azure Machine Learning model as a web service creates a REST API endpoint. You can send data to this endpoint and receive the prediction returned by the model.

    You create a web service when you deploy a model to your local environment, Azure Container Instances, Azure Kubernetes Service, or field-programmable gate arrays (FPGA). You retrieve the URI used to access the web service by using the Azure Machine Learning SDK. If authentication is enabled, you can also use the SDK to get the authentication keys or tokens.

    Example:
    # URL for the web service
    scoring_uri = ‘<your web service URI>’
    # If the service is authenticated, set the key or token
    key = ‘<your key or token>’

  10. HOTSPOT

    You collect data from a nearby weather station. You have a pandas dataframe named weather_df that includes the following data:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q10 130
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q10 130

    The data is collected every 12 hours: noon and midnight.

    You plan to use automated machine learning to create a time-series model that predicts temperature over the next seven days. For the initial round of training, you want to train a maximum of 50 different models.

    You must use the Azure Machine Learning SDK to run an automated machine learning experiment to train these models.

    You need to configure the automated machine learning run.

    How should you complete the AutoMLConfig definition? To answer, select the appropriate options in the answer area.

    NOTE: Each correct selection is worth one point.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q10 131 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q10 131 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q10 131 Answerb
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q10 131 Answer
    Explanation:

    Box 1: forcasting
    Task: The type of task to run. Values can be ‘classification’, ‘regression’, or ‘forecasting’ depending on the type of automated ML problem to solve.

    Box 2: temperature
    The training data to be used within the experiment. It should contain both training features and a label column (optionally a sample weights column).

    Box 3: observation_time
    time_column_name: The name of the time column. This parameter is required when forecasting to specify the datetime column in the input data used for building the time series and inferring its frequency. This setting is being deprecated. Please use forecasting_parameters instead.

    Box 4: 7
    “predicts temperature over the next seven days”

    max_horizon: The desired maximum forecast horizon in units of time-series frequency. The default value is 1.

    Units are based on the time interval of your training data, e.g., monthly, weekly that the forecaster should predict out. When task type is forecasting, this parameter is required.

    Box 5: 50
    “For the initial round of training, you want to train a maximum of 50 different models.”

    Iterations: The total number of different algorithm and parameter combinations to test during an automated ML experiment.

  11. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

    After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

    You create a model to forecast weather conditions based on historical data.

    You need to create a pipeline that runs a processing script to load data from a datastore and pass the processed data to a machine learning model training script.

    Solution: Run the following code:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q11 132
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q11 132

    Does the solution meet the goal?

    • Yes
    • No
    Explanation:

    The two steps are present: process_step and train_step
    Data_input correctly references the data in the data store.

    Note:
    Data used in pipeline can be produced by one step and consumed in another step by providing a PipelineData object as an output of one step and an input of one or more subsequent steps.

    PipelineData objects are also used when constructing Pipelines to describe step dependencies. To specify that a step requires the output of another step as input, use a PipelineData object in the constructor of both steps.

    For example, the pipeline train step depends on the process_step_output output of the pipeline process step:

    from azureml.pipeline.core import Pipeline, PipelineData
    from azureml.pipeline.steps import PythonScriptStep

    datastore = ws.get_default_datastore()
    process_step_output = PipelineData(“processed_data”, datastore=datastore)
    process_step = PythonScriptStep(script_name=”process.py”,
    arguments=[“–data_for_train”, process_step_output],
    outputs=[process_step_output],
    compute_target=aml_compute,
    source_directory=process_directory)
    train_step = PythonScriptStep(script_name=”train.py”,
    arguments=[“–data_for_train”, process_step_output],
    inputs=[process_step_output],
    compute_target=aml_compute,
    source_directory=train_directory)

    pipeline = Pipeline(workspace=ws, steps=[process_step, train_step])

  12. You run an experiment that uses an AutoMLConfig class to define an automated machine learning task with a maximum of ten model training iterations. The task will attempt to find the best performing model based on a metric named accuracy.

    You submit the experiment with the following code:

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q12 133
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q12 133

    You need to create Python code that returns the best model that is generated by the automated machine learning task.

    Which code segment should you use?

    • best_model = automl_run.get_details()
    • best_model = automl_run.get_metrics()
    • best_model = automl_run.get_file_names()[1]
    • best_model = automl_run.get_output()[1]
    Explanation:
    The get_output method returns the best run and the fitted model.
  13. You plan to use the Hyperdrive feature of Azure Machine Learning to determine the optimal hyperparameter values when training a model.

    You must use Hyperdrive to try combinations of the following hyperparameter values. You must not apply an early termination policy.

    – learning_rate: any value between 0.001 and 0.1
    – batch_size: 16, 32, or 64

    You need to configure the sampling method for the Hyperdrive experiment.

    Which two sampling methods can you use? Each correct answer is a complete solution.

    NOTE: Each correct selection is worth one point.

    • No sampling
    • Grid sampling
    • Bayesian sampling
    • Random sampling
    Explanation:

    C: Bayesian sampling is based on the Bayesian optimization algorithm and makes intelligent choices on the hyperparameter values to sample next. It picks the sample based on how the previous samples performed, such that the new sample improves the reported primary metric.
    Bayesian sampling does not support any early termination policy

    Example:
    from azureml.train.hyperdrive import BayesianParameterSampling
    from azureml.train.hyperdrive import uniform, choice
    param_sampling = BayesianParameterSampling( {
    “learning_rate”: uniform(0.05, 0.1),
    “batch_size”: choice(16, 32, 64, 128)
    }
    )

    D: In random sampling, hyperparameter values are randomly selected from the defined search space. Random sampling allows the search space to include both discrete and continuous hyperparameters.

    Incorrect Answers:
    B: Grid sampling can be used if your hyperparameter space can be defined as a choice among discrete values and if you have sufficient budget to exhaustively search over all values in the defined search space. Additionally, one can use automated early termination of poorly performing runs, which reduces wastage of resources.

    Example, the following space has a total of six samples:
    from azureml.train.hyperdrive import GridParameterSampling
    from azureml.train.hyperdrive import choice
    param_sampling = GridParameterSampling( {
    “num_hidden_layers”: choice(1, 2, 3),
    “batch_size”: choice(16, 32)
    }
    )

  14. You are training machine learning models in Azure Machine Learning. You use Hyperdrive to tune the hyperparameter.

    In previous model training and tuning runs, many models showed similar performance.

    You need to select an early termination policy that meets the following requirements:

    – accounts for the performance of all previous runs when evaluating the current run
    – avoids comparing the current run with only the best performing run to date

    Which two early termination policies should you use? Each correct answer presents part of the solution.

    NOTE: Each correct selection is worth one point.

    • Median stopping
    • Bandit
    • Default
    • Truncation selection
    Explanation:

    The Median Stopping policy computes running averages across all runs and cancels runs whose best performance is worse than the median of the running averages.

    If no policy is specified, the hyperparameter tuning service will let all training runs execute to completion.

    Incorrect Answers:
    B: BanditPolicy defines an early termination policy based on slack criteria, and a frequency and delay interval for evaluation.
    The Bandit policy takes the following configuration parameters:

    slack_factor: The amount of slack allowed with respect to the best performing training run. This factor specifies the slack as a ratio.

    D: The Truncation selection policy periodically cancels the given percentage of runs that rank the lowest for their performance on the primary metric. The policy strives for fairness in ranking the runs by accounting for improving model performance with training time. When ranking a relatively young run, the policy uses the corresponding (and earlier) performance of older runs for comparison. Therefore, runs aren’t terminated for having a lower performance because they have run for less time than other runs.

  15. HOTSPOT

    You are hired as a data scientist at a winery. The previous data scientist used Azure Machine Learning.

    You need to review the models and explain how each model makes decisions.

    Which explainer modules should you use? To answer, select the appropriate options in the answer area.

    NOTE: Each correct selection is worth one point.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q15 134 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q15 134 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q15 134 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q15 134 Answer
    Explanation:

    Meta explainers automatically select a suitable direct explainer and generate the best explanation info based on the given model and data sets. The meta explainers leverage all the libraries (SHAP, LIME, Mimic, etc.) that we have integrated or developed. The following are the meta explainers available in the SDK:
    Tabular Explainer: Used with tabular datasets.
    Text Explainer: Used with text datasets.
    Image Explainer: Used with image datasets.

    Box 1: Tabular

    Box 2: Text

    Box 3: Image

    Incorrect Answers:
    Hierarchical Attention Network (HAN)
    HAN was proposed by Yang et al. in 2016. Key features of HAN that differentiates itself from existing approaches to document classification are (1) it exploits the hierarchical nature of text data and (2) attention mechanism is adapted for document classification.

  16. HOTSPOT

    You have a dataset that includes home sales data for a city. The dataset includes the following columns.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q16 135
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q16 135

    Each row in the dataset corresponds to an individual home sales transaction.

    You need to use automated machine learning to generate the best model for predicting the sales price based on the features of the house.

    Which values should you use? To answer, select the appropriate options in the answer area.

    NOTE: Each correct selection is worth one point.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q16 136 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q16 136 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q16 136 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q16 136 Answer
    Explanation:

    Box 1: Regression
    Regression is a supervised machine learning technique used to predict numeric values.

    Box 2: Price

  17. You use the Azure Machine Learning SDK in a notebook to run an experiment using a script file in an experiment folder.

    The experiment fails.

    You need to troubleshoot the failed experiment.

    What are two possible ways to achieve this goal? Each correct answer presents a complete solution.

    • Use the get_metrics() method of the run object to retrieve the experiment run logs.
    • Use the get_details_with_logs() method of the run object to display the experiment run logs.
    • View the log files for the experiment run in the experiment folder.
    • View the logs for the experiment run in Azure Machine Learning studio.
    • Use the get_output() method of the run object to retrieve the experiment run logs.
    Explanation:

    Use get_details_with_logs() to fetch the run details and logs created by the run.

    You can monitor Azure Machine Learning runs and view their logs with the Azure Machine Learning studio.

    Incorrect Answers:
    A: You can view the metrics of a trained model using run.get_metrics().
    E: get_output() gets the output of the step as PipelineData.

  18. DRAG DROP

    You have an Azure Machine Learning workspace that contains a CPU-based compute cluster and an Azure Kubernetes Service (AKS) inference cluster. You create a tabular dataset containing data that you plan to use to create a classification model.

    You need to use the Azure Machine Learning designer to create a web service through which client applications can consume the classification model by submitting new data and getting an immediate prediction as a response.

    Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q18 137 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q18 137 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q18 137 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q18 137 Answer
    Explanation:

    Step 1: Create and start a Compute Instance
    To train and deploy models using Azure Machine Learning designer, you need compute on which to run the training process, test the model, and host the model in a deployed service.

    There are four kinds of compute resource you can create:
    Compute Instances: Development workstations that data scientists can use to work with data and models.
    Compute Clusters: Scalable clusters of virtual machines for on-demand processing of experiment code.
    Inference Clusters: Deployment targets for predictive services that use your trained models.
    Attached Compute: Links to existing Azure compute resources, such as Virtual Machines or Azure Databricks clusters.

    Step 2: Create and run a training pipeline..
    After you’ve used data transformations to prepare the data, you can use it to train a machine learning model. Create and run a training pipeline

    Step 3: Create and run a real-time inference pipeline
    After creating and running a pipeline to train the model, you need a second pipeline that performs the same data transformations for new data, and then uses the trained model to inference (in other words, predict) label values based on its features. This pipeline will form the basis for a predictive service that you can publish for applications to use.

  19. You use the Two-Class Neural Network module in Azure Machine Learning Studio to build a binary classification model. You use the Tune Model Hyperparameters module to tune accuracy for the model.

    You need to configure the Tune Model Hyperparameters module.

    Which two values should you use? Each correct answer presents part of the solution.

    NOTE: Each correct selection is worth one point.

    • Number of hidden nodes
    • Learning Rate
    • The type of the normalizer
    • Number of learning iterations
    • Hidden layer specification
    Explanation:

    D: For Number of learning iterations, specify the maximum number of times the algorithm should process the training cases.

    E: For Hidden layer specification, select the type of network architecture to create.
    Between the input and output layers you can insert multiple hidden layers. Most predictive tasks can be accomplished easily with only one or a few hidden layers.

  20. HOTSPOT

    You are running a training experiment on remote compute in Azure Machine Learning.

    The experiment is configured to use a conda environment that includes the mlflow and azureml-contrib-run packages.

    You must use MLflow as the logging package for tracking metrics generated in the experiment.

    You need to complete the script for the experiment.

    How should you complete the code? To answer, select the appropriate options in the answer area.

    NOTE: Each correct selection is worth one point.

    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q20 138 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q20 138 Question
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q20 138 Answer
    DP-100 Designing and Implementing a Data Science Solution on Azure Part 07 Q20 138 Answer
    Explanation:

    Box 1: import mlflow
    Import the mlflow and Workspace classes to access MLflow’s tracking URI and configure your workspace.

    Box 2: mlflow.start_run()
    Set the MLflow experiment name with set_experiment() and start your training run with start_run().

    Box 3: mlflow.log_metric(‘ ..’)
    Use log_metric() to activate the MLflow logging API and begin logging your training run metrics.

    Box 4: mlflow.end_run()
    Close the run:
    run.endRun()