End-to-End Data Science with IBM Watson Studio and Auto AI

Name: Data Science Lifecycle: Collect, Clean, Predict & Analyze your Data
Uploaded: 2026-02-28T17:07:58.500772+00:00
Channel: IBM Developer
Description: Summary and key takeaways on End-to-End Data Science with IBM Watson Studio and Auto AI, covering Pawa Siddiki and Khalil Faraj, Developer Advocates at IBM

IBM Developer

Feb 28, 2026

•

7 min read

YouTube video ID: 3DElV51FqL8

Source: YouTube video by IBM Developer — Watch original video

PDF

Pawa Siddiki and Khalil Faraj, Developer Advocates at IBM, presented a workshop that walked through the data science lifecycle and a hands-on example using IBM Watson Studio. The agenda covered the stages of data science, distinctions among AI/ML/DL, types of data analysis, data preparation and exploration, the data science pipeline, model types, Auto AI, and a practical customer churn prediction exercise. Housekeeping noted that an IBM Cloud account is required for the hands-on portion and that guides and assets were provided.

Understanding AI, ML, and Deep Learning

Artificial Intelligence (AI) was defined as any technique that enables computers to mimic human intelligence or aid human decision-making. Machine Learning (ML) is described as a subset of AI where computers learn from data. Deep Learning (DL) was framed as a subset of ML that uses multiple layers of neural networks to emulate aspects of the human brain. Data science ties these concepts to practical business challenges: "Data science it's basically the study of data you analyze it you get these statistics you identify patterns and then you apply them on business challenges that you focus on."

Types of Data Analysis

Four types of analysis were outlined: descriptive, diagnostic, predictive, and prescriptive. Descriptive analysis answers what happened and is exemplified by dashboards. Diagnostic analysis explains why something happened, such as analyzing a social media campaign. Predictive analysis forecasts what will happen, and prescriptive analysis recommends actions, for example through recommendation engines.

Type	Purpose	Example
Descriptive	What happened	Dashboards
Diagnostic	Why it happened	Campaign analysis
Predictive	What will happen	Forecasting future sales
Prescriptive	What to do	Recommendation engines

Data Scientist's Role

A data scientist was characterized as someone who wears different hats and is not just dealing with machine learning models. Responsibilities span from understanding domain problems to preparing data and selecting or building models. The role blends domain knowledge with technical skills to produce reliable models and future assumptions.

Types of Data

Data types were categorized as structured, semi-structured, and unstructured. Each type requires different handling during preparation and analysis. Understanding the data type is a prerequisite to choosing appropriate cleaning, transformation, and modeling techniques.

Data Preparation

Data preparation steps included cleaning, transformation, and enrichment. Data cleaning removes bad formats, handles missing data, eliminates useless variables, and corrects wrong values. Data transformation changes formats and column types (for example, integer, string, boolean), and derived variables can be created from existing fields (such as age from an ID). Normalization, handling inconsistent spellings or nicknames, and feature value rescaling were highlighted. Data enrichment refers to looking up and adding information, for instance deriving age from a profile record.

Data Exploration

Data exploration relies on visualization to surface insights. Typical visualizations described were bar charts, line charts, scatter plots, bubble charts, and pie charts. Visual exploration supports hypothesis generation and guides feature selection for modeling.

Data Science Pipeline

The pipeline was presented as an iterative sequence: raw data → processing (cleaning, transformation) → exploration → model design → learning → verification → update/improve → deployment. The process repeats until the model is satisfactory. Domain knowledge and expert knowledge both play roles: domain knowledge informs the use case, while expert knowledge helps build reliable models. The stages map to common tasks such as Data Cleaning, Exploratory Data Analysis (EDA), and ML/Data Modeling, culminating in model deployment.

Machine Learning Models

Machine learning model categories were outlined as unsupervised, supervised, and reinforcement learning. Unsupervised learning focuses on clustering—grouping data based on features. Supervised learning includes classification and regression, targeting categorical and continuous outcomes respectively. Reinforcement learning involves agents, environments, rewards, and observations and is based on actions and feedback.

Category	Focus	Example Task
Unsupervised	Clustering	Grouping similar customers
Supervised	Classification / Regression	Predict churn / predict price
Reinforcement	Agent-based learning	Reward-driven policies

Supervised Learning Deep Dive

Classification was described as predicting a categorical or qualitative target label, while regression predicts a continuous numerical target. Examples included predicting a class label such as "apple" or "cupcake" for classification, and predicting a numerical price for regression. Feature variables and target labels are central to supervised learning workflows.

Unsupervised Learning Deep Dive

Clustering was presented as grouping records based on features without labeled outcomes. Once clusters are defined, unseen data can be placed into existing clusters for segmentation or downstream decisions.

Machine Learning vs. Deep Learning

Deep learning was characterized by multiple hidden layers that learn features automatically; it is computationally intensive and time-consuming. Traditional machine learning requires feature engineering and often has models available via APIs. The trade-off centers on automated feature learning in deep learning versus manual feature engineering in traditional ML.

Auto AI

Auto AI was introduced as automated artificial intelligence built on an AutoML framework. The outlined steps are: upload data, prepare data, select the modeling task, perform hyperparameter optimization, and apply feature engineering. Benefits emphasized include faster model building, bridging the skills gap, discovering use cases, and rapid deployment. It was noted that Auto AI visualizes how it generates pipelines and can build models "without any line of code basically you want to do it that easy."

Details included the Auto AI runtime: typical experiment duration is 2–4 minutes, and for each selected algorithm Auto AI generates four pipelines (a base model, two hyperparameter optimization variants, and a feature engineering variant). Quotes emphasized automation of the time-consuming aspects: "Hyper parameter optimization feature engineering are most are the most time consuming aspects in a data science pipeline and this automation with this automation you can actually rapidly develop them."

Workshop Use Case: Customer Churn Prediction

The practical use case was predicting which customers are likely to stop using a service. Tools used in the demonstration were IBM Watson Studio, Data Refinery, and Auto AI, with a customer_churn.csv dataset. The workshop showed an end-to-end flow from data ingestion and refining to Auto AI experiments and deployment.

Hands-on Session (Khalil Faraj)

Khalil demonstrated setting up IBM Cloud resources needed for the hands-on: Watson Studio, Cloud Object Storage, and a Machine Learning instance. In Watson Studio, a project was created and the customer_churn.csv file uploaded and previewed. Data Refinery was used to refine the churn column: converting boolean strings to integers via conditional replace (True/False → 1/0) and saving the refined dataset as a job.

An Auto AI experiment was created and associated with the Machine Learning service instance. The cleaned data was uploaded, the churn column selected as the target, and Auto AI auto-parsed the features. Settings included binary classification with accuracy as the metric. Auto AI generated model pipelines for selected algorithms such as LGBM Classifier and XGB Classifier; pipelines included base models, hyperparameter optimization, and feature engineering variants. The run produced progress and relationship maps, and the results were analyzed for accuracy, other metrics, and feature importance. The best performing model was saved.

Model deployment involved promoting the saved model to a deployment space and creating an online deployment. The deployment was tested via the Watson Studio interface and the API reference. An API key was generated through IBM Cloud IAM for application integration. The workshop demonstrated a test where a male customer with specified attributes was predicted as 'False' for churn with a confidence score of 0.79.

Practical Summary of Steps Demonstrated

The workshop sequence as shown in the hands-on session consolidated into these steps: set up IBM Cloud resources, create a Watson Studio project, upload and preview data, refine data in Data Refinery, create and run an Auto AI experiment, analyze and save the best model, promote and create an online deployment, and test the deployed model via the interface or API. Presenters summarized: "So it was a very simple workshop what we did was loaded our data onto watson studio used data refined we prepared the data created an auto ai experiment we ran analyze the auto ai job selected the best model deployed the model to watson studio and we made our prediction."

Takeaways

AI, ML, and Deep Learning are nested concepts where data science applies them to analyze data and solve business challenges.
Data preparation—cleaning, transformation, derived variables, normalization, and enrichment—is essential before modeling.
The data science pipeline is iterative: raw data to processing, exploration, model design, learning, verification, and deployment.
Auto AI automates model generation, hyperparameter optimization, and feature engineering, producing multiple pipelines per algorithm.
The hands-on workflow demonstrated loading data, refining it in Data Refinery, running Auto AI experiments, saving the best model, deploying it, and testing via API.

Frequently Asked Questions

Who is IBM Developer on YouTube?

IBM Developer is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

Helpful resources related to this video

If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.

Hands-On Ml Recommended

A practical book that teaches machine learning workflows and Python-based implementations, useful for applying concepts shown in the workshop.

Amazon →

Deep Learning

A foundational book on neural networks and deep learning architectures, relevant to understanding the deep learning subset described.

Amazon →

Data Science Handbook

A reference that covers data preparation, exploration, and modeling techniques, aligning with the pipeline topics in the article.

Amazon →

Links may be affiliate links. We only include resources that are genuinely relevant to the topic.

Summarize another video

Full Transcript YouTube

[Music]
good morning good evening good afternoon
welcome to our live stream today and
today we will be talking about the data
science life cycle
collect clean predict and analyze your
data
my name is pawa siddiki and i'm a
developer advocate at ibm and with me i
have khalil faraj
who's a developer advocate for ibm as
well
so i'm going to start sharing my screen
now
and yeah here we go so again my name is
forua siddiqui developer advocate at ibm
and with me i have colleen farage
and
and i'll be going through the
presentation side of things where and
then we'll have a hands-on session
that'll be conducted by khalil
so let's get started
so during this uh during this workshop
do let us know how we are doing you can
scan this barcode or go to the survey
link
to let us know how you like the
presentation
and what are what is the type of content
you would like to see in the future
so let's get started before we start we
do need some housekeeping to be done so
first and foremost sign up or log in
into your into your ibm cloud account
you can go to ibm.biz forward slash
clean and predict
uh for for the follow along guide for
the hands-on session you can go to
ibm.biz forward slash clean and predict
hyphen hands-on and these slides which
i'm presenting are also available you
can find these links in the description
as well
and and you can
do the need for
so about today's agenda so what are we
going to talk about what is
what is all this data hype all about
we're going to talk about
the types of data analysis types of data
preparing data
about data exploration
data science flow pipeline data modeling
going to demystify the deep learning
versus machine learning aspect of things
auto ai workshop use case and we're
going to have a hands-on and then we're
going to have some useful links for you
to refer to
so what is it all about you've always
heard about data science in today's
space
there's really anyone who hasn't heard
it but there's always
and you've heard about these three
things which is deep learning machine
learning and artificial intelligence
now people confuse them to be as one
thing but they're not they're all
different things so first of all
article intelligence is any technique
that enables computers to actually mimic
human intelligence or
aid human intelligence to make better
decisions
under artificial intelligence we have
machine learning which is basically
using ai techniques that give computer
computers the ability to learn so
basically through the learning that is
done through machine learning it mimics
two words
and the artificial intelligence part
and finally we have
deep learning which is basically
layers of neural networks or or layers
of
deep learning networks
that basically mimic the human brain
so we have the brain which learns and
then reacts towards certain conditions
now all of these three things are
connected with the help of data science
what is data science it's basically the
study of data
you analyze it you get these statistics
you identify patterns and then you apply
them on business challenges that you
focus on
so there are multiple types of data
analysis first of all it's descriptive
analysis
well
it's more about what has happened at a
certain point in time
you
you basically uncover insights and
in terms of descriptive analysis the
main use cases are dashboards we all use
dashboards during covet we all were
using dashboards to actually track
how many cases are there
then you come to
diagnostic analysis on
why it happened
you uncover patterns and dependencies
and you also focus on
on the different aspects that played
towards an outcome
a major use case of this is so a social
media marketing campaign
where where
where you actually target your or your
target audience based on their
their
buying history or what they've searched
on the internet
then you look at predictive analysis
what will happen so based on on your
findings from the previous two steps you
try to figure out what's going to happen
next
and that's basically
a major use case of this is predicting
future sales
and finally
it's prescriptive analysis
what do we do so eliminate
we basically
try to figure out on
to eliminate the possible future
problems
and take advantage advantages of trends
recommendation engines are the perfect
example for this
now a data scientist is not just
dealing with machine learning models
they wear different hats
they focus on business understanding
they focus on data understanding
preparing the data modeling the data
evaluating it
based on the evaluation they go back on
the business understanding to ensure
that
that it validates the purpose
and then they conduct deployment and
this is a very iterative step
it continues it happens multiple times
before
the model is actually deployed
and in today's world
we see three types of data we see
structured data we see semi-structured
data and we see
unstructured data
so structured data could be something in
your excel file or your csv file
semi-structured comes under emails word
documents
mostly focusing on nlp types of use
cases and finally unstructured data
which is basically audio video pictures
sound images and all of these things
when you're talking about preparing your
data
you look into data cleaning
so once you have
the data that you need you need to clean
the data to ensure that it matches
towards your use case that you're going
to use for your model
so you need to you need to
get rid of bad formats such as
such as missing data
you need to replace missing data
with the right relevant data basically
you need to remove the useless variables
and any uh you need to get rid of wrong
data
and then you need to perform data a data
transformation as well so you need to
transform the the uh the variables such
as data formats the column type such as
let's say a column is just numbers you'd
keep it to an integer
and if it's a sentence or a word or a
letter you can keep it to a string if
it's true or false you keep it to
boolean
then you create derived variables so for
example country from ip h from id card
and then you normalize them
so different spellings and different
nicknames so for
for let's say for example for
a male you could use an m
for female you could use an f
so
you need to normalize your data
and then
you feature value rescale
and it's basically you need to rescale
your data to ensure that it's normalized
and finally you need to enrich it look
up and
for example look up an ad
add an age of a person from a profile
so once you're done preparing your data
you need to explore it
so through that
you build up data visualization you
create bar charts for comparison
create line charts to
to see
move
culture over time you create
scatter plots for recognizing
relationships between data and data
points
and you also you can also do ranking
basically an order of magnitude of
relationship between two sets or more
right two sets of values are more
you can conduct bubble charts to see the
distribution of data
and also pie charts to
see a relationship between different
parts in the whole thing itself
so there are multiple ways you can excel
in data science first of all you need to
start building your foundations there
are multiple courses on
cognitiveclass.ai coursera data cam
you can use these courses to learn more
then you need to adapt yourself to
different tools
there are multiple open source libraries
and tools that you can use today
and there are also company and company
specific tools that are provided
kegel is
is just an amazing platform you can
participate in competitions you can get
your data sets you can see you can
contribute a lot on cable by developing
notebooks and kernels for for multiple
use cases as a practice
and furthermore you go towards
competitions taking part in local events
webinars hackathons
and and also getting an internship at a
at a at an organization's right it could
be a bank could be startups
and you should focus your way working
with a as you learn more there
now
when you're talking about the data
science pipeline
uh
it's a very long process right
and
what you see start with is just raw data
you got the data from somewhere from a
newspaper
or maybe a book that you were reading
you take that raw data and then you
process it
like i mentioned
the data cleaning part
that's how you process your data you
transform it you process it clean it and
then you start exploring it you see
what's in the data itself
uh what or what are the patterns that
you're seeing could be could be that you
find some really amazing use cases so
just one thing i want to highlight in
the exploration of data it's just not
limited to one use cases maybe you're
exploring it and then you find another
use case right
and then you go towards designing the
model so what your model is going to
predict and how is it going to predict
those things
then you you learn the model itself
and you go to verifying the experiment
and
learning and learning and verifying the
experiment is very important because
that's how your model is basically
adapting
to what you have programmed to and then
you verify it to say hey yeah the
assumptions are correct here
and based on those learning
builds on those verifications you update
or improve your model you know not
everything is perfect today
and that's what and that's why it's an
iterative process as well
so during your update model you need you
may need to go ahead and go back and
process the data
and
once you've updated it you verify again
and and also make the model learn once
more
and basically
on the right side
on the left side of things
it's basically the domain knowledge that
you need
and finally
on the
right side of things is the expert
knowledge so the domain knowledge
basically means that you have
the knowledge about the domain that
you're dealing with in terms of the use
case in terms of
in terms of the goal of the entire
experiment
in terms of the expert knowledge you
actually
build something that is reliable
build a model that is reliable to make
assumptions about about what's upcoming
in the future
and
in this
gray area basically the processing data
exploring data
and designing the model comes under data
cleaning and processing
then the purple area is basically
exploratory data analysis or creating
data visualizations
and finally
in update model learn model
it's all about
machine learning and data modeling
and as a final step you do the model
deployment your model is ready
to actually
be deployed on a on to the public and
it's going to continue learning you need
to fine tune it not every deployed model
is perfect in the start you need to fine
tune unit to actually
allow it to come to the right accuracy
so there are multiple machine learning
models
so there's unsupervised learning
basically clustering is unsupervised
learning
then there's supervised learning
classification and regression and then
there's reinforcement learning
so clustering basically
it clusters users we're going to go
deeper into the examples of these
but it clusters you
it clusters data based on certain
aspects of things
then when it comes to classification and
regression classification is more about
about classifying and classifying the
data
into the right category
and regression is more on a linear side
which will go and dive deeper
and reinforcement learning is based on
rewards and actions so there's an agent
which does an action towards the
environment and then there's a reward
and
observations as well that are provided
now
going to supervised learning
so there like i mentioned there are two
types of supervised learnings the first
one is classification
you categorize based on a target label
it's either categorical or qualitative
and when you talk about regression the
target label is continuous in a
numerical value
so when you look at this um look at this
table on top right
you have the color type flavor and price
these are features these are basically
the the training variables on which your
data trains on
and then there's the class which is app
apple cupcake cupcake apple basically
that's the target output or that's the
target that you're actually trying to
predict
it's a class target label output
y right
so when you look into classification
what's happening here is that
let me pull out a of laser pointer
all right
so when you look at a look at
classification as an overall
based on the training data
what it does is now we are providing it
the training data
where there's apples there there
there are
ways the apples are actually assigned
and then there's cupcakes right
we feed these into the machine learning
models using these the color type flavor
and price
and then we provide and we train our
machine learning model to predict the
class
and
when we feed unseen and unlabeled data
to the machine learning model we get the
class that is required in terms of
classification so let's say for example
our test data
it's the color is orange
the type is
carb cupcake is carbohydrates right you
don't want to have too much of them
and the flavor is sweet
and the price is around 20 20 or
something so it's going to predict the
class and here it should be cupcake
when you look at regression
regression will take again
color type flavor
weight is based are basically our
features and what we're trying to
predict here is the price
so what
what happens in regression is that it
turns the values into a numerical
category it's a linear function
basically
and based on the input data so for
example let's say
i'm giving it an orange color which is
basically type of carb and the flavor is
sweet and the weight is around 20
20 grams
what is the price so it's going to
basically assign an estimation of the
price based on the linear model that
we've developed
now when you look into unsupervised
learning it's basically clustering
things
based on certain features right
or clustering objects inserted in terms
of certain features and when we provide
it with unseen data it basically puts
that data
into
the cluster it belongs to so for example
again
again let's say that we this is the
training data
we have multiple
multiple features
and we have the color we have the type
we have the flavor
we have the weight and also the price
so we need to find out
what does this
test data actually belongs to
and
i hope it's making sense right and
and at the same time
what's going to happen is that
let's say i've given it the something
with the color orange
then there's
the type is carb
there's a sweet flavor to it the weight
is 13 the price the price is 20.
so it will place
this item this orange item basically
in
in the green cluster
and that's where you will be in that's
how clustering basically works it
it basically provides
tells you
what that one particular item is which
is basically this here
and just for those of you who are
popping in
um thank you for joining us today you
know you're keeping this uh this matter
and this workshop really alive
and we're we're going to show you
a hands-on session as well on these
things so khalid will be doing that stay
put and thank you for the likes and
comments you're awesome
so
uh let's go ahead with machine learning
versus deep learning this is a chapter
this is a topic where people actually
confuse
these two these two topics
right
so deep learning i'm going to start with
deep learning it's basically
a set
of
multiple layers multiple hidden layers
where they train on multiple features
and it goes very much in depth it's
complicated it's very uh
intensive on computation and all of
these things and it has
it it has it learns the features
automatically and it classifies them
and then it provides at the output but
when it comes to traditional machine
learning
it's all about let's say you got a tweet
you conduct feature engineering on those
and those tweets you you you
attain the features from there you
classify them and then you gain the
output
deep learning consists of multiple
hidden layers to actually come up with
the output it's more it's more
competitionally intensive
and it takes a lot of time as well
machine learning models are ready to
deploy these days there are multiple
apis available that actually predict all
of these things for you
so i hope that makes sense um if not
leave a comment and
one of us will reply there
so
today we're going to talk about auto ai
so what is auto ai now i've explained a
lot of things to you right now and it
might
some of you may be lost some of you may
be starters in data science
but there are tools today that automate
all of these processes
so auto ai is one of those these tools
as well
it basically means automated artificial
intelligence
and it's based on the automl framework
which is automate machine learning
framework and it all starts with
preparing uh uploading your data on it's
just one tool
doing all of these steps together by the
way so your life gets really easy and
you can rapidly develop models based on
this
see you start with uploading your data
in a csv file
and then you prepare it right it
prepares it for you
it does feature type detection missing
the values
amputation
feature encoding and scaling and then it
selects
a model
from a wide range of estimators that
it's
that are that are provided and it
selects the best estimator for your data
then it comes to the hyperparameter
optimization to optimize for the best
estimator type that is that it's chosen
and conducts feature engineering let me
tell you by the way it's a fact that
hyper parameter optimization feature
engineering are most are the most time
consuming aspects in a data science
pipeline and this automation with this
automation you can actually rapidly
develop them
and it doesn't take days it doesn't take
months it takes a matter of minutes
maximum five minutes for the for the
hands-on that we're gonna do
and as an end step what you do is save
and deploy the model you're you're ready
to go
and you can also import it as a notebook
if you want to do any different if you
want to make any modifications
so what are the features of
of auto ai first of all you build your
models faster save time
manually prepping the data and building
models
you jump the skills gap hey you don't
know how to code no problem right you're
starting data science you want to get to
learn things no problem you basically
jump the skills gap and you get the
insights where you can learn things
furthermore you can discover more use
cases
uh
furthermore you you want
you can discover excuse me
furthermore you discover more use cases
you embrace ml everywhere to disrupt and
transform you find signal from noise
auto feature engineering is there to
basically extract those predictive that
the more predictive power from your data
you rank and explore different types of
models you get multiple models which
you'll be seeing today
and within a matter of seconds you're
ready set and
ready set and
you're ready to deploy it as well
so pipelines are generated through auto
a can be developed
and can be deployed to
watson studio machine learning
and with the help of rest apis you can
start bringing your model and you're
good to go
so let's talk about today's workshop use
case which will be done by khalil so
basically we're going to predict
customer churn it's it's a very
important aspect for any business
they get insights about their
prospective customers what the customers
are looking for
and how they're doing in their business
pipeline and we'll be using watson
studio uh for our which is basically our
data platform
data refinery to clean some data we're
going to show you how to clean the data
and basically auto ai to predict this
so
furthermore
you can go ahead and sign up and log
into your ibm cloud account
uh
it's uh sign up
the link is right there it's also there
in the description
and
you can follow along for the hands-on
and
the slides are also there
so hello over to you
all right thank you for watching uh my
name is thanks for joining
and let's get started with the hands-on
part so to be very interesting just let
me share my screen beginning
okay
all right so just like i was while
explaining our use case will be focusing
on the customer churn to give you a very
basic example on customer return for
those who don't know let's say in the
telecom industry a lot of
companies use the customer chain
predictions to know which customers are
most likely going to cancel their
subscription on you know their
service that their company are providing
but let's say for example
another case would be with netflix they
want to see which users will be
canceling their subscription on netflix
so by predicting customer service this
can give some insights for businesses
and to try to retain maybe you know the
customers can make them to segment them
to give them some specific deals to try
to keep them with them so in our case
we'll be providing like a very basic
example on car owners we want to see
uh which people are more likely to you
know cancel will not cancel but like to
stop you know using their car stuff like
that with the automotive industry
um it's pretty straightforward so
we'll be using of course our main
technology was a serial platform
and whenever we want to use watson
switch to build predictive models we
will be
using a machine learning instance
service which is basically watson
machine learning as we can see right
here
and of course we have already a data set
that has some information csv file that
has all of the different data we'll be
using to build our model uh we'll be
using this we'll upload this in cloud
object storage we'll go a bit into what
exactly is a cloud of storage when we
start the hazard part
um
and basically that's it so i hope by
this time everyone has already created
this ibm cloud account and if you
already have one you can simply just go
ahead and sign in
um and this is basically let me zoom in
okay this is basically how
ibm dashboard looks like so this is the
main
uh kind of the view of ibm
so to get started we need to start
creating our resources
and this case our main service that
we'll be using is watson studio
so to get started with that we can
simply just click on the create resource
blue button on top right here
and we can search for
watson studio so on this search bar
right here
we can
go ahead and type proxy studio
click enter
and you will see what's inside right
here so this is our first service
we can go ahead click on click on it
choose any location that you want so you
can go ahead with dallas if you want
it's more stable uh
so that's your south
then you can of course make sure to
choose the right plan because that's
free
and last but not least just give it a
name so you can identify it easily later
on and click on create
now in my case i already have my all my
times created so i want to click on
create but simply just go ahead and
click on create
so this is the first service
the second service is the cloud object
storage
so basically what that means is kind of
obvious it's a storage service so it's
somewhere where we can store our files
now in this case since we'll be using
some external files in our workshop we
will be needing to upload this file
somewhere to be able to use it inside
watson studio so the best way to do this
is through cloud object storage
just like we created the watson studio i
just want you to go ahead and you can
type here on the
of the search bar right here cloud
object storage
and then
you will see the service right here is
called object storage you can click it
and we will repeat basically the same
process
uh you choose the pricing plan which is
you know the light one free give it a
name to make it i
easy to identify later on
and click on create so now we have two
services the watson studio platform and
the cloud object storage where our files
were going to be stored
the last service is our machine learning
instance so basically to be able to
build this predictive model we need a
watson machinery instant service that
will be used
in watson studio so we can go ahead and
build our model
uh so again this is our last service we
just go ahead and type
machine learning
search bar and you will see here under
catalog results this purple icon brain
icon
with the title machine learning just
click click on it and choose the
location again if you want you can go
ahead with the us south dallas choose a
free plan
give it a name and click on create so
now we have three services the watson
studio the cloud object storage and the
machinery
so basically if you click on ibm cloud
right here this will redirect us to the
main page of ibm cloud the main page of
the dashboard
and basically this is where you will be
able to see all the services
so
we're
waiting just as the page loads
okay so if you scroll a bit down here
under services and software this is
basically where you'll see all of the
services so if you look at click on that
it's going to open up it will expand
let's expand it
service and software so here you will
basically
see obviously you will see your watson
services so with the watson studio
platform because i have so many services
but your case if you're new you'll
basically see your watson studio service
uh your machine learning
stuff like that and of course there is
the cloud object storage now this would
won't be in the services and software
this will be under the storage
uh
category right here and basically here
you'll be able to see your service
created and instantiated
so to get started uh since again the
entire workshop will be happening inside
watson studio so basically just click on
the service
watson studio service just like that
now once once you click it this will
open up a new page for you
okay
so basically this is our watson studio
you can click on get started
and now this will launch the platform of
fossil studio itself
so now you will see that
it has the name ibm cloudpack for data
which is basically the same as was in
studio it's just for you know
rebranding uh but basically the watson
studio platform is a platform for
everyone it's a one-tool platform for
you know data scientists business
analyst machine learning machine machine
learning engineer
machine learning engineering people ai
engineers anyone basically so
and the great thing is that you don't
need to be an expert so of course we
have technically that would be using
these but even if you're for example a
business analyst or a business person
let's say a business user you don't need
to actually have the technical skills to
be able to use once institution that's
the great thing about it's a platform
made for everyone no matter what your
level is
um so that's why
now we're going to work and show how
exactly it works so this is the kind of
the main dashboard of watson studio it
has its own dashboard again because it's
a platform by itself
uh so let me close this because i have
already
uh my also studio open here
and what you want to do is we want to
create a project inside wasn't studio so
this is basically how it works every
team let's say maybe we have a team of
business analysts of data scientists and
machine
and all engineers you want to
collaborate on a project so to do this
they will go ahead click on
project
and
okay just waiting so
then we click on
new project now in your case i believe
this will be empty because
if you haven't used this before of
course you have obviously been empty in
my case i have some of projects but you
can see this blue button right here
click on your project
and then we go ahead with create and
then an empty project
okay
so now we're going to give our product a
name let's say
card
uh here it will ask you to assign this
watson studio project to a cloud object
storage and the reason why it asks for
this is because
since we are working with files we need
to be able to read this file from
somewhere which is our cloud output
storage that we created earlier in the
session
so that's why here you basically go
ahead uh choose the service that you've
just created
in my case this will be distributed i'll
be using
and of course it will give you the
option as well to restrict who can be a
collaborator because i said before this
is a project so we can have different
people who can collaborate on it so you
will want to create a project you will
have kind of the visibility or the
option to choose who can collaborate in
this specific project so for now we'll
just leave it as
there is no specific person who could be
a collaboration on that
since we're working on a demo
uh and then we can go ahead and click on
create after we assign this and give it
a name
now meanwhile while it's loading uh i
hope by this time as well if you haven't
downloaded the repository just make sure
to
button downl
click on download zip
so this will you know download your zip
folder that has all of these resources
and once it's there just click on it and
basically extract it to automatically be
extracted so we can use the files inside
mainly our data file
so
i will be showing
okay let me go back here
all right
so now that we have our product ready
this is basically how basically i'm just
extracting my
folder sorry about that
basically i will see your folder right
here and it'll be extracted and just
like in the repo we have all our folders
so yeah let's go back
this is our
project dashboard in watson studio at
the moment we have zero assets obviously
because we haven't done anything so far
uh but it's time to start adding some
assets so we can click on assets
there are no data assets at all so we
can simply just click on drop files here
or browse
and basically if you go to your
downloads you will see your folder that
you just downloaded
you go to the data directory and you
will see the customer sharing csv
so you can go ahead and just
open this file
it will take a couple of seconds to
upload it and you will see now that
it's done it was so quick but it shows
you that it's now sending the file to
cloud object storage so that's why we
use cloud public service because this
file is now uploaded there and it's
related to us in series so it can read
it directly from here so now this is our
file
we are going to go ahead and just
preview it quickly
uh basically it's a simple file it has
some information about
let's say customers uh if they uh kind
of their income as their age so in this
case we have
the number of children
their gender
uh their estimated income and their car
owner age
payment nothing how they paid for that
uh membership that they have
and of course the churn so this is
basically our main important column in
the moment because this is basically the
thing that you want to predict and for
now we have them as false for truth so
full true or false to see whether it or
not
these people were already have you know
are willing to to go with the churn or
not
so this is quickly our our data
the preview of our data now if i was
understanding before that we can go
ahead and refine our data directly from
master studio
so basically let me just go back to do
this step by step
now that we have uploaded our file you
will be already see this in under assets
so i'm just going to load assets
okay so again this is our file if you
click on it
we will be able to
preview it just like we did right now
and you will see this blue button here
we find
so we can go ahead and click on refine
now of course the for the focus on this
workshop is much more on building models
much more easily but we just want to
give you a quick overview of kind of a
clips of how the refined tool works in
watson city so i'll be refining one
column which is basically the turn
column the one that has true and false
so the way it works basically with the
refine tool it's going to preview some
of the first 50 rows it's processing the
the data sample
and once it's done we can go ahead and
start refining
so this should take around 10 seconds 15
seconds maximum
at meanwhile this is the column that you
want to refine at the moment it has
boolean values
false and true
and want to
convert this into zeros and ones into
integers basically so basically zero for
false and one for positive
so right now uh the processing is done
and we can start to refine
so as we can see our return is now a
time volume
and all of the other actually fields
have been parsed to their corresponding
uh property like the similar string the
single integer uh this one is boolean
let's say for example we want to change
it into integers
uh the first thing we want to do is to
convert this into strings
so you will see let me do this slope
and let me zoom in
okay
so if you see uh here you will see these
three dots once you hover on the column
you will see that three dots just click
on this
and then there is the convert column
option
and there is some integer string so what
you want to do right now is to convert
this into strings so we click on that
and now we have these as strings string
three
and you will see here on the right side
uh every step will be kind of long so
menu converter data tabs for one column
when you convert one or more content to
insert data types so convert it into
string so everything that we will be
doing it will be kind of shown here as
log of what's what we've done so
basically our steps
so now we have these all as strings but
this is not enough because we want to
set that we want to convert these easily
into zeros and ones we're not doing this
manual we want the first process
so now because these are strings we can
go ahead and do some operations on that
uh by adding some conditions so again
let's click that's over and click on the
three dots
basically
you will see here at the very end there
is the view all to the bottom click on
this
and this will open all the different
operations that we can do
if you scroll down
[Music]
you will see that under organize there
is the conditional replace so basically
this is where we want to replace value
based on a specific condition
now we can click on it is that
just that simply by choosing it and
here we can start giving our conditions
so the first one is i have the value
false
and i want to replace this with the
value of zero so i'm saying that 0 is
false
again this is a string i want to convert
this from when we're working with
strings replacement we need to make sure
that the value that we want to add is as
well as string
so that's why i'm putting the quotations
the double quotations
so false now will be replaced by a zero
and i want to add another condition
which is basically to
replace true
by one
so i'm going to go ahead and
add the value true you don't need to put
off condition because it's already a
string but here because one can be
understood at an integer
so first we need to make a string just
like this
true is now one so now i have two
conditions i have my i
want to do these values inside the
column
of turn and i can go ahead and click on
apply
so now basically if i
scroll
i'll see that now my churn column has
zeros and ones instead of
true and false but again these are
strings i want them as an integer and
that's the easy part right here just
like before i'm going to hover on it
click on these three dots
and we go to convert column
and now i can choose the integer because
i have these are these strings can be
easily transferred to integers
so just like that now i have my churn as
an integer with zeros and ones value so
you're replacing the false and one
replacing the true basically that's it
this is a quick way on how data refining
works again we just want to show to give
you a quick look
and once this is done you can simply
just go ahead save this this new file so
we can click on
right here you see this icon
jobs click on this
click on save and create a job
and here basically just give it some
some details some names so say
data refine
click on next
so this is the input that we gave the
original data set this is the output the
one that has the values of zeros and
ones
the environment will leave it just as
default then we click on next
basically here is schedule
just in case you want to run this at a
specific time but in our case you don't
need this so we can just play this layer
up and if you want to make sure to add
notifications so once this is run and
done it will notify you basically
but we just go ahead and click on next
this is basically
the
summary so we have our data we have no
schedule notification we're choosing the
default
information and now we can basically go
ahead and click on create and run
so usually the job takes around a minute
uh more or less
but again this is you know a small part
of this tutorial so just wanted to give
you make sure that you kind of see how
it works
uh so the job was just created so now if
you click on seed shop details we can
actually
actually see the job being you know
it will be running uh to process this
and generate this file for us so
obviously what will happen is we'll have
a new file
based on our original one where we have
a new values like the zeros at once in
this case
so in this case we can see the status is
already started
and as i said earlier it might take
around a minute
um
but usually what you can do in this case
you can click on this refresh button
to make sure that it's turning out so in
this case it's still still running today
we have the duration which is 44 seconds
uh we're going to wait a couple of 10
seconds and refresh it again and
hopefully by this time we should be good
to go
so meanwhile again here the main thing i
want to show you that it's that easy to
be
to refine data actually there is no kind
of complex process just like as we saw
we were able to add conditions just by
choosing from the menu
doing some clicks and that's it
basically this is how we can refine data
in watson studio
uh so again that's what we mean by
making a platform to be used by anyone
whether you are an experienced technical
person or a person who has no knowledge
on how his stuff works but you want to
you have a project where you want to
build something
i think this is like the best way to get
started
because no advanced knowledge is needed
so right now we can see that data is
completed
and so basically if you go to the
co to our project
you will see that the file has been
generated and sent there
so this should take a couple of seconds
to load
so meanwhile uh while waiting let's talk
about the next step which is basically
the core of this
session which is auto ai so yeah right
now we can see that now we have our
customer chain csv shaped file generated
right here
and
i'm not sure if i'm
[Music]
i think there's a song in the background
if someone if
yeah thank you all right let's go back
so basically
uh where was i
okay so i was saying that now we have
our generated file right here this is
the one that has been refined and this
basically authoring finder works so now
i want to go to the most important part
which is the auto ai
so if we click on this blue button after
project
with what's in studio we have the
ability to do a lot of stuff when it
comes to machine learning and you know
deep learning all of the different stuff
that i always was talking about
the main value is to try to make the
process of building models as easy as it
is
so
for instance maybe you want to go the
traditional way and
write your own custom machine learning
models and go with notebook
if i'm sure maybe you've heard of
jupyter notebook for every anyone who
has a primary data science you can go
there and build your write code and run
the cells to generate your model there
is dashboard this is basically where you
can visualize some results so you go and
go ahead and get some insights it's very
useful for business analysts to
visualize any insights that they have
from a specific use case
and
not just visual charts that are
static but they are dynamic you can
interact with them basically
there is a division recognition model
this is useful for classifying images or
detecting objects uh you know
let's say for example you have cats and
dogs you want to unders build a model
that can understand that these are cast
and these and dogs this can be done
right here natural language classifier
and there's a lot of different things so
if you're interested in deep learning
you can run a set of models here
for neural networks using this service
so there are a lot of different servers
but in our case we're interested in the
auto ai
where you're going to build your model
without any
line of code basically you want to do it
that easy
so if you click on auto iii
it will ask you to associate this auto
ai with a machine learning service
instance
now
at the beginning of the workshop we have
already created one so facilitate the
right time to associate these two
together
so we're going to click on associate a
machine learning
and this will open up a new page where
you will be able to see all your
services
right here so you can choose it and
associate to
your machinery
so in this case i have all of these
services
um in your case i think you will have
only one so just have to choose it
uh so investigated to my service
you will choose it and then you click on
associate service
so again i you just this is your service
you'll see your service right here
you'll check the box and then you click
on associate
now once this is done click on reload
[Music]
and now you'll see that your service is
associated
we're going to give a name for this auto
ai experience so
let's name it for example
auto ai
okay you can give it a description if
you want of course and tags
but for now i'm just going to leave it
like that
and we click on create
so this should take a couple of seconds
okay and now the next step is to upload
our data source now in our case
the data that the original data set is
already cleaned for you so the data you
find your product will show you how
exactly the data funny part work but you
can simply just upload the file the
original one
so here you have the option to choose
from our project or you can simply just
browse
from your
from your local machine
so just go ahead and upload the file
now in this case because i have already
it's it's understand that there is
already this existing file in your
project so you want to overwrite it
in this case i can click on some
basically yeah i'll just contour for
edit that's fine
and now the file is being uploaded to
this auto ai exp experience experience
sorry
now the great thing about auto ai is
that it automatically has this encoding
feature so automatically it parses our
fields
so
we can click on no here that's fine uh
here basically it will ask us to choose
what you want to predict from this
from this data center in our case we
want to predict the circle we want we
have to build a model that understand or
that give us insights on whether or not
a specific person is going to go with
the chair or not
so that's why here we're going to choose
our
our module but as you can see our field
but as you can see that all the fields
are browsed automatically so in this
case gender is a string the income is a
decimal the age is an integer these are
automatically done by autoiac usually in
a traditional way let's say you're
working on a notebook or your own you
know machine learning model you have to
add some libraries to be able to do this
parsing but right here nothing is being
done
from your side australia is doing all
the hard work for you
so we're going to go ahead and choose
the churn and it automatically
understand that sure is of the binary
classification prediction type
now of course you can go ahead and
modify the settings by default optimize
for accuracy and run time but let's go
ahead and
and go with this settings so the battery
classification is the best one suited
for this prediction
you can choose the metric in this case
it's recommended to go with accuracy
again you can see set the optimized
algorithm selection
which kind of algorithm do you want to
be involved in this experience because
again autoi means goal is to find the
best algorithm for this specific use
case for this data
to build a model in this case these are
selected
and then it's going it asks you how many
algorithms you want to use out of all of
these
so basically it's the way it works is
auto ai is going to check all of these
and choose the top performers to
generate what we call the model
pipelines
so in this case if i want to
choose two i will have uh i'm telling
basically i that i want you to give me
the
top two performing algorithm out of all
of these okay
so it's going to go ahead and choose
these two algorithms and then what's
going to do is going it will be building
pipelines on top of these algorithms now
this might be a bit complex but i'll try
to make it as easy as it is the way
pipelines work is something that
forwards mentioned earlier with the
concept of hyper parameter optimization
and feature engineering now these two
concepts are used to make sure to have a
better model with better accuracy and
better prediction results right
and
every pipeline so every sorry every
algorithm is going to have four
pipelines so in our cases two we are
working with two we're going to have a
total of eight four for each and these
pipelines basically are look like the
following the first pipeline is
basically the model itself the base
model so for example let's say the
doctor performance algorithm in this
case for logistic regression and join
the first classifier
for this one
the first pipeline will be the logistic
regression as it is the second pipeline
for it it will be with the first layer
of hyper parameter optimization so this
is the first time we are trying to add
some enhancement on
then the
this third the third layer will be the
feature engineering layer that will be
added on top of it so it's like now we
have the base model of logistic
regression with the first layer of
hyperparameter and the feature
engineering and the fourth layer which
is the last one will be with an added
layer of another hyperparameter
optimization you will see this more
clearly in a bit once the experience
starts but i just wanted to give you a
brief example of how it works so for
every algorithm that auto id chooses
it's going to generate four pipelines
each one with an additional layer of
enhancements this is kind of the
simplest way to explain it
so this case we'll just go ahead with
two which means that we'll have eight
pipelines generated
so this time let's just go back
and let's go and run the experiment
so now this should take around
two to four minutes depends on your
internet connectivity as well
so basically now what toy is is going to
start to generate the power plant and
the great thing about odi is it's
visualize how it's it's generating these
pipelines basically in two ways the
first one is with the relationship map
and the second one it was a progress map
so they both deliver the same thing but
in a different way a different
visualization
so if you swap
the view let's go to the progress map it
will eventually go back to the
relationship
but if we go to the progress map all
right
we can see that now at this moment auto
ai is reading the data set so
it's being here at the moment then
what's going to do is going to split the
data it's going to read the training
data pre-process it and start with the
model selection and this is basically
where it's going to access all the
different algorithms that we chose and
it's going to choose the top
two performing algorithms
and once it get it shows these two
basically dot performance algorithm it's
going to add or generate basically the
pipelines and as i said there are four
for each one of them the first one is
the model the base model itself the
second one is the first hyper branch
optimization the third pipeline is with
the feature engineering and the fourth
pipeline is again with another layer of
hyperparameter optimization and then
it's going to do the exact same thing
for the second
algorithm that it shows us
and also this is going to generate eight
pipelines so it's going to check out of
these eight
which is the best one
so this is how it works
uh as we can see at the moment so the
first algorithm that shows is the lgbm
classifier
it's it's going it's already started to
generate the pipelines
and we can start here seeing some of the
results
so
at the moment for example this is the
best one but again this is not the final
one because we still have
another algorithm to to be to work on
strains pipelines
uh but meanwhile we can go ahead and
check the other visualization it's
basically the same thing but it offers
another another kind of format
uh basically we can see how it's doing
the pipelines we can hover on top of
each one of them to get information
after accuracy like in this case 5.3
0.744
and that's going to do the rest
the exact same for all the different
pipelines
so as i said before this will take a
couple of minutes
um now we're generating the second so
the second algorithm that it shows is
the xgb classifier
so in this case that the lgm and the xgb
are the best two algorithms for this
specific uscape or a pretty specific
data set that we have
so again the cool part is that we are
building a
machine learning model without doing any
without typing any iscope so how cool is
that like we're basically removing all
the hard work and putting it onto ai and
you can just sit back and relax waiting
for the results there
so basically you can
take auto ai as an ai that pulls your ai
button like this is kind of easiest way
to explain it
so we still have the last pipeline right
now
which should be done in any seconds
so we can go ahead and explore some of
these so
at each one you can see the accuracy how
long it took to build and the
enhancement so for example
this pipeline has all the enhancement
that's done to it
uh this because this is the fourth
pipeline of the first algorithm and
actually it's done so this is our
candidate right here it's pipeline four
so the one that belongs to the lgbm
classifier with all the enhancement this
is our top performing model
uh
you can see all the other you can
basically compare them if you click on
the python comparison
so basically this will show you a cool
visualization
right here so you see the pipelines and
the accuracy pass accuracy area under
roc all of these different metrics like
the f1 recall average precision
and you can hover to get somewhere
inside something
um
so let's go and in this case explore
more about our running model since it's
five time four this is our best
we can simply just click on it
and basically you'll be able to see some
again some additional information
uh like the confusion metrics
uh you know precision recall so these
will be more useful for people who have
some knowledge about data science
uh but again this is pretty cool to show
to you future summary
uh this is very important so this is
basically something i want to talk about
which is
that shows the future importance
so basically future importance means
what features are or in this case the
fields of data set that have the biggest
impact in building the accuracy of this
model
so sometimes you will see a new feature
which is basically
an operation that is done between
either by mixing different fields or by
adding an operational specific field so
in this case if square rooting the
average monthly spend field
has come with this new feature and this
kind of has the biggest importance and
from all of these
sometimes you can see fields just as
they are in this case like age it has a
bad importance out of all of these
features but it's basically what is
future engineering so it's when we can
enjoying the features mixing some of
them together adding some operations
together to enhance the accuracy of this
model that's why we go sometimes with
the not sometimes when we go with the
future engineering process this is
exactly what we're doing
so again this is just a way to get some
information on all of these on the
specific model which is our way
so now that this is our winner we can go
ahead and save it
so click on save as
um then we can click on model or
download and then right here
you can give it any name you want i'll
leave it as it is
i can click on free
so now the model is created we can see
it in the project
okay
and the thing is now we want to upload
this
to our and if we want to promote it to
start using so there is this blue button
called promote deployment space
so if you click on this
it's going to ask you to choose at
target space
so now we have our model but it's not
being deployed anywhere
uh you see usually when we have a model
we want to deploy it somewhere to be
able to use it of course if you want to
put it under local machines it's only
very expensive so we wanted to put it on
the cloud using watson studio
to do this we first need to create a
space
so if you don't have a space you can
just simply create a new one but in my
case i already have my own space so i'm
going to choose
myspace
and this my model which is my asset
already there
and then i'm going to click on promote
again this should take a couple of
seconds
okay so now the model has been
successfully promoted to our deployment
space but now it's not deployed yet so
now it's promoted there now we want to
actually deploy it so
again if you go to the main assets page
in our project now you will see here we
have our dns of course you will see the
auto experience that we have done the
experiment itself
and you will see as well that we have a
model that has been saved because we've
saved the model right here
so
now we're going to actually go and
deploy the model and of course there's a
data refinery flow the first thing that
we did when we started the project we
don't care about this one at the moment
so this is our model that we want to
deploy
now if you click on this hamburger icon
the navigation menu you can expand this
you can actually go to your uh to your
uh to your deployment space in my case
it's called cp4d
demo so in your case whatever you call
just click on it you will see it right
there
and now we have promoted this asset
there so i'm going to see this model
right here
so if i click on assets
i'll see my mother right here that's my
mod
carter or toyi now it's not promoted
deployed at the moment but there is this
rocket icon right here that help us
deploy so we can simply just click on it
and we go with the online because this
is basically useful in case you want to
go
uh with data in real time just like
having it as an endpoint to the web
service to send data and receive data
from it there is another option with
spatch but for most common use cases of
the online one then give it a name so
far
uh these are
optional you can leave them empty that's
fine and we click on create
so now if we go to this tab to
deployment one
you are actually going to see that the
model is being in progress being
deployed in progress
so basically this should take again
a couple of seconds
it doesn't take you that much time
um i know you actually once
downloaded 10 seconds if not sometimes
it's useful to refresh the page
uh but let's just wait a couple of
seconds before doing this
okay yeah so now it's deployed as we can
see
and now we can go ahead and actually
test this mod
so if we go to this one
okay so now this is where we go to the
testing phase uh the first one is
basically an api reference in case you
want to use this model in any of your
apps let's say you have a mobile app or
a web app where you want to predict
something you have ultra model to be
able to do that you can go ahead take
this endpoint and use it inside your app
so of course there are some code
snippets that you can use if you're
writing something in java javascript
iphone or
and you just choose this endpoint and
whatever data you want to send it from
there basically kind of rest api
endpoint just give it any data from your
app to this uh model the portland muscle
studio is going to process it and then
it will give you back the results in
adjacent form to your app so you can use
it
uh but in our case now what you want to
do is want to test the model using the
interface of what's the series
right here
so we can go ahead and add the fields as
they are or if if
if you have a lot of fields it would be
much easier to put them as a json
uh in our case i don't i don't have a
json
format but i already have
uh my model deployed it's already saved
so we can go ahead and fill them we have
already provided examples and they will
get our people so you can take value as
they are
but just to
gain some times
i want to go to my audit deployed model
so i can show you my testing results
there
so this is my already model that has the
values testing value a sample
now let's go back to test
oh it's not saved okay never mind i
think we'll have to enter them so let's
go and do them one by one so in this
case gender m
let's go and take the data as they are
i think there is a json file but i'm not
sure if it's working we can try it out
at the moment it's not just
so let's try to copy this
i think this needs to be updated
let's double check
oh never mind already in this case i
have i have this saved
um no there is no value so let's go and
replace it
okay
yeah okay this this one needs to be
updated but anyways what we're going to
do we're going to enter these manually
inside the uh the interface part so
let's go here
um i'm going to add them one by one from
the github people
and so present that okay
so um then let's delete this let us
all right here we have one
okay let's add the income
car owner yes h let's say thirty
uh
uh then we have the last one here is
what i'm doing here is test this model
we need to give it some data right some
input data to see if if a customer with
this information
will have a value of churn either as
true or false to a future or not so in
this case our customer is male he has
one children he has an income of this
value he already has a car his age of 30
his average multi-spend is 43 he is
streaming has his credit card and he has
two memberships
so
we're going to go at the list and we
click on predict
and now we can see this
this is basically the the result the
outcome and we can see that we have the
the f value so this means that's false
so this person
won't turn
right uh and basically here we can see
the confidence score so it's 0.79
accurate
or confident basically that this person
won't turn because it showed us that's
false
so if you're using this kind of in your
web app or mobile this is exactly what
you will get in the json format and you
can use these values to do whatever you
want in your specific application
but basically this is how it works uh of
course in this case we're testing it uh
using the interface of watson studio but
again there is an api reference tutorial
here kind of documentation that you can
take
and basically can do your you can have
you can just simply create an api key i
can show you this how to easily do that
uh if you go for example to ibm and your
main identity platform there is the
manage here
and there is the axis i am
so basically where you can generate an
api key so you can use it in your
application to be able to make this
connection
um so here there is api keys
you can create an avl cloud key give it
a name anything you want to just click
create and this will give you an api key
that you can add
uh inside you inside the code right here
so because you're like any code it will
ask you for an api key you can just add
it and give it any data like the one you
want basically in my case i was giving
this data in the interface but you can
keep this right from here and that's it
you look at the results
so this is how easy it is to build a
model we didn't write any single line of
code we tested it and refined it and we
chose the best model all through watson
studio with the help of photoai
uh and yeah basically that's it i'm
going to give it back to forwards to
conclude this session i hope you find it
pretty useful
hey
so it's getting dark on your side
yes
yes
all right
so as you saw how simple it was
to
to actually build a machine learning
model so quickly
and so efficiently in just a few matter
of clicks i'm just waiting for my screen
to come so that we can summarize and
wrap up all right there we go thank you
so it was a very simple workshop what we
did was loaded our data onto watson
studio
used data refined we prepared the data
created an auto ai experiment we ran
analyze the auto ai job
selected the best model deployed the
model to watson studio and we made our
prediction
uh and thank you so much khalid you
really explained it very very well
understandably it's getting really dark
so yeah
yeah
i'm not sure what happened with the yeah
that's fine no no worries
now slowly um you know we are on the
other part of the geography so yeah it's
more
i'm adding some romantic vibes that's
why
all right
all right so before we go uh here are
some useful links the hands-on asset
links and the getting started link is
all there in the description uh you can
check out developer.ibm.com
you can also check up and check out our
meetup and there's a link
with getting started with machine
learning which is on developer.ibm.com
it's a learning path where you can
familiarize yourself with machine
learning
so thank you so much for joining us
today my name is fawa siddiqui
and with me i had khalil faraj today let
us know how we did by going to the
survey link and or scanning this barcode
and we'll be more than happy to listen
to your feedback
thank you so much everyone have a great
day ahead
and for those of you who are in our
hemisphere have a great evening take
care goodbye stay safe and keep your
family safe