Comprehensive Guide to Importing, Cleaning, and Analyzing Data in R

Name: Day 2: Research Methods and Statistics Training
Uploaded: 2026-01-16T09:43:47.813954+00:00
Channel: RUFORUMNetwork
Description: Comprehensive Guide to Importing, Cleaning, and Analyzing Data in R Recap of the First Session Installed R (v4.4.0) and RStudio.

RUFORUMNetwork

Jan 16, 2026

•

3 min read

YouTube video ID: Ic9PU70slN4

Source: YouTube video by RUFORUMNetwork — Watch original video

PDF

Recap of the First Session

Installed R (v4.4.0) and RStudio.
Updated both the R version and RStudio interface.
Loaded essential packages (e.g., tidyverse, readxl).
Learned basic syntax: variable assignment (<- or =), case‑sensitivity, using R as a calculator, and the importance of specifying measurement levels (categorical vs. numeric).

Setting Up Your Workspace

Folder Structure – Keep a dedicated folder (e.g., R_Training/Day1) on Desktop or Documents containing all scripts and data files.
Working Directory – In RStudio: Session → Set Working Directory → Choose Directory. This tells R where to look for files and avoids path errors.
Cleaning the Console – Click the broom icon or run rm(list = ls()) to start with a clean environment.

Importing Data

1. Excel Files (`.xlsx`)

library(readxl)
my_data <- read_excel("file.xlsx")

Use the Import Dataset menu → From Excel for a GUI shortcut.
After import, R shows the number of observations and variables in the Environment pane.

2. CSV Files (`.csv`)

my_data <- read.csv("file.csv", stringsAsFactors = FALSE)

Ensure the working directory points to the folder containing the CSV.
stringsAsFactors = FALSE keeps character columns as characters until you explicitly convert them.

Inspecting the Imported Data

head(my_data) – first six rows (or specify a number: head(my_data, 3)).
str(my_data) – structure, data types, and factor levels.
summary(my_data) – quick descriptive statistics for all variables.

Converting Characters to Categorical Variables (Factors)

my_data$parent_school <- as.factor(my_data$parent_school)
my_data$rank         <- as.factor(my_data$rank)

For many columns at once:

library(dplyr)
my_data <- my_data %>% mutate(across(where(is.character), as.factor))

After conversion, str(my_data) will show Factor with the appropriate number of levels.

Basic Descriptive Statistics

Continuous variables: mean(), sd(), summary().
Categorical variables: table(my_data$parent_school) for frequencies; prop.table() for percentages.
The summarytools package offers a one‑step freq() function that returns counts, percentages, cumulative frequencies, and handles missing values.

Creating New Variables with Conditional Logic

my_data$financial_literacy <- ifelse(my_data$quiz_score > mean(my_data$quiz_score),
                                      "Literate", "Illiterate")

ifelse() tests a logical condition and assigns values accordingly.
The new variable appears in the Environment and can be examined with table(my_data$financial_literacy).

Renaming, Subsetting, and Dropping Columns

Rename (using dplyr):

my_data <- rename(my_data, quiz_score = Q_score)

Subset rows (e.g., only first‑year students):

first_year <- filter(my_data, year == "First")

Select / drop columns:

my_data_reduced <- select(my_data, -parent_school, -rank)

Workflow Tips

Reproducibility: Save the full script (.R file) and always start with setwd() and library() calls.
Error handling: Read R’s error messages; they often point to missing packages or incorrect file paths.
Practice: Use the provided Google Drive folder, the YouTube recordings, and the Day1 script to repeat each step until it feels automatic.

What Comes Next?

The next session will cover exploratory data analysis, visualisation (histograms, bar charts, pie charts), and an introduction to inferential statistics (Chi‑square, correlation, ANOVA). You will also learn how to perform simple regression models using the tidyverse workflow.

Key take‑away: Mastering data import, proper variable typing, and basic manipulation in R creates a solid foundation for any statistical analysis you will perform later.

By following the step‑by‑step procedures for setting the working directory, importing Excel or CSV files, converting character columns to factors, and creating new variables, you can confidently prepare any dataset for analysis in R and move swiftly into more advanced statistical techniques.

Frequently Asked Questions

Who is RUFORUMNetwork on YouTube?

RUFORUMNetwork is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

What Comes Next?

The next session will cover **exploratory data analysis**, **visualisation** (histograms, bar charts, pie charts), and an introduction to **inferential statistics** (Chi‑square, correlation, ANOVA). You will also learn how to perform simple **regression** models using the tidyverse workflow. --- **Key take‑away:** Mastering data import, proper variable typing, and basic manipulation in R creates a solid foundation for any statistical analysis you will perform later. By following the step‑by‑step procedures for setting the working directory, importing Excel or CSV files, converting character columns to factors, and creating new variables, you can confidently prepare any dataset for analysis in R and move swiftly into more advanced statistical techniques.

Helpful resources related to this video

If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.

R For Data Science Book Recommended

Provides a comprehensive, beginner‑friendly introduction to R, tidyverse, and data manipulation, reinforcing the concepts covered in the training.

Amazon →

Rstudio Desktop Ide Download

The official integrated development environment for R, essential for writing scripts, managing projects, and visualising data as demonstrated.

Amazon →

Tidyverse Package Bundle

Contains dplyr, ggplot2, readr, and other tools used for data import, transformation, and visualisation throughout the session.

Amazon →

Readxl R Package

Enables seamless import of Excel (.xlsx) files, a core step in the workflow described.

Amazon →

Summarytools R Package

Offers the `freq()` function for quick frequency tables with percentages, simplifying the descriptive analysis shown.

Amazon →

Links may be affiliate links. We only include resources that are genuinely relevant to the topic.

Summarize another video

Full Transcript YouTube

programming language I believe yesterday
was a a good introduction for the basis
that we need to to understand and today
we are going to continue to see how do
we get our data into R but just some few
housekeeping rules um we updated the
folder with more data if you check in
the Google Drive we've put more more
materials again and then also you need
to check in day one for we've added
another our file our script so you can
check and see that and then um I believe
that right now you've got a folder
either in your desktop or in your
documents that shows that has all the
materials or the work that we're going
to be using because as we are recording
the work in two in hour we going you'll
be always referring to that folder where
you're working out
everything okay so now just a recap of
what we covered yesterday and then I go
to um what we are going to to to cover
right now we downloaded and installed R
and R Studio I'm hoping that now we on
the same page then we went for those
that um we saw how the r and r Studio
workspace looks like those that had are
already we went ahead to update R using
the different options to an updated
version of 4.4.0 zero um then we also
updated our studio set that we are on
the same page we went ahead we did some
we installed some packages that we
thought they're very important and then
after installing the packages we
required we went ahead and loaded them
after installing we load the package um
we saw the multiple ways on how you can
do it you can use the H command you how
to assign variables using an equal sign
or a less Dash we say that R is also
cons sensitive and you can also use it
as a
calculator we further mentioned that R
does not understand the different levels
of measurement so you have to instruct
it if it's a categorical if it's um or
if it's a numerical variable depending
on the data frame that you're using so
we talked about the data frame the
levels of measurement why we need the
levels of measurement and then the last
thing that I mentioned was about the our
scripts and then the comments so today
we are going to move on on how to import
data so let me first check in the chat
and then I see if we are all on the same
page on what I've just um just a recap
I've
given
um okay
please type yes if the recap is okay and
then we now we continue from what we
covered
yesterday all
right um for those that are asking about
the materials kindly check in the in the
Google link that we we shared it's even
on the WhatsApp group we just keep
updating that link and then you'll find
the material there okay today we're
going to look at how to import data in R
um
like the the
volume increase
volume yes Helen it's me speaking I'm
saying that your volume is a little bit
low okay can you hear me yes I can how
about there that is okay
right
okay um so we are going to move on and
work out on how to how to import our in
how to import our data into R okay there
different there are different ways on
how you can import your data I mean
because as you keep learning the the
language you're going to discover
multiple ways so um you always speak
what is really easy for you having added
the packages that we need for example
yesterday we we we loaded read X XL
which is going to help us to import the
Excel files that particular package and
then most of the other files are usually
saved as CSV files which we are also
going to to to import in our and then
also the other thing that uh you can try
to attempt you can you can also be able
to get different files in other programs
and import them into our for example you
can have a St file you can have a
service file in SPSS then from SS all
those one they have got different
commands and then you can import them in
into R and then you start working
accordingly that can be an exercise in
case you're
interested especially for some of you
that most maybe you're having a um a St
file somewhere and you want to import it
to our and then you work out everything
into the our
language okay so usually when you want
to import uh data into R you have to
look at the file extension most of the
time when we go to the field and collect
data so what happens we you know if
you're using maybe um um um you're
using an Excel sheet most of the time
for example if you're collecting with
odk when they inputting the different
values or is the output most of the time
the output is in a is an an Excel form a
CSV or depending on what you really um
on how on how the program was designed
so I'm going to first um uh when you
look at the Excel file I'm going to
first stop sharing this and then uh I
share
[Music]
something okay I'm going to share the
Excel file this Excel file is within
documents that we sent if you look at it
critically when you look at how this
file is how it is saved it is it has it
is file F
a. xli iix so it's a Microsoft Excel
file this is the extension most of the
time when you're saving a file they'll
ask you what type of uh what type of
extension do you want um for example you
can go to file and then you say servers
when you say servers they will ask you
um what format do you want to use and
then you can you can you can tell um the
computer that uh um as for what I want I
want it as an with an an Excel work um
workbook that is the
xlsx or you can save it as a CSV this is
the comma delimited depending you can
play around with whatever that you want
to save into such that you're able to to
import it in
R
okay I'm not going to save it again I
just wanted to share with you that if it
has that particular extension then the
other file that we have in our folder if
you check again closely it has an
extension of CSV so depending on the
extension that we have it will it will
require us to use a certain command as
I'm going to
illustrate all right so as per the as
per the data that we're going to use
um I I just want to explain briefly what
it's all about such that we are all
clear on on what we are going to on what
we going to use I'm not going to give
all the details because you can still
quickly relate to whatever Feld that
you're doing so um the data that we are
using is about financial literacy among
vocational students so in this
particular case when how the financial
when it comes to financial literacy
where look we gave them different um
different questions we gave them 20
questions on different things for
example interest money institution Etc
and then we went ahead like if if a
student scored one out of 20 then that
would be the that will be her or his
Mark so the range was between one zero
and then and then and then 20 if
somebody gets zero implying they did not
get anything so it's like it was like a
quiz on different financial literacy
questions okay and so we got a code book
where we had all the different questions
coded and then their description
depending on the data set that we are
using so I'm just going to use it as an
example to illustrate what I want to
communicate today then we also had a
question which was talking about
background characteristics and among the
different characteristics that we looked
at we had um parents parents parents
school that was in case maybe a parent
of a student went to school or not the
rank of the student the year of
study with year one or year two
somebody's age also went ahead and
grouped them we asked about their genda
the parental personal monthly income
even the students income the work
experience of a student Etc so this is
just an example to show you the what we
are going to see within our data set the
data set that we want to import into our
It's always important that you have a
quick feel or understanding of the
different variables that you're
collecting depending on your field of
specialization this one I didn't share
it but I shared the Excel sheet so we
can see the variables so if I share
again uh the file that I want to use
this is what this is what you're seeing
at your end we have uh we have different
variables we have the Q score that was
the quiz score what when the student
took the quiz uh what they got out of 20
okay so and uh we can do a summary
statistics and then we check what what
was the mean the median the average and
so on this is what I was talking about
the Q the q21 parent school we wanted to
check if the parent of this student
either attained a college education or
not we have also the Q2 22 rank so we
are going to see all that the year we
had the year at school whether first
year or second year the ages of the
students the the sex that that this is
the gender Etc there are just some few
variables that I picked that just for
illustration
purposes
okay let me pause again and ask any
question on that or I continue are we
together with the data set with what
I've just Illustrated
okay great so I'm going to share
again so this is what I've just
explained in case um in case we are
having our data saved with an extension
of xlsx that is a Microsoft Excel
extension what we are going to do we are
going to how we are going to import that
particular data it's very
easy the package that we're going to use
it is read XL which we already have all
we need to do today is we're going to
load that particular package and then
we'll go to file import data set and
then we go to we go to excel from Excel
we browse and then we see that our data
is import imported into into um into
R okay and then this is something
different so let me first stop sharing
and then let's launch our R and see how
can we now we want to to Cod this file
into
our
okay let's launch our AR um the r on our
computers now we are going to see how do
we import our data set in r with an
extension of XL
XS so like I mentioned what we do in the
in the excel in the r script that I sent
there different illustrations that we
looked at yesterday so you can try to
you can try to follow
some so what we going to first do how
how how do we load down data especially
if you're looking at the other
extension so what we do we go to
file then from
file you you you you you move down and
click import data
set then from import data set you move
to from
Excel then from Excel you click on that
when you click from Excel it's going to
bring you a a certain window
and then in this which has got import
Excel
data then here you click you go to
browse you click
browse after clicking browse it's going
to it's going to take you to the folder
where you're working from or you have to
navigate to the folder where the data is
being stored so in case I browse I go to
my desktop which has got the our
training folder and then my the data is
within the day one folder so I click on
the day one folder and then I click on
the file the Excel file that I want and
I click
open so when I click open it um it gives
me a data preview it gives me the file
path that uh I'm
using and then also it gives me the code
preview and then after I click
import if I click import then I'll be
able to view the data into um into R and
then also I'll be able to see within my
environment space when it has changed
and it's telling me that file which is
the default name on that I saved on the
Excel sheet that it has got 400
observations of 17 variables
let me first pause there and I check in
the chat can I repeat the procedure
again or you've been able to do
it
okay
okay all right let me repeat um once
more and then we do it the last time
remember the file we are importing it is
an Excel file okay which has got an ex
the usual Excel files that we we usually
have but now it has got data in it that
we want to import into into R and then
we we we further go ahead and do our
analysis A Kind reminder if you have
questions just post them in in the Q&A
and then you'll receive
answers okay I'm going to cancel this
and then so this is the
procedure all right you go and click
file where you see our studio app you
click
file after clicking file you're going to
get a
Dropbox from the Dropbox move down till
where you see import data
seat when you reach at import data set
there's another small window that comes
and then
you move and then you click on from
Excel okay so I click from
Excel when I click from
Excel um it brings me a window which has
got um import Excel data it has got a
first line the first box app so I follow
to the end where there is a word browse
on the
right I click on browse when I click on
browse the the r is asking me that I
need to find where the data the data
file is that I want to import into R so
I click browse when I click on browse I
need to navigate till where that file
is so I go to my folder where I saved
the
file okay and then I click on the file
itself the file is file F
A that is the default name
xlsx all right so when I click on it
then I click
open when I click open I give it some
time to retrieve the data within within
um the input Excel data window
So within this window it is showing me a
data preview on how my data looks like
it shows me the code preview the library
that I use which is read XL it shows me
the path and the command and then also
the view so after that what I do I come
down and I click
import if I click
import um it gives me if I click import
I I within my R it show it it shows me
the data that I've
imported but also when I look in my
environment now I can see the
environment I'm looking on the right
upper window it has got file that is the
default name that I saved the file it
gives me more information 400
observations of 17
variables so
if you look in your console the our
console this is what it is giving us it
is showing us the library which
libraries read
XL okay and then it goes further to show
us the command the command is read
uncore
Excel and then in Brackets it um it it
it shows the path okay that it pick the
work from day one folder and then at the
end it put the name of the the name of
the the data set which is file.
xlsx then this last command that I'm
having here it is the view so if I and
it's what it brought later for me to be
able to see that my file has been
successfully imported into
R so I'm going to check in the data set
again and see if we've been you've been
able to import um data set saved with um
the xlsx
extension you can also actually try now
with other data setes maybe you have one
on your computer and you want to just
use that one directly you can as well
use yours but then you just you just you
know you just SP around with the codes
because we say that uh the different
codes that we are writing are
reproducible so you can easily transfer
them to another project that you're
having okay okay I want to check in the
chat and see if we are on the same
page some people are adding WhatsApp add
me on WhatsApp please let's
concentrate then the WhatsApp uh will
come
later
okay all
right
great so that is one option
on how you can import your
data okay with that particular extension
so now I'm moving on to the second
option how about in case your data is
saved as at do CSV these are most of the
time these are the commonly ways of how
they sell files that you can import into
R but I'm not saying that the only ones
they are very many of them some of some
of the data is also saved into AR direct
and then you can just you can just um um
you can just use it direct and then also
there are some ined data sets within
arrow that you can use to practice so
depending on what exactly that you want
to do okay so I'm going to stop sharing
this and then I briefly go back to the
slides okay so this is this is the using
option one we are
done now let's move on to option two now
we want to import data which has a CSV
extension if you have an Excel file like
the one we used previously you can just
go and say save as and then with the
type with the saving type you scroll
down the given window and you use the
comma The Limited
extension with this particular approach
step number one we are going to set the
working directorate okay we want to
understand where is the where the folder
where our data set is at the moment we
understand that then all what we'll do
is just to to use the command um the r.
CSV so what we are going to do um we
going to go within R you have already
this material we are going to go to
session we set the working directory we
choose the directory and then we go and
direct to our
folder okay or you can still do it
manually but let's do the the simpler
way and then after that we shall use the
read. CSV command and import our data
and we'll be able to see it into R okay
so I'm going to stop here then go back
to
R all right
okay so I'm going to cross this out I
don't need it for
now um now let's go and create the we
want to create the working
directory okay how do we set the
directory so what we do like I
mentioned you go to session I believe
you're also working at your end as I
speak also try to work out so click on
session after session you go to set
working
directory okay then it will bring a
small window and then you'll click on
choose
directory when you click on choose
directory it is it is going to take you
to the folder where you saved your work
so you click on that folder
after clicking on the folder where that
data set is that you want to import then
you click
open when you click open when you look
in your in your console you should have
something like this the one that I've
highlighted set WD that is set working
directory and then in bracket it is
showing you the path where the where
that data set is that you want to import
here it is telling me that it's on on
the computer from the users then it is
on my desktop after the desktop then I
go to the our training folder then from
there I go to day one within day one
that is where the folder I want to
import is the file that I want to import
is so the moment you set your working
directory it will be good for you for
example like you know you can click on
it you come and save it you can you can
say copy and then you come and save it
into your file set that you don't need
to redo it again all the time you can
come and say um you can you can just
copy this line and then paste it up here
within the r Studio such that you're
able to save it when you come another
time you do not really need to um you do
not really need too the same process
again because the path or the set Dory
link will be there
automatically okay
so in my case in case I run that
particular
line so what I'm going to do I want now
to import the data set using the read.
CSV
command Okay so that is that this is the
command we are going to
use I can write um you remember the
comments I talked about this is the
command that we're going to use and then
inside the brackets we are going to put
our the the the how the file was saved
so it is you need to check again in your
in your file and confirm the file
extension so it is file f l l do CSV
such that we don't get errors so what is
going to happen we are going to come
inside our bracket and then we put um
file dot
CSV sorry this is small
CSV okay so this is what because you
already have the you already have the
path with you so you're just telling r
that go into this particular folder
where I've got my material and then you
get for me the um the folder the the
Excel file that I'm that I'm interested
in and then also let me first pause a
little bit and then check in the chat
are we on the same
page okay
some people are saying no I'm
fast yeah Dr Helen I think you need to
repeat the procedure yeah
sure
okay all
right I'm going to reduce the speed and
then we move on the same Pace
right
so um let me first stop
sharing we are going to
import
okay we are going to import
now I just want us to be on the same
page
I want to to I want to show you the file
which has got a CSV extension that I
want us to import okay so this is the
file it is the same file that I shared
earlier but this time around this this
one has an extension of a CSV I don't
know if you can if you can see where I'm
putting the the the pointer it has got F
it's file. CSV that is the
extension okay so now for us to import
this kind of um file into R we you we
use the rig. CSV
command but before we we do that
we need to First create a path to tell r
that this file which is the CSV it is
within a given folder on my
desktop okay so um I'm going
to I'm going now to stop sharing this
and then I launch R we go through the
procedure again so if I stop sharing
here I open my
R
okay here we are I've launched
R um maybe the other thing before we go
to the session let me clear everything
in case you want you don't want to see
this particular information within your
within the r console what you can do you
can click here there's a broom and you
sweep everything such that you you you
have a clean
space okay so I'm going to I'm going to
clear the console by clicking on the
Broom if I do that that space becomes
clear I'm going to do the same within
the environment um such that we see
everything closely so if I come and
click this
broom hope you're seeing it I'm within
the environment when you come and click
the broom here
here every it will ask you that are you
sure you want to remove all objects from
the environment and then I'll say
yes okay all right so now let's see how
do we create the the the path or how do
we set the
directory what we do step number one you
go to session
after session there is a drop down that
you see a small um
Dropbox and then you move down to set
working
directory when you reach to set working
directory it has another pointer and it
brings a small window so you move up to
choose directory
when you reach there you click on choose
directory when you click on it it is
selling you that it is you to choose
where where is your folder is it on is
it in your doc is it in your documents
or it is in your or it's on your
desktop okay so you navigate until where
your desktop is so for my case my the
file that I want to um the file I want
to import or the file the CSV file that
I want to use it's within the my it's
it's within a phone on my desktop which
is our
training okay so what happened is that
uh now after after getting to that
folder I click open I just click on the
folder I stop there I don't go to check
whatever materials are inside I'm just
telling her that everything that I'm
going to work with they within this
folder and then after that I click
open if I click open you see that in the
r console it is giving me a path that is
telling me that when you reach on your
desk after when I get to the desktop
then it has to continue to the last
folder which is uh the are training
2024 and within that within this
particular folder that is where I that
is where there is the file file. CSV
which I want to import using the C the
read.
CSV um
command and then I mentioned you can
copy this path the the S working
directory and you paste it into your R
Studio up here such that when you come
again another time you don't need to
follow through the same process
because you would have already said you
would have already told r that know this
is where all all the work that I'm going
to do it's within this particular folder
and so you don't need to go through the
same process of session then working
directory choose working directory
that's why I got this link and I saved
it up
here such that in case I click save when
I come next time I don't need to to
follow that procedure again all I'll
need to do is to just come I click at it
at the end and then I click runs like I
just launch the path and then I'm able
to continue to do whatever that I
want okay let me go back to the
chat um this is what I'm getting error
object cannot be found
okay
um you're saying that uh you're getting
an error that the object cannot be found
we've not yet reached there so you're
actually ahead of us have not reached at
that particular point if you
are
um okay um of course what I'm saying the
working this the path at my end is going
to be totally different from yours so
based on if you're having then Ka
downloads day to exploratory you're
using that is if the if the file. CSV
file is within that folder then you're
right but I think that path is wrong
because all the materials that we are
using they are still in De one not de
two try to check again otherwise you're
going to get an error we are all the the
file that we want to import it's within
the day one folder so I expect that your
last your last folder should be day one
not day two day two materials are going
to be um used later after this try to
cross check
again okay um you can cannot save the
directory we just save the path then
when you when you get the path and you
put it into your script you're able to
save it and then next time you don't
need to re you don't need to follow the
same procedure
again
okay um repeat again how to clean the
console uh how to
[Music]
save repeat where to copy
um yes Victor I think the path is
okay my path is going to be different
from yours depending on where your
folder is where the file that you want
to import is so it is different and
remember we are now using the CSV file
some of you are putting the Excel we've
we have already done with that
particular part with the first option
we're importing the Excel now the second
option we are importing a file that is
served as a CSV so if you if you're
looking at the one of the CSV
automatically you're going to come up
with
errors kindly let's follow closely
because we are repeating ourselves some
of you are asking how do we
import
okay in case you you're having the the
read CSV and you are having you're
getting an error then it implies that
the path is not correct I'm seeing
somebody having an error error in file
file RT cannot open the connection or
cannot open no such file or directory it
implies that the directory you're having
is not right the path you set is not
right so it's the reason you're getting
that
error okay I'm going to repeat once more
and I'm hoping that we are we can all
get on the same
page all right
so please note that we are importing a
file saved with an extension of
CSV the option which is saved with an
with an Excel extension which is um XL
SF we are done with that okay so now we
want to import data that is served as a
CSV first thing you have to know where
where is the folder that has got this
particular file is it in your downloads
is it on your desktop is it in your
documents have that file somewhere
because at the end of the day as we are
setting the working directory it is that
folder we are
targeting at the end as as you navigate
through the path the last folder should
be the one where that file is in this
particular case it's a file. CSV
file so as we are setting the working
directory I expect from your side that
uh of course I'm saying that mine has
got an our training because that is the
folder where I'm having all my
materials okay let me first stop
sharing um
[Music]
okay so this is what I want to show
you
um I don't know if you're seeing what
I'm sharing
Professor Susan are you seeing the fold
I'm sharing just to
confirm yes I'm see okay great so
participants when you look up here um
I've got our training this is the folder
on my desktop but inside the our
training folder there is day one okay
so within the day one I've got these
different materials so I'm um I want to
when in case I use the set working
directory I'm targeting this file which
has got file. CSV so depending on where
your data is depending on the folder
where this file is because it's what we
are
targeting also I can even stop at uh the
training because still here I've got the
same file which is CSV so depending on
where you saved that file the the file
CSV we are targeting that folder for me
it's on the desktop so it's what that's
why as I navigate through the path it
goes up to our training
2024 okay so I'm going to stop this I'm
I'm just trying to clarify more such
that you you get to understand what I'm
trying to
explain so if I share again my r
okay okay so um what we do we go
to I'm going to clean my R pil again
just click on the Broom we we are
looking at the on the left on the left
hand side the bottom window in the
corner there is where I clear console
it's like a small broom that you can
click on and then everything will will
be swept off so I'm resuming so what you
do you go to
session okay after session move on to it
it will bring a Dropbox then move and
click set working
directory after setting working
directory then you move on to choose
directory all right if you choose
directory then you have to navigate it
it is you have to navigate up to where
your folder is if your folder is within
um downloads then you have to navigate
up to there if it is on your desktop
where I told you that it will be easy to
put it somewhere where you can easily um
like get the data from there or if it is
in your documents still you have to
navigate till to your documents but most
of the time um it will automatic Bally
take you where that where that
particular work is but depends on the
computer that you're having of course my
path will be different from the path
that you're using so when I get to that
folder I click
open when I click open it's going to
give me this particular path okay that
is telling me when I reach on my desktop
then the last folder is our training
2024 So within this folder that is where
I've got my file the CSV file that I
want to
import and then I save that um I can
copy this
link okay I just highlight it then I
copy and then I come and I paste it
anywhere within my our script such that
next time I don't need to repeat the
same procedure again when I come another
time when I want to use the same the
same folder where if within that folder
there is a file that I want to work on I
just come and launch this this line
again and I just click run and then it
will run
automatically
okay um and then after that what we are
going to do the moment your path is
right that is the condition if the the
set working directory path is right then
it implies that you're not going to get
any error the errors that you're getting
is because your path is not right
I'm going to move on a little bit and
then later I'll I'll ask in the chat if
you're in the same page so how do we use
the read. CSV command okay we use what
we are going to do this is our Command
that we want the read CSV and then
inside we put the file name how did you
save the the data set the data set that
you have have we saved it as file.
CSV in case you're using a different
data set then you have to quote that
name that you used such that R is able
to capture
it okay so in case I in case I I run
line
42 okay remember inside I've put file.
CSV if I run line 42
what happens I see that in my in my
console I'm seeing some data that has
been that has come up and my AR console
looks messy
now I don't know if you you're also
having that at your
end so what we can do in order to have
to have our data um combined for example
within um let's say a data set within R
we can create we can create an object
name that we give to our data set for
example I can say I'm going to call my
data remember we talked about assigning
variables I'm going to call my data
since it's about financial literacy I'm
going to I I'm just picking a name that
I'm interested in any you can you can
give it any so I'm going to call the all
of this data literacy and I create a
literacy data set you can give it uh
Finance you can you can even set data
you know whatever name that you want you
want to create an object name you can
give it your name anything that you're
interested in we are not SP on that when
I'm assigning a variable remember like
yesterday where I said like age is equal
to 30 AG is equal to this this one is
equivalent to to a certain Val to a
certain value so here I'm going to
create a variable which I'm calling
literacy and I assign it to my data set
such that I'm able to refer to it each
time I want to do
computations for example if I call it if
I say literacy I can say l or Dash or I
can use an equal sign and I copy this
I'm just G when I copy that and then I
come and I paste it
here hope you're seeing line 45 I create
an
object that is the name I want to call
the whole of my data set and then I say
let literacy be equal to line 40 to line
42 why I'm doing that such that I'm I I
stop having my data set seen within the
r
console
so if I run line
45 take a close look then I'll I'll
recap it again if I run line
45 okay see what is going to
happen if I write line 45 when I come to
my R console I see it
here okay but I don't see the
results check what happened to the
environment the environment within the
environment I'm seeing Oh the data
callede literacy with 400 observations
of 17
variables and for you to tell that youve
successfully imported your data set into
R you always um number one when you
check down here in the r console you not
see any error or any rates that are
coming at the same time when you look in
the environment you'll be able to see
that oh the data set has been
successfully imported and you're able to
read some at the information the 400
observations and with 17
variables okay I'm going to pause and
then check if we at the same
page okay Victor has that worked if
whether I use an equal sign it still
work if the path is right um CSV small
letter it's not Capital kindly change
that that's why you're getting an error
CSV file. CSV put it in small
letters okay does the CSV have a
specific package no that is just an
extension that's
just uh can you ask one of the guys
without somebody who is failing to get
the fight to show their screen okay
that's that's a great idea
you yes Thomas go ahead hello have you
got me yes I have I have thank you hey I
said you could ask somebody with
somebody who's having a problem getting
the file to show their screen yeah okay
um can we get at least can we get two
people who can share and then uh we walk
group together we can get um two we do
for the first one and then another one
can also share and then we support you
if you have an a problem if if your side
hasn't yet worked out for
you I think they have to be promoted to
yes we
promoted uh
Sol and then also
Elisa okay thank you
um Sol can you share and then Elisa will
share
later
so please you have all the rights arei
is
it and
Elisa please go ahead and share your
screen
Solan I'm sharing the screen please can
you see
it
yes so you said we should go to session
um um a minute could you Lo the
libraries please just a second Salan
could you Lo the libraries the libraries
we had
yesterday are you talking to me sure
yeah okay where did you walk from
yesterday okay let me there's another
one here see this one
um I wanted to see where you loaded the
libraries you wanted to load them
again okay so please direct me on how to
load
them did you attend yesterday's session
yes I was there in fact I was working on
that screen but after some time the
thing took me to save so I saved it and
I left the place and I open this one
again do you have the file that you
saved yes I think I should have it okay
go to go within that folder and just
double click on it it should
automatically appear
here otherwise we can we can just want
otherwise we can we can load them again
you just type Library if they already
installed
okay I'm not seeing anything okay let's
go to
R because it should have an extension of
R in case you saved it
right it is not that one that was when
we are we wanted to correct the problem
of our tools please
cancel okay let's go back um open launch
hour
okay close and launch our studio we want
want to see the four um yeah I'm seeing
that you already have
it close
this this is our Studio
M okay can you go to
file okay create our
script new file go to new file new file
our script
okay okay so these are the the four
workspaces that we talking about yes now
load Library TI
vers Library I'm believing that you
installed them yesterday so if
you I installed them yesterday great so
type library then we want to load the
package tide
vers you can click on the
blue okay okay then type in tide first
TI
D um there's a
y
um that is not the right okay
yeah can you run it and we
see click the run
button
okay great so that shows that it's you
already have it wonderful so now what we
are going to do um um let's create the
working directory yeah you go to session
okay um okay continue choose
directory choose directory
Dory
Okay so we need to which folder are you
using I have it on my desktop great
okay so it is day one is it day one yes
yes okay click on that and then come and
click
open
okay are you seeing the path down here
yes can you copy it and paste it in the
r scpt such that you're able to save it
copy that
line copy from set
WD thank
you okay then I should put it there yeah
sure so when for example if you want to
launch that fold again tomorrow you
don't need to follow you don't need to
go again through okay okay so now go to
the next line
enter enter the ca okay so let's use
read.
CSV we are using the command now
okay
read CSV it's even there okay then now
inside in double quotes put the file.
CSV the the file that we want to
read okay day
one we are putting the the file
name I believe
in yeah in day one you have that file
file CSV yeah it's the one we want to
read but file. CSV CSV small letters and
then put everything in double quotes
okay
okay can you run that
line
okay you see what happens that your
console becoming
messy yes so now to avoid that what you
want to do is we want to create an
object name that will have all our data
inside okay okay so go on a new Line
enter okay pick any name that you're
interested in or you can put put you can
put your name whatever you want anything
okay okay now let's say um so put an
let's we assigning your name to this
data set okay okay so you can put an
equal sign or you can say less
Dash you you removed line number three I
wanted us to copy okay then put read.
CSV and then I put in our file
name no the the file name not that one
the F
yeah okay put it in quotes double quote
[Music]
yes
okay can you run that line now
okay tell me what has
happened what what has changed I've seen
that the name and the read csb has gone
to the down parts of the left hand
okay so this is our this is the command
that we've said now you're saying that
you're assigning your file to your name
that all that information within the
file CSV you're calling it your name is
now the data set
name when you come and read in the
environment we also have more
information yes yes okay it is telling
you that within this data set you're
having 400
observations of 17 Vari variables yes
then it implies that now you're good to
go and continue with other things that
you want to do thank
you okay you're welcome so can I stop
sharing great yes thank you thank you so
much Dr elen you can sweep in case you
don't want to see that information the
console um can we get one more
one more person to share yes Elisa Elisa
please go
ahead
Elisa share your
screen
Elisa Lisa any other person interested
to share if you're still stuck
as we as you're sharing then somebody
the rest should also be learning what we
are
doing okay MH Alisa take her through
then I'll I'll I'll
support yes
um here I have my screen the are in
R
um so I try to load the ti
uh I managed to do that I think you can
see in the
console uh so I had
that set direct at the bottom there okay
so I tried to C it and paste it here
great so from there uh now I don't know
what to do you want to get the data into
R using the read CSV so I need to put
read CSV
[Music]
okay rate do
CSV which is this one no the first one
the first oh the first one oh
okay that one this
one
so I need to put the file name right
sure oh it's
oh
sorry it's like this let me ask you
where are you getting file
to I when I was serving in CSV I I added
a two the end of the name of the
original fire oh okay okay yes the CSV
should be small put it letters should be
small letters yes csb small letters oh
okay name alone I'm talking about the
extension oh okay okay that
whatever oh what is
happening
okay
CSP but here oh small it's small already
I mean here where within the brackets in
the brackets yes oh okay okay
CSV like this sure uh-huh can you run
that line and re
see oh okay so okay now I want you to
condense this information into a
variable
name okay you can put Alysa or sometimes
we put the VAR the object names
depending on the data set that you're
working you're
working I want to I want to put College
okay uhuh college is equal to m you can
copy line number three and then like
whatever is within number three let it
beest in
college like this
yeah okay oh so the whole of this
collaps to college uhhuh great okay so
run that line so that's why you're
saying that it can be different so what
has
changed are you seeing any changes
when you when you when you look in your
console we don't have any rights
implying that our path is perfect is
okay and then also when you check in the
environment are you seeing under data
there's a word
College in the
environment Elis are you
following hope you've not disappeared
it could be um Elisa is muted can we
have
Michelle college is an object
name and so our data set we are going to
call our data set as college so that
name can change it doesn't matter
whatever you give it to okay let's get
one person more Michelle Michelle is
ready
okay thank you
is it
[Music]
Michelle okay Michelle is promoted her
so you should be able to share her
screen Michelle please go
ahead to share your
screen okay
okay hi uh doc um yeah so my issue
is as you can see I don't seem to have
the Run function up here for reason but
I've managed to set my working directory
what version of R are you
using uh
4.4 Thomas would you have any
[Music]
idea or Professor
Susan I'm
not I'm not very sure
because there could be something
relating to the showing of
the of the the the
short because this one is actually the
the runs there the function about here
not being short I don't know whether
it's just some
setting I don't know can you click on
these arrows here I'm seeing arrows here
and then here which
are you see where where I'm putting the
pointer yeah you try to click on that
and we
see it goes to the top to line number
they're inactive they're inactive can
can you can you close the the close and
open
again should I no save now okay can you
try to open it again again okay and then
you you share share my screen
okay uh somebody said you probably
shared you probably saved it with the
wrong extension just open your open your
your your
studio okay
once you open then you
share there's one saying that you can
click control+ enter and then we see if
it
runs still the same okay you you go to
file click on the file
new Okay click on that and then you say
AR
script I see Okay click on that ah there
because I see you you save as a history
you're opening something history ah okay
so save this as a script and that should
be able to solve the problem all right
thank you so much okay right let me stop
sharing so we can
continue yeah you can save it as as as
okay
okay back to you thank
you
okay okay can we now continue have we
now on the same page with getting data
into R please type in the chat type yes
if it is
okay
okay um
all right um and also we having a
recording if there's something that
might not be clear yet you
can you can you can try to listen in
after the after the training okay let's
continue I think this this slide is
okay
um we now know how to install packages
load packages say working directorate
importing data we we can now do the
comments there also some other things
that you can also check within your data
set for example you can if you want to
know the number of rows the number of
codes um the column names Etc these ones
you can try on your own and then we can
also I think we know right now how to
save to save the our script okay we know
how to clean the console it is see all
these commas that you can try out but
before we move on to this um let's do
some simple things within
R I'm going back to R
okay all
right so
here um I called my data set literacy it
is the name I'm having
here if you've if you've used a
different name then it implies that all
the next steps that we're going to be
doing you should be using that name that
you use to refer to your data set the
what I'm calling the object
name okay we can do some few things for
example um I know already I've imported
my data
successfully I can actually run the
object name in case I want to see the
data but we've already seen it earlier
because like I said um literacy is equal
to this so you can say okay what does
what is if you want to see what is
inside literacy you can be able to you
can be able to see the data set or you
can come just within your environment
and then you click on the Excel W Excel
icon when you click on
it when you click here you'll be able
also to to view your data set very
well hope you've gotten that that that
part so um sometimes it is good to first
get to know how your data looks like so
in case maybe you want to see the first
few Columns of first few laws of your
data set you can use the head command
okay you can say head the head command
and then inside the head command you
type in the name of the data set that
you used in this particular case for me
I used I used literacy so in case I run
this line it's going to give me the
first six rows within my data set
everything it will give me only the
first six rows in case I'm not
interested in the six and I want to be
specific I can type head and then I put
in a number for example like okay just
give me only the first three and then I
run that particular line and it will
only give me one two three the first
three rows whatever you depending on
what you
want
okay and then also it is good to
understand the structure of the data set
if you come to the structure of the data
set we use the command St
Str okay it gives us the structure of
the data set where we see the different
data types whether it is whether the
variables are numerical in nature if
they rages or or um if they are
categorical so I use the Str Str inside
I type the name I've used remember that
is very important the name that you sa
the object name so in case I type St Str
literacy and I run this particular
line you see that everything it will
tell me if I come down here I'm having
my data frame which is the data set it
also telling me 400 observation of 17
variables and then it is showing me that
my first variable Q score it is an
integer okay and it is giving me like
two some of the beginning values and
then my next variable which is q21
parent school it is telling me that it's
a car okay and uh it is having um um
some college study
like because here it was categorical
telling us that either a parent has a
college has a college education or a
parent doesn't have and at the same time
when I come to the Q rank I'm seeing
that it's also a character and with the
rank in this particular case we wanted
to check if a student has a diploma or a
certificate so for me to for me to tell
our that no they are not like characters
they're not just wenss but but read we
need to tell R to read these the
variables that are categorical in nature
into categorical so we are going to use
the as do Factor
command okay because this is where the
relevancy of the level of measurements
come in we said under the levels of
measurements you have to tell whe if
it's nominal or ordinal those are all
categoricals and then if they their
Ratio or scale or um interval all those
are integers or sometimes they would
give you their numericals so what is
going to happen um this is what I mean
it's happening on all the other
variables so this is enough to tell me
that no my data has not been the
categorical variables are not yet read
properly because they have to appear as
factors okay in case you don't want to
read everything you can you can say Okay
I want to know the structure of for
example only one variable you can still
type St Str structure you type what is
the name of your data set in my case
it's
literacy okay then I put a dollar
sign if I put a dollar sign now I'm
specific that I want only this variable
for example I can say I want only the
rank
variable okay so if I run this
particular line what I'm going to see
here it is telling me that oh the ramp
variable it is showing you that it's a
character and they're having two things
diploma and certificates but what I'm
saying that no I want to change instead
of reading it as a character I want to
change it to I want AR to read it as a
categorical variable and with that I'm
going to use the as factor of course
there's also other ways on how you can
read everything automatically in just
one command but uh let me first show you
this
one okay so for us to if you want to
tell our that no read this variable as a
factor what we are going to do this is
what we do you write
literacy
okay um
I'm considering only one variable then
I'll show you with the rest and then I
say as do
Factor then inside I put
everything
okay so in case I run now I say that
within my literacy data set pick for me
the variable q22 Rank and then you I'm
using the as. factor command read it as
a categorical
variable if I run line
68 okay I'm seeing it again within my
our console so I want to run its
structure again and see how does it how
does its structure now look like has it
been converted if I run it the Str Str
if I run line
69 now it is going to tell me that r
that that the rank it's a factor with
two levels remember we say there are two
levels either a student has a certific
is is studying for a certificate or a
diploma so this so um um in this
particular case I'm telling R don't read
rank as a as a character read it as um a
categorical variable and so in that I
use the as Factor command so depending
on the number of um depending on the
number of variables that are categorical
that I want to convert I'll do the same
on everything such that when I run the
when I run the the structure command
again those that were characters have to
read as
categoricals so when you check in the
data one extension sorry in the data one
script that you're using that's why why
you're having all these other commands
that we are converting to categorical
everything I just highlight I run at
once so when I run again then everything
has changed they will tell you that okay
now the C parents is a factor with two
levels when you come to um when you come
to the age group the age that we grouped
it's a factor with five levels then also
the income level is a factor with five
level
so it's no longer a
character okay let me pause a little bit
and I
checking are we on the same
page are we on the same
page okay
okay all
right Dr Helen yes please they are more
no than yes sure and some are saying
you're so
fast
okay
um I'm going to recop
again Let Me Clear My console
Dr Helen yes please those who would like
to use the YouTube there is a YouTube
channel for Forum that's what they
should look for we don't have a link for
the YouTube channel but there is if you
go online and just type in Forum YouTube
channel you will go get all these
recordings
there
okay thank you all right so
um I'm
rep so there are two things that we want
to learn here objective number one we
want to know the structure of the data
set maybe what I can
do um let me go back to my data
set and uh let me reduce on the
variables maybe that that will make it
more
clear
and reduce just a
second it could be that many variables
are creating
confusion delete all right I think this
will make
sense okay I'm going to go back to
okay
um I want to read in a different data
set which where I've reduced the
variables to make this part a bit more
simpler so that I don't we we learn very
easily I'm going to create um
let me let me Swit
this
okay
um this
is yes so I'm reading in a different
file it's still the same file but I've
just removed some of the variables such
that we get this part very well well I'm
going to be very slow so kindly take a
look so I'm reading in a new um a new
CSV file which I've called file reduced
but I'm still using the same object name
remember we said you can you can adjust
the object names okay so if I read
that let me check
again I reduce
okay I've seen the path so why I'm
getting this error it is telling me that
cannot open file this no suchar file or
directory this is what I was telling you
that sometimes you have to read the
error and then find out what could be
the problem so the um I'm going to go
back and launch the the working
directory where the reduce where this
file is the one that I want to use so in
case I run that okay so I'm going to
launch this
again and I will not have the error
again so what I'm having now I'm having
the literacy my file with 400
observation I've reduced the variables
to seven so I can check here this is
what I'm having 1 2 3 4 five six 7even
I've just deleted off the rest for just
demonstration purposes what we are
having among these um among these
variables it's only the Q score and the
the open age that are
continuous Q the parent School the rank
the the year and then the sex they are
all categorical
variables okay and in this particular
case they are all nominal variables
because there's nothing like maybe you
need to know um any order the order
doesn't
matter all right
so um when you come here so let's go
ahead and check the I'm going back on
the structure of the data set which I
said we want to describe our structure
of the data set and we use the Str
Str all right so inside Str Str Str Str
it implies structure we need it's like
we want to see whether the different
levels of measurements are read
incorrectly within R so if I run line
64 this is what I'm seeing I'm told that
it's a data
frame with 400
observations with seven variables this
one we've already
seen then this side is giving me the
different variables that I'm having
I've got the Q
score okay and it is telling me the data
type that is an
integer with some of the it is giving me
some of the the data like
28
Etc when I come to the second variable
which is talking about parent school or
education it is telling me that it's a
character that they just wordings that
some yet I know when I collected my data
I wanted to categorize them in to those
with College college education and those
without when I come to the rank it is
still showing me that it's a character
yet there are two things one with a
diploma some have diploma some have
certificates so the moment I see these
characters I want to change these
characters such that R reads them as
categorical
variables are we on the same page here
let me check in the chat if that is okay
before I continue is that okay our
aim we seeing the structure of the data
set after seeing the structure of the
data set what we are seeing that some of
the variables are read as characters
instead of
categoricals so for us to convert
character to categorical we are going to
use the as Factor command
okay so what we
do
um in case I want to convert for example
I want to convert like Q rank to to
category call what I
do first I pick the name of my data set
which is
literacy okay literacy
how do I attach the respective variables
that I want I use a dollar sign on my
computer the moment I put a dollar sign
you see a
Dropbox asking you which variables do
you want and say I want for example the
parent
school I want to convert the parent
school to a categorical into a
categorical variable as I had it in the
beginning
so I assign a less Dash or I put an
equal sign and I use as a factor
command it is as do
Factor then
bracket okay so but inside the bracket I
put so as do
Factor then inside the bracket I can I
put everything
okay now the question is how do I read
this line please
listen go to my data set
literacy pick for me variable q21 parent
School convert everything as a factor
read it as a categorical variable by
using as a factor command so that's why
inside the bracket we put the same thing
that we want to read that variable as a
factor in case I have another variable I
want to do the same thing I can do the
same for rank because I know rank is
also categorical go to my data set
literacy read for me the pick for me the
q22 rank variable read it as a
categorical using as a factor so in case
I run this line okay and I run also this
particular line and I run the structure
again and I run the structure again see
now what is happening to our output I
can see that the parents uh school is no
longer a character it's now a factor
they they're telling me it has got two
levels those with the some college
education others with
I'm also seeing that also rank is also a
factor now with two
levels okay whereby some students have
in for a certificate others are in for a
diploma we can do the same for this
other one to convert everything to
categorical let me check in the chat is
that a bit clear or we get some some to
share and then I
direct somebody saying that I put the
comments on each
command but now we are working together
why don't you do it at at your end
anybody still finding it hard to to
convert a um a character to a
categorical how does it say
literacy not a your Bo
why I said literacy is just our object
name it's just the name of our data set
you can use anything like like how our
colleagues were sharing using their
names and so on what is the purpose of
having it on a categorical data why we
are converting it into categorical there
it's number one because remember when we
doing analysis um for categorical we
need to run frequency tables so you're
not going to be able to come up with a
frequency table if you're look if you're
having it as a character because
characters they just wordings okay and
then also when you're going for example
if you if you want to interpret your
results how are you going to check the
the how are you going to check the
relationship within the different levels
so that's why you you're saying that I
need to clear everything from the
beginning that is one of the reasons why
yesterday we are looking at why is it
important to understand the levels of
measurements okay it is going to support
you on how you're going to interpret the
data and how you're going to present it
and what statistical technique are you
going to use it's the reason why we are
doing all
this is it converting to category oring
them please um share the the questions
in the q&
a so can I move on
I'm believing now this step is okay so
we can do the
same
great
okay
um I'm going to show you I'm going to go
back and read my the I'm going to go
back and read the original data set
because that's what I'm interested
in um I believe now this one you all
know
okay also um there is another simple
technique that you can still use if
you're having a big data set if it's not
a small data set within our
folder there's a file called the one
extension.
R and uh it is the I mean it's the same
information I just thought that if
sometimes you're having a big data set
and you not be able to say like you know
as a fact and what so um I'm still using
the same thing I run my um the set
working directory I want to read in
everything so here I'm seeing the
structure again it's messy with
characters it have got very many
variables so look at line 22 I can use
this particular line to convert
everything automatically in one line so
I say literacy is my data set name which
I'm having here so I say literacy less
or equal that is equal to then I say
from my from my data set literacy I'm
having these symbols sometimes we refer
to them as pip it's like like going
through within my literacy data set I
put these symbols okay through it then I
put the mutant if this is a command that
I need to use to convert everything
automat like within one line instead of
like breaking all the lines into one and
then inside the mutant if inside I put
like is do character I'm telling r that
within my data set
literacy wherever when where I you see
the character is character okay convert
it as a factor put it as a
factor so after that I don't know if
this line is very clear because it's
just going to do everything within one
line and you don't need to to to go and
work upon each and every line that's the
reason why I called it an extension so
in case I run this line line
22 okay and I now come and check the
structure again it's going to show me
that everything has been converted
automatically just within one
line so what the command does that it
converts all the character columns
within my literacy data set to factors
then so in that particular case like all
the all the all the characters have have
converted to factor using just the
condition of the mutant if command
only in case you don't want to go ahead
and do a lot of writing still within one
line you can do that let me check in the
chat is that
okay make sure um if you're having an
error you need to load the library tied
verse hope you loaded it at the
beginning remember what we said each
time you open R it's better to load the
libraries that's why sometimes you you
come up with errors
that for example that particular era if
you low TI ver it should
disappear and after after installing the
the any package that that that you want
each time you come back to work within R
remember to load those
libraries okay so this was just another
simpler way on how on how you can U you
can do everything just within one line
so can I continue please type yes if it
is
okay okay great I'm seeing many yeses
than the NOS so that is great also this
the de one extension R it's within the
day one folder you can check for it you
can go back and and check
there
okay um Professor
Susan I don't know if we need a break or
I
continue only maybe for yeah 10 minutes
up to okay let me ask the participants
can I continue or we take a break of
just already they are showing you break
break break break break please up to 4
so at 4 we are coming back the break is
8 minutes
okay thank you somebody saying that
continue we are not giving you 15
minutes we are giving you only only
eight
and those that are asking for the Desa
please check in the Google link
[Music]
wck needs you a goba
WS
you he needs to go to a break room with
you yeah um I'm seeing it
here maybe um we can we can do that
after if you can send an email we can
create a zoom after
the when after dissolving the whole
group today how is
that I will be running away from
Jam maybe you can conduct it alone yeah
I can
[Music]
yeah Thomas you're also running
away running
okay wli you can uh you can send me an
email and then we can do a zoom after
after the after the grou has dis
somebody that because say not
him I think he has joined today that
somebody hacked his account so it's
not oh okay
registration
link
e
e
e
e
e
e
e
e
e e
okay um welcome back I think we can we
can
continue um I'm seeing some people
complaining that I'm guessing an error
literacy blah
blah um make sure you you load the
library TI vers all deeper
then you should stop getting that
error
okay let's get started other comments
will be answered along the way in the Q
and
A are you back please type type yes if
you're
back hoping I'm not talking to
myself
great
okay
um let's
continue so we've we now know how to I
mean how to convert the character
variables to categorical how to check to
using the the structure Str Str command
to check the structure of our data set
so um I mean have just shown this the
different Alternatives that you can use
we've seen the structure that everything
is is now okay all right you can we're
going to just do some few simple
statistics our volume is low sorry the
volume is low
[Music]
okay M somebody says he has a mutate
error can they show their
screen okay let me stop
sharing if you have a mutate
error it's called
abala K
something pleas is she do with mutate
function that she's dealing with a
mutate
error
um who is that person if you an era can
you
share she has a mutate
error
if what is the name again you want them
to share the
screen I wanted them to share because
she say she has a mutate
error
abdal how
AB Abdul how I think it's how Abdul
[Music]
Kim is it possible they raise their
hand because we have so many
[Music]
[Music]
iny mutate if character as a factor
would not find
function
pipe he needs to
share I think he can solve that using
maybe he didn't load the depler or the
ti bu package because the mutant if
command needs the deer
package that's what I was saying that if
he can run that and then
it it should that error should be
sorted Abdul
Kim hour
promoted Abdul
Kim please go ahead and share your
screen how
can you see can you see my screen please
good afternoon thank you for the
opportunity okay yes we can see your
screen all right thank you
so my problem is with the mutate
function trying to change all the um
variables in character to
factor
okay okay so this is the command I'm not
sure if I got it right in the first
instance but I've tried to load my D
player and the ti
verse okay can you run and we
okay okay I'll run
it then run Str Str run structure type
Str Str Str Str then bracket put the my
literacy
okay can you run
that
fantastic thank you okay you're welcome
you can stop sharing okay thanks the
Lord anyone with a um similar
challenge all you just need is to load
the library TI verse should be able to
you should be able to solve that
error it is t i d y v v e r s e if
you if you don't have the right name of
the package again still you'll come up
with an error because R will not
recognize that if you type like for
somebody saying an error but the sparing
is not
right
okay um
I'm sharing my our scripts
again all
right okay so now that we've converted
we know that um our data sets reads
right all the categorical variables have
been converted and then we also have
with the with the integers they are the
same as sometimes they say numerical
variables so that those ones really you
don't need to to have have a big deal
about them
um you can do most of the time you can
do summary statistics okay using a
summary command the command is
summary okay and then inside you type in
the name of your data set so when you do
that okay and you run when you come in
your console it will show you some of
the things that uh okay like for the
case of the Q sco it um since it's a
continuous variable it will tell you
that it has the minimum of one and then
the mean of
11.23 blah blah when you come to the
parents uh the parents school is to tell
you college education or higher there
are
231 and then no college education it is
169 this is the thing that we talking
about doing the descriptives and then it
supports you when you look at the sex
the females are 271 and then the males
are
129 now in this case when you just put
in the name of the data set to give you
a summary it will just give you a
summary on all the variables that you
have however you can also just Target
only a specific variable if you are
interested in that and then for example
here if it's a continuous variable you
can say like summary and then then you
type your data set and then you put a
dollar sign of what you want a summary
of what you're saying that I need a
summary go to my literacy data set and
then pick for me the Q score um let me
rewrite this again we can say summary
that is our that is my command you type
in
literacy okay um literacy then dollar
sign when you put a dollar sign they ask
asking you want a summary of what and
then you come for example you click that
and then if you run it will still give
you the same information as you you had
earlier only that this time around
you're just interested in in a specific
variable with a summary that works for
um all continuous variables you can also
do other things like you want only the
mean the minimum you want the standard
deviation using the SD Etc some I mean
these are all Inu commands that you can
refer to but remember you start with the
the data set name then the dollar sign
what exactly do you want to work on
which variable are you interested in in
particular okay in case of categorical
variables you can still do the same for
example here you can say Su summary
literacy dollar sign then the parent
school you can run that it will show you
the the different f
quenes um that of course the results are
going to be the same alternatively you
can use the table command okay it is
table and then inside the table you put
in the literacy the name of your data
set dollar sign then the categorical
variable that you're considering for
example in this particular case I
considered that one if you run that um
it's you still get the same results you
can do the same on other variables like
the family if you run line one two three
then it will tell you that oh okay um
those that are when we looking at family
income those with no income they're just
758 and then with income between this to
that it's that those that have got um
income above 500 and more they 95 I mean
that is the simple descriptive that
we're talking about but you just need to
know um the common one you can use some
sumary for everything you can use
summary if you want a specific variable
you can also for summary for continuous
as well as the categorical but also you
can use the table command for
categorical
variables let me check in the chat are
we on the same
page are we on the same
page
okay objects not
pH um
you need to find out what object are
they talking about like that that one is
resolve okay all
right um also we can run frequencies by
using the prop table in case I'm first
please let me know if you're having like
prop
table okay and then you need the prop
table of what you got previously for
example I've already known that when I'm
looking at the parent school there are
two levels either parent has a parent
has a college education or no education
I can still go ahead and um and get
percentages or proportions by using a
prop do
table okay so if I use prop do table and
then I can put everything inside that I
want proportions all right I can run
this particular line from the table and
then it will give me proportion that
this is
0.57 and then those with no col
0.45 sorry 42 if I want to put them into
percentage I use the same line but I
multiply by 100% so I'm running line 132
and it's giving me
57.7 five or
4225 I mean you're reporting the
frequency you're reporting the
percentages but also still in case
you're not interested to do the
breakdown you know for this and this and
you want to do everything that a go
there's another interesting command that
you can use and we can get that from uh
you can install um the summary tools
okay and then you load below the library
which is the summary tools I'm just
giving you an example and inside here
our main command is
frequency okay I'm just giving an
example then I'm going to pause so in
case I remember we said we going to
build our libraries as we move on
depending on do I need this okay so if
in case I load line
137 I've already installed I don't need
to install again I just need to log a
library I'm going to use the frequency
command okay so what happens is that I
want to load everything I want to see
the frequencies within my literacy data
set or still I can do it for a specific
variable it doesn't matter so what
happens if I run frequency and I put in
my data set name if I run this see what
is going to happen it's going to give me
a summary of everything of all the
variables that I have for example this
was um this was literacy with parents
school it will give me the frequency for
both and then it will also give me the
percentage like um it is
57.7 five then 42. that and then it also
gives you a total cumulation the
cumulative frequency and then the na it
is telling you that there's no missing
variable and then it gives you the total
that it's um it's 100 and then also the
datal percentage is 100 what this
command does it just gives you more
information and then maybe you don't
need to take a lot of time and then the
other B that you need to do you just
pick the information if you're writing a
paper and then you you put your
information in a Word document we talked
about the rank variable it is telling us
that it's certificate and then diploma
those under certificate are
286 and then diploma they are one one
for and uh you can also pick your your
frequencies and then I mean you can
transfer this information whever that
you want it to work for you let me
pause
um anyone who is
lost I am lost in percentage I'm
frequency okay
[Music]
um if can you type one if we are on the
same
page if not then I can recap a little
bit someone is typing two
zero
okay all right
um let me ask a question if you've got
how to use the if you've gotten how to
use the summary
command okay can you type yes have you
learned how to use the summary
command all right with the summary it
just gives us some few
descriptives and then inside the summary
forand what we do you in case you want
um the whole data set
within the bracket you can type in the
the the name of the data set in case you
want for a specific variable you type
the name of the data set you put a
dollar sign and then um you you after
the dollar sign you choose the variable
that you're interested
in okay I also say sometimes um you can
also do the proportions this these are
whatever that we I first show it they're
just breakups okay but also some other
commands whereby you can do everything
at a
go um if you're having categorical
variables you can go ahead and use um um
you can go ahead and use if you don't
want to use the summary you can use the
table command inside the table command
you start with your data set the name of
the data set
dollar sign you put the specific
variable that you're interested in and
then if you run that line it will give
you it will give you an
output from that output or from that
particular line you can as well go and
get percentages and in that case we use
the prop table
command okay so in case you don't want
to have all those breakdowns
you can go in and use the summary tools
and use the frequency
command within the frequency command it
just generates comprehensive frequency
tables for all the counts as well as the
percentages like you're combining
everything it doesn't it will not
separate for you for the for maybe um
this particular group to just give you a
summary of everything with a frequency
as as well as the the
percentages so what happens is that I'm
just giving a recap in case you um in
case you you run this particular one for
example um um if you put in the
frequency and you type in your your the
V the data set name which is literacy so
it will give you a summary for each
variable for example this is a literacy
and the Q score then as you proceed down
it's going to give you literacy and the
parent school all your variables
literacy and then the rank then literacy
and then the year I I think you're
seeing that dollar sign that is that is
popping up and then it it also goes
ahead to tell you that okay this is a
factor you get categorical I had first
year students and then I had second year
students the first years they were
255 the second years were 144 and then
you can go ahead and pick the the
frequencies okay here the frequency 63
sorry the percentage is
63.75 and then and then the percentage
for the second year
36.2 all right and uh you canot if you
don't want to pick this particular
column you can also pick the total
percentages I mean they are still the
same
results and then if you want to transfer
your output to maybe to a Word document
you can just get through and the
different counts and then you're like
okay we looked at for example the part
of the
family and uh we had those that never
had income they were this much and then
this is the percent said you can pick
the N A it means none missing there are
no there are no missing um entities so
that is why we are having zero but this
one just gives you like what you need if
you want to summarize everything that a
go um I don't know if I'm speaking on
the same
page how did you import an R wow you
refer to the um to the YouTube that will
give you a good answer so can I
continue are we on the same page now you
can also try it at your end as you're
trying it will be easy for you to know
that okay when I tried I came up with an
error oh at my side is not coming out
and it is easy to um to navigate to ask
a question if you have a problem
okay um kindly in case you have time
always this when it comes to analysis
about practicing and then the things
will become part of you okay I'm going
to stop sharing this and then I share my
slides
okay um
we have some things that we can also do
within our data sets remember it depends
on the objective that you want to
achieve okay you can create new
variables you can rename columns you can
subset rows this could be even part of
the assignment that you can do you ask
yourself okay given I've got this
particular data set how do I add a new
variable how do I delete a variable okay
and then uh so we've done um the part of
the descriptive analysis that is already
done but you can still try out the rest
um within if you want to do data
manipulation for example if you want to
create a new variable it's within the
same data sets that you want to create a
new variable for example maybe you can
say um within my data set the quiz score
which was ranging from 1 to 20 I can say
that for example those that are below
maybe let me say 14 they have got low
levels of financial literacy and then
those above 13 they have got high levels
of financial literacy so I might in that
case I'll need to create a variable for
example which is like okay it could be a
that is categorical I can move from
continuous to categorical variable but
not the other way around so in that case
I'll be creating a new variable I can
use the rename function to rename a
given column all these functions are
within the ti vers package that is all
that you need T ver has got many nested
um packages in it okay I can also subset
the rules for example in case I'm
dealing we say that I've got first years
and second year student I can subset the
rules and only work with only first year
students or I can subset my data set and
I only work with the second year
students I can delete the column the
different columns by using either by
using both the subset and then the
select
functions okay um here this this part is
very important um whereby it just helps
you to still within when you're doing
descriptive what are the different
things that might be of interest um if
you're having one categorical variable
like what we've seen most of the time
the method of analysis is frequency or
yeah we have just talked about that and
the method of visualization you can do a
pie chart you can do a b chart okay we
are going to do a lot of this work is
going to be covered in the next session
of exploring data analysis um in case of
one scale variable like or one
continuous variable we do summary
statistics which we've seen um if you
want to do it visually you can do a
histogram in case of two um two
categorical values what two categorical
variables the method of analysis we use
a k square if there are two continuous
variables we use a coration coefficient
if it's one categorical and then one
scale we do composion of means or what
you refer to as an analysis of variany
and of course there are different
methods of visualization that we can
use
okay um let me pause a little bit and I
ask in the chat are we on the same
page
okay if you have a question you can post
it in the Q&A and then it will be
answered all right
I'm going to um I'm going to stop
sharing here and then I do the last
part when I go back to
R okay um I just want us to to do a few
things within data
manipulation okay and then other other
things i' be I believe that at the end
of the week you'll have how to navigate
your different data and you just keep
building you just keep learning okay so
um I'm only going to do one thing how do
I create a new variable I want to
convert I've got a variable which is Q
scope these are the scores that students
go okay so in case I run my data set
again that is the the literacy dat is it
I can pick the summary statistics this
one at least now we know
and uh I can use the mean just to find
out okay just the mean alone and then I
want to create a new variable using the
if else command remember what we talked
about yesterday if you're not so certain
about how particular command works you
can type question mark and then you say
if
else okay and sometimes you can read
from the yellow window what it does or
you can just run that line and then you
come to your hope window and they are
telling you how to use it they give you
the
description okay and then they also
they're telling us that um a value with
the same shape as test which is filled
with elements of either yes or no
depending on whether the element of a
test is true or false so this is the
USIC that we going to use they're
telling us that okay okay first write
the test that you want you want that
that is the condition that you want you
want to test if the condition is starty
then it will be yes otherwise it will be
no okay so this is what you want to
create because from from our continuous
variable um we are having that we want
to we want to create a condition and
categorize two groups of people where we
group students one that are financially
financially literate and then
financially illiterate so implying that
I'm going to create a condition here
that is my test is going to be my first
thing so I'm going to the test is going
to be on the financial literacy
score within my data set I'm going to
pick the financial literacy score if it
is greater than my men that is the test
then yes so my yes I'm saying called say
they financially illiterate they
financially illiterate otherwise it is
the
opposite and uh that is what I'm having
on line
160
[Music]
um yes that is what I'm having on this
particular line okay um my if else
so this is the if else all
right um the test that I want that my
test is going to be go to my data set
pick for me the quiz
score if it is greater than the
mean okay this is the test I'm
considering and
do if this anywhere
yeah
okay so in case this test is
passed then call it give it a financial
financially they financially illiterate
otherwise give let them be financially
illiterate all right so and I'm creating
a new variable the new variable is is
called uh
qc2 and I'm Al I also want this variable
Q score2 to be attached within the same
data set that's why I'm saying literacy
literacy as my data set I'm adding
dollar sign a new variable that I'm
creating which I'm calling Q SC2 which
is the one that is going to show me um
the the whether whether students are
financially literate financially
illiterate so after that I run this
particular line please note that
initially we are having seven variables
so if I run this line I expect my
variables to increase for example if I
come here and I
run you see what you see what is
happening my variables have increased
from 17 to 18 if I want to view I can
come and click here
here and then within my data set I come
and check has there been any change you
see that I've created a new variable and
it has categorized for me two things
either student is financially illiterate
or a student is financially
literate okay and even I can do the
tabulation because since it's now
categorical I do a literacy of I do the
table command of the new variable and
then I I'll see like oh now I've got
2011 that belong to this category and
then
199 that belong to this
category okay let me pause a bit and
check the
chat are we on the same page anybody who
is
lost anyone who has tried and wants to
share
okay can I get um can we let's get two
people on that and then uh maybe we can
learn from U from the
rest if you want to share kindly put up
your hand and then you can be
promoted we can um there's
Olga if you want to share then we can
see from you then the
Sharon anyone who has tried then it's
not clear then we can explain from from
your
end repeat please
Rec
Gloria okay
glor Gloria can you share we shall have
two
people some people are saying they can't
see the scripts maybe you zoom in and
then others are saying they are
completely
lost so you may have to recap a few
things okay let's first he from Gloria
and then I'll
Rec okay thank you for
that glor go
ahead oh sorry let me stop
[Music]
sharing please go
ahead
okay yeah
trying to be loud such that we can hear
you
yeah
yeah been able to run
the
my I
don't see my
um I can see
the
okay yes that you're showing me the data
set I'm seeing
that
yeah so what is yeah we want
to but you're already having you're
having the the what the Q2
score yes the Q2 score okay can you go
can you scroll
up
oh okay yes can you scroll
up no we are seeing that this
person
continue yes when you look at the first
person we can see that they categorized
as financially illiterate it implies
that there scores were below the mean
below because it was 11 so when you go
back into the Q score the original the Q
score where we are having their values
they should have a score less than
one okay
can you briefly explain to me what you
understand by that line then I'll I'll
recap from
there what you've understood then I can
support okay I'm looking at line 136
line
136 line 136 you check on
checking yes
okay
okay I want to to to hear what you've
learned about that line then I can I can
support
you
sorry
know
like you're lost
yes
yes
okay okay can you can you let's see the
beginning of
136 No line we still on line 136 okay
great
yeah okay
now try to follow closely you know what
line 128 does it was just reading for us
the data set isn't it yes yes which we
saw it successfully read within the
environment okay we said that the
summary we wanted to know I mean the
summary of The Q score Q score is our
dependent variable it was testing about
it was the quiz
score okay so I came here and I wanted
to know what was the mean so I used the
mean command mean then in bracket you
put whatever that you want isn't it that
is
line are we there I want to use the mean
as my slash hold to say that if a
student is below the mean which mean is
uh is uh
11.23 are you
following yes if you run that line
You'll see that the mean is 11.23 if you
run Line 1
through2 okay yes yes I can see that
okay that is what I B on because if I
want to convert from continuous to
categorical I want like I need a test
what test can I use we said that when
you come to the help window can I I you
seeing the Hope
window you have it here
this is the help window are you seeing
where I'm putting the
pointer here are you seeing the pointer
yeah great
so with the help window I just want to
understand how to use the if
else okay and they're telling you that
this is the usage that you can this is
how you're going to use it if else in
bracket there are three
arguments the test that is The Logical
um The Logical mode that you want to you
want to use for example I say the test I
want to put if
my is greater mean which mean I already
have then your yes is what do you want
to call it that yes if if the test is
passed I want to call it I want to say
this students are financially
literate otherwise it is the no they're
not okay it is the
opposite
yes am I making sense yes yes so now I
come and just pick the if else command
okay which which is here on this
particular
line so what is my test the test is on
the Q score
the quiz
score So within my literacy data set
pick for me the quiz
score if it is greater than the mean
which is
[Music]
11.23 then yes call it financially
literate otherwise the no is financially
illiterate okay are we on the same page
there yeah
yes but I want to to introduce a new
variable within my data set called Q
score to because I need this column that
gives me this output remember I'm
converting from continuous to
categorical I'm moving from Q score I
want to maintain the Q score so I create
a new variable Q score2 within the same
data set that will show me financially
literate and then financially illiterate
and it is this Q score this is a new
variable Q score to but I attach it to
the literacy data the literacy data set
such that it's also part of the other
variables that are
there okay am I making sense
yes that's why when you run that line
you see that the variables are
increasing from 17 to 18 the 18th one is
the Q SC to
variable now which will show financially
literate and then financially illiterate
yet we've also maintained our original Q
score does it make
sense yes okay I ask you for me a simp a
simple
exercise okay okay can we can let's run
a summary statistics for age the open
age because it's also continuous and
then we say we we use the if el
command you can still do it under
there okay
you can you can just type come to where
there is
table okay you can enter create space
there create space where you can work
from
enter okay great let's do for AG maybe
it will be more
clear we have
let's have a
summary okay literacy our data
set dollar
sign let's get um q24 open Edge which is
the continuous
one this one here scroll down you can
come and it yeah open great okay run
it so we are seeing that the mean AG is
around
225 okay now what we are going to do
just for learning purposes let's say
young and then old
okay if a student is greater than if a
student if a student's age is above 22
okay let's say
22.5 for learning purposes we say that
person is older below we say
young you get what I mean okay just copy
that line and then edit it
[Music]
so we need to
create
22 was 22 what
22 even if we just put 20 doesn't matter
but you put just for learning purposes
five uhhuh we need to change this our
test has to change
we put
Edge we put
Edge yes that one okay now instead of
what is the condition if somebody above
the age you now instead of
financially that person is
be
okay you get it
if the condition is passed let's put old
otherwise let's put
young we saying
thater let that person be
old yes wonderful
typ
PS I still
there
oh wow we lost
her okay who can demonstrate for us that
part for the case of age to see that
you've gotten that part you can if you
can manipulate from continuous to
categorical anyone who wants to
demonstrate
okay um
um I'm going
to okay but what um our colleague was
sharing with us I think you've picked
the concept so I'm going to leave you
with some
assignment of course the what I want you
to do you can still categorize into
three levels this one I want you to do
on your own yes however much like I've
put put the commands for you but try to
read through and understand how they
work and then also maybe you want to say
financial literacy you want to put it in
low medium and high or you want to break
them in a I mean whatever whatever
levels that you want that you want to
consider you can still if you want you
can still use the if else command or you
can use the cut command all these things
you can learn as you move on okay you
can use the cut command you can use the
if you can use the cas when function you
can use the if else command all that is
it's great for you um we can rename I
want to wrap up you can rename your
variable okay for example we use the
rename command so I say that uh when you
want to rename we are using the rename
command okay inside you put in the name
of the data set and then what after the
name of the data set what you want to
change for example instead of Q score I
want to call it quiz score so you type
the name that you want to change it to
and let it be equal to what is what is
exisiting within the data set so I'm
changing Q score to quiz score if I run
that
line and uh I run call names to check if
if uh it has been
infected I'll see that now I no longer
have the Q score I have quiz score it
can apply the same command can be
reproduced to other different
things um then I also want what if I
want to subset the rules of the data set
um I can use the subset command let's
take an example we said with the year of
study we had first year and second year
in case I want to only work with only
First Years
what do I do it implies that I'm going
to create a new data set okay from the
master data set I need to I'm going to
create a smaller data set for only First
Years so how do I do that this is the
last bit I want to call the name of the
new data set I'm creating for only first
years as literacy year
one I use the subset
command I put in the name of my datat
which is
literacy and then within literacy I pick
the variable that I'm interested in the
variable is q23 year this variable has
two levels the first year and then the
second year and then I tell r that only
give me for only first years that is why
I say q23 year double equals equal equal
give me only first year yes okay so if I
run this if I run this
line what is going to happen I've
created a new data
set with now two five five observations
these are the only the first years
implying that they balance their second
years I still have the 18 variables I
can click on the icon and now check my
new data set which has only the first
year student
okay um I can go back in case I want to
reproduce it again for something else I
can I can all I need to know what is my
data set and then my variable that I
want to that I want to
substit if you feel it's you can still
use the the hope the Hope window or the
Hope command to support you
all right and then the last one here
it's about uh what about in case I want
I'm not interested in certain columns
within my data set I can delete them out
and then I remain with only the ones
that I need but still it implies that
you need to create a data set again
excluding all the variables that you
don't need and you only stay with what
you
want I called my data set literacy
2 okay I subset the command I use my
data set
literacy then I use the select command
select equal to you put minus C the C is
a concate command which brings together
like inside here remove from the main
data set remove for me the parents and
then remove for me the rank variable you
can have you can add as many variables
as you want here I only considered only
two variables so implying that when I
run
I expect now my variables to reduce by
to reduce by two they were 17 now they
will reduce to 15 because as you work
you have to think at the back of your
mind when I run that I can now see that
my variables have reduced and when you
check again you will not see the rank
variable and then the parent variable
will be deleted that is those are the
some of ways that we can do um dat
manipulation um I think that will be the
end of my part for today and then the
other bits are going to be on
exploratory data analysis but there's a
lot still within the within the um
within the day one script which we can
you can refer to and and you teach
yourself about some of these things
especially we going to see more of data
visualization in the coming session with
the the rest of the colleagues um
Professor Susan Thomas I'm handing over
to you now unless if there is a quick
question that I need to
answer oh thank you very much uh Dr
Helen can you hear me
sure okay thank you very much Dr
Helen uh thank you
participants I hope now you've become
our gurus
in importing exporting
data in can we see your face oh you want
to see my
face why do you want to see my face you
already know me but to
see Dr Helen you pay a fine okay so I
was saying that I think now you are
comfortable with r the the first bits of
R you can manipulate R now
import data either Excel or any other
file extension and I encourage you to go
and
practice we promise to end at 5 and it's
2 minutes to 5 tomorrow we will be
having exploratory data analysis
with uh Dr Thomas oong we take us
through explo data
analysis and some bit of inferential
and N parametric uh
statistics then from there time allowing
we will do some bit of regression but
for now I encourage you practice
practice makes
perfect uh go to
YouTube uh you can either download or
even play the YouTube video and you'll
be able to follow if you came late I saw
some people say they came late it is you
don't need to worry about coming late
because it's already recorded you can
always take off time listen to the
YouTube recording and then follow so
that after this week next week we are
not going to repeat any of the
introductory part once we we finish this
introductory part we will be doing the
advanced statistics and we are not going
to talk about entering data uh loading
file packages and and inst in packages
so please try to uh par try to
actively practice thank you so much
unless my colleague Thomas has something
to say I'll give you an opportunity I
give an opportunity to
to
hello yes thank you
everyone I'm not as nice looking as my
two colleagues I hope you forgive my
looks
yeah thank you very much I've been
trying as much as possible to answer the
the the queries that were raised in the
Q Q and A I I still see some people
posting what should have been posted in
q& a under the chat so if you have any
question I've been trying to see how
people are struggling with the with
codes yeah but um is very important to
read the error that AR gives you because
in that eror that AR gives you there are
solutions so you can look at the at what
I mean sometime RS say this package is
not is not available or is missing then
you first go and install that before you
can install the next one um we have a
lot of material there is the you YouTube
channel for for
Forum uh
there's YouTube channel for forum and
then also there are lots of material on
Google so you can always Google uh if
you get any any error please you can
copy the error and put it in Google and
there will be suggestions uh related to
how that particular error can be can be
sorted out uh you have the material for
exploratory data analysis uh inferential
is already given to you so please try to
um
download uh the
folder put it on your desktop and open
the AR script and try to install all all
the all the the packages that are
required so that tomorrow we can pick up
from there and move thank you very much
maybe I can hand you over to forum team
uh is there anybody there's Amit there's
Jolene so please H hand over to you
thank
you let me remove my face from here
now thank you so much Dr Thomas song
thank you
Dr and thank you very much Professor
Susan um thank you so much the
participants um today we end here and
tomorrow we start again at
2:00 and uh please take time to go and
review some of the sessions on our Forum
Network YouTube
channel on some issues which you might
be facing thank you so much have a good
evening