Advanced Statistics and Experimental Design: Day 2 R Workshop – From Data Management to ANOVA and Post‑hoc Tests

Name: Advanced Statistics and Experimental design Day 2
Uploaded: 2026-01-14T15:42:42.658797+00:00
Channel: RUFORUMNetwork
Description: Summary and key takeaways on Advanced Statistics and Experimental Design: Day 2 R Workshop – From Data Management to ANOVA and Post‑hoc Tests, covering

RUFORUMNetwork

Jan 14, 2026

•

4 min read

YouTube video ID: 6f4_o1G49pE

Source: YouTube video by RUFORUMNetwork — Watch original video

PDF

Introduction

The session was the second (actually third) day of the Advanced Statistics and Experimental Designs training organized by the Forum Secretariat. Participants were welcomed, reminded of meeting etiquette, and thanked for their punctuality and engagement.

Session Overview

Facilitators: Prof. Balava, Dr. Thomas, and technical specialist Salman.
Funding acknowledgment: World Bank and the Institute of Food and Nutrition.
Options for in‑person training were offered for those who prefer physical workshops.

Preparing the R Environment

Create a project folder for each analysis (data, scripts, README, etc.).
Set the working directory in RStudio via Session → Set Working Directory → Choose Directory.
Install required packages (e.g., agricolae, doBy, lattice, effects).
Load packages with library() calls before running any analysis.

Data Management

The example data set “bond_time” was built from vectors representing replicates for four treatments (A, B, C, D).
Participants were instructed to run a series of lines (43‑82) that create the data frame, compute summary statistics, and generate the vector of observation numbers (1‑12).
Emphasis was placed on not re‑typing code; instead, use the provided script files (*.R) and run them directly.

Running One‑Way ANOVA

model1 <- lm(bond_time ~ treatment, data = bond_time)
anova(model1)

The ANOVA table reports degrees of freedom, sum of squares, mean squares, F‑value, and p‑value.
In the example, the p‑value was 1.084e‑5, far below the conventional α = 0.05, indicating a significant difference among at least one pair of treatments.

Interpreting the ANOVA Output

P‑value: compared to α to decide significance.
Stars notation (***, **, *) reflects the magnitude of significance (e.g., *** ≈ p < 0.001).
Coefficients: the intercept represents treatment A; other coefficients show differences relative to A.
R‑squared and adjusted R‑squared give a sense of model fit (discussed later for regression contexts).

Checking Model Assumptions

Residual vs. Fitted Plot – looks for constant variance; the example showed increasing spread for larger fitted values, suggesting heteroscedasticity.
QQ Plot – assesses normality of residuals; most points followed the 45° line, with a few outliers.
Residuals by Treatment – boxplots revealed that treatment D had markedly higher variability than the others.
Statistical Tests
Levene’s Test (via leveneTest) for homogeneity of variance.
Shapiro‑Wilk or sf.test for normality of residuals.
Results: p > 0.05 for normality (fail to reject H₀ → normality holds); p > 0.05 for Levene’s (variances appear homogeneous), but the rule‑of‑thumb ratio (max variance / min variance) was > 5, flagging potential heterogeneity.

Multiple Comparison Procedures

LSD (Least Significant Difference)r LSD.test(model1, "treatment", p.adj="none")
Produced three different letters (A, B, C, D), indicating all pairwise differences were significant in the unadjusted test.
Bonferroni Adjustmentr LSD.test(model1, "treatment", p.adj="bonferroni")
Made the test more stringent; some previously significant pairs became non‑significant, illustrating the importance of controlling Type I error when many comparisons are made.

Practical Examples

Dog‑Bone Crushing Experiment – examined how different dog sizes and bone types affect crushing time. Two models were fitted:
Model 3: only bone type as predictor (non‑significant).
Model 4: bone type + dog type (both predictors significant, residuals became random).
Demonstrated how omitting a relevant factor can inflate residual variance and mask true effects.
Irrigation Treatments on Yield – a second data set with five irrigation levels (including “no irrigation”).
Boxplots highlighted that method 2 gave the highest yield but also the greatest variability.
ANOVA indicated significant differences; post‑hoc tests (LSD, Bonferroni) clarified which methods differed.

Common Troubleshooting Tips

Always run the library calls before any analysis; missing packages generate red error messages.
Use the provided script files instead of typing commands manually to avoid syntax errors.
When sharing screens, ensure the correct RStudio window is selected and that the microphone is muted when not speaking.
If a script fails, check that the data objects have been created (e.g., bond_time must exist before fitting the model).
For persistent errors, copy the exact error message, Google it, and apply the suggested fix.

Closing Remarks

Participants were encouraged to review the recorded sessions on YouTube and revisit the scripts.
The next day will cover two‑way ANOVA, split‑plot designs, regression, and correlation analysis.
A reminder that R is a general‑purpose statistical language, not limited to agriculture.
The facilitators thanked everyone for their patience and participation.

Effective experimental analysis hinges on disciplined data organization, proper setup of the R environment, rigorous checking of ANOVA assumptions, and the judicious use of post‑hoc tests (LSD, Bonferroni) to draw reliable conclusions about treatment effects.

Frequently Asked Questions

Who is RUFORUMNetwork on YouTube?

RUFORUMNetwork is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

Plot** – assesses normality of residuals; most points followed the 45° line, with

few outliers. 3. Residuals by Treatment – boxplots revealed that treatment D had markedly higher variability than the others. 4. Statistical Tests - Levene’s Test (via `leveneTest`) for homogeneity of variance. - Shapiro‑Wilk or `sf.test` for normality of residuals. - Results: p > 0.05 for normality (fail to reject H₀ → normality holds); p > 0.05 for Levene’s (variances appear homogeneous), but the rule‑of‑thumb ratio (max variance / min variance) was > 5, flagging potential heterogeneity.

Summarize another video

Full Transcript YouTube

foreign
[Music]
good afternoon colleagues good afternoon
participants
you are most welcome to day two of the
advanced statistics and experimental
designs
is it day two today or three today's day
three
so you're most welcome to day two of the
experimental designs and um
thank you for joining in on time
thank you for joining in on time and
always being with us here making our
trainings not to be in vain we also
think the facilitators for always being
here with us so my name is Salman
and I'm sure most of you have already
seen me in the previous meetings I work
at the Forum Secretariat as a technical
specialist research and development
today we are continuing with our
training on day two of the experimental
design and we really request you to make
sure that you continue participating as
you've been
also please if you
if you have been called to share the
screen or to ask your questions please
always check your microphone
so that you start sharing the screen
right away so that we do not waste a lot
of time in uh in trying to share this
screen and all these in questions
please if you are if you finished with
what you are sharing or with your
questions please make sure that you mute
your microphone
do not cause a lot of Destruction during
the meeting
and most importantly what has been
always been asked by the facilitators we
we tend to
well we tend to not listen to what the
instructions that are being given to the
facilitators and you find that uh most
questions are being raised over and over
when they've already been explained
please do listen attentively and make
sure that you follow up is at the
facilitators are taking us through so
that we do not waste a lot of time and
we make sure that we deliver what is
supposed to be delivered at the end of
the course
so once again you are most welcome let
me hand over to the facilitators
professor balava thank you very much for
being with us again the flaw is yours
thank you
good afternoon ladies and gentlemen
thank you very much Selma and we also
extend our thanks to the reform to the
World Bank to
[Music]
The Institute of
food and nutrition
we are grateful for the funding and
support and so grateful opportunity to
invite my colleague Dr Thomas but at the
same time yesterday I was seeing some
comments that some people would like a
physical training if you're interested
in a physical training you can always
contact us and we make a program we can
travel and come for a physical contact
thank you very much and I wish you
Pleasant and enjoyable training pay
attention practice practice to be
perfect I invite my colleague Dr Thomas
to go ahead and present thank you very
much
yes
good afternoon everyone
we are going to pick up from where we
left yesterday but before that we are
going to do the the usual ritual when
you're using Allah I was still going to
repeat those steps
let me share my screen
okay now I think I can remove my video
you really see me
okay so so we like I I I started with
yesterday
uh there are a few things that you need
to do when you're trying to do your data
analysis using r or any package uh the
first thing is uh you need to create a
folder for each project for each
analysis that you're going to do
so you create a folder and in that
folder you put everything your data
assets your description of the data set
you can have a small readme file uh you
can have your other scripts uh so that's
the first step and then the second step
is ensure that when you start running
error you make the folder
uh your working directory uh that is it
makes life a lot easier if you don't
make the folders your working directory
so every time that you want to call a
data set you need to be able to specify
the parts after where the data set is
stored if you're good at programming
that may not be an issue but if you are
a learner it's better to do things that
will make you move faster
okay so that is the is the first one uh
the second one is that you need
to
know
the packages the different packages and
functions that you require to do a
particular analysis
um that is not a big deal because you
can easily go to Google but with the
time you will also get to know exactly
how to go around so for those who are
doing agriculture experimentation uh one
very very popular uh
a package is what called Agricola
package
that was uh was the Masters uh work for
some uh some gentlemen after I did this
this then he decided to compile it into
an arrow package and it's one of the
most popular Arab packages that we have
around so you make sure that you know
the kind of packages that you want and
then you also need to know whether those
packages are installed already or not if
they are not installed then you need to
install them and we already know how to
do the installation you go to install
dot package you type the name of the
package and then you go ahead and
proceed if
the if the package turnouts that is not
available that version are then you go
the long away
but
it's always easy for you to just copy
the error that you see in error and
ensure
that you you you copy the error put it
in Google you will be able to see some
solution that has been suggested by by
people
okay so we already installed the
packages yesterday so now we're no
longer going to install the packages so
always good to make
them become like comments so any
installation now we are not going to do
okay so after the pjsi installed the
next thing is then every time that I do
the analysis I need to call the packages
so that they come and they're ready for
use so we are going to start uh looking
at let's say we want like we don't want
the library Dev tools we used it
yesterday just for installation of that
so we can start from line number six we
need agricole
uh the other one
we need Dopey
we need lattice
we need effect
so please always see what what is what
is on the screen
this they are ready but these rates are
not the bad one the the kind of just
telling you that something is happening
okay
so if we have all our
our package our our packages ready the
next thing that we need to do is we need
to check whether our
we need to check whether we have set the
working directory
so let's go to so for setting a working
directory we need to follow the same
procedure we go to session
uh after session
you go to set working directory
after set working directory you go to
choose a directory so just be what I
have out there we have session
we have set working directory and we
have choose a directory
let's see
okay
so so you're going they're going to
share with you uh another folder
the same link so you need to check under
in the same link you need to have a
download week to day one day two
materials yeah the folder will be there
so the link is going to be shared please
don't forget to download the link is
already shared but we'll keep on sharing
okay so to set the working director we
go to session
set working directory choose a directory
okay
so mine is on the desktop
so I go to desktop
I look for
please
don't click and don't double click on
this to open because that that would
mean you are looking for something
inside if there is no any other folder
inside you just highlight it and then
you use the open
okay
so from here
we have set our working directory and
everything is ready so for me you can
see here let me try to zoom in
foreign
this is the command that you set working
directory and you can see from there
that I've successfully set my working
directory
so now we are not going to run the
t-test and we are going to go and run
our analysis of variance straight away
so you check the line where we are going
to
so let's go to my line is is 97
96 97
so here
so here I have a simple program I say
running
analysis Anova using the LM function
so like we we stated before
our error is works we work on an object
in error so we create an object called
Model one
and so the in this object model one we
are going to store the results of the
analysis here so here you have LM which
is the linear model
then
yes I'll learn a bit
okay so here
oh okay I want us to run this of course
we are going to see some error also
uh remember the the data set that we
created yesterday was the bond data set
and this data set was created uh as an
object in the memory of ARA so if we
didn't save that that means it is not is
is is not there for now so we first need
to go ahead and and create the data set
so let's go up
sorry let's go up
okay so let's start running from
we run from line 43
up to where we finish the the data
analysis so
943 we create an object called a
uh 44
we run the summary
we got the variance got the standard
deviation
then we create B
required the summary
the variance
the standard deviation you don't need to
run the other one you can only run the
the one that creates the the that
creates for us
uh the vector
we run 53 54 55 56 57
58 59 60
Grand 62.
we run 64 65 66
67
68 69
71
72 73 just run this line until
until line 82.
so you should be able to end up with
this so this is our data set that we are
ready to analyze so as you see from here
we have rep
rap one two three for AAA rep one two
three for BBB rep one two three for CCC
and so on and so forth
okay so we can now skip all this line
with the box plot because we saw it
yesterday so let's go now and run our
line 97 so in line 97
we are going to do an analysis we are
using the function flm which is a linear
model
and
this is our variable that we want to
analyze
we want to compare different treatments
and then the data set that we're going
to use here is what we call Bond time
Bond time is the last data set that we
created
okay
so let's run
okay so have we all succeeded in in
running
up to line 97 can I see in the chat
there are still a few nodes I don't know
why
can you show us your screen and we see
why you're not succeeding
can you promote Agnes 1z
1 I want somebody who is who has not go
who has not got the thing to to show us
we have promoted against wanjiku
please Agnes
please go ahead and second has been
promoted yes yes please
again please go ahead hello doctor hello
Thomas you might have to stop sharing
your screen then your love has a chance
to share us
I think I know where the problem is so
we're waiting
waiting hello dog
we have a person who has failed to and
so that can trust the screen Thomas
Agnes is talking
please go ahead go ahead Agnes okay I'm
saying yes I don't know whether it's uh
whether I'm I'm sure I'm sorry anybody
raise your hand if you you fail and you
want to show us your screen so that they
can promote you
talking
again
okay I'm saying
um I don't know I don't know whether I'm
throwing typing it's it's typing um
that is that is the issue is not it's
not loading the or updating them they
they are okay so I can assume that
everything is okay I will proceed
the screen is blank okay I think it
Thomas has a problem I can stop sharing
my screen and then I can share the
screen
again as you share your screen
okay yes please share the screen okay
you share the screen go for sharing the
screen
okay okay yeah click on sharing the
screen and then we should be able sorry
my my my my volume is muted so I can't
hear you talking
okay can you hear me now yes can you see
my screen yes
okay I'm saying my issue is my issue
isn't um
uh loading or something it's typing
because I'm a bit left behind
uh like no here I typed for a and I no
no you don't need to type you can run
the command as we have already given you
um the reason why we we we gave you
those because it will save time next
time if you want to create your own data
set you're just going to modify
the the the script that we've given you
don't start typing your own command now
because it's going to play us
okay yes we have the script already I
want you to run the script not to type
it
oh so where is the script
hey but the one you're trying to I mean
it was shared yesterday
okay but you have you have it in there
you have it in day five material
the last
yeah but you have the scripts here so
why those are the script that why are
you trying to modify them
no I'm not modifying it I'm not
modifying them all these all these have
been I've been typing them
as we go along but now when we came to
creating the the the the bond data
I'm left behind from there no that's why
I'm saying you don't need to type them
you have the script already with you you
need to run the script that we've given
you
okay yeah so don't create don't try to
create your own your own link now it's
not going to to
you're going to be remained behind Okay
yeah so later you can modify the the
script that we have given you to do your
own analysis what you simply need to do
you need to be able to change the names
of the files and the names of the
variables and everything will be fine
okay okay
great so you can stop the sharing now
so you can okay
so you can go back and open the a new a
new script and then continue from there
okay anybody else with the problem that
needs to share
uh we have a number of ones raised maybe
I don't know whether Thomas you can
select uh there is Brian Brian you just
select one of them okay yeah
so let me promote Brian
and a safer
we have
we have a safer
okay just get one of them to to do maybe
one or two you promote one
have you promoted any
yes yes
please go ahead can you share your
screen and unmute yourself
we are waiting for you you might be
talking but you are muted
yeah you're muted so you you click on
the the the unmute
okay how about now yes I can hear you
yeah
so can you share they go to sharing the
screen screen so that we can see what
you
hello yes can you share your screen
you if you go below there is a something
in green say share screen click on that
okay
you you double click after you have
clicked on the share screen you need to
select the window which select the
window you want to share is it coming
all right
no you except for yourself yeah
no problem there is some problem with my
application then let I go back
no you you first stop sharing and then
you're sharing a window first or you
select the window which is showing they
are open they are
okay
come back
just stop sharing start sharing again
stop stop sharing and then you open you
you select the window which has r
right first stop oh I can stop it for
you
let me stop it for you okay uh-huh hey
no you have not stopped
I will stop to share your share your
screen click again yeah
Click Share screen
and then you select the window which is
displaying the r
they are they are window
then you double click
Click Share screen
ing
my application
have you clicked here screen
yeah
yes so select the window which has r
yeah I mean
have you done that
yes I've done but there's some problem
happened with my other programming
um
so what just happened to your other
program
maybe you can be guided through
without sharing the screen so just
explain the problem can you tell me the
problem
please explain the problem to the
facilitators
I think and safer has a challenge with
the network okay so maybe you can you
can promote somebody else and then I can
I proceed after that
okay
so before we get someone can I ask
yes
before we promote that someone so um
uh you're saying you're saying uh the
script was shared yesterday
yes the script is in the in the in the
folder that I saw you have the fall that
I mean you have the script
so uh how do I rode it now to have the
to have the detail
you double click on double click on that
script there the one that is written
t-test analysis double click on it then
it will open immediately
if you go in that folder for day five do
you see a a file that stopped with that
ends with DOT r
uh
there's a file
analysis of
I have it can you write right click on
it and say open with arrow Studio
okay
okay nice I see it now okay so you can
follow the scripts that you can run the
Line Design that I indicated and then
you should be able to we should be good
to go
nice thank you so much okay okay we have
a mandio
please go ahead and share your screen
mandio
yes yes yes
I have some I have some that introduce
yourself and where are you from
uh I am from Mozambique
uh I got the the Green in iteration
uh
uh is very nice to participate in this
program
I hope to learn to learn more
about it the statistic
okay do you have any challenge
what are you having any challenge with
our such that you can share your screen
do you have any challenge or there is no
problem
um in this moment I I can do that so
um I'm preparing to to make to make it
[Music]
hello
yes please yes sir
my name is
from Zambia
yes please can you get me yes yes can
you get me yes we are getting you
yes
um can you see on my screen yes yes
please
I've shared my screen okay yes we are
seeing
so so the the first thing
before you proceed
uh you see on top there you have there
is a warning sign
which one yes I've seen it package okay
yes
required but are not installed can you
click install
click on this what what is written in
stores then Okay click on that
okay
I've installed
your weight is still installing
are they all installed now
okay until they all finished
so please always try to read read
through read through your your arrow
Studio page to see what is going on
okay
I'm sure your problem is is related to
the running the libraries and they are
telling you that they are not found
okay can you go ahead and start
explaining what your challenges is
as we wait for you for the download to
complete
um
I think it I was trying to to run one of
them I think maybe it could have been
the problem with the the hero which was
appearing there
if you you try running number four I
tried to run yeah I was trying to run
number four but I was unable to do that
and you got an error yes
yeah because you know you need first
install and then after installing you
can then run the rest
okay yeah you can't ignore the the
number line five if you don't have that
you can ignore a line five for now
okay yeah then you can run the rest
it's going to take a a while as long as
you still see the the red
I you see now it needs to run until all
of them are complete so it's going to
take a while don't you just stop let
continue running don't don't stop it
okay yeah
okay so I think we can we can proceed
so I can stop sharing yeah you can stop
sharing
okay yes let's proceed
okay
okay so so at this point I assume that
we all have the data
and then we we can go ahead and and do
the interpretation so the first thing
here is uh we have created an object
called Model one
and the object model one has several
things in one thing that we can extract
from the model one is analysis of
variable stable and line nine to eight
will Ex extract for us the analysis of
variance table so let's run line number
eight
nine to eight
okay
so do we all see the analysis of
variance table
can I see those who are already seeing
the nurse I want to see those ones who
are who have the nice ovariance table
okay I think we have
a good number
good
okay we can stop there
okay so so let's look at the content of
the analysis of variance table so the
first thing is on top here it tells you
analysis variance and then it tells you
the response which is bone time and then
it tells you that this treatment
has three degrees of freedom
uh these are the sums of squares the
mean sums of squares it give us the F
value and then it gives us the the
probability now uh we there are times we
say
uh the first the most important figure
in the nice variance table is this this
probability now this probability we are
going to compare with a level of
significance which is uh uh which is
zero point let's say 0.05 so you can see
from here that uh our figure
of zero of over
1.084 times 10 to the power negative
five so that means you have uh
four zeros before a one so you have zero
point zero zero zero zero one and then
you have zero eight four so this
probability here is much smaller than uh
our P value our sorry this pivot is much
smaller than our Alpha so in this case
we say the significant difference
between
and the treatment at least a pair of
treatment that we are trying to to
compare
okay so that is
the first part
so we've generated the analysis of
variance table you know how to interpret
the nurse so variance table we can also
generate a summary of the model what we
got in our model
now this is a bit expanded and and maybe
a bit challenging
uh to to for for a number of people so
here the first thing is it gives us the
the the the the model that we have run
then it gives us
the recipe a summary of the residual
we're saying the minimum residual is
negative for the maximum residual is
positive six
and then it gives us
uh the estimates
so here we have the intercept
here we have treatment a b
treatment C
and treatment d
I I don't know which seems to be lacking
treatment a where is treatment a
can anybody tell us where treatment a is
just write in the chat
somebody did not get the certificate for
proposal writing
thank you
I also attended I didn't get
foreign
so we are making comparison with the
interest so The Intercept first of all
is going to compare whether the value
for a is zero or not so you also need to
look at the the T probability here so
you can see from here that uh this is
significant at 0.0
uh zero zero
at I mean it is almost so so you can see
that uh the value for a is not is
significantly different from zero and
then we'll be you're comparing a and and
B then you're comparing C and B then
you're comparing d and a sorry and a not
a b
okay so you can see from here but I'll
give you I'll give you time to go and
you will go and read more
and then down here you have the residual
standard error it gives you standard of
3.1
two two
eight degrees of freedom it gives you
the R square we're going to talk about R
square when we come to regression
analysis
and then it gives you adjusted R square
give it f statistics so this information
here this information here is the
information that was that is that was in
the analysis of variance table
so you can see here
the analysis of variance table the f is
here
okay
so so that is the easier I mean this
easier part now what we want to do is
that uh uh we we need to check for the
analysis the assumptions of analysis of
variance I'm going to talk more about
this uh when we're going to do further
analysis but I
I'm going to make a presentation uh on
experimental design and then on the
analysis in general so I'm going to show
for you uh we're going to look at the
different assumptions of analysis of
variant so what we are going to do here
is we are trying to
uh to check for assumptions of analysis
of variance
uh upgraded uh from line one zero two
I'm just creating data point one two
three four up to 12 because the 12 data
points here so I'm going to create an
object
and that object now I'm going to combine
it with my data set the bond data set so
that I can attach let me run this then
I'll show you the result first before we
can I can we can discuss that
let me do this
okay so what I wanted to do this is the
data set that we had before
I just wanted to attach that a a
0.123 up to 12 to each of these data
point because what we want to do now is
we want to see uh how the residual
behaves
we want to see how the residual from
each point are BM so let's create we've
created a new data set it's almost like
the the all that created a new data set
okay
one way of of looking at the assumptions
of analysis of variance
one way of making Assumption of looking
at assumptions of analysis of variant uh
the assumptions you can use plots mainly
plots so we are going to draw several
plots and then from the several plots we
are going to see whether the assumptions
of analysis of variants are violated or
not so the first thing is
we want to have four different
observations four different graphs we
want to plot them together that's why we
use this
this little command here so this one
will give us two rows and two columns
okay so convert let's run line one zero
four
okay
so now we should be able to
I need to to create more space so that I
can have the plot
foreign
so let me first plot them and then I can
zoom them and we can have a look
so you can see here
okay just hold on a bit
I need to change the
just hold on a bit I need to change my
what I'm sharing
okay so do you see my plots
you see the three plots
okay good okay so what we have done here
is we are looking at uh each data point
each uh here is is the plot of our
residual remember from our summary the
minimum residue was negative for the
maximum residual was positive six so and
this is the one that gave us the is
residual
and then this is the one that gave us
the lowest residue
and then you can see from here
that
there is
a deviation from the straight line
and what you see here that there seems
to be a lot more variation
for the bigger values like at 45
and there is a bit uh the the smaller
variation yeah so we we seems to seeing
that there is some variation some uh
some violations of analysis of variants
analysis of variants one of the
assumptions of an also variants
indicates that
one of the assumptions of analysis of
variant is that the error term should be
constant the variance in the error term
should be constant As you move from one
treatment to the other so we can see
from here uh that there seems to be a
lot more variability uh when you have
higher values uh remember when we look
at the box plot yesterday we found that
d add more variability compared to a b
and c
so so this is actually for B and this is
for a and C and A and C seems to have
similar variability
and the second
plot here look at what we call this is
what we call a QQ plot so a QQ plot is
supposed to show for us
if there is
violations of assumptions of analysis of
variant so here we expect to get
a straight line at 45 degrees
so from here well we seems not to be
doing badly off but there is this data
point that seems to be off way of the
rest so from here we can see that well
analysis of variance Assumption
of uh normality seems to be okay and
then here we also see if we plot the
different data point one point two point
three four five six so we seems to see
uh
some kind of
Randomness here
except that this this uh this one seems
to be a lot higher compared to the rest
so there's some kind of Randomness here
one of the assumptions of analysis of
variance is that the error term should
be random it should not take any pattern
one there is a pattern then you know
that is the the the there's going to be
issue with your interpretation of the
results
okay so let's proceed let me come back
okay
uh now I want to to to stop
okay let me just first explain
[Music]
uh my command
so the first thing we did
we created
a simple 1 to 12 vector and then we
brought that 1 to 12 Vector we combine
it with our original data set which we
call Bond data set we did not want to
change the name so we're just adding it
onto the data that we had before so
here then you have your one two three
four five six seven eight nine ten
eleven twelve and then you attached it
to your data set that we created before
and then we came up with something that
looks like this
okay so that's what we did we just
wanted to say this is our data point one
this is our data point two this is our
data point three so you can see the
first three data points are from
treatment a the next three are from
treatment B and so on and so forth now
we want to to check what is happening to
the residual at Point a
0.1.2.3 so that we can see what at the
end of the data point that we have is
behaving in a strained way that will
help us uh to uh to check for a few
other things so that was that is what we
did here and the next thing that we did
was we wanted to have uh different
uh we want to have four different graphs
so let me first draw the last one then
we can
so you can see we have drawn four
different graphs on the same
on the same
at the same page that is what this
function
this function did
okay then once we created the four blank
spaces to draw them to draw our our
graphs so the first plot we go to
to model one and we say plot
so which equal one there are several
plots that you can get in this model
here so one is the plot of fitted value
versus the residual and then two is the
plot of the theoretical product of QQ QQ
plot
okay and then
this one here what we are doing with
this is we are going to our model here
model one and then we are picking the
residual
remember each of the 12 point as a
residual and then we want to show that
what is the value of the residual for
each data point so let me then enlarge
this again so that I can explain
that sorry
new
foreign
so you can see now that what we are
trying to do here we're looking at Point
a what are our the residual for data for
treatment a behaving there are three
observation in treatment a then how is
the the of the residual in treatment B
behaving how is the residual in
treatment C behaving how is residual in
treatment deep behaving so if you look
at this point here
if you look at this point here this is
this this has a higher variability this
is related to the variability uh in in D
and then uh these two A and C are the
ones here and then B is the one that is
there remember B is the one with the
lowest amount lowest amount of
variability
B is the one with the lowest uh Bond
time the the
the one with lowest observation okay let
me see if there are some questions
let me try to address some questions
that uh we we raise
re-explain the interpretation of
analysis of variance I'm going to do
that when do we use analysis of variance
table and that's the variance table
um
when you are you you're writing your
thesis uh there are different uh
requirements sometimes people use uh
analysis of variants differently
depending on the field in which you are
in most cases
um
there may not be no there may not be
need to actually show the analysis of
variance in your text you may put it in
the appendix
what you want what you're interested in
is you want to put the p-value and from
the P value you can tell whether there
is significant difference or not but
people who do plant breeding are more
interested in the Samsung squares so
they may go further and try to look at
uh uh which uh sometimes when you're
doing what called genotype by
environment interaction you want to see
how the variability of the in the data
is distributed among the genotyped
environment and then the interaction
compared to the error terms so that also
depends on the field so the field which
you're in uh will influence what you're
going to present in your paper on your
thesis but in most cases one or two
sentences is enough to explain what is
international variance table you can
attack the nice ovariance table in your
appendix
uh can you show how to draw climate no
we are still doing an also variance
can you please explain the residual
versus fitted yes so the residual so
what what we what we have here is that
uh for each data point
uh we have the measurement that we have
got in the field remember for a we are
27 28 and 30 and then for B for c for D
what we created now and then when you
you have the analysis of variance model
and also variance model will make a
prediction for each value that you have
so for a there's going to be a
prediction for a now the difference
between the actual value and the value
that the model predicts for you is what
is your residual uh residual is like the
name suggests is what remains after
you've taken away uh parts of the model
now residual is supposed to be noise
supposed to be random noise and that is
why we check for the assumptions of
analysis over and using the residual the
residual is not supposed to show any
pattern once the residual shows the
pattern that means there is still some
information that remains in the residual
and that means your model is not
complete so you need to actually redraw
another model
at the post off we are going to go for
that
okay
so so the the the the the fitted value
versus the residual this one is supposed
to to show us uh that there is no
particular pattern but you can see from
here that we seems to have higher
variability for bigger values and lower
variability for smaller values so this
one indicate that uh this is actually
not a random uh
a random pattern there must be some
issue to do with with a with this sorry
uh no no the data point here is just
I'll just I'm just interested in saying
this is data point number one this is
data point number two number three
number four and so on and so forth um it
has nothing to do with the defeated
value now why this is important
sometimes when you're doing fail
experimentation you actually want to
know what what sometimes we can do is
then we can get uh our residual and plot
it onto the field map now when you plot
the residue onto the field map you can
actually see if there's any strong
strange pattern in the field itself
because there could be a situation where
you have residual in one block seems to
be a higher than residual in other in
other blocks so that would mean that
that particular uh particular block has
a problem as far as the model estimating
the values are concerned remember the
best model should be having zero
residual that means the fitted value and
the actual value should be the same
okay so we would expect the residual to
behave randomly
okay uh somebody's saying is getting
confused
afarima can you can you tell get us tell
us why you're getting confused can you
promote
afarima to to tell us why there is the
confusion
can I repeat what should I repeat after
him
oh it is very important to find the
residual value because it is the
residual is the residual that you're
going to you're going to use the
residual to check whether the
assumptions are met or not
okay I'm going to repeat some of this
thing let's continue running the model
and then we see
okay
so so let me go back and start from the
the the the Anova itself
and then
okay so let's
just hold on a bit
okay
so uh where do you want me to start from
from the model itself
okay let's start from danova and then we
we come and
and okay
okay so from here
uh we have the analysis of variants
table
so the most important thing uh part in
the notes are very unstable is the
fasting is the the p-value
so the p-value is used uh to test the
null hypothesis the null hypothesis uh
is stated that there is no significant
difference between the means that we are
trying to compare now we are going to
compare this P value here with our
significance with our Alpha which is the
level of significance now what ARA does
for you
what other does for you other goes ahead
and that's the comparison and it gives
you these three stars so and then the
chords is here he said if it is three
stars that means the probability is very
small it's almost zero
if these two stars that means you are
testing at 0.0 is significant at 0.01
if you're testing at one star if it
gives you one star that means it is
significant
no I think the the okay so you can have
significant at 0.01
which is three stars at zero point zero
zero zero one zero point zero one so
this is 99.9 percent
confident it gives you you have three
stars and then
ninety percent 99
confident it gives you two stars 95
confident it gives you three stars and
then you can have
uh a DOT a dot mean to significant at a
ten percent and then if it doesn't give
you anything hey that means the
probability is one so that's how the
interpretation of this these Stars here
okay
yeah so that we don't need to the F
value here is just the mean sums or
squares but the mean sums of squares of
treatment divided by the mean sums of
squares error now we don't need this to
interpret this especially if we are
going to look at the P value so the
pivot is enough to solve everything that
we have here okay
so that is the analysis of variance
table and then I went ahead and also got
a summary
let me run the summary
okay so the summary here it tells us the
model that we we ran
okay and then it gives us the summary of
the residue so you can see from here
we have the minimum residual is negative
four the maximum residual is is positive
six
and then
it comes and gives you the different
coefficients
uh we agreed that The Intercept is the
coefficient for a because a is not
appearing here so that means any other
comparison is going to be versus a so
here we have
a
it has a probability of 2.6 times 10 to
the power negative seven
that means the significant at 0.001
percent you can see the stars here so
this step is supposed to continue like
that then B now for B
in this case will be you're comparing B
with a
now the estimate is negative because B
has a smaller value compared to to a
and then with C we are comparing C
versus
a
is negative also means that c also has a
smaller value compared to a and then D
we are comparing D versus a it has the
positive value that means D performs
better than than a so if you look at our
our graph you see the same thing so this
is the interpretation for that and then
so the this the stars are given here
so you can have a zero you can have a
0.01 you can have a zero point zero zero
or zero point zero zero one zero point
zero one zero point zero five zero point
one and then you can have one
so here
so residual is the difference between
the actual value and the value that the
model predicts
okay so here you have
you you have the the residual you have
the degrees of freedom you have the
uh you have the the R square and the
adjusted R square you have the P these
values here uh this last values here are
in the analysis of variance table that's
what you have there okay
okay
so then the next thing then that we did
uh we we we we we we wanted to to
explore the residual we want to see each
of the residual how they behave
okay the Lesser can you promote the
Lesser to share the screen he said it's
getting error not getting the same
results
maybe they can erase their hands such
that we are able to see
the Lesser
I'm going to explain why where what what
means it was true I'm going trick there
but somebody had said was not we have
promoted is the Visa yes Alexa can you
talk
show us your screen
please unmute yourself and share the
screen
okay okay greetings
yes you introduce yourself where are you
from and you can share this
yeah um
from Ethiopia
yes can you share your screen so that we
see yeah yeah
it's like this where to start
so so what is the
uh I don't know yeah yesterday it was
working properly
okay can you clean up the the you clean
the console yeah it's already yeah and
you start running the line and we see
where the problem is which line which
line to run start running your script
and we see where the problem is
like
no no no no we are not running that
today
okay
I think we are here yeah
here let's go up
continue
here
okay you want to run that
yeah I I want to continue from this okay
so I'm getting such type of problem yeah
but you don't have the you need to fast
before you can analyze
so you need to go up
okay
continue up
start from a
that's wrong start from creating a
uh okay
go down
go until where this a is equal to a a is
equal to okay this one yes start from
there
okay so I have to run everything run
everything until you create the data set
then you can analyze that data set
okay you can then run yeah you can run
the rest and you will have no problem no
problem yeah thank you very much I got
it now so you are trying to analyze a
data set which doesn't exist
yeah that's the problem actually yeah so
Rose Ross Oche said that the results are
different can you promote Rose oh not
okay
the script that we have given you you
can modify it and use it with your data
yeah yeah yeah yeah can you stop sharing
now and then
so once you have a good script you can
you don't even need to work to what to
bother about about saving what you've
done the script is enough it is sure
okay thank you very much thank you okay
[Music]
yes yes
be able to see my screen yes
so let me enlarge it so that
um
thank you
hello doc are you able to see my screen
yes so where is the problem
yeah I've been trying to run and uh
provide my
my data that they develop whatever we
have we do there but I can't
down here
whatever is coming out different
um from Kenya
yes so can you first clean up and you
start clean up the the console and you
start again
from here down yes clean the downer one
and pull it up a little bit clean it up
just press the brush
there's a brush on top of that
yes can you pull up the you need to pull
to create a little bit of space so that
we can see what's in the console
okay
this way okay so you go no no no I mean
down we we don't see the the okay fine
so can you where so yeah okay now you
tell us where the problem is you start
running the line which has a problem and
we see yeah I've been running the line
like uh live from a library it was not
showing anything
compared to whatever you are having dog
so
round line number four and we say
no no you don't need to highlight
oh
just click just put a cursor there no
um
you don't highlight
okay yes yes that is okay go and run
okay
is it continue running then
you have skip number five okay continue
running
when when you highlight when you
highlight library that means you're only
running that part Library
oh okay
and I'm seeing some places are having
like a red marks
no no that's okay first go down and we
see
no no go down in the console we see what
is the red in the console
you've covered your console you can you
adjust so that the console becomes a
little bigger
okay
you can click on that
no no no
click on the The Big Box
this one here yes
yes okay
yeah this uh there are some
so it's okay
yeah so you can click on the the one
that shows both of them
click on the two boxes
okay
okay
so let's continue running everything no
you first go and start running from
where we're running today go and start
where the word is a go down
okay
yes go up
turnover
go up
where there is a equals to
okay I yes I am
yeah 41 yes that we start with the line
a line for the one
[Music]
there's some error no no this
is
yeah run everything run from there
continue running
continue
yes
yeah but I'm trying to tell her she
cannot pull it the thing
it's continuing it
yeah
okay
yes
so you can see everything seems to be
okay now I'm very happy thank you
okay we are also happy that you're happy
okay so can you then
stop sharing the screen okay let me
share the the screen my
okay
okay so line one zero four I think
that's okay we want to create multiple
plots uh then somebody's asking uh which
equals to one now they are they are
there there are four five plots
different types of plots in the model in
model one now which means you are saying
out of the four that you have you plot
for us plot plot for us one one and then
the second one say plot for us two one
is the one that is fitted value versus
residual
two is the one that is the the QQ plot
okay so you can have three you can have
four so you can replace with equal to
three which equal to four which equal
five and C then it may not give you it
might tell you you don't have it
okay
so we can
you can run these lines and then you
have your uh so let me let me try now to
actually explain the residual and then
from there
we can proceed
okay
so plot one
is the one that you have fitted value
versus the residual so remember for each
observation there is the value that
you've measured from the field
and then when you put it in in a
statistical model the status quo mode
also predict for you a value that this
value should be that now the difference
between the the actual value from the
field and the residual and not the rest
and the value that the model uh predicts
for you that difference is the residual
a good model should have a zero residual
that means what the model predicts and
what is in the field should be the same
but in most cases that hardly happen so
residual then becomes important to tell
us uh what is going on so you can see
from here that this point yeah this
point number 12 here for example it has
a much bigger residue of from zero zero
is the ideal you see the point that are
close to zero mean are the one that are
predicted very very accurately so this
one here has a much higher residual
compared to the rest so this point is
not this particular
model does not seems to predict Point
number 12 well
okay
and then for the second plot that's the
QQ plot this is a plot that shows the if
there is no more distribution remember
the residual is supposed to be normally
distributed with a mean of zero uh and
it's a mean of zero and a constant
variance so this will tell us that well
we are actually okay this doesn't seems
to be very bad and then the the plot for
residual versus data point I just also
wanted to show you which point is giving
us the is residual so Point number 12 is
the one that give us the iso residual so
I may want to go and check at this point
here and say why is this particular
Point not predicted very well uh with
the why is the model not predicting this
particular Point well so that's what
you're going to check but later you may
also want to check whether this point is
actually is actually influencing is an
influential point
okay and then
here we also decide to uh to actually
plot a box plot for each the residual
for each treatment to see the amount of
variability that exists within each
treatment and we see whether the the
variabilities are constant remember one
of the assumptions so analysis of
variance that the variance should be
constant so you can see that there's a
lot more variability in D which is
actually what is reflected here you can
see
there's a lot more variability in D
compared to the other cases
and then you can see from here that
there seems to be at least they all have
positives and negative residues positive
and negative residual in each of the
different cases here
so I hope that makes the explanation a
lot more clearer now
okay so now we are going to proceed uh
so I'm going to show you another plot
uh what we're first going to do I want
you to go and run a line
110 I want just to have only one plot I
don't want to have more than one plot
there are different ways of doing this
but I always try to get a simpler way uh
that I can use to do my illustration so
let's run point 10
110 and then I'm interested in line 14
to
line 14 to to to to 120 you'll see what
will happen
so let's start from line 14.
okay so you can see from here uh that we
plotted the uh the the an histogram of
the residual and then on the Instagram
of the residual we have imposed a normal
distribution a cup so here this can help
uh help us to check whether actually the
residual that we have has a normal
distribution or not so yeah well it's
not that bad but the problem that some
gaps in the data but here we can see
that our data is fairly normally
distributed so the assumptions of
normality is not violated in this
particular case
okay so please you cannot these things
are being recorded there on YouTube so
always you can always go on YouTube and
and revisit and learn more and more and
more okay
so we we looked at the the residual
plots
uh and receive residual plots
um is giving us some information so we
can as far we can go ahead and actually
test whether the residuals are are
normally distributed weather days okay
so we have line 123
and please note that
we have an explanation here for what we
are testing so here we are testing for
homogeneity of variance among treatment
using analysis of variance approach
can be notorious
for having
what we call you have one one one set of
function inside another so here
we have the first thing that we want to
do is
we have we run our analysis of variance
okay
once you've run the analysis of variance
you extract the residual from it
okay
and then you square the residual that
you've extracted
okay
and then
you then
fit
the residual that you have here that
you've
you now try to analyze that residual
with the mean with the treatment from
the same data set so you're running
analysis of variant this is similar to
what we what we started with
you see we started with this model here
so what we could have done we could have
just picked this one here and put it
inside
so the all of these
can actually be replaced by with model
one that way we have all of this can be
replaced with model one so I can
actually type model one there
maybe let me just do like this
okay because the all of this is what
generated for us our model one then the
next thing is we go to our model one we
pick the residual from our model one
and then we square that residual and
then we use that this actually has our
variable
uh our variable that now we need to
analyze using the treatment so we are
comparing variability in the residual uh
for the different treatment so we can
run
uh 125 should give us the same answer as
1 24.
okay so here you can see
so can somebody tell us is there a
significant difference in the variance
among the treatments can I see in the
chat
hey I'm asking hey can somebody explain
I I'm not asking you to say no
if you can explain you can try it for me
hey okay I think I'm the one who uh you
don't know they know is correct yes
sorry so here we can see that the P
value here is greater than 0.05 so we
reject the null hypothesis we fail to
reject the null hypothesis that the
variances in the three groups are in the
four groups are significantly different
okay so our P value is bigger than 0.05
and therefore
uh we cannot reject the null hypothesis
that there is
no variation between the treatment so
the variance in treatment one treatment
the variability in treatment air
treatment B treatment C treatment D
although it appears they appear to be
different but when you test it
statistically there is no no difference
now we can proceed
and test we are we accept the null
hypothesis we can proceed and go and do
11 tests
so 11 tests also will test for us for
homogeneity of variance
okay so let's run line 128
so at the variance are the the variance
in the three group homogeneous
so do we reject the null hypothesis or
not
okay
yeah so there's no difference
uh between the variance so you can see
that uh both the living test and the
analysis of variance seems to tell us
that there is no significant differences
in the variance of the different groups
that we have there okay
uh then we can go ahead and test
okay so there is a what we call a rule
of thumb
uh there's what we call a rule of thumb
so although the analysis of variants may
not be showing as significant although
11 tests doesn't show us that they
significant are differences between the
different uh different variants the
different the variance in the different
group uh what we can do is we can look
at the ratio of the maximum variance
to them to the minimum variance and
it should not exceed
should not exceed
five
okay so let's let's let us first try to
run line one chapter eight at one and
then I can explain from there what's
going on
so what I want to do here I look at
aggregate
so what aggregate aggregate will do
aggregate is going to get for us born
time
and then we'll analyze it for each
treatment separately
and what we want the summary we want is
the variance so let's run this line
okay
if you could not find living test that
means you do not have you did not call
one of the
one of the the functions that we one of
the packages at the beginning
so can you go up and run those libraries
so that you can get the packages
otherwise then you will not be able to
okay
so who can tell us
oh the snake sign is called the till is
a tilde is a Greek letter called tilde
so it's saying uh like you you're saying
this in terms of aggregation means say
you aggregate them but we also use it
for running the model so this is the
relationship you that's what you use to
start a relationship between this and
that so you're trying to use treatment
uh to study
the difference so for the different
treatment group you want to see you want
to divide the bond time data and get the
variance for each group
okay so what I want you to do is I want
you to calculate what is the maximum
variance in this case
can I see what is the maximum which one
has the maximum variance
okay d as the maximum variant and which
has the the smallest variance
so A and C we have the smallest variance
so let you divide I want you to divide
the maximum variance with the smallest
variance
divide the maximum and smallest what do
you get
say you get 14
okay so since we get 14
okay so 14 is actually greater than 5 so
we cannot ignore
uh the variability that exist in the
different treatments
okay
so let me run line
130 132
so I run line 132 line 132 gives me 12.
but you guys I think your calculation is
not okay let me see
28
divided by 2.23
ah but your lineup your 14 is not
correct you gave me a wrong answer
okay so this is also a very simple
command that we arrive here so
so what we what we want to do here the
first thing the upper part here is to
get the maximum uh the maximum variance
and then you get the minimum variance
you divide and then you get your 12. so
and this is just a simple
why divide a by y d by by by by a no
we're dividing d by a because the rule
of thumb says you check the ratio of the
maximum variance to the minimum variant
it should not be more than five that's
what the rule of thumb says so in our
case here this value is more than five
so we can see that there is a the there
is a significant difference uh between
the variances in the two in the groups
that we are dealing with
listening
okay so uh you you will have you have
the script with you you're going to run
uh and then we're going to move on okay
so let's look at another test we are
looking at the test for for normality
so you can use the safiro.test uh to
test for normality so here
what is the likely effect of variance
greater than five what what we are
saying in this case is that when you're
trying to compare a group that is highly
valuable compared to a group that is
that is not very highly variable in the
comparison becomes very unreal and
unreliable at some point so some at some
point you may actually get a different
uh results
foreign
so what what we are trying to do here is
we are looking at we remember the thing
is the assumption is that the
variability in the different treatment
the results from the different
treatments are uniform so we are trying
to compare the quickest way to look at
it whether it is uniform or not is to
look for the highest variant with the
smallest variant the rest in the middle
you can ignore because once the is and
the the smallest the highest and the
lowest are are different from each other
that is enough already for you to say
the variants are not are not the same
Sylvester you've just connected the
laptop
yeah don't worry you continue you you
follow
Eric what what do you what can what can
I do
Eric Eric what is the problem
foreign
test test for normality test for whether
the residues are normally distributed
okay so let's run line 135
yeah you can also you
see what you can do
okay so when you look at the superior
test
uh do we confirm that there is normality
or no normality
the null hypothesis is that there is
normality so are we rejecting the null
hypothesis or not
we are not rejecting the null hypothesis
because the p-value is greater than 0.05
okay
and then
okay I think we are we're doing great
I'm going to run just one simple and
then we'll have a break
okay
yeah if you cannot find Levine test that
means one of the the fun one of the the
packages up there you've not brought it
why can't you run safero test
[Music]
yeah there are a number of questions
that uh you can also ask Google and
gives you the answers
foreign
of course then you will not see any
results there
okay we are still going to come back and
do a little bit of this other test again
after I talk about experimental design
I'm going to run again another sets of
another sets of analysis and then from
there uh we can
uh we can we we will continue so this
test I'm going to run them again so so
we let's try to run a post oak so uh
when we looked at the analysis of
variance let me run the analysis
variance table first
I want to get a nice variance and then I
want to experience that to explain
something
okay so from the analysis of variance
when you look at the p-value the P value
as three stars
so with the p-value with three stars
that means uh
we are
99.9 percent
confident
in our rejection of the null hypothesis
so when we reject the anal hypothesis
the conclusion that we can come up with
that at least a pair of the treatment
that we are trying to compare are
significantly different remember we have
a b c and d so you can compare A and B
and C and we we
we say at least a pair of those are
significant but it doesn't tell us which
ones are significantly different so let
me go and run a post oak should be the
one to tell us which of the pairs are
significantly different or not
okay so I'm going we're going to run a
two-way analysis of variance later on I
will explain that
no I think I've already explained why I
created a data point the data point I
just want to look to know what's
happening at each observation what uh
how does each each of measurement works
the null hypothesis in the living test
is that the variance are
among the different treatments are
within the different treatments are
equal so equal variance is the the null
hypothesis so if we fail to reject the
null hypothesis that means we conclude
that the variance are equal
okay how do we get or receive the
training certificate please uh that one
will be answered uh by the the the
neuroforum will answer that which test
should we do on the data in case the
variants are uniform that's where we
will talk about that
you couldn't find the function LSD DOT
test did you run did you install uh the
the unit go up here
let's go up and and look at the
different
so once it tells you that you don't you
you don't have the function that means
you're not
try to run
a Malcolm 11. my 11. so that as the LSD
test
okay so let's try to
okay so so let's look here
so after running line 138 uh you can see
it tells us that least significant
difference is 5.87
and then treatment with the same letters
are not significantly different so here
you can see
d
as letter A
a as letter b
c as the letter c and then b as letter D
each of them have different letters so
that means they are significantly
different from each other so all the
pairs here are significantly are
different so you can see here we are
doing uh what we are doing here is we
have we are comparing the first one and
the second one the second one the third
one the third one and and that
okay now you can also adjust for what
you called boniferoni so what happens
here is that
um because of making so many comparisons
like you say A and B A and C A and D
then B and C B and D then C and D are in
the process you can commit what we call
type one error so even if the F uh
statistics does not show significant
differences uh between the treatment if
you go ahead and do your LSD sometimes
you can actually find out that they
actually significantly different and
that significant difference comes about
because what we call uh multiple
multiple comparison uh one of the
example that I'm going to run is going
to show the comparison how it comes in
so let's run 139
okay so you can see now that
the minimum significant difference have
now increased up to
8.86908.7 now now if you look here
a
is different from B
a is different from D they don't share
the same
a is different from D they don't share
the same uh letter but a and C
share letter be in common that means
they are not significant if they're not
significant different from each other
and then C and and and B also share C
that means they're not significantly
different so once we do the one for only
correction you can see the result
changes
correction is a lot more Stricker so
what we got before as significant has
actually now stopped being significant
remember in the first case
everything was significant but once we
change and we render boniferoni then you
can get all this uh differences okay
foreign
okay
okay so let's start from from line
number 138 so in La 138 we are testing
we are using lsd.tests
uh we are looking at our model one and
what are we comparing we're comparing
the different treatments and console
equal true means we want to show the
result on the console otherwise the
result will not be shown there okay so
let's run this
So based on this
uh you can see the letter a b c and d
and here it is stated clearly that
treatment with the same letters are not
significantly different so what happens
here is then we use the LSD here to
compare the different treatment so for
example if you compare the difference
between
a and
a and and and B this difference here is
greater than the LSD which is 5.86 so
with that we can declare A and D to be
significantly different and we make them
to have different letters so if this one
has letter A this should have another a
letter B then we can compare A and C
and see if they are not different we can
also give them diff so here you can see
the difference between uh A and C the
difference is eight eight is still
greater than five point eight so we can
consider them to be significant now when
you come to this also the difference
also about a that's why you have
different letters now if we run the
second one
you see the second one now we are we are
now using something called minimum
significant difference uh minimum
significant difference in this case now
has a value of 8.7 now if I compare this
A and B
okay
okay so if you compare a and d
you can see that the difference between
them is still greater than 8.86 but once
you compare
A and C you can see the difference
between a and C is eight eight is less
than
0.87 so that's why now they are having
the same letter if you compare this
C and and B you also the difference
between them is also less than the
minimum significant difference now why
why do we adjust for boniferoni now
what is happening here is that uh
we are when you have in this case we
have a
B
C
and D
okay so that means in terms of pairwise
comparison I'm just writing this we need
to compare a and b
We compare a
and C We compare
a a and d
We compare B and C We compare B and D
P and D We compare C
and D so you see now uh when we set our
significance level at 0.05 uh that
every time that we make a comparison
that lack of level significant is 0.05
so so at the end you're going to
accumulate a lot of Errors so that by
the end of the day
um
you are you are you are type 1 error is
going to increase it's going to get a
lot more bigger and there's a higher
chance that even what is not
significantly different may actually be
declared to be significant different
just by simple error by Simple by chance
alone and that is what we have actually
seen in in our case now that when you
adjust for burning for only what you're
simply doing you're becoming a lot more
Stricker instead of having uh your Alpha
to be 0.05 we can for example divide
Alpha 0.05 by by say how many pairs
we're comparing in this case How many
pairs did we compare
let's say by four so our Alpha now going
to change to zero point uh zero one two
five instead of 0.05 so we become a lot
more Stricker uh when you're making the
comparison so there are several uh
adjustments that can be done there is
the surface there are a lot more you
need to read in the list that you get
them okay
so we can now run our example number two
which is very quickly because we have
the same thing so the first thing we
have done is to uh create uh the the the
data set
so let's run uh line 145
I'm not going to do a lot of explanation
here because what we have here is is
similar
always check what you created
okay let's have 10 minutes break
we start at four
yes
are different ways of correcting for
multiple comparison
okay so we are starting uh after 10
minutes
I'm going to mute myself so don't expect
me to be saying anything
[Music]
[Music]
foreign
[Music]
foreign
[Music]
[Applause]
[Music]
foreign
[Music]
[Music]
thank you
[Music]
foreign
[Music]
[Music]
thank you
[Music]
yes
[Music]
foreign
[Music]
[Music]
thank you
[Music]
thank you foreign
[Music]
[Music]
thank you
[Music]
[Music]
foreign
[Music]
foreign
[Music]
[Music]
[Music]
foreign
[Music]
[Music]
[Music]
foreign
[Music]
foreign
[Music]
thank you
[Music]
you want to share
the screen
Thomas you said who wants to share hello
okay
oh okay okay
[Music]
well we have promoted okay
[Music]
yes okay
okay
please share your screen
I think there is a challenge with the uh
yeah oh he is in our journey
okay okay
so remedial please share your screen
foreign
Ute yourself
yes yes
please go ahead
hey so where should I go from here
you go look down the sharing screen
okay
on the Green then select the screen to
share which should be your your other
Studio
I am
wait do you see do you see something
down Rich hair screen in green
yes yeah click on that
click and then select which screen you
want to share
I select my screen isn't it yeah the one
you want to share there are many there
will be a number of screen showing up on
your on on what you said then you select
what double click on one
they're still showing a number of names
of participants here not not like Jesus
there is a markup of grains yes
share screen yes what does it tell you
it should give you some yeah a window
that you can select what to share
I'm saying something now okay RS Studio
yes you share the address you click on
it then click share
yes this is my screen okay it's coming
not yet but it's going to come yes
are you able to see it okay full screen
okay yes
uh-huh so where is the problem
until now I'm trying to run the you said
we run over number 40 I have tried to
run it uh
okay I started from the library
yes when I run you run all the libraries
yes okay
yes let me first take you
here
I'm seeing a lot of Errors
first cleanup which errors some of them
not everything in red is error
I I should clean everything here okay
clean can use
yes
skip five
I don't go to six
hello
four to six round number six line six
I said six
number six okay
[Music]
number eleven
from number six then to number 11. no
it's random until 11. okay
all right
[Music]
let's see how it is coming no but you
you need to wait for okay you need to
wait okay
that's fine
up to 11 yes
3D plots too 11 is not go to 11 is not
yet okay
okay can you go down then go to where a
starts from where the bonds
41 is where it is yes
can you run from number 41 onwards
okay
before
just continue running
okay
[Music]
[Music]
you'll tell me where it is over yes yes
continue running
okay
[Music]
thank you
[Music]
[Music]
guys
[Music]
I think I'm saying something interesting
yes
yeah continue running
yes
run
no started seeing some error down here
no where's that we don't see we're not
seeing anything we're not seeing is
uh can you click on uh you see the
things
there's error in blood
so you need to adjust that pattern make
it bigger
can you click on the there is on the on
on the extreme
Crystal Box there's pick up there's a
smaller and a bigger box on it
or where the graph is not there
wait on the extreme end
here same line
no no no no no
you saw the box down
go on the extreme the the fourth window
this was the Box on top
this one yeah Okay click on that
hi can you adjust the the the the the
left hand try to pull and make that one
bigger
okay you think doing it wrong then push
it the other side
okay
now
okay
all right can you now try to run right
line
uh line one zero four
[Music]
yes okay yes
you run
everyone but it's showing some hair
it's again showing the same error
hey we're seeing something here whatever
you go and continue running there's no
error there
I I run one
okay you continue running
no no no you check okay the five is not
okay yeah run the other on the five is
okay the five does not run you need to
change that too you can remove the five
you just run one and two
okay
then I I run this again yes
and run that run also that
okay
okay
yeah so you can continue running the
rest of the line
there should be no problem now
okay thank you
and you
box
[Music]
so you run quickly and we see what is
coming on if you want us to run to
continue staying there
okay
[Music]
[Music]
okay you can now stop sharing the screen
and running on you on you okay now okay
thank you
okay
okay so let's uh let us now run
uh the remaining
so can we start running now from line
one 145
yes let's run line 145 up to you should
be able to to get the the plots that we
have that we've done before
so we we run 141
145
146
47 48 49
okay so uh look the green part is
explaining to you what you are doing
so we can have a summary of yield
okay so let's run 150
3 154
155
6
157
150
8.
okay
then what we do yes highlight
the Education data and RAM data loan
so what we all did there we're just
creating this data set here so here you
have replicate one two three four five
for treatment no irrigation then you
have
one two three four five for for
irrigation method one
then go for irrigation method two
irrigation method three irrigation
method four
that is what we have done all the steps
that we have done the first step was to
create the yield then we create a
treatment and we create a replicate and
then we brought them all together just
like what we did
like what we did in the previous one
okay
so once you've created your debt your
data set you first need to ensure you've
created the correct thing so you need to
print it and see what is going on
so let's look at the the names of the
the names of the columns
so this one will tell us that there are
three names there are three columns the
First Column is called replication the
second column is called treatment and
then the third column is called yield
we can also check the fast six
observations
so you can see the first six
observations
okay so what we are running here is the
same thing that we ran with the first
data set I just brought a new data set
so that we can see uh some of the
differences
you can look at the structure
so we'll tell you that this is a data
frame with 25 observations on three
variables
the first variable
is replication
he said this character you can see
everything is put in the in the quotes
double quotes that means this character
and then the second is treatment
which is also character we have
everything in quotes and then the third
one is yield it's also telling us this
character but remember the yield is
supposed to be numeric because that is
what we are analyzing so what we are
going to do now is we are going to
modify our data set the first thing that
we need to do we need to convert
replication and treatment into factors
uh and then we need to convert yield
into numeric using as dot numeric
so we can print and see what we have
done again
okay
then we can also check the structure of
what we have done
if say object irrigation that are not
found that means you've not created it
so please go and run the previous step
before you can run that means you skip
some step
that if you first go and run the other
step before you come back you will see
the data will no longer be there
somebody said why not make five
observation if you want to make five
observation then you can put head the
name of the data set and bracket and the
comma then you put five
I don't know what what what the person
was thinking when he said you observed
the first six observation okay
so we've seen now that
uh we still have a data frame
uh is the first observation is called
replication it is a factor now with five
levels
the labels are one two three four five
and then the second column is treatment
it's also a factor with five levels and
then the last one is yield is now
numeric
okay so we can go ahead and then do the
aggregations
we aggregate
okay so this is the same thing that we
did so you say for treatment one the
minimum yield is that
the first quartile
uh the second question the the medium yo
the milio the third quartile and then
the maximum yield so you can run this
for run 170
172 7172 73 74. so this one is good
because this one will give you a kind of
a good summary uh that you can
you can use
uh at least it will tell you a few
things about your data before you start
the analysis so let's run a box plot
that's 179
okay so based on the box plot
let me try to First expand a bit and
then ask you a few questions
okay so you can see from here that is
Method one
method two method three method 4 and the
method of no irrigation
which method produced the highest yield
okay
so method two produce the highest yield
uh do you think the significant
difference between method for and no
irrigation
what can you say about the difference
between the likely difference between
method 4 and no irrigation
okay you think there is no significant
sample this
okay
okay so so what what is clear from here
is that method two seems to be
completely off on its own and the other
method seems to not fair very
differently
from the the No No irrigation method
uh what can you also talk about the
variable there's a lot more variability
in the new method there is very little
variability in method in new method for
okay that means
change the screen
and go back to my other Studio
okay
so we can also
uh look at
the yield by replications
so it seems to see there is one
particular values that are up what could
be the value what could be happening to
the values that are up
can somebody tell us in the chat
somebody said they are outliers
okay but from which treatment do you
think they're coming from
foreign
yeah they could be coming from treatment
too which is the one with the highest
yield so these are the the other the
other four methods and this should be
the the fourth method
the method two so you can actually see
okay good
then we can go ahead and run our model
number two
uh do the summary
and then you can see from here uh that
uh the previous set a significant
difference between the different uh the
at least between a pair of treatment a
pair of irrigation method
we can look at the
we can look at the the the the the
summary just like we did before
we can also go and do the residual plots
it should be a lot easier now
for most of you
oh sorry I mean
sorry I need to First draw again M4
okay I want you you remove the five for
now
and remove the five the plot five
and do the one and two
one
two
that
and then let's do one of the box plot
first before we can
run run 196 so that we can compare four
observations
so I'm going to
change the screen that I'm sharing so
that I share the the zoom one
and we we have a look at
okay
so you can see from here that the the
values here are closer the value is here
closer but there is one where there's a
bit more variability and the one that
has more variability seems to be uh the
one with the value that's around 200 uh
200 kilogram so so this is the one for
treatment number two so you can see that
treatment number two which has the
highest steel resource of the one with a
lot more variability compared to the
other treatment these are the ones that
are
linked are together here now if we look
at the normality assumption we seems to
be doing well
uh if you look at the randomness
yeah this is also seems to be okay
and then you see now method two is still
one with a lot more variability compared
to the other method that we are looking
at
okay so you should then be in position
to interpret this uh without really
having any any any problem
okay so let me continue
I want to use the last 20 30 minutes to
introduce an experimental design which
will summarize tomorrow and then do
okay so we can go and
run line 195.
so that we can just have one one plot at
a time
so you can see this is the residual
versus treatment
then residual versus replication
we can also
plot our
histogram
okay you can see your beautiful line
here so also this one tells us that well
there is the seems to be no problem with
the normality we seems to be doing well
although there seems to be a bit of a a
longer tail decide compared to the other
side
okay
that we did before for homogeneity of
variance
so here the P value is 0.0658
so are we rejecting the null hypothesis
or not
can I see in the chat are we rejecting
the null hypothesis
okay so we are not rejecting the null
hypothesis that means the variances in
the different groups are equal
good
you guys are now becoming super students
okay so let's run 11 tests it should
confirm what we had before so the
p-value is less than
0.05 is greater than 0.05 so also we
don't reject the null hypothesis
so in this case we say the variants are
homogeneous
but let's run the the rule of thumb we
look at the maximum
versus the minimum
so the maximum variance is
maximum variance is 2 into
and then the minimum variance is in four
so if you get the ratio you have 300 and
393
point
four six six
seven three divided by the smallest
which is 30 point
seven
six
seven eight seven
okay we have a value of 12. so this one
also is greater than
so if you run this line these two lines
you have a value of 12 point a seven
eight so this by rule of thumb we
actually look at this value is greater
than five so even if the the two tests
above indicate that there are no
significant differences in the variances
within the different group
but the rule of thumb seems to suggest
otherwise
we can do a superior test as well
and then we can do our LSD so you see
the LSD now talk about a which is number
two
okay is different from all the rest
but number one one and two and one and
three are not different
and then
but four is not different from
it's not different from the no
irrigation but four is different from
the other the others as well now if we
do one for any correction let's see what
will happen
so you can see with boniferoni
corrections things have changed
so you can see that a is different from
all the rest
one B uh one and three are not different
but
one is different from four and no
irrigation while method three method
four and no irrigations are are the same
they share the same same letter
okay
you're getting error because you you
need to go and
you first need to go and run the the
Malcolm you go and call the library
Malcolm
you need the library Malcolm please
Leonard you need to
are you back in Kenya
okay so so that that was the what we we
had before
okay
so now we we can uh run what we call a
randomized complete block design so in
the other case we assumed that there was
no blocking I'm going to talk about
blocking when we are talking about
experimental design
so we can also run
uh we can also run
uh
three three two Thirty One
uh let's run 231 uh
upwards
no downwards
always check what you've created
look at the structure
change the bonds to numeric change the
dog type to to to factors change Bond
type to factors check you again
people are insisting on registration
link
okay so now you should be able to run
this without any any problem
you can run your box plot
and you can check which one seems to be
higher than the rest which one has more
variability
okay so you can see that there's the the
big dogs take shorter time to crush the
bone the small talk stack how long a
time to crush the bonds uh but there's a
lot more variability between the longer
and the medium the smaller and the
medium-sized dogs in crushing the bones
now we can go ahead
so so here we have two things we have
the types of the the dog and then we
also have the type of the bones
now we're going to run a simple model
model number three uh which
requires
uh we're just running it we want to
explain the time taken by a dog to crush
the bone
uh you want to explain that time by
using only the type of the bonds you
want to see where the differences
between the types of the bonds now in
the second model we want to see whether
this difference there is the type of the
bonds but you also want to take in
consideration the fact that there are
different types of dogs I'm going to
show you in my my experimental design
how I mean I'll talk a bit about this
destroy example mine that I like using
frequently
okay so we are going to run
a model 3 and model four
okay we're going to check
for the
okay so you can see from here
that for model number three
I look at the P value
is 0.0881
so is there a significant difference
between the type of bonds
I want to see in the chat is there a
significant difference
no no no no no no no great
good
you can now stop typing
okay now let's look at a number for
for that okay
ah now what happens now is there a
significant difference between the type
of bonds
okay
so after correcting for the time after
taking into consideration the types of
the dog uh you now see that the bond
type becomes significantly different so
that means if you don't correct for the
type of the dog
uh you're going to have your your error
your residual error to be much much
bigger so you can see from here that the
bond type is significant and then the
type of the dogs also significantly
different from each other
uh then uh you can go ahead and look at
the residual like what we did before so
it's still the same same
you can allow me to crush now because I
think we've done this
so that we also become Advanced like it
has been promised so we need to advance
slowly at surely so we can run from 263
okay so what what I'm trying to do here
is we are trying to compare
we are trying to compare
uh plot one for model model 3 and model
four
then we're going to compare plot plot
four
plot 2 for model 3 and model 4.
okay so let me just expand this and then
I share the screen and we try to explain
what's going on there
okay so so first thing if you see here
you can see that
here there is a straight line
but here there seems to be
kind of a pattern you can see the
residual seems to be moving in this
direction and forming some kind of a
curve so from here you can see that
here we miss we may not we we we we
don't seems to have a big problem but
there seems to be a problem here
the the model that we are using seems
not to fit the data properly
now let's compare model 3 and model 4 we
have the normality seems to be okay in
both cases this is fine this one is fine
but this one is telling us that yeah I
think there's some kind of lack of fit
in our model so you can see for smaller
values uh the the the the errors are
positive
and then for larger values the errors
are also positive and then for values in
the middle the error seems to be closer
to the center closer to zero so here the
the estimation seems to be okay although
they are also below they're all next
they're also Negative they are kind of
negative so we seems to see some kind of
a pattern uh in our our data set
okay let me stop sharing and I'll share
the next one
okay so we can also run a line uh
271
which is the
for model
three we can run the same
for model or this one said dog not found
I think it's dogs it is and S is missing
there
okay then we can also run their box plot
for the two models
and then we can compare them
so let me Zoom I I change to the Zone
one
okay
so so what you can see from here is that
for model number three
if you look at the first
the first four dogs
their residuals are negative
then the next
dogs they have residuals that are close
to zero then the the last dogs after
residuals that are positive now this is
a clear indication that there is lack of
faith here there is a pattern in this so
there's lack of it here while if you
look at the residual here we seems to
see the there's a distribution there is
random distribution uh going on in this
particular case here now if you look at
the the the the the the two for if you
look at the the bond type yeah we seems
to to be okay
for model 3 this this well we still have
high variability here and there but
model four we seems to have a very good
fit for a very good fit for C and also a
good fit for B but the the fit for D
seems to be a little bit off now these
are the things that you need to look at
when you're doing your data analysis and
try to get an explanation as to what is
really going on in your data set
okay let me stop this then we can
so so we can continue the the different
plots but I'm not so much interested in
these plots anymore you can go and and
and look at them but let me just have
them ready then we can just compare them
now you can see the importance of
putting the the putting this plot side
by side by putting this plot side by
side you should be able to uh to
actually make your comparisons of the
model a lot easier than if they are not
side by side so you can see from here
that for big dogs the residues are all
negative and for small dog the residuals
are all positive it's only for the
medium-sized dog that we have but
negative and positive residual now this
is a clear sign that for the smaller dog
the difference with with the model is is
under the the model is overestimating
the time that the dogs takes to crush
the bonds while for the small dog is
overestimating the model seems only to
work fairly well for the medium-sized
door now if you
you take the the type of dog into
consideration that is for model number
four you can see that there or they're
all now above and below zero so there is
no pattern the pattern that we saw here
completely disappeared so the pattern
that we see in in model 3 is because we
did not include the dog type in our
model once we include the dog type in
our model then the residual now becomes
random
and you won't receive That Is Random not
residue that is not random that is
non-randal so the distribution
the normality is fairly okay
in most cases uh normality is not a very
big as not a very big challenge as far
as analysis or variance is concerned
unless the sample size is extremely uh
small and also variance a bit robust to
violations due to uh normality but the
main challenge is always uh the issue of
of non-constant variance
okay let me see whether there's some
question
okay
we have a few minutes to go but that's
fine let's run the remaining script
okay so we can still repeat the same
test you can run this I'm not going to
explain it anymore because we know how
to explain
okay what I want to do now is let's look
at the uh the the the the two model look
at the Post Oak so here you can see
so that means D A and C are not
significantly different
they all have a a
but
b c and T all Up BBB so the only
difference here is that there is a
difference between d
and B
this difference D and D and B and then
these other ones are not different from
each other
okay now let's run when we adjust for
one for running
now when you have just open a friend you
see everything now it's not significant
okay because now the front becomes a lot
more stricter so sometimes if you if
you're too strict then you may actually
missed out
now let's look at our model number four
so model number four seems to indicate
index d being different from the rest
while the other ones are all
the same now let's look at the
adjustment and see what will happen
okay when you adjust it's still the same
so that means the difference you see D
is actually far apart so it doesn't
matter whether you change the burn phone
or what they will still have that
significant difference uh that we're
talking about
Collins when do we use Holmes method of
adjustment I give that to you as an
assignment now you go and read now I
don't want to bring it here and tomorrow
you'll be the first thing tomorrow in
the morning Colin is going to explain
for us when you're going to use homes
okay
okay so the the same thing here we have
another data set so you can also run
this I think we're just going to run
this and finish and then we call it a
day
so we've had a lot of example here for
you to actually appreciate uh analysis
of variant this is one-way analysis of
variance because we are only having one
factor but we can have we're going to
look tomorrow we're going to look at
two-way analysis of variance we may also
look at how you analyze a data using uh
how you analyze a split plot and so but
depends entirely on the time that we
have
so we can run
this
now this is a lot easier because we have
created our data set in and we have
created our data set in in in in in in X
in in in in R so you don't have this
error trying to bring in the data and
order but tomorrow we are going to be
getting data that is already saved and
we're going to continue uh we're going
to analyze them
so we can look at the
let me share the plots here the zoom
please put your question in Q and A
don't put them in the other in the chat
room is people have put a lot of things
in the chat room
okay so if we see here you look at a so
the varieties
the variety seems to be the performance
of the variety seems to be very uniform
I mean
there's really little variation among
them there's a lot of variability in
variety B I think uh there's some
variability in variety e now when you
look at the blocks block one seems to
have measurement that are quite uniform
block 2 seems to have more variability
uh followed by block three and then
block four so there's low variability in
Block one
foreign
so we can go ahead
and and run
our model five and six this is similar
to the what we did before we run Model 5
with variety only and model six uh with
variety and and block and then we can go
ahead and and and explain and look at
the differences uh in the results
okay so you can see from Model 5 the
varieties are not significantly
different because the p-value is greater
than 0.05 let's look at model 6 and see
whether there will be any change
uh for six
is also not significant but there's a
significant differences between the
blocks
so you can see this has resulted into a
reduction in the mean Square for the
residual
you have 0.1
the compared to 0.2 so you produce the
residual by half by almost half
now what happens here is that although
although the variety is not
significantly different and when the
variety is not significantly different
in this case there is no need for you to
go ahead and do the post oak test to use
LSD now
if you want to use LSD
you may actually find that some
varieties are different from each other
and that it will be due to what we call
chance now
if you've earned officials protected LSD
means that when you are
when you are you only do LSD when there
is a significant difference detected by
the analysis analysis of variance table
that's what called protected LSD so you
don't proceed to do LSD unless there is
a significant difference
are detected in the analysis of variance
table
so the same for 349 onward is just the
simple running of the graph that we had
before
so you can
have them there you can compare them
you can run them
so you can run them and try to explain
them the way we did up there I just want
only here to go for the post-op test to
show you why it's important to use
Fischer's protected LSD
because sometimes analysis of variance
tells you there is no difference between
the the what you're trying to compare
but if you insist and go ahead and do
and do LSD test you might actually find
significant differences
let me go to the last
okay so we are going to run line 41 411
and 412
so let's run line line four one one
so you can see from line four one one
okay for model for Model 5
when you run line four four one one
actually everything is
so there's no significant difference
between uh the the varieties you're
testing but we should actually not even
do the LSD test because we already know
from the analysis of variants that the
pairs are not significantly different
let's run our
with adjustment you don't expect any
changes but let's see what will happen
when you run
model 6.
okay now see what you happen in model 6
now model 6 is now telling us that
a
is actually different from E and F
I remember the analysis of variance
table told us that not that the pale
they are not significant
remember a nice variance
we fail to reject the H naught when we
fail to reject H naught we simply saying
the means are all the same but if you
insist and go on and do your LSD you
actually find that the significant
differences now this is where the
correction factor becomes critical now
when you come and run line 415
ah you can see now because of the
correction everything is actually AAA so
here you can see that the correction is
really really very useful
okay so I think uh we have four six
minutes I'm going to ask you to run the
remaining part
so let me just explain what is remaining
here so we have an Irish data so Irish
data is one of the data sets that is
that comes together with r
so please you don't need to do anything
if you want to know what Irish data is
all about you run line four to two
then it gives you a description then you
can run the rest of the of the lines
without any problem and this one you can
also see some beautiful
maybe you'll run and then explain one
person can demonstrate for us tomorrow
so let me just show you this picture
then we call it a day
okay so here we are looking at the
Circle length
versus the circle with
but then here we colored it using the
different species so you can see the
different species showing different
colors this is separate length and
separate width and then
with length versus the
what is the the length versus the width
so uh at this point I like to add you
back to my boss
to proceed from there
so Susan
you have your student with you so thank
you very much everyone
so thank you very much everyone
thank you very much ladies and gentlemen
you've been so patient and except
this is
uh we thank you so much for tomorrow
we'll have the three and uh depending on
what my colleague Dr Thomas has covered
probably we will come in with the
regression and correlation tomorrow
okay so I'll hand over to
Ada David
amitu or Dr nyararo or Selma whoever is
around
to close the session thank you so much
and we see you tomorrow
thank you very much Prof might be just a
quick quick one before you you hand over
to us there was a question that I saw in
the chats asking if R is only for
agriculture yeah
can you hear me
yeah you
you can okay there was a question in the
chat that was asking if R is only for
agriculture
no no R is not only for agriculture we
can also use it for survey data
uh if there is Time Dr Helen can
demonstrate that using when she's
talking about service sampling
okay thank you thank you very much thank
you very much Pros uh to you and the
team thank you very much Dr adong for
taking us throughout into the day
two of this course and training which is
going on very well we just want to thank
the participant and also thank my
colleagues who are online and are busy
responding putting up the links here and
there thank you very much colleagues and
we call it a day please if you have
missed anything go and catch up on the
YouTube channel so that tomorrow you
come and be on the same page thank you
and see you tomorrow