No data will be perfect. Yes, you may work on tons of data, but you will encounter missing values in it. As we know, the algorithms will not return good results if the data quality is poor. So, today, we are going to discuss about omit function in R. This functions helps to remove missing values i.e. NA values from the data.

Syntax of omit function in R

Omit(): This function is used to eliminate the missing values or the NA values from the data.

na.omit(data)

Here,

Data = The input data.

Simple example of omit function in R

If you are good with syntax, let’s see an example that will show the working of the omit function in R programming. Let us consider a vector as input data and then we can pass it to it na.omit() function. 

#Input vector
df <- c(34,56,4,NA,78,5,438.9,NA,7.88,45,0.56)
#Eliminates the NA values.
na.omit(df)
[1]  34.00  56.00   4.00  78.00   5.00 438.90   7.88  45.00   0.56

attr(,"na.action")
4 8

attr(,"class")
"omit"

As you see here, the output is free of missing values or NA values. You can further observe that the attribute na.action shows the position of NA values in the input data and the attribute class shows the function i.e. omit. 

Omit function using a dataframe in R

Now, let us consider a dataframe that includes NA values. Then we can apply the omit function to negate the missing values.

#Input dataframe
df <- datasets::airquality
#display data
df
Ozone Solar.R Wind Temp Month Day
1      41     190  7.4   67     5   1
2      36     118  8.0   72     5   2
3      12     149 12.6   74     5   3
4      18     313 11.5   62     5   4
5      NA      NA 14.3   56     5   5
6      28      NA 14.9   66     5   6
7      23     299  8.6   65     5   7
8      19      99 13.8   59     5   8
9       8      19 20.1   61     5   9
10     NA     194  8.6   69     5  10
#Negates the NA values in the data
na.omit(df)
Ozone Solar.R Wind Temp Month Day
1      41     190  7.4   67     5   1
2      36     118  8.0   72     5   2
3      12     149 12.6   74     5   3
4      18     313 11.5   62     5   4
7      23     299  8.6   65     5   7
8      19      99 13.8   59     5   8
9       8      19 20.1   61     5   9
12     16     256  9.7   69     5  12

As you can see here, the function has eliminated the NA values and the complete rows which include the NA or missing values in R.

Handle missing values in particular column

Till now, we have eliminated the missing values in a vector and a dataframe. But, what if you need to negate the missing values present in only one particular column?.

Yes, using na.omit() function you can specify the particular column in which you want to work on. Let’s see how it works.

#Input dataframe
df <- datasets::airquality
#display data
df
Ozone Solar.R Wind Temp Month Day
1      41     190  7.4   67     5   1
2      36     118  8.0   72     5   2
3      12     149 12.6   74     5   3
4      18     313 11.5   62     5   4
5      NA      NA 14.3   56     5   5
6      28      NA 14.9   66     5   6
7      23     299  8.6   65     5   7
8      19      99 13.8   59     5   8
9       8      19 20.1   61     5   9
10     NA     194  8.6   69     5  10
na.omit(df$Ozone)
 41  36  12  18  28  23  19   8   7  16

These are the first 10 values present in the column “Ozone”. Like this, you can mention the column name in the function to deal with it alone.

Wrapping Up

The omit function is a very handy function when it comes to data analytics and data pre-processing as well. You can eliminate all the missing values with a single line of code and it’s pretty awesome. That’s all for now. Happy R!!!

Read: R docs

Categorized in:

Tagged in: