Delete rows with zero value under condition

by Noura   Last Updated September 20, 2018 20:26 PM

I have a data frame :

dt <- read.table(text = "
350 16 
366 11 
376  0
380  0
397  0
398 45  
400 19  
402 0
510 0
525 0
537 0
549 0
569 112
578 99")

I want to delete all rows with a zero in the second column except the row before and after a non zero value.

The result will beĀ :

dt1 <- read.table(text = "
350 16 
366 11 
376  0
397  0
398 45  
400 19  
402 0
549 0
569 112
578 99")
Tags : r dataframe


Answers 3


library(data.table)
setDT(dt)

dt[{n0 <- V2 != 0; n0 | shift(n0) | shift(n0, type = 'lead')}]

#      V1  V2
#  1: 350  16
#  2: 366  11
#  3: 376   0
#  4: 397   0
#  5: 398  45
#  6: 400  19
#  7: 402   0
#  8: 549   0
#  9: 569 112
# 10: 578  99
Ryan
Ryan
September 20, 2018 20:04 PM

A simple solution using base R comparing upward and downward displaced vectors

dt[ !(c(dt$V2[-1],0) == 0 & c(0,dt$V2[-length(dt$V2)]) == 0), ]
Patricio Moracho
Patricio Moracho
September 20, 2018 20:19 PM

Using dplyr:

dt %>%
  filter(lag(V2, 1) != 0 | lead(V2, 1) != 0)

    V1  V2
1  350  16
2  366  11
3  376   0
4  397   0
5  398  45
6  400  19
7  402   0
8  549   0
9  569 112
10 578  99

Or:

dt %>%
  group_by(cond = lag(V2, 1) != 0 | lead(V2, 1) != 0) %>%
  filter(cond == TRUE) %>%
  ungroup() %>%
  select(-cond)

# A tibble: 10 x 2
      V1    V2
   <int> <int>
 1   350    16
 2   366    11
 3   376     0
 4   397     0
 5   398    45
 6   400    19
 7   402     0
 8   549     0
 9   569   112
10   578    99
tmfmnk
tmfmnk
September 20, 2018 20:21 PM

Related Questions





Pyspark - Get X random records from dataframe

Updated March 31, 2017 15:26 PM