How to Remove Duplicates From NumPy Array in Python?

Remove duplicates from numpy array

Are you struggling with duplicated data in a NumPy and are looking for a solution in Numpy to remove duplicates from an array? If yes, then you have landed in the right place. 🎉

In Python, the Numpy library comes with various valuable methods that provide multiple ways to manipulate data in Numpy arrays. In this article, we’ll look into multiple solutions to remove duplicate records from a Numpy array.

Remove duplicates from numpy array

 

Note: For more information and detailed steps to download, install and import NumPy on your system (Windows, macOS, Linux) refer to our complete guide here. Additionally to update NumPy on your system using PIP, you can refer to our guide for that here.

 

 

How to Remove Duplicates From a Numpy Array?

In some cases, it is essential to remove duplicate data from an array due to a number of reasons. It might be possible that data is duplicated mistakenly and you want to remove it or you might be interested in finding duplicate data. So, don’t worry. This guide will help you a lot. There are several ways to remove duplication from a Numpy array. Let’s discuss two approaches here:

  1. np.unique()
  2. set()

 

Method 1: Remove Duplicates From a Numpy Array Using np.unique() Function

We can use the np.unique() function if we want to remove duplicates from an array. To use that function, we will pass an array as an argument

The syntax of this function is given below:

 

Syntax

# remove np array duplicates

np.unique(array)

 

a. Remove Duplicates from 1-D Array

Let’s use the np.unique(array) function to remove duplicates from a 1-D array.

 

Code

# import numpy

import numpy as np




# numpy array created

array = np.array([1, 2, 4, 2, 3, 3, 5, 3, 1, 4])




# print the array with duplicate numbers

print(array)




# print the array and remove duplicates

print(np.unique(array))

 

Output

[1 2 4 2 3 3 5 3 1 4]

[1 2 3 4 5]

 

In the above example 👆you can see we have created a Numpy array using np.array() function. In the Numpy array, we’ve duplicated values. To remove duplicate values we have used the np.unique() function and printed the array again.

 

b. Remove Duplicates Rows from 2-D Numpy Array

Let’s remove duplicates rows from a 2-D array using np.unique() function:

 

Code

# import numpy

import numpy as np




# numpy array created

array = np.array([[1, 2, 1], [2, 3, 5], [1, 2, 1]])




# print the numpy array with duplicate rows

print("The numpy array with a duplicate row is:")

print(array)




# remove duplicate rows from numpy array

print("Remove duplicate rows from numpy array:")

print(np.unique(array, axis=0))

 

Output

The numpy array with a duplicate row is:

[[1 2 1]

 [2 3 5]

 [1 2 1]]




Remove duplicate rows from numpy array:

[[1 2 1]

 [2 3 5]]

 

In the above example 👆 you can see that we have two duplicate rows. We have removed duplicate rows by using the np.unique() function. The second printed output only contains the unique rows.

 

c. Remove Duplicates Columns from 2-D Numpy Array

Let’s remove duplicates columns from a 2-D array using np.unique() function:

 

Code

# import numpy

import numpy as np


# numpy array is created

array = np.array([[1, 2, 2], [4, 1, 1], [3, 1, 1]])


# print the numpy array with duplicate columns

print("The numpy array with duplicate column is:")

print(array)


# remove duplicate columns from numpy array

print("Remove duplicate columns from numpy array:")

print(np.unique(array, axis=1))

 

Output

The Numpy array with the duplicate column is:

[[1 2 2]

 [4 1 1]

 [3 1 1]]


Remove duplicate columns from Numpy array:

[[1 2]

 [4 1]

 [3 1]]

 

In the above example 👆 you can see that we have duplicate columns. We have removed duplicate columns by using the np.unique() function. The second printed output only contains the unique column values.

 

 

Method 2: Remove Duplicates from a Numpy Array Using set() Function

The set() function in python is a built-in function that takes an input of iterable elements and returns distinct elements from that set of iterable elements. 

Let’s use set() function in an example:

 

Code

# import numpy 

import numpy as np




# numpy array has been created

array = np.array([[1,2,3],

                 [3,2,1],

                 [4,5,6],

                 [7,8,9],

                 [9,8,9],

                 [4,5,6],

                 [7,8,9]

                 ])

                 

# Delete duplicate rows from 2D NumPy Array

array = np.vstack(list(set(tuple(row) for row in array)))




print("Distinct array values are:")

print(array)

 

Output

Distinct array values are:

[[9 8 9]

 [3 2 1]

 [7 8 9]

 [1 2 3]

 [4 5 6]]

 

In the above example👆we have created a Numpy array of duplicate values. We have iterated each row of the 2-D Numpy array and made its contents as tuples because it is not comparable. After this, pass that array to a set() method. Using this function will return unique elements. Here we have used the numpy.vstack() function for joining the array vertically. 

 

 

Conclusion

In this article, we have discussed how to remove duplicates from a Numpy array. Further, we have provided two alternative solutions to remove duplicates from a Numpy array. 

A quick recap of the topics we’ve explained in this article

  1. What is a Numpy library in python?
  2. How to remove Duplicates From Numpy Array?
  3. How to remove duplicates using np.unique() method?
  4. How to remove duplicates from using set() method?

Hope this guide helps you out 😇Do share your experience using each method and let us know in the comment sections below 👇which solution you find more feasible 🥰

 

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts
Total
0
Share