Sometimes, while working with records, we can have a problem in which we need to remove duplicate records. This kind of problem is common in web development domain. Let’s discuss certain ways in which this task can be performed.
Method #1 : Using list comprehension + set()
In this method, we test for each list as it appears and add it to set so that it’s repeated occurrence can be avoided and then this is added to newly maintained unique tuple, removing duplicates.
Python3
test_tup = ([ 4 , 7 , 8 ], [ 1 , 2 , 3 ], [ 4 , 7 , 8 ], [ 9 , 10 , 11 ], [ 1 , 2 , 3 ])
print ( "The original tuple is : " + str (test_tup))
temp = set ()
res = [ele for ele in test_tup if not ( tuple (ele) in temp or temp.add( tuple (ele)))]
print ( "The unique lists tuple is : " + str (res))
|
Output :
The original tuple is : ([4, 7, 8], [1, 2, 3], [4, 7, 8], [9, 10, 11], [1, 2, 3])
The unique lists tuple is : [[4, 7, 8], [1, 2, 3], [9, 10, 11]]
Time complexity: O(n)
Auxiliary Space: O(n)
Method #2 : Using OrderedDict() + tuple()
The combination of above functions can also be used to perform this particular task. In this, we convert the tuple to OrderedDict(), which automatically removes the duplicate elements and then construct a new tuple list using tuple().
Python3
from collections import OrderedDict
test_tup = ([ 4 , 7 , 8 ], [ 1 , 2 , 3 ], [ 4 , 7 , 8 ], [ 9 , 10 , 11 ], [ 1 , 2 , 3 ])
print ( "The original tuple is : " + str (test_tup))
res = list (OrderedDict(( tuple (x), x) for x in test_tup).values())
print ( "The unique lists tuple is : " + str (res))
|
Output :
The original tuple is : ([4, 7, 8], [1, 2, 3], [4, 7, 8], [9, 10, 11], [1, 2, 3])
The unique lists tuple is : [[4, 7, 8], [1, 2, 3], [9, 10, 11]]
Method #3: Using recursive method.
Python3
def remove_duplicates(tup, result, seen):
if not tup:
return result
if tuple (tup[ 0 ]) not in seen:
result.append(tup[ 0 ])
seen.add( tuple (tup[ 0 ]))
return remove_duplicates(tup[ 1 :], result, seen)
test_tup = ([ 4 , 7 , 8 ], [ 1 , 2 , 3 ], [ 4 , 7 , 8 ], [ 9 , 10 , 11 ], [ 1 , 2 , 3 ])
result = []
seen = set ()
res = remove_duplicates(test_tup, result, seen)
print ( "The original tuple is : " + str (test_tup))
print ( "The unique lists tuple is : " + str (res))
|
Output
The original tuple is : ([4, 7, 8], [1, 2, 3], [4, 7, 8], [9, 10, 11], [1, 2, 3])
The unique lists tuple is : [[4, 7, 8], [1, 2, 3], [9, 10, 11]]
Time Complexity: O(n)
Auxiliary Space: O(n)
Method #4:Using a loop and a set:
Algorithm:
1.Create an empty list called “result” and an empty set called “seen”.
2.For each sublist in the input tuple:
a. Convert the sublist to a tuple called “t”.
b. If “t” is not in the “seen” set, append the original sublist to the “result” list and add “t” to the “seen” set.
3.Convert the “result” list to a tuple and return it.
Python3
def remove_duplicates(tup):
result = []
seen = set ()
for sublist in tup:
t = tuple (sublist)
if t not in seen:
result.append(sublist)
seen.add(t)
return tuple (result)
test_tup = ([ 4 , 7 , 8 ], [ 1 , 2 , 3 ], [ 4 , 7 , 8 ], [ 9 , 10 , 11 ], [ 1 , 2 , 3 ])
unique_tup = remove_duplicates(test_tup)
print ( "Original tuple:" , test_tup)
print ( "Tuple with duplicate lists removed:" , unique_tup)
|
Output
Original tuple: ([4, 7, 8], [1, 2, 3], [4, 7, 8], [9, 10, 11], [1, 2, 3])
Tuple with duplicate lists removed: ([4, 7, 8], [1, 2, 3], [9, 10, 11])
Time complexity: O(n*m) where n is the number of sublists in the input tuple and m is the length of the largest sublist. This is because we need to iterate over each sublist and create a tuple from it, and checking membership in a set takes constant time on average.
Auxiliary Space: O(n*m) because we need to store the input tuple, the output list, and the set of seen tuples. In the worst case where all sublists are unique, the size of the output list will be equal to the size of the input tuple, so the space complexity will be proportional to the size of the input tuple.
Method #5: Using loop and if-else statement
Algorithm:
- Initialize an empty list res to store unique lists.
- Loop through each list i in the given tuple test_tup.
- If the list i is not already in the res list, then append i to the res list.
- Return the res list as the result.
Python3
test_tup = ([ 4 , 7 , 8 ], [ 1 , 2 , 3 ], [ 4 , 7 , 8 ], [ 9 , 10 , 11 ], [ 1 , 2 , 3 ])
res = []
for i in test_tup:
if i not in res:
res.append(i)
print ( "The unique lists tuple is : " + str (res))
|
Output
The unique lists tuple is : [[4, 7, 8], [1, 2, 3], [9, 10, 11]]
Time Complexity:
The loop runs n times, where n is the length of the input tuple.
The list append operation and the list membership test both have O(1) time complexity on average.
Therefore, the overall time complexity of the algorithm is O(n) on average.
Auxiliary Space:
The space used by the res list is O(k), where k is the number of unique lists in the input tuple.
The space used by other variables is constant.
Therefore, the overall space complexity of the algorithm is O(k).
Method 6: Using generator function
Define a generator function that takes a tuple as input.
Initialize an empty set to store the seen tuples.
Loop through the original tuple.
Convert the current list to a tuple and check if it is already in the set.
If it is not in the set, yield the current list and add the tuple to the set.
Return the generator function.
Python3
def remove_duplicates(tup):
seen = set ()
for lst in tup:
tup_lst = tuple (lst)
if tup_lst not in seen:
yield lst
seen.add(tup_lst)
test_tup = ([ 4 , 7 , 8 ], [ 1 , 2 , 3 ], [ 4 , 7 , 8 ], [ 9 , 10 , 11 ], [ 1 , 2 , 3 ])
res = tuple (remove_duplicates(test_tup))
print ( "The original tuple is : " + str (test_tup))
print ( "The unique lists tuple is : " + str (res))
|
Output
The original tuple is : ([4, 7, 8], [1, 2, 3], [4, 7, 8], [9, 10, 11], [1, 2, 3])
The unique lists tuple is : ([4, 7, 8], [1, 2, 3], [9, 10, 11])
This method has a time complexity of O(n), where n is the length of the input tuple, since we only loop through the tuple once.
This method has an auxiliary space complexity of O(n), where n is the length of the input tuple, since we need to store the unique tuples in a set.
Please Login to comment...