Find Indices In Numpy Arrays Consisting Of Lists Where Element Is In List
Solution 1:
import numpy as np
np_arr = np.array([['hello', 'salam', 'bonjour'], ['a', 'b', 'c'], ['hello']])
vec_func = np.vectorize(lambda x: 'hello'in x)
ind = vec_func(np_arr)
Output:
#Ind: array([ True, False, True])
# np_arr[ind]:array([list(['hello', 'salam', 'bonjour']), list(['hello'])], dtype=object)
However, if you wish to get the output as a list of integers for indices, you might use:
np.where(vec_func(np_arr))
#(array([0, 2], dtype=int64),)
Solution 2:
Following @aminrd's answer, you can also use np.isin
instead of Python's in
, which gives you the benefit of returning a boolean numpy array representing where the string hello
appears.
import numpy as np
myarray = np.array(
[["hello", "salam", "bonjour"], ["a", "b", "c"], ["hello"]], dtype=object
)
ids = np.frompyfunc(lambda x: np.isin(x, "hello"), 1, 1)(myarray)
idxs = [(i, np.where(curr)[0][0]) for i, curr inenumerate(ids) if curr.any()]
Result:
>>> print(ids)
[array([ True, False, False]) array([False, False, False]) array([ True])]
>>> print(idxs)
[(0, 0), (2, 0)]
EDIT: If you want to avoid the explicit loop, you could pad the array with 0 (same as False
) and then use numpy's broadcasting normally (this is necessary since ids
becomes an object
array with shape (3,)
)
>>> padded_ids = np.column_stack((itertools.zip_longest(*ids, fillvalue=0)))
>>> print(np.stack(np.where(padded_ids), axis=1))
[[0 0]
[2 0]]
Keep in mind padding methods usually have some kind of a loop somewhere, so I don't think you can totally get away from it.
Solution 3:
Extending the answer about np.vectorize
in order to answer the additional question about returning also the index within each list (asked by the OP as a comment under the accepted answer), you could perhaps define a function which returns the index number or -1, and then vectorize that. You can then post-process the return from this vectorized function, to obtain both types of required indices.
import numpy as np
myarr = np.array([['foo', 'bar', 'baz'],
['quux'],
['baz', 'foo']])
defget_index(val, lst):
"return the index in a list, or -1 if the item is not present"try:
return lst.index(val)
except ValueError:
return -1
func = lambda x:get_index('foo', x)
list_indices = np.vectorize(func)(myarr) # [0 -1 1]
valid = (list_indices >= 0) # [True False True]
array_indices = np.where(valid)[0] # [0 2]
valid_list_indices = list_indices[valid] # [0 1]print(np.stack([array_indices, valid_list_indices]).T)
# [[0 0] <== list 0, element 0# [2 1]] <== list 2, element 1
Post a Comment for "Find Indices In Numpy Arrays Consisting Of Lists Where Element Is In List"