![]() |
Imagine you are working with a massive dataset which is represented by multi-dimensional arrays called tensors. In simple terms, tensors are the building blocks of mathematical operations on the data. However, sometimes, tensors can have majority of values as zero. Such a tensor with a lot of zero values is called as sparse tensor. Sparse tensors are mostly encountered in the fields of computer vision and natural language processing. These can be pretty overwhelming at times. Therefore, in this article we will be discussing various aspect related to sparse tensors. You will have the following concepts cleared when you read this article: Table of Content What are Sparse Tensors?Sparse tensor is a term specifically defining those vectors which have a multitude of zero values. Unlike other tensors which holds majority non-zero values, sparse vectors have different approach. Sparse vector smartly optimizes the storage and computation by keeping track of non-zero values only. Hence, they are idea for the scenarios of sparsity. For example, you have a gigantic chessboard which has most of its squares empty (zero values). A sparse vector is like a clever agent which focuses more on the pieces (non-zero values) and their precise positions. It doesn’t bother about the empty squares and thereby, reduces storage requirements and computational load associated with unnecessary zero values. In TensorFlow, you can represent a sparse tensor by using tf.sparse.SparseTensor object. They are currently encoded using the coordinate list (COO) format. It neatly organizes the non-zero values, their corresponding indices, and the overall shape of the tensor. A special thing about sparse tensor is that it maintains the essence of the data while discarding the noise. Thus, we finally have an efficient way of storing and processing large datasets. How to create Sparse Tensors in TensorFlow?There are two ways to create a sparse tensor. Both of the ways are discussed in detail with an example below: By Directly Specifying Indices and ValuesAs we discussed earlier, you can represent sparse tensor by the tf.sparse.SparseTensor object. Now, we will see how we can create sparse tensors in TensorFlow. When you are creating a sparse tensor, you need to specify the following three components:
Python3
Output: SparseTensor(indices=tf.Tensor( The above code shows that we have created a sparse tensor using with two non-zero values 41 and 30 which are at indices (1, 2) and (3, 3), respectively. The tf.sparse.SparseTensor constructor is then used to create the sparse tensor by providing the values, indices, and dense_shape. When we print the sparse vector we get the structure of indices, values, and dense_shape. Creating from Dense TensorIn case you are working on a large dataset, basically comprising all tensors. However, you notice that most of the values in the tensor are zero. Then, you can use tf.sparse module to work with sparse tensors. Here, you will have to choose tf.sparse.from_dense method. Python
Output: Dense Tensor: In the above code, we have our dense tensor stored in the dense_tensor variable. Then we simply use tf.sparse.from_dense method with dense_tensor and get our sparse tensor. You can see both the dense and the sparse vector in the output. How to manipulate sparse tensors?You can manipulate sparse tensors in multiple ways using different methods. It is completely at you discretion what operation you want to do with your tensor. We will see different ways we can manipulate the sparse tensors in TensorFlow. There is a package called tf.sparse package which consist of all the tools and methods that allow us to perform arithmetic manipulations, matrix multiplications, concatenation, and more. Let us discuss few of them. Arithmetic ManipulationsTraditional arithmetic manipulations are not directly applicable on sparse tensors. Therefore, to add sparse tensors of same shape we use tf.sparse.add function. You can study the code give below. It shows two sparse tensors which are added and the result is stored in the result_sparse_tensor variable. Python3
Output: SparseTensor(indices=tf.Tensor( Sparse-Dense Matrix MultiplicationIf you want to multiply a sparse tensor and a dense tensor you can use tf.sparse.sparse_dense_matmul method as given below. Python3
Output: Sparse Tensor: ConcatenationIn order to concatenate two sparse tensors, you can use tf.sparse.concat method. Check the example given below that concatenates two sparse tensors. Python3
Output: Sparse Tensor 1: SlicingIt is very easy to slice a sparse tensor. You simply need to use tf.sparse.slice function like give below in the code. Python3
Output: Original Sparse Tensor: Elementwise OperationsIf you want to perform elementwise operations on nonzero values in sparse tensors (TensorFlow 2.4 and higher), you can use tf.sparse.map_values to do the same. Python3
Output: Original Sparse Tensor: Handling Sparse Tensors: Distinguishing Zero vs Missing ValuesThus, you would have understood so far that Sparse tensors in TensorFlow offer a powerful mechanism for efficiently managing datasets with a large number of zero values. But in some cases, you need to distinguish between explicit zero values and missing values becomes crucial for accurate data representation. Therefore, now we will understand how we can handle sparse tensors while maintaining this distinction. Representing Missing ValuesIn most of the operations, sparse tensors are designed to treat explicit zeros and missing values as identical. But, at some point you might need to differentiate between them in some cases. For example, if you have a tensor of score and some of the values are missing and can have any floating-point value between -Inf and +Inf. Thus, using a sparse tensor to represent missing data instead of zero allows to encode this information accurately. Operations ConsiderationAs we mentioned earlier, sparse tensors treat explicit zeros values and missing values as the same for most of the operations on tf.sparse.SparseTensors. However, if you use tf.sparse.reduce_max , it will not treat missing value as zero. Python3
Output: Result of reduce_max: 3.0 Explicit Zeros and Missing DataYou should know while creating sparse tensors that certain operations, like tf.sparse.reduce_max, do not differentiate between explicit zeros and missing values. Therefore, encoding this information appropriately during data preprocessing is crucial for accurate representation. Therefore, when you understand intricacies of sparse tensor handling you can tailor your data representation to specific use cases. This nuanced approach allows you to encode unknown or absent data, contributing to more accurate and meaningful analyses. |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 13 |