上QQ阅读APP看书,第一时间看更新
Algorithm
The algorithm of the prototypical networks is shown here:
- Let's say we have the dataset, D, comprising {(x1, y1), (x2, y2), ... (xn, yn)} where x is the feature and y is the class label.
- Since we perform episodic training, we randomly sample n number of data points per each class from our dataset, D, and prepare our support set, S.
- Similarly, we select n number of data points and prepare our query set, Q.
- We learn the embeddings of the data points in our support set using our embedding function, f∅ (). The embedding function can be any feature extractor—say, a convolutional network for images and an LSTM network for text.
- Once we have the embeddings for each data point, we compute the prototype of each class by taking the mean embeddings of the data points under each class:
- Similarly, we learn the query set embeddings.
- We calculate the Euclidean distance, d, between query set embeddings and the class prototype.
- We predict the probability, p∅(y = k|x), of the class of a query set by applying softmax over the distance d:
- We compute the loss function, J(∅), as a negative log probability, J(∅) = -logp∅(y=k|x), and we try to minimize the loss using stochastic gradient descent.