Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

khot greater than 1 #14

Open
yujianll opened this issue Jun 7, 2022 · 3 comments
Open

khot greater than 1 #14

yujianll opened this issue Jun 7, 2022 · 3 comments

Comments

@yujianll
Copy link

yujianll commented Jun 7, 2022

Hi, thanks for sharing the code!

I'm using the SubsetOperator to sample k elements from n elements (k < n). However, I found khot > 1 for some scores. I wonder if this is the expected output. In the paper, it says each value in the k-hot vector should be within [0, 1].

@yujianll
Copy link
Author

yujianll commented Jun 7, 2022

As a follow up question, is there any methods that can give a relaxed khot vector within [0, 1]? I want to represent the unselected elements as 1 - khot but fail to do so if khot is greater than 1.

@sangmichaelxie
Copy link
Collaborator

Sorry this is late - How much greater than 1 is it? It might just be a numerical issue that could be resolved by renormalizing.

@SiyuanHuangSJTU
Copy link

SiyuanHuangSJTU commented Oct 27, 2024

I've encountered the same problem.

Here I offer an example of the problem:

import torch
import numpy as np

class SubsetOperator(torch.nn.Module):
    def __init__(self, k, tau=1.0):
        super(SubsetOperator, self).__init__()
        self.k = k
        self.tau = tau

    def forward(self, scores):
        # m = torch.distributions.gumbel.Gumbel(torch.zeros_like(scores), torch.ones_like(scores))
        # g = m.sample()
        g = torch.tensor([[-0.1926, -1.8388, -0.7433, -0.0096, -0.6953,  1.1131, -0.4599,  3.3375,
          0.5950,  0.0287, -0.5226,  0.5701,  0.0701,  1.9500,  0.2947, -0.1015,
         -0.7057,  1.9853,  2.0632, -0.0271]])
        scores = scores + g 

        khot = torch.zeros_like(scores)
        onehot_approx = torch.zeros_like(scores)
        for i in range(self.k):
            khot_mask = torch.max(1.0 - onehot_approx, torch.tensor([EPSILON]))
            scores = scores + torch.log(khot_mask)
            onehot_approx = torch.nn.functional.softmax(scores / self.tau, dim=-1)
            khot = khot + onehot_approx

        return khot

# Constants
EPSILON = np.finfo(np.float32).tiny

k = 10 # Number of top passages to select
tau = 3.  # Temperature parameter

subset_op = SubsetOperator(k=k, tau=tau)

# score = torch.randn(size=(1,20))
score = torch.tensor([[ 1.4577,  0.1808, -0.4349,  0.5752,  0.4115, -0.1067, -2.4714,  0.2549,
          0.4630,  0.7541, -0.0901,  1.8299,  0.5339, -0.5185,  0.6371, -0.4329,
          1.0309, -1.8988,  0.9963,  1.7575]])

from matplotlib import pyplot as plt
res = subset_op(score).detach().numpy().flatten()
print(res)
plt.plot(res)

which outputs: [0.5602655 0.2231781 0.26037002 0.45167285 0.34610078 0.5175879 0.14770216 ****1.107019**** 0.5258558 0.483119 0.3118482 0.78725654 0.4570909 0.58937955 0.5058449 0.31969813 0.4190858 0.38890892 0.9528999 0.64511645]

In fact, for me, the statement that "the numerical range is between 0 and 1" is problematic. I don't believe that the algorithm provided in this paper (https://arxiv.org/abs/1901.10517) can ensure that the numerical range is within 0 and 1. This is because the soft k-hot mask does not fully meet the requirement to completely mask the selected elements. I wonder if my understanding is incorrect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants