Issues about Positional Embedding and Reference Point

Hi, thanks for sharing your wonderful work.

I got a question in here, https://github.com/Atten4Vis/ConditionalDETR/blob/ead865cbcf88be10175b79165df0836c5fcfc7e3/models/transformer.py#L33
which embedes positional information in the query_pos.

however, I don't understand the reason why does ```2*(dim_t//2)``` has to be devided by 128, instead of the actual dimension ```pos_tensor``` has (e.g., 256 by default).
https://github.com/Atten4Vis/ConditionalDETR/blob/ead865cbcf88be10175b79165df0836c5fcfc7e3/models/transformer.py#L38
Is it works correctly even ```dim_t``` is divided by 128?

I would appreciate to be corrected !

And another question is, 
when we do the calculation of the equation (1) in the paper, 
https://github.com/Atten4Vis/ConditionalDETR/blob/ead865cbcf88be10175b79165df0836c5fcfc7e3/models/conditional_detr.py#L89
can I understand that the model would learn "offsets" from the corresponding reference points?
what is precise role of the reference points? 

Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues about Positional Embedding and Reference Point #32

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issues about Positional Embedding and Reference Point #32

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions