The value stands for "thousands of an em",
and the "em" stands for the current font size.
So in essence, it's a relative size and certainly not a fixed pixel size.
Adobe XD loosely based its spacing/tracking value on a typographer's method for defining character spacing.
Historically, the character M used to be the widest plain character in the letterpress collection of a font. That M's physical block of lead or wood [see the image in my other reply] often measured as a square in width and height – not entirely coincidentally. Hence the reason why typesetters also used to call that size the "em" or "the square". And that very idea of using a flexible size called em still lives on in CSS !
BTW, the various lengths of dashes also refer to that principle: the em-dash, and the en-dash (half of the em). See this Wikipedia page for their usage, but keep in mind that it differs a lot between languages !
So handing-off your materials to a developer by converting all flexible values into pixels, will result in type not being able to automatically and accordingly scale in responsive designs, and it can't accomodate anymore for accessibility aspects. I.M.H.O. any conversion is better left to front-end developers and their frameworks, because of the difficulties and debates about screen densities.
That's why you need to hand it over to a developer (or an intermediate) who also understands these modern and appropriate principles of using flexible sizes. Anyone demanding a 100% pixel-perfect execution from a mockup, is asking for trouble.
There's also some trouble with Adobe XD, though. The starting position of text boxes, the height of the baseline within a text box, the way how line height and paragraph spacing is actually adding above and/or below the text line – these are all very ambiguous. The W3C's CSS definitions are very clear on this ! On the other hand, I wonder if browser engines and operating systems are consistent among each other by now...
... View more