
(we could also imagine that this function could be apply also to a whole dataset).įinally, we need also to consider that value labels and declared missing values are two different kind of metadata, implying that attributes for missing values could also be applied to non labelled variables (similarly, variable labels could be attached to a non labelled would increase the size of the dataset, but not substantially (in any case, completely manageable for decent size datasets) because it doesn't copy >allstandard< anymore). Then, we could provide a function like missing2NA that will return a vector with all declared missing values transformed to NA. So, when importing an SPSS file, declared missing values should not be changed to NA but we should keep the original code. Microdata should never be changed when importing a data set, and attributes used to store metadata. My feeling is that the same approach should be used for missing values. You keep the exact codes used in the original SPSS/Stata file and your able, later, to transform your data into a factor with a dedicated function. What I really like with the labelled class is that it's not changing your microdata. If the identifier is the row number, the information will be lost once you will sort your ame. Thirdly, you don't have always a unique identifier for each row. Secondly, you store the information at the ame level and not at the variable level. it seems in your proposal that it could increase substantially the size of a ame as you copy in an attribute as you copy all cases. $MISSING2$ Missing (on paper form)$values $MISSING2$ I don't have videogames$values $MISSING1$ Missing (on paper form)$values It is off this topic, but it touches important documentation points and I intend to combine attributes with DDI as an import/export tool. One other thing: it might very well be that some missing values are left undeclared in the SPSS file but they are declared in other metadata files (e.g. There are completely useless for any practical purpose, therefore I think only declared missing values with labels should be treated. Personally, missing values with no labels are equivalent to undeclared system missings. In the case of the Missing1 variable, we have a missing value of 5 which is found on the position where IndividualId is equal to 3.

one that stores the IDs of the unique id variable where the missings are found.one that stores the original values of the missing.each subcomponent has exactly two further subcomponents:.each component has one subcomponent for each missing type.I have a solution for any number of missing values (not just 27), using attributes.
