The details depend on the particular capability being targeted, but we typically filter training data based on automatic measures of writing quality, such as grammaticality, fluency, and clarity. These sorts of filtering help to increase the writing quality of text generated by our models. We also perform automatic pseudonymization of certain sensitive fields in the data before training, which makes it less likely that our AI models will output sensitive fields verbatim from the training data.
Comments
0 comments
Please sign in to leave a comment.