The publicly available and open-source datasets that we have used in some instances for model training are datasets that are used across industry for training language models and may include some protected text. We also use text that has been made available under Creative Commons licenses as well as text that is in the public domain.
Comments
0 comments
Please sign in to leave a comment.