After one year from its start, the ORCID community has published the first Public Data File. I downloaded it, and had a brief look at the data. It comes both in XML and JSON, one file per ORCID record. The number of files is a little over 360,000.
Here is how my own record looks like in XML. The orcid-activities element marks the start of the list of publications, cut off here:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <orcid-message xmlns="http://www.orcid.org/ns/orcid"> <message-version>1.0.22</message-version> <orcid-profile type="user"> <orcid>0000-0002-6892-9305</orcid> <orcid-id>http://orcid.org/0000-0002-6892-9305</orcid-id> <orcid-preferences> <locale>en</locale> </orcid-preferences> <orcid-history> <creation-method>website</creation-method> <completion-date>2013-08-26T13:58:52.798Z</completion-date> <submission-date>2013-08-26T13:47:05.325Z</submission-date> <last-modified-date>2013-09-05T11:41:03.758Z</last-modified-date> <claimed>true</claimed> </orcid-history> <orcid-bio> <personal-details> <given-names>Tuija</given-names> <family-name>Sonkkila</family-name> </personal-details> <external-identifiers> <external-identifier> <orcid>0000-0002-6892-9305</orcid> <external-id-orcid>0000-0001-7707-4137</external-id-orcid> <external-id-common-name>ResearcherID</external-id-common-name> <external-id-reference>I-6344-2013</external-id-reference> <external-id-url>http://www.researcherid.com/rid/I-6344-2013</external-id-url> </external-identifier> </external-identifiers> </orcid-bio> <orcid-activities> ...
For a quick overall test, here is a plot of various locale values as a bar chart. The locale doesn’t really tell very much at all about the demographics of ORCID users. Rather it confirms that even among scholars, English is the predominant web browser language setting.
Code of the XSLT transformation, and of making the plot is available here.