Brussels / 30 & 31 January 2016


API-Powered Dictionaries For Digitally Under-Represented Languages

At the end of 2014 Oxford University Press (OUP) launched the Oxford Global Languages (OGL) initiative whose focus is to develop linguistic resources particularly for digitally under-represented languages. The aim of the programme is to help language communities over the world to create, maintain, and use digital language resources while developing digital-ready content formats to support the growing language needs of technology worldwide. OGL aims to create a win-win situation whereby communities of digitally under-represented languages are able to contribute content through an online platform, licensees are able to consume that data in the digital format they need for a cost, and OUP generates enough revenue to sustain free community access to the online platform.

In September 2015 OUP launched the first two OGL websites for isiZulu and Northern Sotho. The backend of these websites draws on an API (Application Programming Interface) which interacts with the RDF master data in a triple store and delivers it to the frontend serialized as JSON-LD. The websites enable language communities worldwide to add, review, and share language-related content for digitally under-represented languages. Three months after the launch of the isiZulu and Northern Sotho websites crowdsourcing contributed to create and improve 537 entries on the online dictionaries and the number of community contributors is increasing day by day.

The presentation focuses on the API (Application Programming Interface) developed to power these websites. We show API calls to search and retrieve dictionary entries, add new content on the website in real-time and delete it if need be. We discuss the advantages of API-powered dictionaries, how the API allows OUP to crowdsource linguistic data from online language communities, and how APIs facilitate the integration of data with external systems and developers.

Finally, we would like to make an appeal to language enthusiasts, linguists, and software developers to contribute to the OGL initiative and experiment with our APIs. In the coming months OUP is planning to release API-powered dictionary websites for Malay and Urdu, and we are building online communities around these languages. On the technology side we are currently giving early access to our APIs and SPARQL endpoint to software developers in order to gather requirements for publicly accessible APIs for given languages and for publishing some of our RDF as Linked Data.


Photo of Sandro Cirulli Sandro Cirulli