Monika Zarczuk-Engelsma  profile picture
Monika Zarczuk-Engelsma | Have a Book
Posted on 19 November 2024

Metadata: The Key to Accessibility

Accessibility
Illustration on a pink background showing a large tablet with books on its screen. A woman holding a magnifying glass examines one of the books, while a man stands to the right, reading from a mobile device.

Well-constructed metadata is key to the discoverability and availability of digital publications. They create the first impression of a book, encouraging potential readers to read or buy it.

The importance of metadata grows as more and more ebooks are available on the market. In the context of the accessibility of ebooks, it will grow by leaps and bounds in June 2025 once the European Accessibility Act (EAA) comes into force. It is also worth mentioning that publishers use metadata for market analysis and production planning.

At the end of the day, it is crucial to understand who is responsible for creating correct metadata and which elements of metadata are key to accessibility, for example for blind readers.

What Metadata Is

Metadata is data about data. To put it simply, metadata is a “label” which describes a book. It contains all the basic data, such as its ISBN, author, title, date and format or genre (the more precise it is, the better). It is metadata that enables readers, librarians and browsers to find specific publications in the desired format. The information that an ebook’s metadata should feature depends on the type of the book. For instance, in the case of a streaming audiobook it is important to know if it is narrated by a human or by AI, and if it is a real person – who are they? (It is possible to provide a lot of examples from the Polish book market where an audiobook’s description does not feature this basic information). In the case of an ebook it is often stated that it can be purchased in the EPUB format, but in the case of accessibility there is a significant difference between the EPUB2 and EPUB3 formats. (EPUB3 is the only one that, at least in theory, guarantees full accessibility).

Why Metadata Matters

Metadata is key to indentifying, finding, understanding and reusing an ebook,

  • Precise metadata facilitates finding the book the reader is looking for in a web browser or a library catalogue.
  • Correctly defined metadata defines the accessibility of books for people with disabilities.
  • Precise metadata facilitates managing large digital resources.
  • Standardised metadata enables information exchange between different systems and platforms. Bear in mind that there are hundreds of thousands, if not millions, of ebooks on the market (depending on the language, e.g. fewer in Polish than in English).

Metadata Usage Standards

First of all, not only are there tens of millions of books, but they are also very diverse: academic or science books are often open access or Creative Commons licensed, whereas school textbooks are usually subject to national regulations. We wish for all of them to be available for blind individuals, and they should know in advance whether the books really are accessible. Novels differ from picture books, and colouring books for children are a different universe which probably will not be analysed in the context of availability for blind people. Summing it up, the world of ebooks is complex.

There are at least three main sources of guidelines for creating accessible ebooks: publishers (mainly the ONIX standard), creators of ebook formats (like PDF or EPUB) and creators of general accessibility guidelines (WCAG & W3C).

Combining the number of books available on the market and the number of guidelines we reach metadata from another side. The diversity of data formats and information systems requires unified standards which will guarantee interoperability and efficiency of information management. In reality, there is no universal set of guidelines for publishers.

Who Standardises Metadata

Choosing the right standard for metadata depends on specific needs and requirements. As a result, there is a number of entities which present their stances on the matter. The best known organisations are:

1. World Wide Web Consortium (W3C)

An international non-profit organisation with a focus on creating standards which enable creating, publishing and connecting web content.

W3C prepares a set of standards regarding metadata, including standards for RDF (Resource Description Framework), which is a language used to describe web content.

The most important guidelines for blind people are the Web Content Accessibility Guidelines (WCAG) 2.1, used in many legal acts of the European Union.

2. EDItEUR

EDItEUR is an international non-profit organisation which standardises the book market and is responsible for developing ONIX specifications.

ONIX is a standard for the exchange of data about publishing products. It is widely used in the publishing industry and enables data exchange between different systems.

The 196 list (linked below) is of vital importance to accessibility. It is worth noting that following the link you will also find the ONIX specification prepared in a way accessible for blind people.

3. Schema (https://schema.org/).

Created by the largest companies specialising in finding information online: Google, Inc., Yahoo, Inc., Microsoft Corporation, Yandex. These four entities sponsor the specification which defines the way in which a unit should be described in order to be identifiable for robots browsing websites.

For those who love challenges: it would be best to understand the “accessModeSufficient” element of the specificationfirst.

4. Dublin Core Metadata Initiative (DCMI)

An international non-profit organisation which develops and sustains a set of simple interoperable metadata which can be used with any digital resource. DCMI focuses on creating basic elements of metadata which can be used in different fields and environments. It offers a number of standards, such as Dublin Core, which is one of the best known sets of metadata widely used by libraries. The MARC standard has also been popular in recent years.

MARC (Machine-Readable Cataloging) is an international standard used to save and exchange digital bibliographic data. It is like a universal language which enables libraries around the world to exchange information about their resources. Thanks to this format it is possible to save a lot of information about a book, such as its title, author, publisher, year of publication, ISBN, keywords, and information about accessibility for people with disabilities. MARC enables the transfer of bibliographic data between different computer systems and, as a result, the creation of databases and resource catalogues.

5. International Organization for Standardization (ISO)

It focuses on creating standards which facilitate international exchange of goods and services. It develops a number of standards regarding metadata for various fields, such as librarianship, archival science and information management. The most important ISO standards for creating accessible ebooks are ISO PDF/UE, ISO EPUB and for metadata ISO ISBN, ISO DOI.

There are at least two other organisations which should be mentioned: OASIS (Organization for the Advancement of Structured Information Standards), which focuses on creating standards which facilitate data exchange between different systems and apps, and develops metadata standards for catalogues, archives and other data management systems, as well as METS (Metadata for ebooks) – a standard developed by the OCLC organisation used for describing complex digital products, such as ebooks, for library environments.

Codename “ONIX List 196”

Choosing the right standard depends on one’s objectives (e.g. distribution in online stores, cataloguing in libraries), on what systems will be used to manage and publish ebooks, and on a book’s complexity – the more complex it is, the more detailed the description needs to be.

The good news is that it is the publisher who is responsible for creating the file and metadata so that it suits market needs. Libraries create the metadata needed for borrowing books. As online both these standards should match, the better a publisher’s description is, the more likely it is to be used by other systems.

ONIX is the most widely used standard in the publishing industry. It is therefore advisable to use it as a base. In every publishing company it should work like this. There should be one person (the owner, director, production manager, editor-in-chief, marketing director, or ideally, an accessibility specialist) who knows that there is someone in the company who understands the ONIX standards, or knows someone who does, and can refer to them when asked about metadata. It is acceptable for the ONIX specialist to be an external person, as long as they are recognized within the company.

Another key feature is that the ONIX standards and WCAG are “synchronised” or “compatible”, which means they form a link between publishers and accessibility.

Then the (internal or external) ONIX specialist should fully understand the array of possibilities on the 196 ONIX list regarding accessibility.

The result should be a combination of basic elements of ebook description and those regarding accessibility, such as:

  • ISBN (International Standard Book Number) or, if an ebook does not have an ISBN, another identifier, such as DOI (Digital Object Identifier)
  • A precise and unequivocal book title. It would best if it was separate from the volume number and the name of the book series (if books are numbered, the numbers should be in separate fields, not in the same field as the title). Likewise, the title field should not contain information about the format, e.g. Mobi or ePub.
  • Author names and surnames.
  • Names and surnames of co-creators, if applicable (translators, editors, illustrators)
  • Publisher’s name
  • Date of publication and market release date (if they are different, which is often the case with printed books and ebooks)
  • A concise description of the book’s content.
  • A precise specification of the book’s genre, subject and subgenres (if possible). If we use ONIX it would be best to also add a Schema description, as it is the basic scheme used in the publishing industry.
  • Keywords which describe the book’s content and make it easier to find.
  • The book’s original language.
  • Price at which the ebook may be purchased.

Regarding accessibility

  • Electronic file format and all the details (if it is a PDF file – is it DRM protected? If it is and EPUB file – which version is it, EPUB2 or EPUB3?). Analysing lists 79 and 150 of ONIX specification is a good starting point
  • Data about the book’s accessibility for people with disabilities: is it available in text format? Does it contain alternative image descriptions? Does it meet A or AA WCAG 2.1 guidelines? Does it follow ISO guidelines (e,g, UA-2 in case of PFD files)? Does it contain searchable indexes and a searchable table of contents? Do mathematical formulas fulfill MathML requirements? Is page numbering from the printed version available in the digital version as well? Are ebook and audiobook versions synchronised (if both formats are available)? Are ARIA markers used? Apart from this, there is a number of guidelines for visually impaired individuals regarding contrast, colour palette, the font type. All of this is available under the codename “ONIX list 196”.
  • Data regarding exemption from regulations. Bear in mind that microenterprises have exemption from the Accessibility Act, which all sellers selling ebooks in the EU must be familiar with. British and American publishers who sell ebooks in the EU are also under the obligation to introduce accessible ebooks and correct ebook metadata.

Summary: dear publisher, remember that...

There are many reasons why it is good to make sure the metadata is well constructed. Not only it will be required by the European Accessibility Act from June 2025 for all ebooks, but it is also key to satisfying browsing for electronic books. Thanks to it, every reader, regardless of their needs and limitations, will be able to easily find and use the publications they are interested in. Search engines like Google also use metadata to index and classify books. The better you construct the metadata, the better use you make of categories, keywords, book description, the more likely your book is to appear on top of search results. The better you understand the methods of browsing metadata, the better prepared you will be. Recommendation algorithms used by ebook stores also rely on metadata. The more precisely you describe your book, the better suited suggestions readers who have already bought a similar publication will get. Metadata is a system of communicating vessels.

Sources:

https://www.editeur.org/files/ONIX%203/APPNOTE%20Accessibility%20metadata%20in%20ONIX%20(advanced).pdf

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/draft/principles/

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/draft/techniques/onix-metadata/index.html

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/draft/techniques/epubmetadata/index.html

https://www.w3.org/TR/WCAG21/

https://www.editeur.org/83/overview/

https://www.editeur.org/files/ONIX%20for%20books%20-%20code%20lists/ONIX_BookProduct_Codelists_Issue_67.html

https://schema.org/accessModeSufficient).

The article was created in close collaboration between the Polish Foundation for the Blind and Visually Impaired "Trakt” and Have a Book.

Translated by Aleksandra Kallas

MZ

Monika Zarczuk-Engelsma

Monika Zarczuk-Engelsma was born in the Lublin region, and her early years were connected to a school for the blind in Kraków. She graduated from the Academy of Podlasie with a degree in Polish Philology, and then continued her education at the University of Gdańsk, obtaining a diploma in postgraduate studies in Partnership Marketing and Public Relations. This closely aligned with the position she was entrusted with at one of the non-governmental organizations in the Tricity area.

As part of expanding her professional experience, Monika participated in projects teaching computer skills to blind individuals, conducted training on volunteerism, tried her hand at copywriting, took part in the European Union’s European Voluntary Service project, and also secured an internship at the European Economic and Social Committee. Monika is now the office manager of the Polish Foundation for the Blind and Visually Impaired "Trakt.”

In her daily life, she uses Braille but also readily takes advantage of modern technology, such as an iPhone equipped with VoiceOver, a screen-reading laptop, a talking thermometer, and a bathroom scale that announces measurements in a clear female voice. Thanks to these tools, Monika achieves a high level of independence and is able to live the way she enjoys—actively and engagingly, meeting new people, working, reading books, and pursuing hobbies such as playing chess, board and card games, cooking, and traveling (she has already visited over 30 countries).

Currently, she is also raising two wonderful children, taking care of two already adult cats, and running her dream home with a garden.