Moderating Crowdsourced Content


Nancy Proctor

Beth Ziebarth





Crowdsourcing, verbal description, visual description, quality, authority, professional description

What is this study about?

This document presents responses from the field regarding the idea of crowdsourcing visual descriptions. Many professionals have concerns about the quality and veracity of descriptions created by non-professionals. In addition, there are concerns about personal judgment, interpretations, and emotions conveyed in non-professional descriptions that can potentially mislead or distort the perception of what is being described.

Top line summary

Although skepticism about crowdsourcing visual descriptions is pervasive and unlikely to disappear quickly or entirely, interpretive trends in art museums and the use of inquiry-based learning in particular make use of non-expert descriptions integral to some public programs. Art museums may therefore be leading the way in finding value and creating greater acceptance for crowdsourcing visual descriptions in the field.

About the authors

Beth Ziebarth is Director of the Smithsonian’s Accessibility Programs Office and a member of the Access App team, heading up the content-based part of the initiative.

Nancy Proctor was a member of the technical team, advising on the development of the software-based elements of the Access App toolkit. Nancy Proctor worked with Ziebarth on the first iteration of the Access App and has participated primarily in the technical team of the Access App project.


From the beginning of the Access App project with its 2012-2013 pilot at the National Museum of American History, the team has been aware of concerns from professionals in the field that crowdsourced visual descriptions – i.e. those created by people who don’t necessarily have any training or experience in visual describing – would not be of high enough quality or reliability to be useful to blind and visually impaired people. Worse, inaccuracies and personal interpretations conveyed in amateur descriptions could mislead people and give them an incorrect impression of what is being described. This has been the primary objection voiced to the Access App from within the field, and crowdsourcing visual description in general.

To some extent this concern is also a manifestation of a style of visual description in which the describer refrains from including any interpretation or personal judgement about what is being described. The aim is to produce a “what you see is what you get” quality description that should be more objective and faithful to the actual visual attributes of what is being described.

Professionals working with visual description in museums, however, have found the “WYSIGYG” school of descriptions could impoverish the visitor experience by leaving aside one of museum’s greatest strengths: subject matter expertise. Museum professionals typically know a great deal more about objects in their museums than just what they look like, and indeed it is this broader knowledge that brings people to museums, hoping to hear from the experts. To withhold this subject matter expertise from the visual description seems to go against the interests of the visitors and the mission of the institution.

In addition, the inquiry method used in art museums in particular to engage visitors in close looking and thinking about exhibitions and objects has grown in prevalence in museum education practice. The inquiry method is a form of close-looking, that uses questions to get people to look closely at objects. New York’s Museum of Modern Art (MoMA) is a leading exponent and developer of this approach. An inquiry-based session typically begins by asking participants “what do you see?” in front of an object or exhibit. This question is clearly unhelpful with blind and visually-impaired visitors if there are no other sensory experiences available for the object (e.g. touching, smelling, or listening).

Educators using the inquiry method have sought to include blind and visually-impaired participants in the experience of the object by asking the group to describe what they see, with the educator occasionally interjecting additional description as necessary to offer a complete picture and summarize the group’s description. All participants are encouraged to ask questions as the object is being described as well. Inclusive inquiry-based encounters in the museum, therefore, are by nature a hybrid of professional and crowdsourced visual description.

The role of inquiry-based learning and its impact on perceptions of the role and value of crowdsourcing was discussed at a recent visual description workshop led by Rebecca McGinnis, Hannah Goodwin and Lorena Bradford at the 2017 LEAD Conference, attended by professional describers as well as art, accessibility and other museum and cultural heritage professionals representing a range of levels of experience of visual describing. Three conclusions of the conversation argued in favor of at least some use of crowdsourcing for visual descriptions in museums:

  1. The hybrid approach that uses crowdsourcing as well as professional describing can and must work in order to ensure that visitors’ experiences of common educational programs are inclusive of blind and visually-impaired people. There is an expectation, therefore, that inclusive, inquiry-based programming will help inform discussions about the risks and benefits of crowdsourcing to visual description practice.
  2. There is also a value in introducing a range of voices in discussions of objects and exhibits in museums. People of different ages and life experiences bring a diversity of perspectives to the museum’s exhibitions and collections, and these can in turn create new opportunities for other audiences to find relevance and interest in the museum. In addition, hearing descriptions from a human voice can add texture and pleasure for the listener that is unavailable in a curator’s text or wall label. Furthermore, some visitors welcome the introduction of emotional input and personal responses in visual descriptions, and do not fear being “misled” by the opinion or perception of another, particularly if it is offered as one view among many. It has been proven with crowdsourced publications like Wikipedia or some online translation and transcription projects that when a preponderance of objects have been described by a range of voices and describers, the aggregate body of descriptive content may serve to produce a higher quality description than a single, professional describer could offer. At that point, however, the question of how many descriptions a listener would want to have to listen to in order to get a comprehensive view of the object being described may become more pressing.
  3. However, at the moment there are still so few museums offering visual description, arguably some description is better than none in helping to open the door to museum experiences for people who are blind or have low vision. In other words, quantity or availability rather than quality is the primary problem museums face in terms of accessibility to blind and visually-impaired visitors. Museums urgently need to find and offer new services in addition to in person facilitation in order to provide blind and visually-impaired visitors with the same level of independence and on-demand access to collections and exhibitions that sighted visitors enjoy.

One major barrier to museums creating more visual descriptions is perceived cost. There are several elements that can serve as barriers to adoption: a) deciding on and possibly providing the hardware or devices through which the visual descriptions are heard; b) the adoption or creation of software and accessible interfaces to the content; and c) the production of the visual description content itself. The Access App project aimed at mitigating the first two by a) choosing to provide descriptions through visitor’s own devices and b) developing open source software and interfaces to an app that would make both the recording and playback of visual descriptions easy for museums and visitors.

There has not yet been enough crowdsourced visual description content produced to determine if crowdsourcing actually saves on costs compared to having the same amount of content created by professional describers. Nonetheless, new business models are already emerging, including services that allow individuals and institutions like museums to purchase a number of minutes or hours per month of visual description via mobile device. The describers are not subject-matter experts like professional museum describers, but they are trained in visual describing and, perhaps more importantly than anything for the blind or visually-impaired visitor, are available on-demand through the convenience of the visitor’s own mobile device. This kind of service suggests yet another tool that museums can adopt to have a range of solutions at the ready for increasing accessibility to collections and exhibitions, beyond the dominant paradigm of live guides and professional describers.


The arguments against crowdsourcing visual descriptions hinge largely on issues of control over the visitor experience and content created about the collection or exhibition. As museum professionals become more comfortable with ceding control to visitors and non-expert participants, many of the concerns about crowdsourcing are likely to abate or fall away.

There are benefits to engaging visitors and non-expert audiences in visual description beyond the content they produce: sighted visitors look more closely at objects when asked to describe them, and become more engaged in both the museum and its mission when explicitly participating in a service that will benefit other visitors. In addition, sighted visitors have found it interesting to hear how others see and describe objects and exhibitions. This is analogous to the experience of visiting and discussing an exhibition or object with a friend: the friend may not be a subject-matter expert, but there is pleasure in sharing views with them.

What we would do differently

As a next step in this research, user response to various combinations of content: professional, crowdsourced, and mixed in various ways should be tested to see if any patterns or best practices can be found that suggest the right balance among various content sources.

What we would not change

Though we have not converted all to using crowdsourcing to grow the available body of visual descriptions, we find that independent trends and developments in the field, like inquiry-based learning, seem to be gradually creating greater openness to approaches like those used in the Access App project. We believe we are on the right path, albeit still very much at the beginning!


Many people, including blind and visually-impaired people, want to be able to visit museums independently and access information that enhances their experience on demand. Museums also want to provide the most satisfying experiences possible in order to achieve their missions and responsibility to the public. These demands are leading museums to try multiple new routes to increasing accessibility through providing more visual descriptions of their collections and exhibitions. Arguments against crowdsourcing visual descriptions are generally based on the concern that amateur describers don’t have the necessary knowledge to describe accurately or well. But description is an activity that does not require subject matter expertise, and we have found that sighted visitors are willing to record their descriptions of objects and benefit from the closer looking and greater engagement with the collection and museum that the activity affords. The challenge now is not only to find ways to encourage sighted visitors to participate so that their visual descriptions benefit blind and visually-impaired visitors but to control the visual description content for accuracy, as is the case in other major crowdsourced content platforms like Wikipedia.

An additional hurdle is getting museums to invest in platforms that allow for the delivery of verbal description. Museums have been investing in audio tours for years, and it is clear that all visitors benefit from the addition of a layer of visual description content. Engaging with crowdsourcing to achieve this end, even if the amateur content is moderated and framed by professional content, is a promising route forward with multiple additional benefits as well.