NSF suppport for ASLLRP

NSF Support for the American Sign Language Linguistic Research Project

We are grateful for support from the National Science Foundation.

I. The Architecture of Functional Categories in American Sign Language
#SBR-9410562 November 1, 1994 - January 31, 1998 C. Neidle, P.I.; J. Kegl, co-P.I.; B. Bahan, co-investigator ($355,000) #SBR-9729010 and #SBR-9729065 March 15, 1998-February 28, 2002 C. Neidle, P.I.; J. Kegl, co-P.I., B. Bahan and D. MacLaughlin, co-investigators ($355,000)
II. SignStream: A Multimedia Tool for Language Research
#IRI-9528985 and #IIS-9528985 December 1, 1995 - May 31, 2000 C. Neidle, P.I. ($748,169)
III. National Center for Sign Language and Gesture Resources
#EIA-9809340 and #EIA-9809209 October 1, 1998 - September 30, 2003 Boston University: C. Neidle, P.I.; S. Sclaroff, co-P.I. ($649,999) University of Pennsylvania: D. Metaxas, P.I.; N. Badler and M. Liberman, co-P.I.'s ($650,000)
IV. Essential Tools for Computational Research on Visual-Gestural Language Data
#IIS-9912573 May 15, 2000 - April 30, 2004 C. Neidle, P.I.; S. Sclaroff, co-P.I. ($687,602)
V. Pattern Discovery in Signed Languages and Gestural Communication
#IIS-0329009 September 15, 2003 - August 30, 2007 C. Neidle, P.I.; M. Betke, G. Kollios, and S. Sclaroff, co-P.I.'s ($749,999)
VI. ITR-Collaborative Research: Advances in recognition and interpretation of human motion: An Integrated Approach to ASL Recognition
#CNS-04279883 , 0427267, 0428231 October 15, 2004 - March 31, 2009 Boston University: C. Neidle, P.I. ($500,000) Gallaudet University: C. Vogler, P.I. ($249,998) Rutgers University: D. Metaxas, P.I.; A. Elgammal and V. Pavlovic, co-P.I.'s ($1,099,815)
VII. HCC-Large Lexicon Gesture Representation, Recognition, and Retrieval
#HCC-0705749 September 15, 2007 - September 30, 2011 Boston University: S. Sclaroff, P.I.; C. Neidle, co-P.I. ($899,985) University of Texas at Arlington: V. Athitsos, P.I.	http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0705749 Aims to develop sign "look-up" technology, to allow the computer to recognize and identify a sign produced by a user in front of a web cam (or a video clip for which the user has specified the start or end point). Such technology could, for example, serve as the interface to an ASL video dictionary. For purposes of developing and training computer algorithms for recognition, a set of about 3,000 signs in citation form was elicited from up to 6 native signers each, and these approx. 9,000 tokens have been linguistically annotated with unique gloss labels and start and end handshapes. See details of the project here: http://www.bu.edu/av/asllrp/dai-asllvd.html For preliminary reports of this work, see [2] [3] [5] [8] . Further reports and a doctoral dissertation by Ashwin Thangali related to the use of linguistic constraints on the composition of signs to improve recognition results will be forthcoming. The complete, linguistically annotated, data set will be made available ASAP, hopefully early in 2012.
VIII. II-EN: Infrastructure for Gesture Interface Research Outside the Lab
#IIS-0855065 July 20, 2009 - August 31, 2012 Boston University: S. Sclaroff, P.I., C. Neidle and M. Betke, co-P.I.'s ($$591,445)
IX. Collaborative Research: II-EN: Development of Publicly Available, Easily Searchable, Linguistically Analyzed, Video Corpora for Sign Language and Gesture Research (planning grant)
#CNS-0958442, 0958247, and 0958286 April 1, 2010 - March 31, 2012 Boston University: C. Neidle, P.I; S. Sclaroff, co-P.I. ($70,000) Rutgers University: D. Metaxas, P.I. ($20,000) University of Texas at Arlington: V. Athitsos, P.I. ($10,000)
X. III: Collaborative Research: CI-ADDO-EN: Development of Publicly Available, Easily Searchable, Linguistically Analyzed, Video Corpora for Sign Language and Gesture Research
#CNS-1059218, 1059281, 1059235 and 1059221 August 1, 2011 - July 31, 2017 Boston University: C. Neidle, P.I., S. Sclaroff, co-P.I. ($368,205) Rutgers University: D. Metaxas, P.I. ($97,908) Gallaudet University: B. Bahan, P.I.; C. Vogler, co-P.I. ($92,257) University of Texas, Arlington: V. Athitsos, P.I. ($66,630)	http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1059218 Aims to make publicly available the resources that have been developed in conjunction with all of the projects listed on this page. This includes especially linguistically annotated ASL video data (with synchronized video showing the signing from multiple angles along with a close-up of the face) from native signers. An additional video collection from Gallaudet Universitywill also be included. See the data now available through the Data Access Interface, currently under development (to include additional data sets and enhanced possibilities for browsing, searching, and downloading data): http://secrets.rutgers.edu/dai/queryPages/
XI. III: Medium: Collaborative Research: Linguistically Based ASL Sign Recognition as a Structured Multivariate Learning Problem
#IIS-0964385 and 0964597 September 1, 2010 - August 31, 2015 Boston University: C. Neidle, P.I. ($469,000) Rutgers University: D. Metaxas, P.I. ($739,000)	http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0964385 The goal is to enable computer algorithms to distinguish among and identify different morphological classes of signs in ASL (e.g., lexical signs, fingerspelled signs, loan signs, classifier constructions), which follow different compositional principles -- and to exploit appropriate linguistic constraints for the relevant class, in order to improve recognition of manual signs.
XII. HCC: Collaborative Research: Medium: Generating Accurate, Understandable Sign Language Animations Based on Analysis of Human Signing
#IIS-1065013, 10650090 and 1054965 July 1, 2011 - June 30, 2016 Boston University: C. Neidle, P.I. ($385,957) CUNY (Queens College): M. Huenerfauth, P.I. ($338,005) Rutgers University: D. Metaxas, P.I. ($469,996)	http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1065013 Applies the models developed (with funding from VI above) for recognition of linguistically significant facial expressions and head movements (see, for example, these presentations: [1] [4] [6] [7]) to the production of more realistic signing avatars. One of the biggest problems with generation of sign language through signing avatars is the problem of incorporating realistic facial expressions and head movements that are essential to the grammars of signed languages.
XIII. EAGER: Collaborative Research: Data Visualizations for Linguistically Annotated, Publicy Shared, Video Corpora for American Sign Language (ASL)
#1748016, 1748022 August 1, 2017 - July 31, 2018 Boston University: C. Neidle, P.I. ($18,001) Rutgers University: D. Metaxas, P.I. ($54,999)	https://www.nsf.gov/awardsearch/showAward?AWD_ID=1748016 The goal of this project is to further improve the existing SignStream 3 and DAI 2 applications by incorporating several powerful enhancements and additional functionalities to enable the shared tools and data to support new kinds of research in both linguistics (for analysis of linguistic properties of ASL and other signed languages) and computer science (for work in sign language recognition and generation). Specifically, we will incorporate into the displays, within both the annotation software and the Web interface, graphical representations of computer-generated analyses of ASL videos, so that users will be able to visualize the distribution and characteristics of key aspects of facial expressions and head movements that carry critical linguistic information in sign languages (e.g., head nods and shakes, eyebrow height, and eye aperture). Resulting publications: [9], [10], [11].
XIV. CHS: Medium: Collaborative Research: Scalable Integration of Data-Driven and Model-Based Methods for Large Vocabulary Sign Recognition and Search
#17634866, 1763523, 1763569 August 1, 2018 - July 31, 2022 Boston University: C. Neidle, P.I. ($300,023) Rutgers University: D. Metaxas, P.I. ($689,999) RIT: M. Huenerfauth, P.I. ($209,896)	https://www.nsf.gov/awardsearch/showAward?AWD_ID=1763486 This research will create a framework that will enable the development of a user-friendly, video-based sign-lookup interface, for use with online ASL video dictionaries and resources, and for facilitation of ASL annotation. Input will consist of either a webcam recording of a sign by the user, or user identification of the start and end frames of a sign from a digital video. To test the efficacy of the new tools in real-world applications, the team will partner with the leading producer of pedagogical materials for ASL instruction in high schools and colleges, which is developing the first multimedia ASL dictionary with video-based ASL definitions for signs. The lookup interface will be used experimentally to search the ASL dictionary in ASL classes at Boston University and RIT. Project outcomes will revolutionize how deaf children, students learning ASL, or families with deaf children search ASL dictionaries. They will accelerate research on ASL linguistics and technology, by increasing efficiency, accuracy, and consistency of annotations of ASL videos through video-based sign lookup. And they will lay the groundwork for future technologies to benefit deaf users, such as search by video example through ASL video collections, or ASL-to-English translation, for which sign-recognition is a precursor. The new linguistically annotated video data and software tools will be shared publicly, for use by others in linguistic and computer science research, as well as in education. This research will strategically combine state-of-the-art computer vision, machine-learning methods, and linguistic modeling. It will leverage the team's existing publicly shared ASL corpora and Sign Bank - linguistically annotated and categorized video recordings produced by native signers - which will be augmented to meet the requirements of this project. See https://www.nsf.gov/awardsearch/showAward?AWD_ID=1763486 for resulting publications.
XV. NSF Convergence Accelerator [Phase I]--Track D: Data & AI Methods for Modeling Facial Expressions in Language with Applications to Privacy for the Deaf, ASL Education & Linguistic Research
#2040638 September 15, 2000 - May 31, 2022 award to: Rutgers University: D. Metaxas, P.I. , M. D'Imperio, co-P.I. ($960,00); with subcontracts to Boston University: C. Neidle, P.I. ($213,342) and to RIT: M. Huenerfauth, P.I.	https://www.nsf.gov/awardsearch/showAward?AWD_ID=2040638 The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future. This award will support development of “Data & AI Methods for Modeling Facial Expressions in Language with Applications to Privacy for the Deaf, ASL Education & Linguistic Research.” Facial expressions and head gestures constitute an essential component of signed languages such as American Sign Language (ASL), which is the primary means of communication for over 500,000 people in the United States and the third most studied "foreign" language in the US. They also play an important role in spoken languages, but this has been much less well studied for spoken languages, in part because of the lack of analytic tools for analysis. The team of linguists, computer scientists, deaf and hearing experts on ASL, and industry partners will address research and societal challenges through three types of deliverables targeted to diverse user and research communities. They will develop: 1. Tools to facilitate and accelerate research into the role of facial expressions in both signed and spoken languages. 2. An application to help ASL second-language learners produce the facial expressions and head gestures that convey grammatical information in the language. This is one of the most challenging aspects of second language acquisition of ASL. 3. An application to enable ASL users to have private conversations about sensitive topics, by de-identifying the signer in video communications while preserving the essential linguistic information expressed non-manually (through use of 4-dimensional face-tracking algorithms to separate facial geometry from facial movement and expression). This last deliverable addresses a real problem for ASL users who seek private communication in their own language. Obscuring the face is not an option for hiding the signer’s identity, since critical linguistic information expressed non-manually would be lost. See https://www.nsf.gov/awardsearch/showAward?AWD_ID=2040638 for resulting publications.
XVI. CHS: Medium: Collaborative Research: Linguistically-Driven Sign Recognition from Continuous Signing for American Sign Language (ASL)
#2212302, 2212301, 2212303 August 15, 2022 - July 31, 2025 Boston University: C. Neidle, P.I. ($405,106) Rutgers University: D. Metaxas, P.I. ($628,966) RIT: M. Huenerfauth, P.I. ($15,014)	https://nsf.gov/awardsearch/showAward?AWD_ID=1059218 There is currently no method to segment and recognize signs from videos of continuous signing. In this project, computer-based sign recognition techniques developed for American Sign Language (ASL) will be extended from isolated, citation-form signs to signs in sentences. Sign recognition from continuous signing is a challenging task, in part because the articulation of one sign is often affected by that of neighboring signs. Much of the recent deep-learning research has focused on short videos of continuous signing using an “unsegmented” approach, in which the words in a video are identified without detection of start and end times for each sign. However, ASL utterances typically contain signs of different types (e.g., lexical, fingerspelled, classifier constructions) that have significantly different internal structures, requiring distinct recognition strategies. Thus, “segmented” recognition to identify video sub-durations containing distinct sign types is necessary for a complete recognition architecture. Segmented recognition would also enable automatic time-stamped annotation of ASL videos to produce training data for AI research, as well as powerful tools to provide ASL learners (who often have trouble parsing continuous signing) with a word-segmented analysis of videos. The new methods and all data collected for this project will be shared publicly, facilitating new research in computer vision, graphics, HCI, ASL, linguistics, and other related sciences. The research will also pave the way for a wide variety of technologies to improve communication between deaf and hearing individuals, such as: ASL-to-English translation (for which sign recognition is a precursor); educational applications to support ASL learners; and Google-like sign search by example over videos on the Web. Additional broad impact will derive from project outcomes because these same technologies can be applied to other signed languages, and because the new methods will be incorporated into educational programs in the PIs’ institutions. Beyond these societal impacts and benefits for research and education, the PIs will continue their long tradition of recruiting students who are deaf or hard-of-hearing, as well as members of other underrepresented groups, to participate in this research. This project will develop a novel end-to-end machine learning approach with the following key components: (1) segmentation of regions from continuous signing that contain distinct types of ASL signs (based on 2D skeleton data extracted from a window of a specific number of frames, using AlphaPose); (2) segmentation of the individual signs within those regions (using a spatiotemporal GCN approach); and (3) recognition of the segmented signs (using a transformer for sign classification). The recognition of segmented signs will leverage linguistic constraints applicable to the recognized sign type, with a focus in this project on lexical signs, and will also consider coarticulation effects. To these ends, the research will build on the PIs’ large, publicly shared, linguistically annotated, video corpora of isolated signs and continuous signing. This approach, with explicit sign type detection, sign segmentation, and type-specific recognition strategies for segmented signs, is essential for successful continuous sign recognition at scale and for a wide range of applications.
XVII. NSF Convergence Accelerator Track H: AI-based Tools to Enhance Access and Opportunites for the Deaf
#2235405 December 15, 2022 - November 30, 2023 award to: Rutgers University: D. Metaxas, P.I. ($750,00); with subcontracts to Boston University: C. Neidle, P.I. ($180,000) and to RIT: M. Huenerfauth, P.I.	https://nsf.gov/awardsearch/showAward?AWD_ID=2235405 We propose to develop sustainable, robust AI methods to overcome obstacles to digital communication and information access faced by Deaf and Hard-of-Hearing (DHH) individuals, empowering them personally and professionally. Users of American Sign Language (ASL), which has no standard written form, lack parity with hearing users in the digital arena. The proposed tools for privacy protection for ASL video communication and video search-by-example for access to multimedia digital resources build on prior NSF-funded AI research on linguistically-informed computer-based analysis and recognition of ASL from videos. PROBLEM #1. ASL signers cannot communicate anonymously about sensitive topics through videos in their native language; this is perceived by the Deaf community to be a serious problem. PROBLEM #2. There is no good way to look up a sign in a dictionary. Many ASL dictionaries enable sign look-up based on English translations, but what if the user does not understand the sign, or does not know its English translation? Others allow for search based on properties of ASL signs (e.g., handshape, location, movement type), but this is cumbersome, and a user must often look through hundreds of pictures of signs to find a target sign (if it is present at all in that dictionary). The tools to be developed will enable signers to anonymize ASL videos while preserving essential linguistic information conveyed by hands, arms, facial expressions, and head movements; and enable searching for a sign based on ASL input from a webcam or a video clip. Participants include DHH individuals, Deaf-owned companies, and members of other underrepresented minorities. The products will serve the >500,000 US signers and could be extended to other sign languages. The proposed application development brings together state-of-the-art research on: (1) video anonymization (using an asymmetric encoder-decoder structured image generator to generate high-resolution target frames driven by the original signing from the low-resolution source frames for anonymization, based on optical flow and confidence maps); (2) computer-based sign recognition from video (bidirectional skeleton-based isolated sign recognition using Graph Convolution Networks); and (3) HCI, including DHH user studies to assess desiderata for user interfaces for the proposed applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

References cited above, in reverse chronological order; for more complete list of publications see http://www.bu.edu/asllrp/talks.html, http://www.bu.edu/asllrp/reports.html , and http://www.bu.edu/asllrp/publications.html. Publications are also listed with on the NSF pages with award information.

[1] Nicholas Michael, Peng Yang, Dimitris Metaxas, and Carol Neidle, A Framework for the Recognition of Nonmanual Markers in Segmented Sequences of American Sign Language, British Machine Vision Conference 2011, Dundee, Scotland, August 31, 2011.

[2] Ashwin Thangali, Joan P. Nash, Stan Sclaroff and Carol Neidle, Exploiting Phonological Constraints for Handshape Inference in ASL Video, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011.

[3] Haijing Wang, Alexandra Stefan, Sajjad Moradi, Vassilis Athitsos, Carol Neidle, and Farhad Kamangar, A System for Large Vocabulary Sign Search. International Workshop on Sign, Gesture, and Activity (SGA) 2010, in conjunction with ECCV 2010. September 11, 2010. Hersonissos, Heraklion, Crete, Greece.

[4] Nicholas Michael, Carol Neidle, Dimitris Metaxas, Computer-based recognition of facial expressions in ASL: from face tracking to linguistic interpretation. 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, LREC 2010, May 22-23, 2010.

[5] Vassilis Athitsos, Carol Neidle, Stan Sclaroff, Joan Nash, Alexandra Stefan, Ashwin Thangali, Haijing Wang, and Quan Yuan, Large Lexicon Project: American Sign Language Video Corpus and Sign Language Indexing/Retrieval Algorithms. 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, LREC 2010, May 22-23, 2010.

[6] Nicholas Michael, Dimitris Metaxas, and Carol Neidle, Spatial and Temporal Pyramids for Grammatical Expression Recognition of American Sign Language. Eleventh International ACM SIGACCESS Conference on Computers and Accessibility. Philadelphia, PA, October 26-28, 2009.

[7] Carol Neidle, Nicholas Michael, Joan Nash, and Dimitris Metaxas, A Method for Recognition of Grammatically Significant Head Movements and Facial Expressions, Developed Through Use of a Linguistically Annotated Video Corpus. Workshop on Formal Approaches to Sign Languages, held as part of the 21st European Summer School in Logic, Language and Information, Bordeaux, France, July 20-31, 2009.

[8] V. Athitsos, C. Neidle, S. Sclaroff, J. Nash, A. Stefan, Q. Yuan, & A. Thangali, The ASL Lexicon Video Dataset. First IEEE Workshop on CVPR for Human Communicative Behavior Analysis. Anchorage, Alaska, Monday June 28, 2008.

[9] D. Metaxas, M. Dilsizian, & C. Neidle, Linguistically-driven Framework for Computationally Efficient and Scalable Sign Recognition. Proceedings of LREC 2018. Miyagawa, Japan, May 2018.

[10] C. Neidle, A. Opoku, G. Dimitriadis, & D. Metaxas, NEW Shared & Interconnected ASL Resources: SignStream® 3 Software; DAI 2 for Web Access to Linguistically Annotated Video Corpora; and a Sign Bank. 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community. pp. 147-154. LREC 2018. Miyagawa, Japan, May 2018.

[11] D. Metaxas, M. Dilsizian, and C. Neidle, Scalable ASL Recognition using Model-based Machine Learning and Linguistically Annotated Corpora. 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community. pp. 127-132. LREC 2018. Miyagawa, Japan, May 2018.

[ ASLLRP home page ]