[visionlist] The 6th Workshop on Vision and Language (VL’17): First Call for PapersPosted: November 10, 2016
The 6th Workshop on Vision and Language (VL’17)
At EACL’16 in Valencia, SpainFirst Call for PapersComputational vision-language integration is commonly taken to mean the process of associating visual and corresponding linguistic pieces of information. Fragments of natural language, in the form of tags, captions, subtitles,surrounding text or audio, can aid the interpretation of image and video data by adding context or disambiguatingvisual appearance. Labeled images are essential for training object or activity classifiers. Visual data can helpresolve challenges in language processing such as word sense disambiguation, language understanding, machinetranslation and speech recognition. Sign language and gestures are languages that require visual interpretation.Studying language and vision together can also provide new insight into cognition and universal representations ofknowledge and meaning, the focus of researchers in these areas is increasingly turning towards models for groundinglanguage in action and perception. There is growing interest in models that are capable of learning from, andexploiting, multi-modal data, involving constructing semantic representations from both linguistic and visual orperceptual input.The 6th Workshop on Vision and Language (VL’17) aims to address all the above, with a particular focus on theintegrated modelling of vision and language. We welcome papers describing original research combining languageand vision. To encourage the sharing of novel and emerging ideas we also welcome papers describing new datasets,grand challenges, open problems, benchmarks and work in progress as well as survey papers.Topics of interest include (in alphabetical order), but are not limited to:* Computational modelling of human vision and language* Computer graphics generation from text* Cross-lingual image captioning* Detection/Segmentation by referring expressions* Human-computer interaction in virtual worlds* Human-robot interaction* Image and video description and summarisation* Image and video labelling and annotation* Image and video retrieval* Language-driven animation* Machine translation with visual enhancement* Medical image processing* Models of distributional semantics involving vision and language* Multi-modal discourse analysis* Multi-modal human-computer communication* Multi-modal machine translation* Multi-modal temporal and spatial semantics recognition and resolution* Recognition of narratives in text and video* Recognition of semantic roles and frames in text, images and video* Retrieval models across different modalities* Text-to-image generation* Visual question answering / visual Turing challenge* Visually grounded language understanding* Visual storytellingAccepted technical submissions will be presented at the workshop as 20+5min oral presentations; poster submissionswill be presented in the form of brief ’teaser’ presentations, followed by a poster presentation during theworkshop poster session. Authors of longer technical papers will have the option of additionally presenting theirwork in poster form.Paper SubmissionSubmissions should be up to 8 pages long plus references for long papers, and 4 pages long plus references forposter papers. Submissions should adhere to the EACL 2017 format (style files available http://eacl2017.org/index.php/calls/call-for-papers), andshould be in PDF format.Please make your submission via the workshop submission pages: link will be provided in second call.Important DatesNov 10, 2016: First Call for Workshop PapersDec 9, 2016: Second Call for Workshop PapersJan 16, 2017: Workshop Paper Due DateFeb 11, 2017: Notification of AcceptanceFeb 21, 2017: Camera-ready papers dueApril 4, 2017: VL’17 WorkshopProgramme CommitteeRaffaella Bernardi, University of Trento, ItalyDarren Cosker, University of Bath, UKAykut Erdem, Hacettepe University, TurkeyJacob Goldberger, Bar Ilan University, IsraelJordi Gonzalez, Autonomous University of Barcelona, SpainFrank Keller, University of Edinburgh, UKDouwe Kiela, University of Cambridge, UKAdrian Muscat, University of Malta, MaltaArnau Ramisa, IRI UPC Barcelona, SpainCarina Silberer, University of Edinburgh, UKCaroline Sporleder, GermanyJosiah Wang, University of Sheffield, UKFurther members t.b.c.OrganisersAnya Belz, University of Brighton, UKKaterina Pastra, Cognitive Systems Research Institute (CSRI), Athens, GreeceErkut Erdem, Hacettepe University, TurkeyKrystian Mikolajczyk, Imperial College London, UKContacta.email@example.com://vision.cs.hacettepe.edu.tr/vl2017/This Workshop is organised by European COST Action IC1307: The European Network on Integrating Vision and Language (iV&L Net)The explosive growth of visual and textual data (both on the World Wide Web and held in private repositories bydiverse institutions and companies) has led to urgent requirements in terms of search, processing and managementof digital content. Solutions for providing access to or mining such data depend on the semantic gap betweenvision and language being bridged, which in turn calls for expertise from two so far unconnected fields: ComputerVision (CV) and Natural Language Processing (NLP). The central goal of iV&L Net is to build a EuropeanCV/NLP research community, targeting 4 focus themes: (i) Integrated Modelling of Vision and Language for CVand NLP Tasks; (ii) Applications of Integrated Models; (iii) Automatic Generation of Image & Video Descriptions;and (iv) Semantic Image & Video Search. iV&L Net will organise annual conferences, technical meetings,partner visits, data/task benchmarking, and industry/end-user liaison. Europe has many of the world?s leadingCV and NLP researchers. Tapping into this expertise, and bringing the collaboration, networking and communitybuilding enabled by COST Actions to bear, iV&L Net will have substantial impact, in terms of advances in boththeory/methodology and real world technologies.