ARQMath aims to advance math-aware search and the semantic analysis of
mathematical notation and texts.
Task 1: Answer Retrieval.
Given a math question post, return relevant answer posts.
Task 2: Formula Retrieval.
Given a formula in a math question post, return relevant formulas from both
question and answer posts.
Pilot Task 1: Open Domain Question Answering.
Given a math question post, return an automatically generated answer that is
comprised of excerpts from arbitrary sources, and/or machine generated.
(approval pending)
BioASQ: Large-scale biomedical semantic indexing and question answering
The aim of the BioASQ Lab is to push the research frontier towards systems that
use the diverse and voluminous information available online to respond directly
to the information needs of biomedical scientists.
Task 1: Large-Scale Online Biomedical Semantic Indexing.
Classify new PubMed documents, before PubMed curators annotate (in effect,
classify) them manually into classes from the MeSH hierarchy.
Task 2: Biomedical Semantic Question Answering.
It uses benchmark datasets of biomedical questions, in English, along
with gold standard (reference) answers constructed by a team of biomedical
experts. The participants have to respond with relevant articles, and
snippets from designated resources, as well as exact and "ideal" answers.
Task 3 - DisTEMIST: Disease Text Mining and Indexing Shared
Task.
It focuses on the recognition and indexing of diseases in medical
documents in Spanish, by posing subtasks on
(1) indexing medical documents with controlled terminologies;
(2) automatic detection indexing textual evidence (i.e. disease entity
mentions in text); and
(3) normalization of these disease mentions to terminologies.
Task 4 - Task Synergy: Question Answering for developing
problems.
Biomedical experts pose unanswered questions for the developing problem of
COVID-19, receive the responses provided by the participating systems, and
provide feedback, together with updated questions in an iterative procedure
that aims to facilitate the incremental understanding of COVID-19.
CheckThat! lab on Fighting the COVID-19 Infodemic and Fake News Detection
The CheckThat! lab aims at fighting misinformation and disinformation in social
media, in political debates and in the news, with focus on three tasks (in seven
languages: Arabic, Bulgarian, Dutch, English, German, Spanish, and Turkish).
Task 1: Fighting the COVID-19 Infodemic.
It focuses on disinformation related to the ongoing COVID-19 infodemic and
asks to identify which posts in a Twitter stream are worth fact-checking,
contain a verifiable factual claim, are harmful to the society, and why.
This task is offered in Arabic, Bulgarian, Dutch, English, Spanish, and
Turkish.
Task 2: Detecting Previously Fact-Checked Claims.
Given a check-worthy claim, and a set of previously-checked claims,
determine whether the claim has been previously fact-checked with respect to
a collection of fact-checked claims. The text can be a tweet or a sentence
from a political debate. The task is offered in Arabic and English.
Task 3: Fake news detection.
Given the text and the title of a news article, determine whether the main
claim made in the article is true, partially true, false, or other (e.g.,
articles in dispute and unproven articles). This task is offered in English
and German.
ChEMU: Cheminformatics Elsevier Melbourne University lab
The ChEMU lab series provides a unique opportunity for the development of
information extraction tools over chemical patents. ChEMU 2022 focuses on
information extraction in chemical patents, including five tasks ranging from
document- to expression-level.
Task 1a: Named entity recognition.
This task aims to identify chemical compounds, their specific types,
temperatures, reaction times, yields, and the label of the reaction.
Task 1b: Event extraction.
A chemical reaction leading to an end product often consists of a sequence
of individual event steps. The task is to identify those steps which
involve chemical entities recognized from Task 1a.
Task 1c: Anaphora resolution.
It requires the resolution of anaphoric dependencies between expressions in
chemical patents. The participants are required to find five types of
anaphoric relationships in chemical patents: coreference,
reaction-associated, work-up, contained, and transform.
Task 2a: Chemical reaction reference resolution.
Given a reaction description, this task requires identifying references to
other reactions that the reaction relates to, and to the general conditions
that it depends on.
Task 2b: Table semantic classification.
This task is about classifying tables in chemical patents into 8 categories
based on their contents .
eRisk explores the evaluation methodology, effectiveness metrics, and practical
applications (particularly those related to health and safety) of early risk
detection on the Internet. Early detection technologies can be employed in
different areas, particularly those related to health and safety. For instance,
early alerts could be sent when a predator starts interacting with a child for
sexual purposes, or when a potential offender starts publishing antisocial
threats on a blog, forum or social network. Our main goal is to pioneer a new
interdisciplinary research area that would be potentially applicable to a wide
variety of situations and to many different personal profiles. Examples include
potential paedophiles, stalkers, individuals that could fall into the hands of
criminal organisations, people with suicidal inclinations, or people susceptible
to depression.
Task 1: Early Detection of Signs of Pathological Gambling.
The challenge consists of sequentially processing pieces of evidence and
detect early traces of pathological gambling (also known as compulsive
gambling or disordered gambling), as soon as possible. The task is mainly
concerned about evaluating Text Mining solutions and, thus, it concentrates
on texts written in Social Media.
Task 2: Early Detection of Depression.
The challenge consists of sequentially processing pieces of evidence and
detect early traces of depression as soon as possible. The task is mainly
concerned about evaluating Text Mining solutions and, thus, it concentrates
on texts written in Social Media.
Task 3: Measuring the severity of the signs of Eating Disorders
The task consists of estimating the level of features associated with a
diagnosis of eating disorders from a thread of user submissions. For each
user, the participants will be given a history of postings and the
participants will have to fill a standard eating disorder questionnaire
(based on the evidence found in the history of postings).
HIPE - Named Entity Recognition and Linking in Multilingual Historical
Documents
HIPE ('Identifying Historical People, Places and other Entities') focuses on
named entity recognition and linking in historical documents, with the objective
of assessing and advancing the development of robust, adaptable, and
transferable named entity processing systems. Compared to the first HIPE edition
in 2020, HIPE 2022 will confront systems with the challenges of dealing with
more languages, learning domain-specific entities, and adapting to diverse
annotation schemas.
Task 1: Named Entity Recognition and Classification (NERC).
With two subtasks: NERC-coarse on high-level entity types, for all
languages and NERC-fine on finer-grained entity types, for English,
French, and German only.
Task 2: Named Entity Linking (EL).
Or the linking of named entity mentions to a unique referent in a knowledge
base (Wikidata) or to a NIL node if the mention does not have a referent in
the KB."
Amyotrophic Lateral Sclerosis (ALS) is a severe chronic disease characterized
by progressive or alternate impairment of neurological functions, characterized
by high heterogeneity both in symptoms and disease progression. The goal of
iDPP is to design and develop an evaluation infrastructure for AI
algorithms able to:
(1) better describe disease mechanisms;
(2) stratify patients according to their phenotype assessed all over the disease
evolution; and
(3) predict disease progression in a probabilistic, time dependent fashion.
Task 1: Ranking Risk of Impairment.
This task will focus on ranking of patients based on the risk of impairment
in specific domains. We will use the ALSFRS-R scale to monitor speech,
swallowing, handwriting, dressing/hygiene, walking and respiratory ability
in time and will ask participants to rank patients based on time to event
risk of experiencing impairment in each specific domain.
Task 2: Predicting Time of Impairment.
This task will refine Task 1 asking participants to predict when specific
impairments will occur (i.e. in the correct time-window). We will assess
model calibration in terms of the ability of the proposed algorithms to
estimate a probability of an event close to the true probability within a
specified time-window.
Task 3: Explainability of AI algorithms [Position Papers].
This task will call for position papers to start a discussion on AI
explainability including proposals on how the single patient data can be
visualized in a multivariate fashion contextualizing its dynamic nature and
the model predictions together with information on the predictive variables
that most influence the prediction. We will evaluate proposals of different
visualization frameworks able to show the multivariate nature of the data
and the model predictions in an explainable, possibly interactive, way.
ImageCLEF is set to promote the evaluation of technologies for annotation,
indexing, classification and retrieval of multi-modal data, with the objective
of providing information access to large collections of images in various usage
scenarios and domains. ImageCLEF 2022 focuses on medical, nature, Internet, and
system fusion applications.
Task 1: ImageCLEFmedical
The caption task focuses on interpreting and summarizing the insights gained
from radiology images, i.e. develop systems that are able to predict the
UMLS concepts from visual image content, and implementing models to
predict captions for given radiology images. The tuberculosis task
fosters systems that are expected to detect cavern regions localization
rather than simply provide a label for the CT images.
Task 2: ImageCLEFcoral
It fosters tools for creating 3-dimensional models of underwater coral
environments. It requires participants to label coral underwater images with
types of benthic substrate together with their bounding box, and to segment
and parse each coral image into different image regions associated with
benthic substrate types.
Task 3: ImageCLEFaware
The online disclosure of personal data often has effects which go beyond the
initial context in which data were shared. Participants are required to
provide automatic rankings of photographic user profiles in a series of
real-life situations such as searching for a bank loan, an accommodation, a
waiter job or a job in IT. The ranking will be based on an automatic
analysis of profile images and the aggregation of individual results.
Task 4: ImageCLEFfusion
System fusion allows to exploit the complementary nature of individual
systems to boost performance. Participants will be tasked with creating
novel ensembling methods that are able to significantly increase the
performance of precomputed inducers in various use-case scenarios, such as
visual interestingness and video memorability prediction.
JokeR: Automatic Wordplay and Humour Translation
Workshop
The goal of the JOKER workshop is to bring together translators and computer
scientists to work on an evaluation framework for creative language, including
data and metric development, and to foster work on automatic methods for
wordplay translation.
Pilot task 1: Classify and interpret wordplay.
Classify single words containing wordplay according to a given typology,
and provide lexical-semantic interpretations.
Pilot task 2: Translate single term wordplay.
Translate single words containing wordplay.
Pilot task 3: Translate phrase wordplay.
Translate entire phrases that subsume or contain wordplay.
Task 4: Unshared Task.
We welcome submissions that use our data in other ways!
The aim of LeQua 2022 (the 1st edition of the lab) is to allow the comparative
evaluation of methods for “learning to quantify” in textual datasets; i.e.
methods for training predictors of the relative frequencies of the classes of
interest in sets of unlabelled textual documents. These predictors (called
“quantifiers”) will be required to issue predictions for several such sets, some
of them characterized by class frequencies radically different from the ones of
the training set.
Task 1:
Participants will be provided with documents already converted into vector
form; the task is thus suitable for participants who do not wish to engage
in generating representations for the textual documents, but want instead to
concentrate on optimizing the methods for learning to quantify.
Task 2:
Participants will be provided with the raw text of the documents; the task
is thus suitable for participants who also wish to engage in generating
suitable representations for the textual documents, or to train end-to-end
systems.
PAN is a series of scientific events and shared tasks on digital text forensics
and stylometry, studying how to quantify writing style and improve authorship
technology.
Task 1: Authorship Verification.
Given two texts, determine if they are written by the same author.
Task 2 - IROSTEREO: Profiling Irony and Stereotype Spreaders on
Twitter.
Given a Twitter feed, determine whether its author spreads Irony and
Stereotypes.
Task 3: Style Change Detection.
Given a document, determine the number of authors and at which positions the
author changes.
Task 4: Trigger Warning Prediction.
Given a document, determine whether its content warrants a warning of
potential negative emotional responses in readers.
SimpleText: Automatic Simplification of Scientific Texts
The 2022 SimpleText track addresses the challenges of text simplification
approaches in the context of promoting scientific information access, by
providing appropriate data and benchmarks, and creating a community of NLP and
IR researchers working together to resolve one of the greatest challenges of
today.
Task 1: What is in (or out)?
Select passages to include in a simplified summary, given a query.
Task 2: What is unclear?
Given a passage and a query, rank terms/concepts that are required to be
explained for understanding this passage (definitions, context,
applications,..).
Task 3: Rewrite this!
Given a query, simplify passages from scientific abstracts.
Task 4: Unshared task.
We welcome any submission that uses our data!
Decision making processes, be it at the societal or at the personal level, often
come to a point where one side challenges the other with a why-question, which
is a prompt to justify some stance based on arguments. Since technologies for
argument mining are maturing at a rapid pace, also ad-hoc argument retrieval
becomes a feasible task in reach.
Task 1: Argument Retrieval for Controversial Questions.
Given a controversial topic and a collection of argumentative documents,
the task is to retrieve and rank sentences (the main claim and its most
important premise in the document) that convey key points pertinent to the
controversial topic.
Task 2: Argument Retrieval for Comparative Questions.
Given a comparative topic and a collection of documents, the task is to
retrieve relevant argumentative passages for either compared object or for
both and to detect their respective stances with respect to the object they
talk about.
Task 3: Image Retrieval for Arguments.
Given a controversial topic, the task is to retrieve images (from web pages)
for each stance (pro/con) that show support for that stance.