|
NIST TAC Belief and Sentiment (BeSt) Track
Data Sets
There are three
types of data
files:
-
1. The 2016 Belief
and Sentiment
training data.
This is data in
best format and
includes discussion
forum and newswire
data. There are
separate releases
for the three
languages (English,
Chinese, Spanish).
The releases
include separate
source, ERE, and
best files.
-
2. The 2016 Belief
and Sentiment test
data. This is
data in best
format and includes
discussion forum
and newswire data.
There is a single
release for all
three languages. The
release includes
separate source,
ERE, and best
files for each
language.
-
3. Data tagged for
committed belief.
This is a
different type of
belief annotation
(target only, source
is only ever the
author). This
may be useful for
training belief
taggers. The
content and format
is explained in
more detail in
the data releases.
Note that
participants in the
2017 eval are
free to choose
how they use the
existing data;
there is no need
to treat the 2016
Belief and
Sentiment test data
differently from
the 2016 Belief
and Sentiment
training data in
preparing for the
2017 evaluation.
Details are as
follows.
-
ALL
LANGUAGES
2016 Eval data:
LDC2016E114 (TAC
KBP 2016 Belief
and Sentiment
Evaluation Gold
Standard Annotation
V2)
Committed belief
annotation:
LDC2014E125 (DEFT
Committed Belief
Annotation
Self-Evaluation
Package)
-
ENGLISH
2016 training:
LDC2016E27 (DEFT
English Belief and
Sentiment Annotation
V2)
- 2016 training:
236 documents, 165k
words
- 2016 eval: 165
documents, 100k
words
Committed belief
annotation:
LDC2014E55 (DEFT
Committed Belief
Annotation R1 V1.1)
and LDC2014E106 (DEFT
Committed Belief
Annotation R2)
-
CHINESE
2016 training:
LDC2016E61 (DEFT
Chinese Belief and
Sentiment Annotation)
- 2016 training:
180 documents
- 2016 eval: 159
documents
Committed belief
annotation:
LDC2015E99 (DEFT
Chinese Committed
Belief Annotation)
-
SPANISH:
2016 training:
LDC2016E62 (DEFT
Spanish Belief and
Sentiment Annotation)
- 2016 training: 90
documents, 82k
words
- 2016 eval: 168
documents, 85k
words
Committed belief
annotation:
LDC2016E40 (DEFT
Spanish Committed
Belief Annotation)
|
|