NEXT: An Event Schema Extension Approach for Closed-Domain Event Extraction Models

Researchers from our project partner, Ontotext, together with the Faculty of Mathematics and Informatics at the Sofia University in Bulgaria published a paper titled NEXT: An Event Schema Extension Approach for Closed-Domain Event Extraction Models, presenting an innovative approach to extending event extraction models to accommodate new event types with minimal resources. This is particularly relevant in domains like disinformation, where emerging topics require flexible adaptation of event schemas.

The primary goal of the paper, published under the Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (pp. 113-123), is to address the challenge of extending closed-domain event extraction models to new event types with limited annotated data and computational resources. The authors introduce NEXT (New Event eXTraction), a methodology that allows for the extension of an existing model to novel event types using a small number of annotated samples for fine-tuning. This approach is tested on a dataset related to fake news debunks, highlighting its potential for applications in domains characterized by rapidly evolving topics.

Key elements of the paper

1. Efficiency in Resource Utilization: NEXT requires a very small set of annotated examples (as few as a dozen) to fine-tune the model for a new event type, demonstrating a high degree of efficiency in utilizing resources.

2. Improvement in Model Performance: The application of NEXT not only enables the extraction of new event types but also enhances the model's performance on existing event types. This dual benefit is achieved without significant additional computational costs, as the methodology operates effectively on a single GPU.

3. Precision and Recall Trade-offs: The findings show a trade-off between precision and recall when varying the number of annotated samples for fine-tuning. While precision can be maintained with a minimal set of samples, achieving high recall requires a larger dataset.

4. Model Overfitting Concerns: Overfitting emerges as a potential concern when fine-tuning the model for an excessive number of epochs. However, this issue can be mitigated through strategies like pre-filtering input data or employing a voting mechanism among multiple fine-tuned models.

"NEXT: An Event Schema Extension Approach for Closed-Domain Event Extraction Models" makes a significant contribution to the field of natural language processing by addressing a critical challenge in event extraction with an innovative, resource-efficient solution.

The NEXT approach presents a promising avenue for enhancing the adaptability and efficiency of event extraction models, particularly in domains where new events frequently emerge. The methodology's ability to improve recall for existing event types while incorporating new ones offers substantial benefits for applications in real-time information monitoring and analysis.

Future work could explore the application of NEXT in other domains beyond disinformation, assess its effectiveness against open-domain approaches, and investigate further optimizations to balance precision and recall. Additionally, expanding the annotated dataset and comparing NEXT with alternative methodologies could provide deeper insights into its relative strengths and areas for improvement.

Read the full paper here.