Skip to content

DataScienceUIBK/TemporalQA-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 

Repository files navigation

It's High Time: A Survey of Temporal Question Answering

⏳ Temporal Question Answering & Temporal Information Retrieval — A Comprehensive Survey in the Era of LLMs


📋 Table of Contents


📘 Overview

This repository provides a comprehensive, curated collection of research papers, datasets, methods, and resources focused on Temporal Question Answering (TQA) and Temporal Information Retrieval (Temporal IR). It accompanies our survey paper on how AI models reason about time, adapt to evolving knowledge, answer temporally constrained questions, and retrieve time-sensitive information.


✨ Key Contributions

  • 📚 Comprehensive Survey
    Coverage of 27+ datasets and 50+ methods (2003–2025)

  • 🧭 Unified Taxonomy
    Structured view of datasets, tasks, and modeling approaches

  • 🧪 Critical Analysis
    Identifies key limitations in temporal reasoning and retrieval

  • 🚀 Research Roadmap
    Highlights 7 open challenges for future work

⏳ Why Temporal QA Matters

Time fundamentally shapes how we interpret and retrieve information:

  • 🗞️ Retrieval
    "Latest climate policies" vs. "policies from the 1990s"

  • 🧠 Reasoning
    Understanding causality, evolution, and event ordering

  • 💬 Interaction
    Expecting temporally grounded responses from AI systems

  • 🔄 Adaptation
    Handling continuously evolving knowledge and facts


📊 Datasets

📌 Quick Statistics

Metric Value
📚 Total Datasets 27+
❓ Questions 2.5M+
📅 Time Span 1367 → 2025
🌍 Domains News, Web, Knowledge Bases

🗂️ Dataset Categories

  • 🕰️ Diachronic — Longitudinal corpora over time
  • 📸 Synchronic — Snapshot-based datasets
  • 🌐 Web-based — Real-time evolving data
  • ⚙️ Synthetic — Controlled evaluation
  • 🧩 KG-based — Structured temporal reasoning

Featured Datasets

🗞️ Diachronic Datasets (Time-Stamped Historical Documents)
Dataset Year #Questions Source Time Coverage Answer Type Links
ArchivalQA 2022 532K NYT Corpus 1987-2007 Extractive Paper · GitHub
ChroniclingAmericaQA 2024 485K Historical Newspapers 1800-1920 Extractive Paper · GitHub
StreamingQA 2022 147K News Articles 2007-2020 Extractive Paper · GitHub
NewsQA 2017 119K CNN/Daily Mail 2007-2015 Freeform Paper · GitHub
TempLAMA 2022 50K News 2010-2020 Extractive Paper · GitHub
TORQUE 2020 21K News - Abstractive Paper · GitHub
ForecastQA 2021 10.3K News 2015-2019 Multiple Choice Paper · Website
TDDiscourse 2019 6.1K News Unspecified Extractive Paper · GitHub
📖 Synchronic Datasets (Wikipedia Snapshots)
Dataset Year #Questions Time Scope Answer Type Multi-Hop Links
ComplexTempQA 2024 100.2K 1987-2023 Extractive Paper · GitHub
TEMPREASON 2023 52.8K 634-2023 Abstractive Paper · GitHub
TimeQA 2021 41.2K 1367-2018 Extractive Paper · GitHub
TemporalAlignmentQA 2024 20K 2000-2023 Abstractive Paper Github
SituatedQA 2021 12.2K ≤ 2021 Mixed Paper · GitHub
TempTabQA 2023 11.4K Infoboxes Abstractive Paper · Website
TiQ 2024 10K Unspecified Entities Paper · GitHub
PAT-Questions 2024 6.1K Present-anchored Extractive Paper · GitHub
TRACIE 2021 5.4K ≤ 2020 Abstractive Paper · GitHub
MenatQA 2023 2.8K 1367-2018 Extractive Paper · GitHub
🌐 Web & Real-Time Datasets
Dataset Year #Questions Source Update Frequency Links
ReaLTimeQA 2023 5.1K Web Search Weekly (2020-2024) Paper · Website
FreshQA 2024 600 Google Search Periodic Paper · GitHub
🧪 Synthetic & Reasoning-Focused Datasets
Dataset Year #Questions Focus Links
COTEMPQA 2024 4.7K Co-temporal reasoning Paper · GitHub
UnSeenTimeQA 2024 3.6K Beyond memorization Paper · GitHub
Test of Time (ToT) 2024 1.8K Temporal reasoning eval Paper · GitHub
TIMEDIAL 2021 1.1K Temporal commonsense Paper · GitHub

🔧 Methods & Approaches

Evolution Timeline

📅 2003-2010: Rule-Based Era
   └─ TimeML, TERSEO, temporal taggers

📅 2011-2019: Statistical & Early Neural
   └─ Language models, temporal embeddings

📅 2020-2022: Transformer Revolution
   └─ Temporal pretraining, time-aware architectures

📅 2023-2025: LLM & RAG Era
   └─ Retrieval-augmented generation, temporal reasoning

Method Categories

🤖 Temporal Language Models (Click to expand all models)
Model Year Key Innovation Architecture Paper Code
TempoT5 2022 Temporal conditioning via prefixes T5 + timestamp prefixes Paper GitHub
BiTimeBERT 2023 Dual temporal encoding (timestamp + content) BERT + bi-temporal module Paper Github
TempoBERT 2022 Time-aware masking strategy BERT + temporal masking Paper GitHub
TALM 2023 Hierarchical temporal word representations BERT + temporal adapter Paper Github
SG-TLM 2023 Syntax-guided + temporal-aware masking BERT + dual masking Paper GitHub
TSM 2023 Temporal span masking T5 + salient span masking Paper Contact authors
Temporal Attention 2022 Time matrix in attention mechanism Transformer + time matrix Paper GitHub
TCQA 2023 Synthetic QA + span selection T5-based Paper Github
Time-aware Prompting 2022 Temporal prompts for generation GPT-2 + temporal prompts Paper GitHub
🔍 Temporal RAG Systems (Click to expand all systems)
System Year Pipeline Architecture Temporal Signals Paper Code
TempRetriever 2025 Fusion-based dense retrieval Query + doc timestamps Paper Contact authors
TimeR4 2024 Retrieve-Rewrite-Retrieve-Rerank TKG timestamps + constraints Paper GitHub
MRAG 2024 Modular multi-hop framework Symbolic + semantic temporal scoring Paper Contact authors
TempRALM 2024 Dense retrieval + temporal proximity Timestamp-based ranking Paper Contact authors
TsContriever 2024 Contrastive time-sensitive retrieval Time-aware embeddings Paper Github
FreshLLMs 2024 Search augmentation for recency Web search integration Paper GitHub
🧠 Temporal Reasoning Methods (Click to expand all approaches)
Method Year Reasoning Type Key Contribution Paper Code
ECONET 2021 Continual adaptation Event consistency across updates Paper GitHub
ConTempo 2024 Contrastive temporal relations Unified temporal relation extraction Paper GitHub
TIMERS 2021 Document-level relations Structured inference layers Paper GitHub
TRAM 2024 Multi-dimensional reasoning Event frequency, duration, ordering Paper GitHub
TODAY 2023 Differential analysis Temporal robustness testing Paper GitHub
Narrative-of-Thought 2024 Narrative-based reasoning Recounted narratives for coherence Paper GitHub
📜 Classical Methods (Rule-Based & Statistical)
Era Methods Key Papers
Rule-Based TimeML, TERSEO, temporal taggers Harabagiu & Bejan, 2005, Saquete et al., 2004, Saquete et al., 2004
Statistical IR Time-based language models, temporal ranking Li & Croft, 2003, Berberich et al., 2010, Arikan et al., 2009, Alonso et al., 2007, , ,

📚 Complete historical overview →


📖 Temporal Tasks

Core temporal prediction tasks supporting TQA systems:

Task Input Output Key Applications Representative Papers
Event Dating Event description Event timestamp Historical analysis, timeline construction Das et al., 2017, Wang et al., 2021
Document Dating Document text Creation date Digital preservation, metadata recovery Kumar et al., 2012, Niculae et al. 2014, Vashishth et al. 2018, Jatowt et al. 2007, SalahEldeen and Nelson, 2013
Focus Time Estimation Document content Discussed time period Historical QA, event-centric retrieval Jatowt et al., 2013, Jatowt et al., 2013, Shrivastava et al., 2017
Query Time Profiling Search query Temporal intent/distribution Time-aware search, query understanding Kanhabua & Nørvåg, 2010,Jones and Diaz 2007 Dakka et al., 2008, Gupta and Berberich 2014

🏥 Domain-Specific Applications

Medical Domain

Challenges: Patient timeline reconstruction, symptom progression, treatment sequencing

System/Dataset Focus Key Paper
TimeText Time-oriented clinical QA Zhou et al., 2008
Temporal Clinical QA Semantic web techniques Tao et al., 2010
Time-aware Health QA Evidence retrieval with recency Vladika & Matthes, 2024

Legal Domain

Challenges: Evolving statutes, precedent timelines, jurisdiction-specific temporal expressions

System/Dataset Focus Key Paper
ChronosLex Time-aware incremental training T.y.s.s et al., 2024

Financial Domain

Challenges: Regulatory changes, market events, time-sensitive numerical reasoning

Dataset Focus Key Paper
FinQA Numerical reasoning over financial data Chen et al., 2021
FinTextQA Long-form financial QA Chen et al., 2024
FinDER Financial QA with RAG Choi et al., 2025

🛠️ Resources & Tools

Temporal Taggers & NLP Tools

Tool Year Languages Type Features Link
HeidelTime 2010 200+ Rule-based High precision, domain adaptation Paper · GitHub
SUTime 2012 English Rule-based Stanford CoreNLP integration Paper · Website
CogCompTime 2018 English Neural Compositional temporal understanding Paper · GitHub
Temponym Tagger 2016 English Hybrid Implicit temporal references Paper

Document Collections

Collection Period Size Domain Access
NYT Annotated Corpus 1987-2007 1.8M articles News LDC License
Chronicling America 1800-1920 Historical Newspapers Free Access
Newswire Corpus 1878-1977 2.7M articles News HuggingFace
Wikipedia Dumps Various TB-scale Encyclopedia Wikimedia

Evaluation Frameworks


🚀 Future Directions

Our survey identifies 7 critical research areas requiring immediate attention:

1️⃣ Dynamic Temporal Knowledge Management

Problem: Static corpora can't handle evolving facts
Challenge: Temporal propagation when updating related events
Needed: Real-time knowledge graphs with dependency tracking

2️⃣ Temporally-Aware LLM Agents

Problem: LLMs hallucinate temporal information
Challenge: Resolving "last Tuesday" or "since our last chat"
Needed: Timeline memory, temporal reference resolution

3️⃣ Diachronic-Synchronic Integration

Problem: Most systems use only one knowledge type
Challenge: Aligning historical trends with current snapshots
Needed: Cross-source temporal alignment algorithms

4️⃣ Temporal Uncertainty & Confidence

Problem: Systems treat all dates as exact
Challenge: "Around 476 AD", "mid-20th century"
Needed: Probabilistic temporal representations

5️⃣ Multilingual & Multimodal TQA

Problem: Most work is English text-only
Challenge: Lunar calendars, visual time cues, cultural references
Needed: Cross-lingual temporal taggers, vision-language models

6️⃣ Implicit Temporal Intent Understanding

Problem: Many questions hide their time constraints
Challenge: Inferring "now" vs. "historically" from context
Needed: Context-dependent temporal intent detection

7️⃣ Evaluation & Benchmarking

Problem: Standard metrics don't capture temporal coherence
Challenge: Measuring temporal grounding, not just accuracy
Needed: Temporal-aware evaluation protocols


✨ Citation

If you find this work useful, please cite 📜our paper:

Plain

Piryani, B., Abdullah, A., Mozafari, J., Anand, A., & Jatowt, A. (2025). It's High Time: A Survey of Temporal Question Answering. arXiv preprint arXiv:2505.20243.

Bibtex

@article{piryani2025s,
  title={It's High Time: A Survey of Temporal Question Answering},
  author={Piryani, Bhawna and Abdullah, Abdelrahman and Mozafari, Jamshid and Anand, Avishek and Jatowt, Adam},
  journal={arXiv preprint arXiv:2505.20243},
  year={2025}
}

🪪License

This project is licensed under the MIT License - see the LICENSE file for details.

📝 Contributing

We welcome contributions to keep this survey comprehensive and up-to-date!

Missing a Paper or Dataset?

If we've missed your work or you know of a relevant paper/dataset that should be included, please send us an email at:

📧 bhawna.piryani@uibk.ac.at

Please include:

  • Paper title and authors
  • Link to paper and code/data (if available)
  • Brief description of the contribution

You can also open an issue on GitHub.

About

Survey of datasets, methods, and tools for Temporal Question Answering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors