Document Type

Conference Proceeding

Publication Date



extractive reading comprehension, spatial-temporal information, span mask


Extractive reading comprehension is to extract consecutive subsequences from a given article to answer the given question. Previous work often adopted Byte Pair Encoding (BPE) that could cause semantically correlated words to be separated. Also, previous features extraction strategy cannot effectively capture the global semantic information. In this paper, an extractive summarization model is proposed with enhanced spatial-temporal information and span mask encoding (ESSM) to promote global semantic information. ESSM utilizes Embedding Layer to reduce semantic segmentation of correlated words, and adopts TemporalConvNet Layer to relief the loss of feature information. The model can also deal with unanswerable questions. To verify the effectiveness of the model, experiments on datasets SQuAD1.1 and SQuAD2.0 are conducted. Our model achieved an EM of 86.31% and a F1 score of 92.49% on SQuAD1.1 and the numbers are 80.54% and 83.27% for SQuAD2.0. It was proved that the model is effective for extractive QA task.

Digital Object Identifier (DOI)


Originally published as:

Li, R., Zheng, F., Liang, G., Jiang, L., Wu, P., & Chen, B. (2022, October). ESSM: an extractive summarization model with enhanced spatial-temporal information and span mask encoding. In 5th International Conference on Computer Information Science and Application Technology (CISAT 2022)(Vol. 12451, pp. 1003-1007). SPIE.

Copyright 2022. Society of Photo‑Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this publication for a fee or for commercial purposes, and modification of the contents of the publication are prohibited.

Archived on this site as part of the SPIE green open access program. For more information, please see: