Uses & citations¶
Htmldate is used at several institutions, included in other software packages and cited in research publications. This page lists projects and publications mentioning the library.
To add further references, please edit this page and suggest changes.
Notable projects using this software¶
Institutional users¶
Media Cloud platform for media analysis through its meta-extractor package
The Internet Archive’s sandcrawler which crawls and processes the scholarly web for the Fatcat catalog of research publications
SciencesPo médialab through its Minet webmining package
Various repositories¶
Citations in papers¶
To reference this software in a publication please cite the following paper:
Barbaresi, A. “htmldate: A Python package to extract publication dates from web pages”, Journal of Open Source Software, 5(51), 2439, 2020. DOI: 10.21105/joss.02439
@article{barbaresi-2020-htmldate,
title = {{htmldate: A Python package to extract publication dates from web pages}},
author = "Barbaresi, Adrien",
journal = "Journal of Open Source Software",
volume = 5,
number = 51,
pages = 2439,
url = {https://doi.org/10.21105/joss.02439},
publisher = {The Open Journal},
year = 2020,
}
Publications citing Htmldate¶
Grabovoy, A., Bakhteev, O., & Chekhovich, Y. (2021). “The automatic approach for scientific papers dating,” 2021 Ivannikov Ispras Open Conference (ISPRAS), pp. 107-113, IEEE, DOI: 10.1109/ISPRAS53967.2021.00020.
Hanley, H. W., Kumar, D., & Durumeric, Z. (2022). “Happenstance: Utilizing Semantic Search to Track Russian State Media Narratives about the Russo-Ukrainian War On Reddit”. arXiv preprint arXiv:2205.14484.
Kupi, M. (2021). “Late to the Party? Agile Methods in British and German Government Institutions”, Master’s thesis, Hertie School Berlin.
Smits, T., & Ros, R. (2021). “Distant reading 940,000 online circulations of 26 iconic photographs”, New Media & Society, DOI: 10.1177/14614448211049.
Stefanovitch, N. (2022). Team TMA at SemEval-2022 Task 8: Lightweight and Language-Agnostic News Similarity Classifier. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) (pp. 1178-1183).
Viduka, D., Ličina, B., & Ilić, L. (2021). Choosing the best Python web framework for beginner according to experienced users. In 11th International Conference on Applied Information and Internet Technologies (p. 100-103).
Ports¶
- Go port
Software ecosystem¶
Barbaresi, A. (2021). “Trafilatura: A web scraping library and command-line tool for text discovery and extraction”. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations (pp. 122-131).
