Uses & citations ================ .. meta:: :description lang=en: Htmldate is used at several institutions, included in other software packages and cited in research publications. This page lists projects and publications mentioning the library. Htmldate is used at several institutions, included in other software packages and cited in research publications. This page lists projects and publications mentioning the library. To add further references, please `edit this page `_ and suggest changes. Notable projects using this software ------------------------------------ Institutional users ^^^^^^^^^^^^^^^^^^^ - `Media Cloud platform `_ for media analysis through its `meta-extractor `_ package - The Internet Archive's `sandcrawler `_ which crawls and processes the scholarly web for the `Fatcat catalog `_ of research publications - `SciencesPo médialab `_ through its `Minet `_ webmining package Various repositories ^^^^^^^^^^^^^^^^^^^^ - see `dependency graph on GitHub `_ Citations in papers ------------------- **To reference this software in a publication please cite the following paper:** - Barbaresi, A. "`htmldate: A Python package to extract publication dates from web pages `_", *Journal of Open Source Software*, 5(51), 2439, 2020. DOI: 10.21105/joss.02439 .. image:: https://joss.theoj.org/papers/10.21105/joss.02439/status.svg :target: https://doi.org/10.21105/joss.02439 :alt: JOSS article .. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.3459599.svg :target: https://doi.org/10.5281/zenodo.3459599 :alt: Zenodo archive .. code-block:: shell @article{barbaresi-2020-htmldate, title = {{htmldate: A Python package to extract publication dates from web pages}}, author = "Barbaresi, Adrien", journal = "Journal of Open Source Software", volume = 5, number = 51, pages = 2439, url = {https://doi.org/10.21105/joss.02439}, publisher = {The Open Journal}, year = 2020, } Publications citing Htmldate ---------------------------- - Grabovoy, A., Bakhteev, O., & Chekhovich, Y. (2021). "The automatic approach for scientific papers dating," 2021 Ivannikov Ispras Open Conference (ISPRAS), pp. 107-113, IEEE, DOI: 10.1109/ISPRAS53967.2021.00020. Hanley, H. W., Kumar, D., & Durumeric, Z. (2023). Happenstance: Utilizing Semantic Search to Track Russian State Media Narratives about the Russo-Ukrainian War On Reddit. In Proceedings of the international AAAI conference on web and social media (Vol. 17, pp. 327-338). - Hanley, H. W., Kumar, D., & Durumeric, Z. (2023). A Golden Age: Conspiracy Theories' Relationship with Misinformation Outlets, News Media, and the Wider Internet. arXiv preprint arXiv:2301.10880. - Kupi, M. (2021). "Late to the Party? Agile Methods in British and German Government Institutions", Master’s thesis, Hertie School Berlin. - Smits, T., & Ros, R. (2021). "Distant reading 940,000 online circulations of 26 iconic photographs", New Media & Society, DOI: 10.1177/14614448211049. - Smits, T., & Ros, R. (2023). Space and Place in Online Visual Memory: The Tank Man in Hong Kong. The Visual Memory of Protest, Amsterdam University Press. - Stefanovitch, N. (2022). Team TMA at SemEval-2022 Task 8: Lightweight and Language-Agnostic News Similarity Classifier. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) (pp. 1178-1183). - Viduka, D., Ličina, B., & Ilić, L. (2021). Choosing the best Python web framework for beginner according to experienced users. In 11th International Conference on Applied Information and Internet Technologies (p. 100-103). Ports ----- Go port `go-htmldate `_ Software ecosystem ------------------ - Barbaresi, A. (2021). "Trafilatura: A web scraping library and command-line tool for text discovery and extraction". In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations (pp. 122-131). .. image:: software-ecosystem.png :alt: Software ecosystem :align: center :width: 65%