Scrape Quran translations from a website

Apertium

Bible and Quran translations often serve as a parallel corpus useful for solving NLP tasks because both texts are available in many languages. Your goal in this task is to write a program in the language of your choice which scrapes the Quran translations available on the following website: http://www.quran-ebook.com/ . You can adapt the scraper described on the Writing a scraper page or write your own from scratch. The output should be plain text in Tanzil format ('text with aya numbers'). You can see examples of that format on http://tanzil.net/trans/ page. Before starting, check whether the translation is not already available on the Tanzil project's page (no need to re-scrape those, but you should use them to test the output of your program). Although the format of the translations seems to be the same and thus your program is expected to work for all of them, translations we are interested the most are the following: Azerbaijani version 2 , Bashkir , Chechen , Karachay and Kyrgyz . When scraping, please be polite and request data at a reasonable rate.

scraper

Students who completed this task

Ryan A. Chi

Task type

Code