Requested uploads

This is a help page for requesting upload of scans of public domain books to the Internet Archive. Wikisource:Sources contains links to other digital libraries from which scans can be added to IA. However PDF Scans derived from Google Books contains a warning which needs to be stripped off before adding the text to IA.

  • English Books from British India collection of 18 books from Savifa Virtual Library South Asia, University of Heidelberg. Solomon7968 (talk) 09:29, 4 February 2014 (UTC)
    • I can do the upload, but most of the work here is metadata. Please prepare 1) a list of URLs of the books to download, 2) a CSV table with title, creator, date, description, sponsor (digitising institution) etc. I can help you revise the table if you're unsure how to name the fields, but I don't have time for the data entering. --Nemo 09:46, 4 February 2014 (UTC)
    • @Solomon7968:: This set is small and ok to me to work on it. I can do the metadata sorting and the upload process, but only during the next week. This term is ok for you? I can also provide to you the resulting .CSV for further requests (ie., to help you to write new CSV files for new or bigger requests) Lugusto 19:03, 7 February 2014 (UTC)
      • Thanks for the offer! Many books in the collection are rare and not available elsewhere. I also dropped an E-Mail to their Contact person Nicole Merkel a month or so ago, but no response from them so far. Solomon7968 (talk) 03:31, 8 February 2014 (UTC)
        • Scans provided by SavifaDok for Transactions of the Agricultural and Horticultural Society of India v. II are in a very low quality. Fortunately I've found a better quality scan on GBS so I've picked this, did OCR+djvu processing locally on my machine (with ABBYY 11) and uploaded it to Commons as File:Transactions of the Agricultural and Horticultural Society of India - Vol 2.djvu. But please note that Transactions of the Agricultural and Horticultural Society of India isn't exactly a book, but a journal; GBS have also more issues for this title. I will check/work on 17 remaining files from SavifaDok tomorrow. Lugusto 19:46, 10 February 2014 (UTC)
          • @Solomon7968:: So sorry for this delay. Apparently all scans from SavifaDok are in very low quality. Again I've found a better scan quality on GBS (although there are also issues on both versions in the exactly same pages...), this time for A rapid sketch of the life of Raja Radhakanta Deva Bahadur with some notices of his ancestors, and testimonials of his character and learning. Done as in the previous book, File:A rapid sketch of the life of Raja Radhakanta Deva Bahadur.djvu. As soon as possible I will process more books from the 16 remaining. Lugusto 02:47, 15 February 2014 (UTC)

PDF Scans derived from Google Books contains a warning which needs to be stripped off before adding the text to IA for facilitating proofreading for Wikisource. These are normally done by the user/bot "tpb" (not affiliated to Internet Archive): we dream of a way to suggest tpb books we're interested in; we can start accumulating Google Books URLs here and then maybe tpb at some point will fetch them.

Also see this Scriptorium thread opened by Yann. Solomon7968 (talk) 10:37, 4 February 2014 (UTC)

  • The work tpb has started seems to be ended years ago, but I'm not sure. In the meantime the GBS original collection grown considerably. Maybe we are in need of a tool to do direct research on GBS + warning page removal + IA upload instead? Lugusto 19:03, 7 February 2014 (UTC)
    • Many editors here equipped with the software to remove the warnings and watermarks replace the existing IA derived file on Commons with the clean one without warnings and watermarks. It is especially trouble-some for large files. An automated system for uploading to IA will help for sure. Solomon7968 (talk) 03:31, 8 February 2014 (UTC)


please strip off google boilerplate and upload to IA

ಸಂಪಾದಿಸಿ

I would like to add the original 1847 edition of "History of the press of western new york" by Frederick Follett to en.wikisource.org. IA has the 1920 and 1973 reprints but not the original 1847 edition - books.google.com has the 1847 edition about 80 pages. I would appreciate help from someone who can remove the google boilerplate and upload to IA to create the djvu file. Robin2014 (talk) 19:35, 23 February 2015 (UTC)