Digital Resources: Licensed Resources for Text Mining

Sketch Engine

Helsingin yliopiston kirjasto on jatkanut Sketch Engine -korpustyökalun organisaatiolisenssiä vuonna 2023. Työkalu on kaikkien Helsingin yliopiston tutkijoiden, opettajien ja opiskelijoiden käytössä.

Sketch Engine soveltuu laajoista tekstikorpuksista tehtävään tiedonhakuun ja tekstiaineistojen analyysiin kaikilla opetuksen ja tieteen aloilla. Sovelluskohteista työkalun kehittäjät mainitsevat erityisesti digitaaliset ihmistieteet.

Kirjaudu Sketch Engineen

Helsinki University Library resources for text analysis

code photo by Pixabay


This site is intended to support the members of the Helsinki University community interested in text/data mining library-provided licensed resources. The resources listed here below offer some mode of access to texts and data for large-scale analysis.

If there are particular vendors or resources in which you are interested, please let us know: e-library at Upon request, we can investigate further and assist researchers, teachers, and students affiliated with HU.


Databases available for TDM in Helsinki University (2021) Publisher or Vendor
British Library Newspapers, years 1732 - 1950

Contains full runs of influential national and regional newspapers representing different political and cultural segments of British society. Note! You can find all Gale databases also behind this link. 

The Times Digital Archive 

The Times of London is widely considered to be the world's 'newspaper of record'. The Times Digital Archive allows users to search over 200 years of this invaluable historical source. Includes a learning centre with tips for searching and other useful information for beginners. 
The Economist Historical Archive 1843-2015

The Economist Historical Archive is the fully searchable facsimile edition of The Economist, the weekly paper for anyone engaged in politics, current affairs, business and trade worldwide. Containing every issue since its launch in 1843, the archive offers full-colour images, multiple search indexes, topic and area supplements and surveys. It is an unrivalled multidisciplinary primary source for researching and teaching the nineteenth and twentieth centuries.


Eighteenth Century Collections Online  - Ecco I-II

Use Eighteenth Century Collections Online to access the digital images of every page of books published during the 18th Century. With full-text searching of millions of pages, the product allows researchers new methods of access to critical information in the fields of history, literature, religion, law, fine arts, science and more.


Nineteenth Century Collections Online;  7 and 12

Science, Technology, and Medicine (7), 1780 – 1925 collection comprises:

  • Journals in general science, medicine, biology, botany, chemistry, physics, mathematics, geology, technology and other disciplines.
  • Monographs in such areas as history of medicine, dentistry, anthropology, ecology, public health, geography, astronomy, optics, color theory, mathematics, physics, electricity, engineering, and the philosophy of science.

Religion, Spirituality, Reform, and Society (12)

  • Topics include Christian Socialism, Secularism, and Materialism, relations between church and state, mysticism and symbolism, the philosophies of Hegel, Kant, and Comte and more.
  • Languages: English, Dutch, French, German, Italian, Latin, Spanish.

17th and 18th Century Burney Collection Newspapers (The Seventeenth and Eighteenth Century Burney Newspapers) Collection

The newspapers and news pamphlets gathered by the Reverend Charles Burney (1757‒1817) represent the largest single collection of seventeenth and eighteenth-century English news media. The 700 or so bound volumes of newspapers and news pamphlets were published mostly in London, however there are also some English provincial, Irish and Scottish papers, and a few examples from the American colonies, Europe and India. Includes a learning centre with tips for searching and other useful information for beginners. 




17th and 18th Century Nichols Collection (Seventeenth and Eighteenth Century Nichols Newspapers Collection)

Seventeenth and Eighteenth Century Nichols Newspapers Collection features London newspapers and pamphlets gathered by antiquarian and printer John Nichols. This collection, sourced from the Bodleian Library, spans the years 1672 to 1737 and complements the titles and issues found in seventeenth and eighteenth-century Burney Collection Newspapers. Includes a learning centre with tips for searching and other useful information for beginners. 



19th Century U.S. Newspapers, years 1800–1899

With digital facsimile images of both full pages and clipped articles for hundreds of nineteenth-century U.S. newspapers and advanced searching capabilities, researchers will be able to research history in ways previously unavailable. For each issue, the newspaper is captured from cover-to-cover, providing access to every article, advertisement and illustration.


Indigenous Peoples: North America

sources collections from Canadian and American institutions, providing insight into the cultural, political and social history of Native Peoples from the seventeenth into the twentieth century. Including diverse manuscripts; book collections; newspapers from various tribe and Indian-related organizations; materials such as Bibles, dictionaries and primers in Indigenous languages all enable students' examination of important primary source materials.


U.S. Declassified Documents Online

Offers access to over 750,000 pages of government documents. Covering major policy issues from the period before the World War II into the twenty-first century, the archive serves as a convenient source for documents from government departments including Defense; State; Treasury; CIA; and the White House. U.S. Declassified Documents Online supports the study of history, politics, international relations, and journalism, among other fields.




Proquest Historical Newspapers: New York Times (1851 - 1937) ;The Guardian (1821 - 1909); The Observer (1791 - 1909); Wall Street Journal (1889 - 1936); Washington Post (1877 - 1936)

Early English Books Online (EEBO)  Proquest

Secret Files from World Wars to Cold War
British government secret intelligence and foreign policy files from 1873 to 1953. Sourced from The National Archives, U.K..

Taylor & Francis

Past Masters - 40 different data sets of collected works by widely known philosophers and authors, includes original texts and translations.

The collected list and descriptions of Past Master data sets

Journal backfiles  
Elsevier - journal backfile - 1 700 titles Elsevier
Springer - journal archive - 913 titles SpringerNature
Wiley - journal archive - 781 titles Wiley
Taylor & Francis - journal archive - 1140 titles Taylor & Francis
Nature - journal archive SpringerNature

Integrum Profi -  Authorised Users of the University of Helsinki may without obtaining Integrum's prior written consent download a few thousands of articles from the Integrum Profi or My Integrum to perform and engage in text mining /data mining activities for academic research and other non-commercial educational purposes. To download larger datasets The Slavonic Library will have to request data on your behalf from Integrum, in which case, please, contact::

Integrum World Wide

Gale Learning Centres

Text and data mining example - Vogue Archive (Proquest)

DH in Europe

DSM Directive in EU