Digital Scholar Lab is a tool for Digital humanities studying and research by Gale. It is designed to facilitate learnining and researching Gale's digital resources licensed by the University of Helsinki. The Digital Scholar Lab is a cloud-based platform that enables students and researchers with UH credentials to access content and OCR data from Gale Primary Sources as well as one's own plain text files and analyse these with text and data mining tools. One can create custom data sets from the archives available and use digital tools to analyse them. A few sample projects are provided for teaching and learning the different stages of DH research. In addition, a learning centre within the tool offers instructional videos and texts on the various elements involved.
With a database-style interface, the Digital Scholar Lab provides a familiar navigational style for students who are new to text data mining as well as seasoned scholars. Users of the Gale Digital Scholar Lab can download individual content sets (up to 5,000 documents). Downloaded content sets are for scholarly, non-commercial use only.
In the Digital Scholar Lab, there are three basic functions, following the basic research methology in digital humanities. In the Lab you can find helpful videos and information on all these functions.
1. Building a custom content set (data set)
By searching a certain topic from Gale's Primary Sources available to the library you can find and collect appropriate material which you can then analyze with the Lab. In addition, you can also upload a plain text of your own choice. You can search (basic or advanced search) using words that appear in a text, or the metadata describing it. Once you have got results, you can review the information on each document to determine if it's right for your research. You can also click into results to see the original document scan and it's text. You can also use filters to further refine your search. Once you are happy with the results, you can create a custom content set. Choose the results most relevant to your research question (or scanned documents) and add them to the set.
2. Cleaning the content set
Once you have the content set ready, it is usually necessary to clean it (removing unwanted words, punctuation or characters) for analyzing the data. You can create multiple cleaning configurations so you can tailor how a content set is cleaned depending on the analysis you are trying to do. Test your configurations by selecting a content set and then the first 10 documents will be cleaned with your settings. You will then be able to download the original and clean version of those documents for comparison. When you are happy with the results, you can use the configuration you choose as basis for analysis.
3. Analyzing the content set
Analysis allows you to take hundreds or thousands of documents and use digital tools to analyze them in ways that would have been too time consuming without the help of computational algorithms. There are several tools available for analysis in the Lab (see the box right) with a helpful video instructions. Each tool has settings you can use to tune the results you can get. With these tools you can also create nice visualizations like the one below.
With Digital Scholar Lab you can: